WO2021135455A1 - Semantic recall method, apparatus, computer device, and storage medium - Google Patents

Semantic recall method, apparatus, computer device, and storage medium Download PDF

Info

Publication number
WO2021135455A1
WO2021135455A1 PCT/CN2020/118454 CN2020118454W WO2021135455A1 WO 2021135455 A1 WO2021135455 A1 WO 2021135455A1 CN 2020118454 W CN2020118454 W CN 2020118454W WO 2021135455 A1 WO2021135455 A1 WO 2021135455A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentence vector
online
candidate
vector
query data
Prior art date
Application number
PCT/CN2020/118454
Other languages
French (fr)
Chinese (zh)
Inventor
骆迅
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021135455A1 publication Critical patent/WO2021135455A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of artificial intelligence technology, in particular to a semantic recall method, device, computer equipment and storage medium.
  • the semantic recall model is widely used in AI question answering systems.
  • AI question answering systems are used in more and more places to replace manual question answering to improve processing efficiency.
  • the semantic recall model is mainly based on traditional deep learning models, such as CNN, LSTM, and ESTM models.
  • the purpose of the embodiments of the present application is to propose a semantic recall method, device, computer equipment, and storage medium, aiming to solve the technical problem of low efficiency of semantic recall model processing corpus data.
  • an embodiment of the present application provides a semantic recall method, which adopts the following technical solutions:
  • a semantic recall method includes the following steps:
  • the candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
  • an embodiment of the present application also provides a semantic recall device, which adopts the following technical solutions:
  • the first obtaining module is configured to obtain the online sentence vector corresponding to the online query data based on the sentence vector generator when the online query data is received;
  • the second obtaining module is used to obtain the stored candidate sentence vector
  • a splicing module configured to match the online sentence vector and the candidate sentence vector based on a sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
  • the sorting module is configured to sort the candidate sentence vectors in descending order according to the similarity, and return the answer to the candidate question corresponding to the candidate sentence vector ranked first as the correct answer.
  • an embodiment of the present application also provides a computer device, including a memory and a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor executes
  • the computer-readable instructions implement the steps of the semantic recall method as described below:
  • the candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
  • embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the following Steps of semantic recall method:
  • the candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
  • the online query data is the input sentence
  • the online query data corresponding to the online query data is obtained based on the sentence vector generator.
  • Sentence vector the online sentence vector is the vector form data corresponding to the online query data; obtain the stored candidate sentence vector, where the candidate sentence vector is the sentence vector corresponding to the candidate question stored in the database in advance; based on the sentence vector
  • the splicer matches the online sentence vector and the candidate sentence vector to obtain the similarity between the online sentence vector and the candidate sentence vector; the candidate sentence vector can be screened according to the similarity, so as to filter out the The candidate sentence vector that best matches the online sentence vector.
  • the candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the candidate sentence vector ranked first is returned as the correct answer.
  • Figure 1 is an exemplary system architecture diagram to which the present application can be applied;
  • Figure 2 is a flowchart of an embodiment of a semantic recall method
  • Figure 3 is a schematic diagram of a sentence vector generator
  • Figure 4 is a schematic diagram of a sentence vector splicer
  • Fig. 5 is a schematic structural diagram of an embodiment of a semantic recall device according to the present application.
  • Fig. 6 is a schematic structural diagram of an embodiment of a computer device according to the present application.
  • semantic recall device 600 first acquisition module 610, second acquisition module 620, splicing module 630, sorting module 640.
  • the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105.
  • the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
  • the user can use the terminal devices 101, 102, and 103 to interact with the server 105 through the network 104 to receive or send messages and so on.
  • Various communication client applications such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, and social platform software, may be installed on the terminal devices 101, 102, and 103.
  • the terminal devices 101, 102, and 103 may be various electronic devices with display screens and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic Video experts compress standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image experts compress standard audio layer 4) players, laptop portable computers and desktop computers, etc.
  • MP3 players Moving Picture Experts Group Audio Layer III, dynamic Video experts compress standard audio layer 3
  • MP4 Moving Picture Experts Group Audio Layer IV, dynamic image experts compress standard audio layer 4
  • laptop portable computers and desktop computers etc.
  • the server 105 may be a server that provides various services, for example, a background server that provides support for pages displayed on the terminal devices 101, 102, and 103.
  • the semantic recall method provided in the embodiments of the present application is generally executed by the server/terminal, and correspondingly, the semantic recall device is generally installed in the server/terminal device.
  • terminals, networks, and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks, and servers according to implementation needs.
  • the semantic recall method includes the following steps:
  • Step S200 when receiving online query data, obtain an online sentence vector corresponding to the online query data based on the sentence vector generator;
  • Online query data is real-time query data received online.
  • the online sentence vector corresponding to the online query data is obtained based on the sentence vector generator.
  • the obtained online sentence vector is the sentence vector corresponding to the online query data.
  • the online query data is a sentence
  • the sentence is input to the tokenizer layer in the sentence vector generator
  • the word in the online query data is id based on the tokenizer layer Transformation means converting each word in the sentence into the format of ID.
  • the ID is passed through the embedding layer to obtain the word vector corresponding to each word in the online query data.
  • convolution processing is performed on the word vector to obtain the online sentence vector corresponding to the current online query data.
  • the sentence vector generator is an independent model structure for processing online query data.
  • the traditional deep learning model usually includes a representation layer and an output layer.
  • the representation layer and output layer of the traditional deep learning model are separated, and the part of the representation layer is separated.
  • the corresponding sentence vector generator is obtained.
  • the CNN model Take the CNN model as an example. In the CNN model, the sentence vector generator is shown in Figure 3.
  • q1(char) represents the input layer q1 of the sentence, that is, online query data, and then through the embedding layer to obtain the word vector corresponding to each word in the online query data, the word vector passes (Conv+GlobalMaxPooling)*3, that is, the three-layer convolutional neural network performs convolution processing to obtain the convolution result, where conv is convolution and GlobalMaxPooling is global pooling.
  • Concat splices the obtained convolution results, and outputs the spliced result to obtain the online sentence vector corresponding to the online query data.
  • the results of each layer of convolution must be spliced.
  • the purpose of multi-layer convolution is to make the obtained data more accurate. Therefore, other models may not include concat.
  • Step S300 Obtain a stored candidate sentence vector
  • the candidate sentence vector is pre-stored in the database, and the candidate sentence vector is obtained and stored in advance by the sentence vector generator for the candidate question.
  • candidate questions are obtained in advance, and the sentence vector of the candidate question is generated offline through the offline sentence vector generator.
  • the candidate sentence vector corresponding to the candidate question is obtained, the candidate sentence vector is stored in the database in.
  • Step S400 matching the online sentence vector and the candidate sentence vector based on the sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
  • the similarity between the candidate sentence vector and the online sentence vector is calculated based on the vector splicer. Specifically, the difference feature vectors of the candidate sentence vector and the online sentence vector in different measurement dimensions are calculated, and finally the difference feature vectors in different measurement dimensions are combined and spliced to obtain the final difference between the candidate sentence vector and the online sentence vector Feature vector.
  • regularization processing is performed on the difference feature vector to obtain the similarity between the online sentence vector and the candidate sentence vector.
  • the online sentence vector and candidate sentence vector When the online sentence vector and candidate sentence vector are obtained, the online sentence vector and candidate sentence The vector is input to Diff+Mul+Max; Diff+Mul+Max calculates the difference feature vector of the online sentence vector and candidate sentence vector from the three measurement dimensions of subtraction, multiplication and maximum, thereby obtaining the online sentence
  • the difference feature vector of the vector and the candidate sentence vector in the three dimensions concat splices the difference feature vector calculated in the three measurement dimensions to obtain the final difference feature vector; input the final difference feature vector to 3*( Dense+BatchNormalization+Relu+Dropout) to make it regularize the final difference feature vector obtained by splicing.
  • input the result of the regularization process into Sigmoid
  • Sigmoid is the activation function, and use Indicates that the result of the regularization process is passed through the activation function to obtain the similarity between the online sentence vector and the candidate sentence vector.
  • Step S500 Sort the candidate sentence vectors in descending order according to the similarity, and return the answer to the candidate question corresponding to the candidate sentence vector ranked first as the correct answer.
  • the candidate questions corresponding to the candidate sentence vector are sorted in descending order according to the similarity, that is, sorted from large to small.
  • the answer to the candidate question corresponding to the candidate sentence vector with the highest similarity of the sentence vector on the line is selected as the correct answer.
  • the representation layer and output layer of the traditional model are separated into a sentence vector generator and a splicer without changing the accuracy of the original model.
  • the sentence vector generator processes the data, and then through the sentence vector splicer, the processed data and the candidate sentence vector are spliced, without the need for the overall model structure, which increases the amount of concurrency of model processing and improves the model’s ability to process corpus Data processing efficiency and accuracy of question and answer matching.
  • This application belongs to the field of artificial intelligence technology and has good performance in both machine learning and deep learning.
  • step S200 includes:
  • Multi-layer convolution processing is performed on the word vector to obtain the online sentence vector of the online query data.
  • the online query data is a single sentence, where the word vector is the vector corresponding to each word in the single sentence.
  • each word in the online query data is idized according to the word frequency, TF-IDF (term frequency—inverse document frequency) and other characteristics in the online query data to obtain the The ID corresponding to each word in the online query data.
  • the word vector corresponding to each ID in the online query data is obtained.
  • the embedding layer there is a mapping relationship between the ID and the word vector.
  • the embedding layer is passed The word vector corresponding to each word in the online query data can be obtained.
  • the convolution result is a set of sentence vectors corresponding to the online query data.
  • a set of sentence vectors cannot fully reflect the characteristic information of the current online query data. Therefore, all the word vectors obtained in the online query data are subjected to multi-layer convolution processing to obtain multiple sets of convolution results. The multiple sets of convolution results obtained are spliced together, and the final result obtained is the online sentence vector corresponding to the online query data.
  • the online sentence vector of the online query data is obtained according to the word vector. There is no need for a complete model structure. Only a sentence vector generator is needed to obtain the corresponding online sentence vector, which improves The model's processing efficiency on corpus data has further improved the concurrency of model processing.
  • acquiring the word vector of the online query data includes:
  • the ID is feature-encoded based on the embedding layer of the sentence vector generator to obtain a word vector corresponding to each word in the online query data.
  • the token analysis layer is the tokenizer layer, and each word in the received online query data can be IDized according to the tokenizer layer. Specifically, when online query data is received, the word frequency, tfidf and other characteristics of the online query data are obtained. Based on the characteristics, the tokenizer layer can ID each word in the online query data, for example, The ID of the word division with the word frequency of 5 is 001. After the ID of each word in the online query data is completed in the tokenizer layer, the ID of each word obtained is input to the embedding layer, that is, the embedding layer.
  • the Embedding layer determines the word vector corresponding to each word according to the ID, that is, based on the embedding layer, the ID of each word is feature-coded, and the mapping between each word and the multi-dimensional space is determined, thereby obtaining the currently input online query data The word vector corresponding to each word in.
  • the analysis and extraction of online query data according to the tag analysis layer and the embedding layer are realized, the efficiency and accuracy of online query data analysis are improved, and the corresponding matching data obtained from online query data is further improved. (That is, the correct answer) efficiency and accuracy.
  • performing multi-layer convolution processing on the word vector to obtain the online sentence vector of the online query data includes:
  • the semantic features obtained each time are spliced together to obtain the online sentence vector of the online query data.
  • the word vector corresponding to each word in the online query data is obtained, it is determined that the online query data is based on the semantic feature of the word, and the semantic feature is the logical representation based on the word in the online query data.
  • a convolutional neural network such as a CNN three-layer convolutional neural network
  • the semantic features of the online query data can be extracted based on the obtained word vector.
  • the word vector of each word in the online query data is subjected to convolution processing through the convolutional neural network, and the convolution result obtained is the semantic feature of the online query data based on the word, and the semantic feature is also A set of vectors.
  • Multi-layer convolution is performed on the word vector through the convolutional neural network, and the semantic features obtained each time are spliced together to obtain the online sentence vector corresponding to the online query data.
  • a three-layer convolutional neural network is used to perform three-layer convolution on all word vectors in the online query data, and the result of the three-layer convolution, that is, the semantic feature, is spliced together, and the output is the online The online sentence vector corresponding to the query data.
  • the online sentence vector corresponding to the online query data is obtained by stitching according to semantic features, which improves the accuracy of obtaining the online sentence vector corresponding to the online query data, and further improves the The accuracy of the vector matching to get the correct answer.
  • step S300 includes:
  • Candidate questions are pre-collected questions, and the candidate questions are pre-stored in the question library.
  • the candidate question is obtained, the candidate sentence vectors are calculated one by one for all candidate questions in the question library.
  • the candidate question is calculated offline based on the sentence vector generator, and the calculation process is the same as that of the online sentence vector.
  • the candidate sentence vector can also be calculated offline based on the sentence vector generator without a network connection; for online sentence vectors, the sentence vector generator only performs real-time calculations on the received online questions.
  • the calculation of the candidate sentence vector of the candidate question is realized, and the pre-calculation and storage of the candidate sentence vector saves the matching time during question and answer matching, and improves the efficiency of obtaining answers.
  • the method further includes:
  • the candidate sentence vector is stored in a database in a dictionary in association with the candidate question.
  • the candidate sentence vector When the candidate sentence vector is obtained, the candidate sentence vector is stored in the form of a dictionary. Specifically, each candidate sentence vector corresponds to unique identification information, and the candidate sentence vector and its corresponding candidate question are associated and stored according to the identification information. When extracting the candidate sentence vector and the corresponding candidate question, it can be directly extracted based on the identification information.
  • pre-storage of candidate sentence vectors in the form of a dictionary is realized, which further improves the extraction efficiency of candidate sentence vectors during matching, and saves the duration of question and answer matching.
  • the semantic recall method further includes:
  • the multiplication is to perform dot multiplication on the online sentence vector and the candidate sentence vector, and the result obtained is the difference feature vector of the online sentence vector and the candidate sentence vector in the multiplication measurement dimension;
  • the subtraction is the online sentence The vector and the candidate sentence vector are subtracted, and the result is the difference feature vector of the line sentence vector and the candidate sentence vector in the subtraction measurement dimension;
  • the maximum value is the maximum value of the line sentence vector and the candidate sentence vector ,
  • the maximum value obtained is the feature vector of the difference between the line sentence vector and the candidate sentence vector in the maximum measurement dimension.
  • the difference feature values corresponding to the three measurement dimensions of the multiplication, subtraction, and maximum value are spliced together to obtain the final difference feature vector of the online sentence vector and the candidate sentence vector.
  • the measurement dimension includes, but is not limited to, three measurement dimensions of multiplication, subtraction, and maximum value, and may also include measurement dimensions such as minimum value.
  • the difference feature vector is regularized, and subjected to dense layer dimensionality reduction and activation function sigmoid processing.
  • the sigmoid function can map the variable to between 0 and 1, which can be obtained
  • An output result is a probability value from 0 to 1. According to the probability value, the similarity between the online sentence vector and the candidate sentence vector is measured; if the probability is greater than 0.5, it is determined that the online sentence vector and the candidate sentence vector are similar, otherwise, they are not similar.
  • the splicing and matching of online sentence vectors and candidate sentence vectors is realized, and the processing of the entire model is also not required, which improves the processing efficiency of the model, and the candidate sentence vector with the highest matching degree is determined through the similarity output, and further This greatly improves the accuracy of obtaining answers to questions.
  • the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a computer-readable storage medium.
  • the computer-readable instructions When executed, they may include the processes of the above-mentioned method embodiments.
  • the aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
  • this application provides an embodiment of a semantic recall device.
  • the device embodiment corresponds to the method embodiment shown in FIG. Used in various electronic devices.
  • the semantic recall device 600 in this embodiment includes:
  • the first obtaining module 610 is configured to obtain the online sentence vector corresponding to the online query data based on the sentence vector generator when the online query data is received;
  • the first obtaining module 610 includes:
  • the first obtaining unit is configured to obtain the word vector of the online query data based on the sentence vector generator;
  • the first processing unit is configured to perform multi-layer convolution processing on the word vector to obtain the online sentence vector of the online query data.
  • the first acquiring unit further includes:
  • the second processing unit is configured to perform ID processing on each word in the online query data based on the tag analysis layer of the sentence vector generator to obtain an ID corresponding to each word in the online query data;
  • the third processing unit is configured to perform feature encoding on the ID based on the embedding layer of the sentence vector generator to obtain a word vector corresponding to each word in the online query data.
  • the first processing unit further includes:
  • a fourth processing unit configured to perform multi-layer convolution processing on the word vector based on a convolutional neural network to obtain semantic features corresponding to the online query data
  • the first splicing unit is used to splice the semantic features obtained each time together to obtain the online sentence vector of the online query data.
  • Online query data is real-time query data received online.
  • the online sentence vector corresponding to the online query data is obtained based on the sentence vector generator.
  • the obtained online sentence vector is the sentence vector corresponding to the online query data.
  • the online query data is a sentence
  • the sentence is input to the tokenizer layer in the sentence vector generator
  • the word in the online query data is id based on the tokenizer layer Transformation means converting each word in the sentence into the format of ID.
  • the ID is passed through the embedding layer to obtain the word vector corresponding to each word in the online query data.
  • convolution processing is performed on the word vector to obtain the online sentence vector corresponding to the current online query data.
  • the sentence vector generator is an independent model structure for processing online query data.
  • the traditional deep learning model usually includes a representation layer and an output layer.
  • the representation layer and output layer of the traditional deep learning model are separated, and the part of the representation layer is separated.
  • the corresponding sentence vector generator is obtained.
  • the CNN model Take the CNN model as an example. In the CNN model, the sentence vector generator is shown in Figure 3.
  • q1(char) represents the input sentence q1, that is, online query data
  • the word vector passes ( Conv+GlobalMaxPooling)*3, that is, the three-layer convolutional neural network performs convolution processing to obtain the convolution result, where conv is convolution and GlobalMaxPooling is global pooling.
  • Concat splices the obtained convolution results, and outputs the spliced result to obtain the online sentence vector corresponding to the online query data.
  • the results of each layer of convolution must be spliced.
  • the purpose of multi-layer convolution is to make the obtained data more accurate. Therefore, other models may not include concat.
  • the second obtaining module 620 is configured to obtain stored candidate sentence vectors
  • the second obtaining module 620 includes:
  • the second obtaining unit is used to obtain candidate questions stored in the question library
  • the first calculation unit is configured to perform offline calculation on the candidate question based on the sentence vector generator to obtain a candidate sentence vector corresponding to the candidate question.
  • the third acquiring unit is configured to acquire the unique identification information corresponding to each candidate sentence vector
  • the storage unit is configured to store the candidate sentence vector in a dictionary in the form of a dictionary in association with the candidate question in a database according to the identification information.
  • the candidate sentence vector is pre-stored in the database, and the candidate sentence vector is obtained and stored in advance by the sentence vector generator for the candidate question.
  • candidate questions are obtained in advance, and the sentence vector of the candidate question is generated offline through the offline sentence vector generator.
  • the candidate sentence vector corresponding to the candidate question is obtained, the candidate sentence vector is stored in the database in.
  • the splicing module 630 is configured to match the online sentence vector and the candidate sentence vector based on a sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
  • the splicing module includes:
  • the second calculation unit is used to calculate the difference feature vector of the online sentence vector and the candidate sentence vector in the three measurement dimensions of multiplication, subtraction, and maximum;
  • the second splicing unit is used to splice the difference feature vectors in the three measurement dimensions together to obtain the final difference feature vector;
  • a fifth processing unit configured to perform regularization processing on the final difference feature vector to obtain a processing result
  • the sixth processing unit is configured to perform function processing on the processing result to obtain the similarity between the online sentence vector and the candidate sentence vector.
  • the similarity between the candidate sentence vector and the online sentence vector is calculated based on the vector splicer. Specifically, the difference feature vectors of the candidate sentence vector and the online sentence vector in different measurement dimensions are calculated, and finally the difference feature vectors in different measurement dimensions are combined and spliced to obtain the final difference between the candidate sentence vector and the online sentence vector Feature vector.
  • regularization processing is performed on the difference feature vector to obtain the similarity between the online sentence vector and the candidate sentence vector.
  • the online sentence vector and candidate sentence vector When the online sentence vector and candidate sentence vector are obtained, the online sentence vector and candidate sentence The vector is input to Diff+Mul+Max; Diff+Mul+Max calculates the difference feature vector of the online sentence vector and candidate sentence vector from the three measurement dimensions of subtraction, multiplication and maximum, thereby obtaining the online sentence
  • the difference feature vector of the vector and the candidate sentence vector in the three dimensions concat splices the difference feature vector calculated in the three measurement dimensions to obtain the final difference feature vector; input the final difference feature vector to 3*( Dense+BatchNormalization+Relu+Dropout) to make it regularize the final difference feature vector obtained by splicing.
  • input the result of the regularization process into Sigmoid
  • Sigmoid is the activation function, and use Indicates that the result of the regularization process is passed through the activation function to obtain the similarity between the online sentence vector and the candidate sentence vector.
  • the sorting module 640 is configured to sort the candidate sentence vectors in descending order according to the similarity, and return the answer to the candidate question corresponding to the first-ranked candidate sentence vector as the correct answer.
  • the candidate questions corresponding to the candidate sentence vector are sorted in descending order according to the similarity, that is, sorted from large to small.
  • the answer to the candidate question corresponding to the candidate sentence vector with the highest similarity of the sentence vector on the line is selected as the correct answer.
  • FIG. 6 is a block diagram of the basic structure of the computer device in this embodiment.
  • the computer device 6 includes a memory 61, a processor 62, and a network interface 63 that communicate with each other through a system bus. It should be pointed out that only the computer device 6 with components 61-63 is shown in the figure, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC Application Specific Integrated Circuit
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Processor
  • the computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
  • the memory 61 includes at least one type of readable storage medium, the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static memory Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, etc.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the memory 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6.
  • the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk equipped on the computer device 6, a smart media card (SMC), a secure digital (Secure Digital, SD) card, Flash Card, etc.
  • the memory 61 may also include both the internal storage unit of the computer device 6 and its external storage device.
  • the memory 61 is generally used to store an operating system and various application software installed in the computer device 6, such as computer-readable instructions of a semantic recall method.
  • the memory 61 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 62 may be a central processing unit (CPU), a controller, a microcontroller, a microprocessor, or other data processing chips.
  • the processor 62 is generally used to control the overall operation of the computer device 6.
  • the processor 62 is configured to execute computer-readable instructions or process data stored in the memory 61, for example, computer-readable instructions for executing the semantic recall method.
  • the network interface 63 may include a wireless network interface or a wired network interface, and the network interface 63 is generally used to establish a communication connection between the computer device 6 and other electronic devices.
  • the computer device realizes that without changing the accuracy of the original model, the representation layer and the output layer of the traditional model are split into a sentence vector generator and a splicer, respectively.
  • a sentence vector generator Only need to process the data through a single sentence vector generator, and then use the sentence vector splicer to splice the processed data and candidate sentence vectors, without the need for the overall model structure, which increases the amount of concurrency of model processing.
  • This application also provides another implementation manner, that is, to provide a computer-readable storage medium that stores a semantic recall process, and the semantic recall process can be executed by at least one processor to enable all The at least one processor executes the steps of the semantic recall method as described above.
  • the computer-readable storage medium realizes that without changing the accuracy of the original model, the representation layer and the output layer of the traditional model are split into a sentence vector generator and a splicer, respectively.
  • sentence vectors only a single sentence vector generator is needed to process the data, and then a sentence vector splicer is used to splice the processed data and candidate sentence vectors, without the need for the overall model structure, which improves the concurrency of model processing.
  • This improves the model’s processing efficiency and the accuracy of Q&A matching when processing corpus data. And can be applied to a variety of different types of models, with mobility and high scalability.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present application.
  • a terminal device which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided is a semantic recall method, belonging to the field of artificial intelligence, comprising: upon receiving online query data, obtaining, on the basis of a sentence vector generator, an online sentence vector corresponding to the online query data (S200); obtaining a stored candidate sentence vector (S300); matching the online sentence vector and the candidate sentence vector on the basis of a sentence vector assembler to obtain the similarity between the online sentence vector and the candidate sentence vector (S400); sorting candidate sentence vectors in descending order according to similarity, and returning, as a correct answer, a first-ranked answer of a candidate question corresponding to the candidate sentence vector (S500). The method is such that without changing the accuracy of an original model, the representation layer and output layer of a conventional model are split into a sentence vector generator and an assembler, increasing the concurrency processed by the model, and improving the processing efficiency of the model when processing corpus data and the accuracy of matching of questions and answers.

Description

语义召回方法、装置、计算机设备及存储介质Semantic recall method, device, computer equipment and storage medium
本申请要求于2020年5月13日提交中国专利局、申请号为202010402690.9,发明名称为“语义召回方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 13, 2020, the application number is 202010402690.9, and the invention title is "Semantic Recall Method, Device, Computer Equipment, and Storage Medium", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请涉及人工智能技术领域,尤其涉及一种语义召回方法、装置、计算机设备及存储介质。This application relates to the field of artificial intelligence technology, in particular to a semantic recall method, device, computer equipment and storage medium.
背景技术Background technique
目前,语义召回模型被广泛地应用于AI问答系统中,随着科技的发展,越来越多的地方以AI问答系统来代替人工问答,以提高认为处理效率。其中,语义召回模型主要以传统深度学习模型为主,如CNN、LSTM、ESTM模型等。At present, the semantic recall model is widely used in AI question answering systems. With the development of technology, AI question answering systems are used in more and more places to replace manual question answering to improve processing efficiency. Among them, the semantic recall model is mainly based on traditional deep learning models, such as CNN, LSTM, and ESTM models.
然而,随着信息时代的高速发展,模型需要处理的语料数据也越来越庞大,精度也越来越高,覆盖面也越来越广。发明人意识到,当前的语义召回模型在处理大量语料数据时,并不能高效处理大量的语料数据,其训练速度慢、收敛时间长并且内存占用大,最终导致语义召回模型在处理语料数据时效率低下的技术问题。However, with the rapid development of the information age, the corpus data that the model needs to process is also increasing, the accuracy is getting higher and higher, and the coverage is getting wider. The inventor realizes that the current semantic recall model cannot efficiently process a large amount of corpus data when processing a large amount of corpus data. Its slow training speed, long convergence time and large memory footprint ultimately lead to the efficiency of the semantic recall model in processing corpus data. Low technical problems.
发明内容Summary of the invention
本申请实施例的目的在于提出一种语义召回方法、装置、计算机设备及存储介质,旨在解决语义召回模型处理语料数据效率低下的技术问题。The purpose of the embodiments of the present application is to propose a semantic recall method, device, computer equipment, and storage medium, aiming to solve the technical problem of low efficiency of semantic recall model processing corpus data.
为了解决上述技术问题,本申请实施例提供一种语义召回方法,采用了如下所述的技术方案:In order to solve the above technical problems, an embodiment of the present application provides a semantic recall method, which adopts the following technical solutions:
一种语义召回方法,包括以下步骤:A semantic recall method includes the following steps:
在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;When receiving online query data, obtain the online sentence vector corresponding to the online query data based on the sentence vector generator;
获取存储的候选句向量;Obtain the stored candidate sentence vector;
基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;Matching the online sentence vector and the candidate sentence vector based on the sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
为了解决上述技术问题,本申请实施例还提供一种语义召回装置,采用了如下所述的技术方案:In order to solve the above technical problems, an embodiment of the present application also provides a semantic recall device, which adopts the following technical solutions:
第一获取模块,用于在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;The first obtaining module is configured to obtain the online sentence vector corresponding to the online query data based on the sentence vector generator when the online query data is received;
第二获取模块,用于获取存储的候选句向量;The second obtaining module is used to obtain the stored candidate sentence vector;
拼接模块,用于基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;A splicing module, configured to match the online sentence vector and the candidate sentence vector based on a sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
排序模块,用于根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The sorting module is configured to sort the candidate sentence vectors in descending order according to the similarity, and return the answer to the candidate question corresponding to the candidate sentence vector ranked first as the correct answer.
为了解决上述技术问题,本申请实施例还提供一种计算机设备,包括存储器和处理器,以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下所述的语义召回方法的步骤:In order to solve the above technical problems, an embodiment of the present application also provides a computer device, including a memory and a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor executes The computer-readable instructions implement the steps of the semantic recall method as described below:
在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;When receiving online query data, obtain the online sentence vector corresponding to the online query data based on the sentence vector generator;
获取存储的候选句向量;Obtain the stored candidate sentence vector;
基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;Matching the online sentence vector and the candidate sentence vector based on the sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
为了解决上述技术问题,本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下所述的语义召回方法的步骤:In order to solve the above technical problems, embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the following Steps of semantic recall method:
在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;When receiving online query data, obtain the online sentence vector corresponding to the online query data based on the sentence vector generator;
获取存储的候选句向量;Obtain the stored candidate sentence vector;
基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;Matching the online sentence vector and the candidate sentence vector based on the sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。The details of one or more embodiments of the present application are presented in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings and claims.
上述语义召回方法、装置、计算机设备及存储介质,通过在接收到线上查询数据时,该线上查询数据即为输入的句子,基于句向量生成器获取所述线上查询数据对应的线上句向量,该线上句向量即为该线上查询数据对应的向量形式数据;获取存储的候选句向量,其中,该候选句向量为预先存储于数据库中候选问题对应的句向量;基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;根据相似度即可对候选句向量进行筛选,从而筛选出与该线上句向量最为匹配的候选句向量。具体地,根据所述相似度对候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。由此,则解决了语义召回模型处理语料数据效率低下的技术问题。The above semantic recall method, device, computer equipment and storage medium, when online query data is received, the online query data is the input sentence, and the online query data corresponding to the online query data is obtained based on the sentence vector generator. Sentence vector, the online sentence vector is the vector form data corresponding to the online query data; obtain the stored candidate sentence vector, where the candidate sentence vector is the sentence vector corresponding to the candidate question stored in the database in advance; based on the sentence vector The splicer matches the online sentence vector and the candidate sentence vector to obtain the similarity between the online sentence vector and the candidate sentence vector; the candidate sentence vector can be screened according to the similarity, so as to filter out the The candidate sentence vector that best matches the online sentence vector. Specifically, the candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the candidate sentence vector ranked first is returned as the correct answer. As a result, the technical problem that the semantic recall model is inefficient in processing corpus data is solved.
附图说明Description of the drawings
为了更清楚地说明本申请中的方案,下面将对本申请实施例描述中所需要使用的附图作一个简单介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the solution in this application more clearly, the following will briefly introduce the drawings used in the description of the embodiments of the application. Obviously, the drawings in the following description are some embodiments of the application. Ordinary technicians can obtain other drawings based on these drawings without creative work.
图1是本申请可以应用于其中的示例性系统架构图;Figure 1 is an exemplary system architecture diagram to which the present application can be applied;
图2为语义召回方法的一个实施例的流程图;Figure 2 is a flowchart of an embodiment of a semantic recall method;
图3为一个句向量生成器的示意图;Figure 3 is a schematic diagram of a sentence vector generator;
图4为一个句向量拼接器的示意图;Figure 4 is a schematic diagram of a sentence vector splicer;
图5是根据本申请的语义召回装置的一个实施例的结构示意图;Fig. 5 is a schematic structural diagram of an embodiment of a semantic recall device according to the present application;
图6是根据本申请的计算机设备的一个实施例的结构示意图。Fig. 6 is a schematic structural diagram of an embodiment of a computer device according to the present application.
附图标记:语义召回装置600,第一获取模块610,第二获取模块620,拼接模块630,排序模块640。Reference signs: semantic recall device 600, first acquisition module 610, second acquisition module 620, splicing module 630, sorting module 640.
具体实施方式Detailed ways
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同;本文中在申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请;本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。本申请的说明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是 用于描述特定顺序。Unless otherwise defined, all technical and scientific terms used herein have the same meanings as commonly understood by those skilled in the technical field of the application; the terms used in the specification of the application herein are only for describing specific embodiments. The purpose is not to limit the application; the terms "including" and "having" in the specification and claims of the application and the above-mentioned description of the drawings and any variations thereof are intended to cover non-exclusive inclusions. The terms "first", "second", etc. in the specification and claims of this application or the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific order.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。The reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.
为了使本申请的目的、技术方案及优点更加清楚明白,下面结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the objectives, technical solutions, and advantages of this application clearer, the following further describes the application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, and are not used to limit the present application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如网页浏览器应用、购物类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can use the terminal devices 101, 102, and 103 to interact with the server 105 through the network 104 to receive or send messages and so on. Various communication client applications, such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, and social platform software, may be installed on the terminal devices 101, 102, and 103.
终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。The terminal devices 101, 102, and 103 may be various electronic devices with display screens and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic Video experts compress standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image experts compress standard audio layer 4) players, laptop portable computers and desktop computers, etc.
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103上显示的页面提供支持的后台服务器。The server 105 may be a server that provides various services, for example, a background server that provides support for pages displayed on the terminal devices 101, 102, and 103.
需要说明的是,本申请实施例所提供的语义召回方法一般由服务端/终端执行,相应地,语义召回装置一般设置于服务端/终端设备中。It should be noted that the semantic recall method provided in the embodiments of the present application is generally executed by the server/terminal, and correspondingly, the semantic recall device is generally installed in the server/terminal device.
应该理解,图1中的终端、网络和服务端的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminals, networks, and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks, and servers according to implementation needs.
继续参考图2,示出了根据本申请的语义召回方法的一个实施例的流程图。所述语义召回方法,包括以下步骤:Continuing to refer to FIG. 2, a flowchart of an embodiment of the semantic recall method according to the present application is shown. The semantic recall method includes the following steps:
步骤S200,在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;Step S200, when receiving online query data, obtain an online sentence vector corresponding to the online query data based on the sentence vector generator;
线上查询数据为在线接收到的实时查询数据。在接收到该线上查询数据时,则基于句向量生成器获取该线上查询数据对应的线上句向量。其中,得到的该线上句向量即为该线上查询数据对应的句向量。具体地,在接收到线上查询数据时,该线上查询数据为一个句子,将该句子输入至句向量生成器中的tokenizer层,基于该tokenizer层对该线上查询数据中的字进行id化,即将句子中的每个字转化为ID的格式。而后将该ID通过embedding(嵌入)层,即可得到该线上查询数据中每个字对应的字向量。在得到该字向量时,对该字向量进行卷积处理则可以得到当前该线上查询数据对应的线上句向量。Online query data is real-time query data received online. When the online query data is received, the online sentence vector corresponding to the online query data is obtained based on the sentence vector generator. Wherein, the obtained online sentence vector is the sentence vector corresponding to the online query data. Specifically, when online query data is received, the online query data is a sentence, the sentence is input to the tokenizer layer in the sentence vector generator, and the word in the online query data is id based on the tokenizer layer Transformation means converting each word in the sentence into the format of ID. Then the ID is passed through the embedding layer to obtain the word vector corresponding to each word in the online query data. When the word vector is obtained, convolution processing is performed on the word vector to obtain the online sentence vector corresponding to the current online query data.
句向量生成器为对线上查询数据进行处理的独立模型结构,传统的深度学习模型通常包括表征层和输出层,将传统深度学习模型的表征层和输出层拆分开,将表征层的部分作为句向量生成器,即得到对应的句向量生成器。以CNN模型为例,在该CNN模型中,该句向量生成器如图3所示。The sentence vector generator is an independent model structure for processing online query data. The traditional deep learning model usually includes a representation layer and an output layer. The representation layer and output layer of the traditional deep learning model are separated, and the part of the representation layer is separated. As a sentence vector generator, the corresponding sentence vector generator is obtained. Take the CNN model as an example. In the CNN model, the sentence vector generator is shown in Figure 3.
由图3可知,该模型中,q1(char)表示语句的输入层q1,即线上查询数据,而后经过embedding嵌入层得到该线上查询数据中每个字对应的字向量,该字向量通过(Conv+GlobalMaxPooling)*3,即三层卷积神经网络进行卷积处理,得到卷积结果,其 中,conv为卷积,GlobalMaxPooling为全局池化。Concat对得到的卷积结果进行拼接,output输出拼接后的结果,即可得到该线上查询数据对应的线上句向量。其中,由于进行了三层卷积,则要对每层卷积的结果进行拼接,多层卷积的目的是使得到的数据更精确,因此,其他模型不一定包括concat。It can be seen from Figure 3 that in the model, q1(char) represents the input layer q1 of the sentence, that is, online query data, and then through the embedding layer to obtain the word vector corresponding to each word in the online query data, the word vector passes (Conv+GlobalMaxPooling)*3, that is, the three-layer convolutional neural network performs convolution processing to obtain the convolution result, where conv is convolution and GlobalMaxPooling is global pooling. Concat splices the obtained convolution results, and outputs the spliced result to obtain the online sentence vector corresponding to the online query data. Among them, because three layers of convolution are performed, the results of each layer of convolution must be spliced. The purpose of multi-layer convolution is to make the obtained data more accurate. Therefore, other models may not include concat.
步骤S300,获取存储的候选句向量;Step S300: Obtain a stored candidate sentence vector;
候选句向量被预先存储在数据库中,候选句向量为候选问题通过句向量生成器预先得到并存储。在问答系统中,预先获取候选问题,并通过线下的句向量生成器对候选问题的句向量进行离线生成,在得到该候选问题对应的候选句向量时,则将该候选句向量存储在数据库中。The candidate sentence vector is pre-stored in the database, and the candidate sentence vector is obtained and stored in advance by the sentence vector generator for the candidate question. In the question answering system, candidate questions are obtained in advance, and the sentence vector of the candidate question is generated offline through the offline sentence vector generator. When the candidate sentence vector corresponding to the candidate question is obtained, the candidate sentence vector is stored in the database in.
步骤S400,基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;Step S400, matching the online sentence vector and the candidate sentence vector based on the sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
在获取到候选句向量与线上句向量时,基于向量拼接器对该候选句向量和线上句向量的相似度进行计算。具体地,计算该候选句向量和线上句向量在不同衡量维度上的差异特征向量,最后组合拼接不同衡量维度上的差异特征向量,即可得到该候选句向量和线上句向量最终的差异特征向量。在得到该最终的差异特征向量时,对该差异特征向量进行正则化处理,即可得到线上句向量和候选句向量的相似度。When the candidate sentence vector and the online sentence vector are obtained, the similarity between the candidate sentence vector and the online sentence vector is calculated based on the vector splicer. Specifically, the difference feature vectors of the candidate sentence vector and the online sentence vector in different measurement dimensions are calculated, and finally the difference feature vectors in different measurement dimensions are combined and spliced to obtain the final difference between the candidate sentence vector and the online sentence vector Feature vector. When the final difference feature vector is obtained, regularization processing is performed on the difference feature vector to obtain the similarity between the online sentence vector and the candidate sentence vector.
以CNN模型为例,在该CNN模型中,该句向量拼接器如图4所示。由图4可知,该模型中,q1(实时)表示线上句向量,q2(离线)表示候选句向量,在获取到线上句向量及候选句向量时,将该线上句向量及候选句向量输入至Diff+Mul+Max;Diff+Mul+Max则对该线上句向量及候选句向量,从减法、乘法和最大值三个衡量维度进行差异特征向量计算,由此得到该线上句向量及候选句向量在三个维度的差异特征向量;concat对在该三个衡量维度计算得到的差异特征向量进行拼接,得到最终的差异特征向量;将该最终的差异特征向量输入至3*(Dense+BatchNormalization+Relu+Dropout),使其对拼接得到的最终的差异特征向量进行正则化处理。而后将该正则化处理的结果输入至Sigmoid,Sigmoid为激活函数,用
Figure PCTCN2020118454-appb-000001
表示,将该正则化处理的结果通过该激活函数,即可得到该线上句向量和候选句向量的相似度。
Take the CNN model as an example. In the CNN model, the sentence vector splicer is shown in Figure 4. It can be seen from Figure 4 that in the model, q1 (real-time) represents the online sentence vector, q2 (offline) represents the candidate sentence vector. When the online sentence vector and candidate sentence vector are obtained, the online sentence vector and candidate sentence The vector is input to Diff+Mul+Max; Diff+Mul+Max calculates the difference feature vector of the online sentence vector and candidate sentence vector from the three measurement dimensions of subtraction, multiplication and maximum, thereby obtaining the online sentence The difference feature vector of the vector and the candidate sentence vector in the three dimensions; concat splices the difference feature vector calculated in the three measurement dimensions to obtain the final difference feature vector; input the final difference feature vector to 3*( Dense+BatchNormalization+Relu+Dropout) to make it regularize the final difference feature vector obtained by splicing. Then input the result of the regularization process into Sigmoid, Sigmoid is the activation function, and use
Figure PCTCN2020118454-appb-000001
Indicates that the result of the regularization process is passed through the activation function to obtain the similarity between the online sentence vector and the candidate sentence vector.
步骤S500,根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。Step S500: Sort the candidate sentence vectors in descending order according to the similarity, and return the answer to the candidate question corresponding to the candidate sentence vector ranked first as the correct answer.
在确定该线上句向量与候选句向量之间的相似度时,根据该相似度对候选句向量对应的候选问题进行降序排序,即从大到小排列。选取问题库中与该线上句向量相似度最高的候选句向量对应的候选问题的答案为正确答案。将该正确答案作为线上查询数据的正确答案,推送至用户界面。When determining the similarity between the online sentence vector and the candidate sentence vector, the candidate questions corresponding to the candidate sentence vector are sorted in descending order according to the similarity, that is, sorted from large to small. The answer to the candidate question corresponding to the candidate sentence vector with the highest similarity of the sentence vector on the line is selected as the correct answer. Use the correct answer as the correct answer for online query data and push it to the user interface.
在本实施例中,实现了在不改变原有模型的精度下,将传统模型的表征层和输出层拆分开分别作为句向量生成器和拼接器,在获取句向量时,只需要通过单个的句向量生成器对数据进行处理,再通过句向量拼接器对处理得到的数据与候选句向量进行拼接,而不需要整体的模型结构,提高了模型处理的并发量,提高了模型在处理语料资料时的处理效率及问答匹配的准确率。并且可以被应用于各种不同类型的模型上,具有迁移性,可拓展性高。本申请属于人工智能技术领域,在机器学习、深度学习中均具有较好的表现。In this embodiment, it is realized that the representation layer and output layer of the traditional model are separated into a sentence vector generator and a splicer without changing the accuracy of the original model. When the sentence vector is obtained, only a single The sentence vector generator processes the data, and then through the sentence vector splicer, the processed data and the candidate sentence vector are spliced, without the need for the overall model structure, which increases the amount of concurrency of model processing and improves the model’s ability to process corpus Data processing efficiency and accuracy of question and answer matching. And can be applied to a variety of different types of models, with mobility and high scalability. This application belongs to the field of artificial intelligence technology and has good performance in both machine learning and deep learning.
在本申请的一些实施例中,步骤S200,包括:In some embodiments of the present application, step S200 includes:
基于句向量生成器,获取所述线上查询数据的字向量;Obtain the word vector of the online query data based on the sentence vector generator;
对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量。Multi-layer convolution processing is performed on the word vector to obtain the online sentence vector of the online query data.
线上查询数据为单个的句子,其中,字向量即为该单个的句子中每个字对应的向量。 在接收到该线上查询数据时,根据该线上查询数据中的词频、TF-IDF(term frequency–inverse document frequency)等特征对该线上查询数据中的每个字进行id化,得到该线上查询数据中每个字对应的ID。而后基于mbedding(嵌入)层获取该线上查询数据中每个ID对应的字向量,在该embedding层中ID与字向量之间为映射关系,在得到每个字的ID时,通过该embedding层即可得到该线上查询数据中每个字对应的字向量。The online query data is a single sentence, where the word vector is the vector corresponding to each word in the single sentence. When the online query data is received, each word in the online query data is idized according to the word frequency, TF-IDF (term frequency—inverse document frequency) and other characteristics in the online query data to obtain the The ID corresponding to each word in the online query data. Then, based on the mbedding (embedding) layer, the word vector corresponding to each ID in the online query data is obtained. In the embedding layer, there is a mapping relationship between the ID and the word vector. When the ID of each word is obtained, the embedding layer is passed The word vector corresponding to each word in the online query data can be obtained.
在得到线上查询数据中每个字对应的字向量时,将该线上查询数据所包括的所有字向量输入至卷积神经网络,基于该卷积神经网络对该字向量进行处理得到卷积结果,该卷积结果即为该线上查询数据对应的一组句向量。然而,一组句向量并不能完全反应当前该线上查询数据的特征信息,因此,对该线上查询数据中得到的所有字向量进行多层卷积处理,得到多组卷积结果。将得到的多组卷积结果拼接在一起,得到的最终结果即为该线上查询数据对应的线上句向量。When the word vector corresponding to each word in the online query data is obtained, all the word vectors included in the online query data are input to the convolutional neural network, and the word vector is processed based on the convolutional neural network to obtain convolution As a result, the convolution result is a set of sentence vectors corresponding to the online query data. However, a set of sentence vectors cannot fully reflect the characteristic information of the current online query data. Therefore, all the word vectors obtained in the online query data are subjected to multi-layer convolution processing to obtain multiple sets of convolution results. The multiple sets of convolution results obtained are spliced together, and the final result obtained is the online sentence vector corresponding to the online query data.
在本实施例中,实现了根据字向量对线上查询数据的线上句向量的获取,不需要完整的模型结构,只需要句向量生成器即可获取到对应的线上句向量,提高了模型对语料数据的处理效率,并进一步地提高了模型处理的并发量。In this embodiment, the online sentence vector of the online query data is obtained according to the word vector. There is no need for a complete model structure. Only a sentence vector generator is needed to obtain the corresponding online sentence vector, which improves The model's processing efficiency on corpus data has further improved the concurrency of model processing.
在本申请的一些实施例中,上述基于句向量生成器,获取所述线上查询数据的字向量包括:In some embodiments of the present application, based on the sentence vector generator described above, acquiring the word vector of the online query data includes:
基于句向量生成器的标记解析层对所述线上查询数据中的每个字进行ID化处理,得到所述线上查询数据中的每个字对应的ID;Based on the tag analysis layer of the sentence vector generator, IDize each word in the online query data to obtain an ID corresponding to each word in the online query data;
基于所述句向量生成器的嵌入层对所述ID进行特征编码,得到所述线上查询数据中每个字对应的字向量。The ID is feature-encoded based on the embedding layer of the sentence vector generator to obtain a word vector corresponding to each word in the online query data.
标记解析层即tokenizer层,根据该tokenizer层即可对接收到的线上查询数据中的每个字进行ID化。具体地,在接收到线上查询数据时,获取该线上查询数据的词频、tfidf等特征,tokenizer层基于该特征,即可对该线上查询数据中的每个字进行id化,例如将词频为5的字划分ID为001。在tokenizer层对该线上查询数据中的每个字的id化处理完成后,将得到的每个字的ID输入至嵌入层,即embedding层。Embedding层则根据该ID确定每个字对应的字向量,即基于该embedding层对每个字的ID进行特征编码,确定每个字与多维空间的映射,由此得到当前输入的线上查询数据中每个字对应的字向量。The token analysis layer is the tokenizer layer, and each word in the received online query data can be IDized according to the tokenizer layer. Specifically, when online query data is received, the word frequency, tfidf and other characteristics of the online query data are obtained. Based on the characteristics, the tokenizer layer can ID each word in the online query data, for example, The ID of the word division with the word frequency of 5 is 001. After the ID of each word in the online query data is completed in the tokenizer layer, the ID of each word obtained is input to the embedding layer, that is, the embedding layer. The Embedding layer determines the word vector corresponding to each word according to the ID, that is, based on the embedding layer, the ID of each word is feature-coded, and the mapping between each word and the multi-dimensional space is determined, thereby obtaining the currently input online query data The word vector corresponding to each word in.
在本实施例中,实现了根据标记解析层及嵌入层对线上查询数据的解析提取,提高了对线上查询数据的解析效率及准确率,进一步提高了对线上查询数据获取对应匹配数据(即正确答案)的效率及准确率。In this embodiment, the analysis and extraction of online query data according to the tag analysis layer and the embedding layer are realized, the efficiency and accuracy of online query data analysis are improved, and the corresponding matching data obtained from online query data is further improved. (That is, the correct answer) efficiency and accuracy.
在本申请的一些实施例中,上述对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量包括:In some embodiments of the present application, performing multi-layer convolution processing on the word vector to obtain the online sentence vector of the online query data includes:
基于卷积神经网络对所述字向量进行多层卷积处理,得到所述线上查询数据对应的语义特征;Performing multi-layer convolution processing on the word vector based on a convolutional neural network to obtain semantic features corresponding to the online query data;
将每次得到的所述语义特征拼接在一起,得到所述线上查询数据的线上句向量。The semantic features obtained each time are spliced together to obtain the online sentence vector of the online query data.
在得到线上查询数据中每个字对应的字向量时,确定该线上查询数据基于字的语义特征,该语义特征即为该线上查询数据中基于字的逻辑表示。通过卷积神经网络(如CNN三层卷积神经网络),可以基于得到的字向量,对该线上查询数据的语义特征进行提取。具体地,将线上查询数据中每个字的字向量经过该卷积神经网络进行卷积处理,所得到的卷积结果即为该线上查询数据基于字的语义特征,该语义特征亦为一组向量。通过卷积神经网络对该字向量进行多层卷积,将每次得到的语义特征拼接在一起,即可得到该线上查询数据对应的线上句向量。如通过一个三层卷积神经网络对该线上查询数据中的所有字向量进行三层卷积,将该三层卷积后的结果,即语义特征,拼接在一起,输出即为该线上查询数据对应的线上句向量。When the word vector corresponding to each word in the online query data is obtained, it is determined that the online query data is based on the semantic feature of the word, and the semantic feature is the logical representation based on the word in the online query data. Through a convolutional neural network (such as a CNN three-layer convolutional neural network), the semantic features of the online query data can be extracted based on the obtained word vector. Specifically, the word vector of each word in the online query data is subjected to convolution processing through the convolutional neural network, and the convolution result obtained is the semantic feature of the online query data based on the word, and the semantic feature is also A set of vectors. Multi-layer convolution is performed on the word vector through the convolutional neural network, and the semantic features obtained each time are spliced together to obtain the online sentence vector corresponding to the online query data. For example, a three-layer convolutional neural network is used to perform three-layer convolution on all word vectors in the online query data, and the result of the three-layer convolution, that is, the semantic feature, is spliced together, and the output is the online The online sentence vector corresponding to the query data.
在本实施例中,实现了根据语义特征拼接得到线上查询数据对应的线上句向量,提高了对线上查询数据对应线上句向量获取的准确性,进一步提高了在根据该线上句向量进行 匹配得到正确答案的准确率。In this embodiment, the online sentence vector corresponding to the online query data is obtained by stitching according to semantic features, which improves the accuracy of obtaining the online sentence vector corresponding to the online query data, and further improves the The accuracy of the vector matching to get the correct answer.
在本申请的一些实施例中,步骤S300,包括:In some embodiments of the present application, step S300 includes:
获取问题库中存储的候选问题;Obtain candidate questions stored in the question library;
基于所述句向量生成器对所述候选问题进行离线计算,得到所述候选问题对应的候选句向量。Perform offline calculation on the candidate question based on the sentence vector generator to obtain a candidate sentence vector corresponding to the candidate question.
候选问题为预先收集的问题,该候选问题被预先存储于问题库中。在得到该候选问题时,对该问题库中的所有的候选问题一一进行候选句向量的计算。具体地,在获取到候选问题时,基于句向量生成器对候选问题进行线下计算,其计算过程与线上句向量的计算方式相同。但候选句向量在无需网络连接下亦可基于该句向量生成器离线计算;而对于线上句向量,该句向量生成器只对接收到的线上问题进行实时的计算。例如,{糖尿病是怎么形成呢:[0.76,0.54,0.77,…,0.65,0.23,0.13],糖尿病应该怎么治疗:[0.12,0.25,0.65,…,0.11,0.86,0.92]},其中,“糖尿病是怎么形成呢”“糖尿病应该怎么治疗”后面的数字即为这两句话分别对应的句向量表示。在该两个问题为候选问题时,则将该问题以候选句向量的形式进行存储。Candidate questions are pre-collected questions, and the candidate questions are pre-stored in the question library. When the candidate question is obtained, the candidate sentence vectors are calculated one by one for all candidate questions in the question library. Specifically, when the candidate question is obtained, the candidate question is calculated offline based on the sentence vector generator, and the calculation process is the same as that of the online sentence vector. However, the candidate sentence vector can also be calculated offline based on the sentence vector generator without a network connection; for online sentence vectors, the sentence vector generator only performs real-time calculations on the received online questions. For example, {How is diabetes formed: [0.76,0.54,0.77,...,0.65,0.23,0.13], how should diabetes be treated: [0.12,0.25,0.65,...,0.11,0.86,0.92]}, where, " How does diabetes form?" "How should diabetes be treated?" The numbers after the two sentences are the sentence vectors that correspond to these two sentences. When the two questions are candidate questions, the questions are stored in the form of candidate sentence vectors.
在本实施例中,实现了对候选问题的候选句向量的计算,并通过对候选句向量的预先计算及存储,节省了在问答匹配时的匹配时长,提高了对答案的获取效率。In this embodiment, the calculation of the candidate sentence vector of the candidate question is realized, and the pre-calculation and storage of the candidate sentence vector saves the matching time during question and answer matching, and improves the efficiency of obtaining answers.
在本申请的一些实施例中,上述基于所述句向量生成器对所述候选问题进行离线计算,得到所述候选问题对应的候选句向量的步骤之后,还包括:In some embodiments of the present application, after the above step of performing offline calculation on the candidate question based on the sentence vector generator to obtain the candidate sentence vector corresponding to the candidate question, the method further includes:
获取每个所述候选句向量对应的唯一标识信息;Acquiring unique identification information corresponding to each candidate sentence vector;
根据所述标识信息,将所述候选句向量以字典的形式,与所述候选问题关联存储至数据库中。According to the identification information, the candidate sentence vector is stored in a database in a dictionary in association with the candidate question.
在得到该候选句向量时,则以字典的形式对该候选句向量进行存储。具体地,每个候选句向量对应有唯一的标识信息,根据该标识信息对候选句向量和其对应的候选问题进行关联存储。在对候选句向量及对应的候选问题进行提取时,根据该标识信息即可直接进行提取。When the candidate sentence vector is obtained, the candidate sentence vector is stored in the form of a dictionary. Specifically, each candidate sentence vector corresponds to unique identification information, and the candidate sentence vector and its corresponding candidate question are associated and stored according to the identification information. When extracting the candidate sentence vector and the corresponding candidate question, it can be directly extracted based on the identification information.
在本实施例中,实现了对候选句向量通过字典的形式进行预存储,进一步地提高了在匹配时对候选句向量的提取效率,节省了问答匹配时长。In this embodiment, pre-storage of candidate sentence vectors in the form of a dictionary is realized, which further improves the extraction efficiency of candidate sentence vectors during matching, and saves the duration of question and answer matching.
在本申请的一些实施例中,所述语义召回方法还包括:In some embodiments of the present application, the semantic recall method further includes:
计算所述线上句向量和所述候选句向量在乘法、减法和最大值三个衡量维度上的差异特征向量;Calculating the difference feature vectors of the online sentence vector and the candidate sentence vector in the three measurement dimensions of multiplication, subtraction, and maximum;
将所述三个衡量维度上的差异特征向量拼接在一起,得到最终的差异特征向量;Splicing the difference feature vectors in the three measurement dimensions together to obtain the final difference feature vector;
对所述最终的差异特征向量进行正则化处理,得到处理结果;Performing regularization processing on the final difference feature vector to obtain a processing result;
对所述处理结果进行函数处理,得到所述线上句向量和所述候选句向量之间的相似度。Performing function processing on the processing result to obtain the similarity between the online sentence vector and the candidate sentence vector.
在获取到线上句向量和候选句向量时,分别计算该线上句向量和候选句向量在乘法、减法和最大值三个衡量维度上的差异特征向量。其中,该乘法即将该线上句向量和候选句向量进行点乘,得到的结果即为该线上句向量和候选句向量在乘法衡量维度上的差异特征向量;减法即为将该线上句向量和候选句向量进行减法运算,得到的结果即为该线上句向量和候选句向量在减法衡量维度上的差异特征向量;最大值即为将该线上句向量和候选句向量取最大值,得到的最大值即为该线上句向量和候选句向量在最大值衡量维度上的差异特征向量。将该乘法、减法和最大值三个衡量维度上分别对应的差异特征值拼接在一起,得到该线上句向量和候选句向量最终的差异特征向量。其中,该衡量维度包括但不限于乘法、减法和最大值三个衡量维度,还可以包括最小值等衡量维度。When the online sentence vector and the candidate sentence vector are obtained, the difference feature vectors of the online sentence vector and the candidate sentence vector in the three measurement dimensions of multiplication, subtraction and maximum are calculated respectively. Among them, the multiplication is to perform dot multiplication on the online sentence vector and the candidate sentence vector, and the result obtained is the difference feature vector of the online sentence vector and the candidate sentence vector in the multiplication measurement dimension; the subtraction is the online sentence The vector and the candidate sentence vector are subtracted, and the result is the difference feature vector of the line sentence vector and the candidate sentence vector in the subtraction measurement dimension; the maximum value is the maximum value of the line sentence vector and the candidate sentence vector , The maximum value obtained is the feature vector of the difference between the line sentence vector and the candidate sentence vector in the maximum measurement dimension. The difference feature values corresponding to the three measurement dimensions of the multiplication, subtraction, and maximum value are spliced together to obtain the final difference feature vector of the online sentence vector and the candidate sentence vector. Among them, the measurement dimension includes, but is not limited to, three measurement dimensions of multiplication, subtraction, and maximum value, and may also include measurement dimensions such as minimum value.
在得到该最终的差异特征向量时,对该差异特征向量进行正则化,经过dense层降维和激活函数sigmoid处理,其中,通过sigmoid函数可以将变量映射到0到1之间,由此即可得到一个输出结果为0至1的概率值。根据该概率值来衡量线上句向量与候选句向量 之间的相似度;如在概率大于0.5时,确定线上句向量和候选句向量相似,否则,则不相似。When the final difference feature vector is obtained, the difference feature vector is regularized, and subjected to dense layer dimensionality reduction and activation function sigmoid processing. Among them, the sigmoid function can map the variable to between 0 and 1, which can be obtained An output result is a probability value from 0 to 1. According to the probability value, the similarity between the online sentence vector and the candidate sentence vector is measured; if the probability is greater than 0.5, it is determined that the online sentence vector and the candidate sentence vector are similar, otherwise, they are not similar.
在本实施例中,实现了对线上句向量及候选句向量的拼接匹配,同样无需整个模型的处理,提高了模型的处理效率,并通过相似度输出确定匹配度最高的候选句向量,进一步地提高了对问题答案获取的准确率。In this embodiment, the splicing and matching of online sentence vectors and candidate sentence vectors is realized, and the processing of the entire model is also not required, which improves the processing efficiency of the model, and the candidate sentence vector with the highest matching degree is determined through the similarity output, and further This greatly improves the accuracy of obtaining answers to questions.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,该计算机可读指令可存储于一计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a computer-readable storage medium. When the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Among them, the aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowchart of the drawings are displayed in sequence as indicated by the arrows, these steps are not necessarily performed in sequence in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least part of the steps in the flowchart of the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution is also It is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
进一步参考图5,作为对上述图2所示方法的实现,本申请提供了一种语义召回装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to FIG. 5, as an implementation of the method shown in FIG. 2, this application provides an embodiment of a semantic recall device. The device embodiment corresponds to the method embodiment shown in FIG. Used in various electronic devices.
如图5所示,本实施例所述的语义召回装置600包括:As shown in FIG. 5, the semantic recall device 600 in this embodiment includes:
第一获取模块610,用于在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;The first obtaining module 610 is configured to obtain the online sentence vector corresponding to the online query data based on the sentence vector generator when the online query data is received;
其中,所述第一获取模块610包括:Wherein, the first obtaining module 610 includes:
第一获取单元,用于基于句向量生成器,获取所述线上查询数据的字向量;The first obtaining unit is configured to obtain the word vector of the online query data based on the sentence vector generator;
第一处理单元,用于对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量。The first processing unit is configured to perform multi-layer convolution processing on the word vector to obtain the online sentence vector of the online query data.
所述第一获取单元还包括:The first acquiring unit further includes:
第二处理单元,用于基于句向量生成器的标记解析层对所述线上查询数据中的每个字进行ID化处理,得到所述线上查询数据中的每个字对应的ID;The second processing unit is configured to perform ID processing on each word in the online query data based on the tag analysis layer of the sentence vector generator to obtain an ID corresponding to each word in the online query data;
第三处理单元,用于基于所述句向量生成器的嵌入层对所述ID进行特征编码,得到所述线上查询数据中每个字对应的字向量。The third processing unit is configured to perform feature encoding on the ID based on the embedding layer of the sentence vector generator to obtain a word vector corresponding to each word in the online query data.
所述第一处理单元还包括:The first processing unit further includes:
第四处理单元,用于基于卷积神经网络对所述字向量进行多层卷积处理,得到所述线上查询数据对应的语义特征;A fourth processing unit, configured to perform multi-layer convolution processing on the word vector based on a convolutional neural network to obtain semantic features corresponding to the online query data;
第一拼接单元,用于将每次得到的所述语义特征拼接在一起,得到所述线上查询数据的线上句向量。The first splicing unit is used to splice the semantic features obtained each time together to obtain the online sentence vector of the online query data.
线上查询数据为在线接收到的实时查询数据。在接收到该线上查询数据时,则基于句向量生成器获取该线上查询数据对应的线上句向量。其中,得到的该线上句向量即为该线上查询数据对应的句向量。具体地,在接收到线上查询数据时,该线上查询数据为一个句子,将该句子输入至句向量生成器中的tokenizer层,基于该tokenizer层对该线上查询数据中的字进行id化,即将句子中的每个字转化为ID的格式。而后将该ID通过embedding(嵌入)层,即可得到该线上查询数据中每个字对应的字向量。在得到该字向量时,对该字向量进行卷积处理则可以得到当前该线上查询数据对应的线上句向量。Online query data is real-time query data received online. When the online query data is received, the online sentence vector corresponding to the online query data is obtained based on the sentence vector generator. Wherein, the obtained online sentence vector is the sentence vector corresponding to the online query data. Specifically, when online query data is received, the online query data is a sentence, the sentence is input to the tokenizer layer in the sentence vector generator, and the word in the online query data is id based on the tokenizer layer Transformation means converting each word in the sentence into the format of ID. Then the ID is passed through the embedding layer to obtain the word vector corresponding to each word in the online query data. When the word vector is obtained, convolution processing is performed on the word vector to obtain the online sentence vector corresponding to the current online query data.
句向量生成器为对线上查询数据进行处理的独立模型结构,传统的深度学习模型通常包括表征层和输出层,将传统深度学习模型的表征层和输出层拆分开,将表征层的部分作为句向量生成器,即得到对应的句向量生成器。以CNN模型为例,在该CNN模型中,该句 向量生成器如图3所示。The sentence vector generator is an independent model structure for processing online query data. The traditional deep learning model usually includes a representation layer and an output layer. The representation layer and output layer of the traditional deep learning model are separated, and the part of the representation layer is separated. As a sentence vector generator, the corresponding sentence vector generator is obtained. Take the CNN model as an example. In the CNN model, the sentence vector generator is shown in Figure 3.
由图3可知,该模型中,q1(char)表示输入的语句q1,即线上查询数据,而后经过embedding嵌入层得到该线上查询数据中每个字对应的字向量,该字向量通过(Conv+GlobalMaxPooling)*3,即三层卷积神经网络进行卷积处理,得到卷积结果,其中,conv为卷积,GlobalMaxPooling为全局池化。Concat对得到的卷积结果进行拼接,output输出拼接后的结果,即可得到该线上查询数据对应的线上句向量。其中,由于进行了三层卷积,则要对每层卷积的结果进行拼接,多层卷积的目的是使得到的数据更精确,因此,其他模型不一定包括concat。It can be seen from Figure 3 that in the model, q1(char) represents the input sentence q1, that is, online query data, and then through the embedding layer to obtain the word vector corresponding to each word in the online query data, the word vector passes ( Conv+GlobalMaxPooling)*3, that is, the three-layer convolutional neural network performs convolution processing to obtain the convolution result, where conv is convolution and GlobalMaxPooling is global pooling. Concat splices the obtained convolution results, and outputs the spliced result to obtain the online sentence vector corresponding to the online query data. Among them, because three layers of convolution are performed, the results of each layer of convolution must be spliced. The purpose of multi-layer convolution is to make the obtained data more accurate. Therefore, other models may not include concat.
第二获取模块620,用于获取存储的候选句向量;The second obtaining module 620 is configured to obtain stored candidate sentence vectors;
其中,所述第二获取模块620包括:Wherein, the second obtaining module 620 includes:
第二获取单元,用于获取问题库中存储的候选问题;The second obtaining unit is used to obtain candidate questions stored in the question library;
第一计算单元,用于基于所述句向量生成器对所述候选问题进行离线计算,得到所述候选问题对应的候选句向量。The first calculation unit is configured to perform offline calculation on the candidate question based on the sentence vector generator to obtain a candidate sentence vector corresponding to the candidate question.
第三获取单元,用于获取每个所述候选句向量对应的唯一标识信息;The third acquiring unit is configured to acquire the unique identification information corresponding to each candidate sentence vector;
存储单元,用于根据所述标识信息,将所述候选句向量以字典的形式,与所述候选问题关联存储至数据库中。The storage unit is configured to store the candidate sentence vector in a dictionary in the form of a dictionary in association with the candidate question in a database according to the identification information.
候选句向量被预先存储在数据库中,候选句向量为候选问题通过句向量生成器预先得到并存储。在问答系统中,预先获取候选问题,并通过线下的句向量生成器对候选问题的句向量进行离线生成,在得到该候选问题对应的候选句向量时,则将该候选句向量存储在数据库中。The candidate sentence vector is pre-stored in the database, and the candidate sentence vector is obtained and stored in advance by the sentence vector generator for the candidate question. In the question answering system, candidate questions are obtained in advance, and the sentence vector of the candidate question is generated offline through the offline sentence vector generator. When the candidate sentence vector corresponding to the candidate question is obtained, the candidate sentence vector is stored in the database in.
拼接模块630,用于基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;The splicing module 630 is configured to match the online sentence vector and the candidate sentence vector based on a sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
其中,所述拼接模块包括:Wherein, the splicing module includes:
第二计算单元,用于计算所述线上句向量和所述候选句向量在乘法、减法和最大值三个衡量维度上的差异特征向量;The second calculation unit is used to calculate the difference feature vector of the online sentence vector and the candidate sentence vector in the three measurement dimensions of multiplication, subtraction, and maximum;
第二拼接单元,用于将所述三个衡量维度上的差异特征向量拼接在一起,得到最终的差异特征向量;The second splicing unit is used to splice the difference feature vectors in the three measurement dimensions together to obtain the final difference feature vector;
第五处理单元,用于对所述最终的差异特征向量进行正则化处理,得到处理结果;A fifth processing unit, configured to perform regularization processing on the final difference feature vector to obtain a processing result;
第六处理单元,用于对所述处理结果进行函数处理,得到所述线上句向量和所述候选句向量之间的相似度。The sixth processing unit is configured to perform function processing on the processing result to obtain the similarity between the online sentence vector and the candidate sentence vector.
在获取到候选句向量与线上句向量时,基于向量拼接器对该候选句向量和线上句向量的相似度进行计算。具体地,计算该候选句向量和线上句向量在不同衡量维度上的差异特征向量,最后组合拼接不同衡量维度上的差异特征向量,即可得到该候选句向量和线上句向量最终的差异特征向量。在得到该最终的差异特征向量时,对该差异特征向量进行正则化处理,即可得到线上句向量和候选句向量的相似度。When the candidate sentence vector and the online sentence vector are obtained, the similarity between the candidate sentence vector and the online sentence vector is calculated based on the vector splicer. Specifically, the difference feature vectors of the candidate sentence vector and the online sentence vector in different measurement dimensions are calculated, and finally the difference feature vectors in different measurement dimensions are combined and spliced to obtain the final difference between the candidate sentence vector and the online sentence vector Feature vector. When the final difference feature vector is obtained, regularization processing is performed on the difference feature vector to obtain the similarity between the online sentence vector and the candidate sentence vector.
以CNN模型为例,在该CNN模型中,该句向量拼接器如图4所示。由图4可知,该模型中,q1(实时)表示线上句向量,q2(离线)表示候选句向量,在获取到线上句向量及候选句向量时,将该线上句向量及候选句向量输入至Diff+Mul+Max;Diff+Mul+Max则对该线上句向量及候选句向量,从减法、乘法和最大值三个衡量维度进行差异特征向量计算,由此得到该线上句向量及候选句向量在三个维度的差异特征向量;concat对在该三个衡量 维度计算得到的差异特征向量进行拼接,得到最终的差异特征向量;将该最终的差异特征向量输入至3*(Dense+BatchNormalization+Relu+Dropout),使其对拼接得到的最终的差异特征向量进行正则化处理。而后将该正则化处理的结果输入至Sigmoid,Sigmoid为激活函数,用
Figure PCTCN2020118454-appb-000002
表示,将该正则化处理的结果通过该激活函数,即可得到该线上句向量和候选句向量的相似度。
Take the CNN model as an example. In the CNN model, the sentence vector splicer is shown in Figure 4. It can be seen from Figure 4 that in the model, q1 (real-time) represents the online sentence vector, q2 (offline) represents the candidate sentence vector. When the online sentence vector and candidate sentence vector are obtained, the online sentence vector and candidate sentence The vector is input to Diff+Mul+Max; Diff+Mul+Max calculates the difference feature vector of the online sentence vector and candidate sentence vector from the three measurement dimensions of subtraction, multiplication and maximum, thereby obtaining the online sentence The difference feature vector of the vector and the candidate sentence vector in the three dimensions; concat splices the difference feature vector calculated in the three measurement dimensions to obtain the final difference feature vector; input the final difference feature vector to 3*( Dense+BatchNormalization+Relu+Dropout) to make it regularize the final difference feature vector obtained by splicing. Then input the result of the regularization process into Sigmoid, Sigmoid is the activation function, and use
Figure PCTCN2020118454-appb-000002
Indicates that the result of the regularization process is passed through the activation function to obtain the similarity between the online sentence vector and the candidate sentence vector.
排序模块640,用于根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The sorting module 640 is configured to sort the candidate sentence vectors in descending order according to the similarity, and return the answer to the candidate question corresponding to the first-ranked candidate sentence vector as the correct answer.
在确定该线上句向量与候选句向量之间的相似度时,根据该相似度对候选句向量对应的候选问题进行降序排序,即从大到小排列。选取问题库中与该线上句向量相似度最高的候选句向量对应的候选问题的答案为正确答案。将该正确答案作为线上查询数据的正确答案,推送至用户界面。When determining the similarity between the online sentence vector and the candidate sentence vector, the candidate questions corresponding to the candidate sentence vector are sorted in descending order according to the similarity, that is, sorted from large to small. The answer to the candidate question corresponding to the candidate sentence vector with the highest similarity of the sentence vector on the line is selected as the correct answer. Use the correct answer as the correct answer for online query data and push it to the user interface.
为解决上述技术问题,本申请实施例还提供计算机设备。具体请参阅图6,图6为本实施例计算机设备基本结构框图。In order to solve the above technical problems, the embodiments of the present application also provide computer equipment. Please refer to FIG. 6 for details. FIG. 6 is a block diagram of the basic structure of the computer device in this embodiment.
所述计算机设备6包括通过系统总线相互通信连接存储器61、处理器62、网络接口63。需要指出的是,图中仅示出了具有组件61-63的计算机设备6,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。其中,本技术领域技术人员可以理解,这里的计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(ApplicationSpecific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate Array,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。The computer device 6 includes a memory 61, a processor 62, and a network interface 63 that communicate with each other through a system bus. It should be pointed out that only the computer device 6 with components 61-63 is shown in the figure, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
所述存储器61至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。所述计算机可读存储介质可以是非易失性,也可以是易失性。在一些实施例中,所述存储器61可以是所述计算机设备6的内部存储单元,例如该计算机设备6的硬盘或内存。在另一些实施例中,所述存储器61也可以是所述计算机设备6的外部存储设备,例如该计算机设备6上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器61还可以既包括所述计算机设备6的内部存储单元也包括其外部存储设备。本实施例中,所述存储器61通常用于存储安装于所述计算机设备6的操作系统和各类应用软件,例如语义召回方法的计算机可读指令等。此外,所述存储器61还可以用于暂时地存储已经输出或者将要输出的各类数据。The memory 61 includes at least one type of readable storage medium, the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static memory Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. The computer-readable storage medium may be non-volatile or volatile. In some embodiments, the memory 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6. In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk equipped on the computer device 6, a smart media card (SMC), a secure digital (Secure Digital, SD) card, Flash Card, etc. Of course, the memory 61 may also include both the internal storage unit of the computer device 6 and its external storage device. In this embodiment, the memory 61 is generally used to store an operating system and various application software installed in the computer device 6, such as computer-readable instructions of a semantic recall method. In addition, the memory 61 can also be used to temporarily store various types of data that have been output or will be output.
所述处理器62在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器62通常用于控制所述计算机设备6的总体操作。本实施例中,所述处理器62用于运行所述存储器61中存储的计算机可读指令或者处理数据,例如运行所述语义召回方法的计算机可读指令。In some embodiments, the processor 62 may be a central processing unit (CPU), a controller, a microcontroller, a microprocessor, or other data processing chips. The processor 62 is generally used to control the overall operation of the computer device 6. In this embodiment, the processor 62 is configured to execute computer-readable instructions or process data stored in the memory 61, for example, computer-readable instructions for executing the semantic recall method.
所述网络接口63可包括无线网络接口或有线网络接口,该网络接口63通常用于在所述计算机设备6与其他电子设备之间建立通信连接。The network interface 63 may include a wireless network interface or a wired network interface, and the network interface 63 is generally used to establish a communication connection between the computer device 6 and other electronic devices.
在本实施例中,所述计算机设备,实现了在不改变原有模型的精度下,将传统模型的表征层和输出层拆分开分别作为句向量生成器和拼接器,在获取句向量时,只需要通过单个的句向量生成器对数据进行处理,再通过句向量拼接器对处理得到的数据与候选句向量进行拼接,而不需要整体的模型结构,提高了模型处理的并发量,提高了模型在处理语料资料时的处理效率及问答匹配的准确率。并且可以被应用于各种不同类型的模型上,具有迁移性,可拓展性高。In this embodiment, the computer device realizes that without changing the accuracy of the original model, the representation layer and the output layer of the traditional model are split into a sentence vector generator and a splicer, respectively. When obtaining the sentence vector , Only need to process the data through a single sentence vector generator, and then use the sentence vector splicer to splice the processed data and candidate sentence vectors, without the need for the overall model structure, which increases the amount of concurrency of model processing. The processing efficiency of the model in processing corpus data and the accuracy of question and answer matching. And can be applied to a variety of different types of models, with mobility and high scalability.
本申请还提供了另一种实施方式,即提供一种计算机可读存储介质,所述计算机可读存储介质存储有语义召回流程,所述语义召回流程可被至少一个处理器执行,以使所述至少一个处理器执行如上述的语义召回方法的步骤。This application also provides another implementation manner, that is, to provide a computer-readable storage medium that stores a semantic recall process, and the semantic recall process can be executed by at least one processor to enable all The at least one processor executes the steps of the semantic recall method as described above.
在本实施例中,所述计算机可读存储介质,实现了在不改变原有模型的精度下,将传统模型的表征层和输出层拆分开分别作为句向量生成器和拼接器,在获取句向量时,只需要通过单个的句向量生成器对数据进行处理,再通过句向量拼接器对处理得到的数据与候选句向量进行拼接,而不需要整体的模型结构,提高了模型处理的并发量,提高了模型在处理语料资料时的处理效率及问答匹配的准确率。并且可以被应用于各种不同类型的模型上,具有迁移性,可拓展性高。In this embodiment, the computer-readable storage medium realizes that without changing the accuracy of the original model, the representation layer and the output layer of the traditional model are split into a sentence vector generator and a splicer, respectively. When using sentence vectors, only a single sentence vector generator is needed to process the data, and then a sentence vector splicer is used to splice the processed data and candidate sentence vectors, without the need for the overall model structure, which improves the concurrency of model processing. This improves the model’s processing efficiency and the accuracy of Q&A matching when processing corpus data. And can be applied to a variety of different types of models, with mobility and high scalability.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present application.
显然,以上所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例,附图中给出了本申请的较佳实施例,但并不限制本申请的专利范围。本申请可以以许多不同的 形式来实现,相反地,提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明,对于本领域的技术人员来而言,其依然可以对前述各具体实施方式所记载的技术方案进行修改,或者对其中部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构,直接或间接运用在其他相关的技术领域,均同理在本申请专利保护范围之内。Obviously, the embodiments described above are only a part of the embodiments of the present application, rather than all of the embodiments. The drawings show preferred embodiments of the present application, but do not limit the patent scope of the present application. This application can be implemented in many different forms. On the contrary, the purpose of providing these examples is to make the understanding of the disclosure of this application more thorough and comprehensive. Although this application has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it is still possible for those skilled in the art to modify the technical solutions described in each of the foregoing specific embodiments, or equivalently replace some of the technical features. . All equivalent structures made by using the contents of the description and drawings of this application, directly or indirectly used in other related technical fields, are similarly within the scope of patent protection of this application.

Claims (20)

  1. 一种语义召回方法,包括下述步骤:A semantic recall method includes the following steps:
    在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;When receiving online query data, obtain the online sentence vector corresponding to the online query data based on the sentence vector generator;
    获取存储的候选句向量;Obtain the stored candidate sentence vector;
    基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;Matching the online sentence vector and the candidate sentence vector based on the sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
    根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
  2. 根据权利要求1所述的语义召回方法,其中,所述基于句向量生成器获取所述线上查询数据对应的线上句向量的步骤包括:The semantic recall method according to claim 1, wherein the step of obtaining the online sentence vector corresponding to the online query data by the sentence-based vector generator comprises:
    基于句向量生成器,获取所述线上查询数据的字向量;Obtain the word vector of the online query data based on the sentence vector generator;
    对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量。Multi-layer convolution processing is performed on the word vector to obtain the online sentence vector of the online query data.
  3. 根据权利要求2所述的语义召回方法,其中,所述基于句向量生成器,获取所述线上查询数据的字向量的步骤包括:The semantic recall method according to claim 2, wherein the step of obtaining the word vector of the online query data based on the sentence vector generator comprises:
    基于句向量生成器的标记解析层对所述线上查询数据中的每个字进行ID化处理,得到所述线上查询数据中的每个字对应的ID;Based on the tag analysis layer of the sentence vector generator, IDize each word in the online query data to obtain an ID corresponding to each word in the online query data;
    基于所述句向量生成器的嵌入层对所述ID进行特征编码,得到所述线上查询数据中每个字对应的字向量。The ID is feature-encoded based on the embedding layer of the sentence vector generator to obtain a word vector corresponding to each word in the online query data.
  4. 根据权利要求2所述的语义召回方法,其中,所述对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量的步骤包括:The semantic recall method according to claim 2, wherein the step of performing multi-layer convolution processing on the word vector to obtain the online sentence vector of the online query data comprises:
    基于卷积神经网络对所述字向量进行多层卷积处理,得到所述线上查询数据对应的语义特征;Performing multi-layer convolution processing on the word vector based on a convolutional neural network to obtain semantic features corresponding to the online query data;
    将每次得到的所述语义特征拼接在一起,得到所述线上查询数据的线上句向量。The semantic features obtained each time are spliced together to obtain the online sentence vector of the online query data.
  5. 根据权利要求1所述的语义召回方法,其中,所述获取存储的候选句向量的步骤包括:The semantic recall method according to claim 1, wherein the step of obtaining the stored candidate sentence vector comprises:
    获取问题库中存储的候选问题;Obtain candidate questions stored in the question library;
    基于所述句向量生成器对所述候选问题进行离线计算,得到所述候选问题对应的候选句向量。Perform offline calculation on the candidate question based on the sentence vector generator to obtain a candidate sentence vector corresponding to the candidate question.
  6. 根据权利要求5所述的语义召回方法,其中,所述基于所述句向量生成器对所述候选问题进行离线计算,得到所述候选问题对应的候选句向量的步骤之后,还包括:The semantic recall method according to claim 5, wherein after the step of performing offline calculation on the candidate question based on the sentence vector generator to obtain the candidate sentence vector corresponding to the candidate question, the method further comprises:
    获取每个所述候选句向量对应的唯一标识信息;Acquiring unique identification information corresponding to each candidate sentence vector;
    根据所述标识信息,将所述候选句向量以字典的形式,与所述候选问题关联存储至数据库中。According to the identification information, the candidate sentence vector is stored in a database in a dictionary in association with the candidate question.
  7. 根据权利要求1至6任意一项所述的语义召回方法,其中,所述基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度的步骤包括:The semantic recall method according to any one of claims 1 to 6, wherein the sentence vector-based splicer matches the online sentence vector and the candidate sentence vector to obtain the online sentence vector and the candidate sentence vector The steps of similarity of sentence vectors include:
    计算所述线上句向量和所述候选句向量在乘法、减法和最大值三个衡量维度上的差异特征向量;Calculating the difference feature vectors of the online sentence vector and the candidate sentence vector in the three measurement dimensions of multiplication, subtraction, and maximum;
    将所述三个衡量维度上的差异特征向量拼接在一起,得到最终的差异特征向量;Splicing the difference feature vectors in the three measurement dimensions together to obtain the final difference feature vector;
    对所述最终的差异特征向量进行正则化处理,得到处理结果;Performing regularization processing on the final difference feature vector to obtain a processing result;
    对所述处理结果进行函数处理,得到所述线上句向量和所述候选句向量之间的相似度。Performing function processing on the processing result to obtain the similarity between the online sentence vector and the candidate sentence vector.
  8. 一种语义召回装置,包括:A semantic recall device includes:
    第一获取模块,用于在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;The first obtaining module is configured to obtain the online sentence vector corresponding to the online query data based on the sentence vector generator when the online query data is received;
    第二获取模块,用于获取存储的候选句向量;The second obtaining module is used to obtain the stored candidate sentence vector;
    拼接模块,用于基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;A splicing module, configured to match the online sentence vector and the candidate sentence vector based on a sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
    排序模块,用于根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The sorting module is configured to sort the candidate sentence vectors in descending order according to the similarity, and return the answer to the candidate question corresponding to the candidate sentence vector ranked first as the correct answer.
  9. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现如下所述的语义召回方法的步骤:A computer device includes a memory and a processor, wherein computer-readable instructions are stored in the memory, and when the processor executes the computer-readable instructions, the steps of the semantic recall method as described below are implemented:
    在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;When receiving online query data, obtain the online sentence vector corresponding to the online query data based on the sentence vector generator;
    获取存储的候选句向量;Obtain the stored candidate sentence vector;
    基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;Matching the online sentence vector and the candidate sentence vector based on the sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
    根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
  10. 根据权利要求9所述的计算机设备,其中,所述所述基于句向量生成器获取所述线上查询数据对应的线上句向量的步骤包括:9. The computer device according to claim 9, wherein the step of obtaining the online sentence vector corresponding to the online query data by the sentence-based vector generator comprises:
    基于句向量生成器,获取所述线上查询数据的字向量;Obtain the word vector of the online query data based on the sentence vector generator;
    对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量。Multi-layer convolution processing is performed on the word vector to obtain the online sentence vector of the online query data.
  11. 根据权利要求10所述的计算机设备,其中,所述基于句向量生成器,获取所述线上查询数据的字向量的步骤包括:The computer device according to claim 10, wherein the step of obtaining the word vector of the online query data by the sentence vector generator comprises:
    基于句向量生成器的标记解析层对所述线上查询数据中的每个字进行ID化处理,得到所述线上查询数据中的每个字对应的ID;Based on the tag analysis layer of the sentence vector generator, IDize each word in the online query data to obtain an ID corresponding to each word in the online query data;
    基于所述句向量生成器的嵌入层对所述ID进行特征编码,得到所述线上查询数据中每个字对应的字向量。The ID is feature-encoded based on the embedding layer of the sentence vector generator to obtain a word vector corresponding to each word in the online query data.
  12. 根据权利要求10所述的计算机设备,其中,所述对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量的步骤包括:The computer device according to claim 10, wherein the step of performing multi-layer convolution processing on the word vector to obtain the online sentence vector of the online query data comprises:
    基于卷积神经网络对所述字向量进行多层卷积处理,得到所述线上查询数据对应的语义特征;Performing multi-layer convolution processing on the word vector based on a convolutional neural network to obtain semantic features corresponding to the online query data;
    将每次得到的所述语义特征拼接在一起,得到所述线上查询数据的线上句向量。The semantic features obtained each time are spliced together to obtain the online sentence vector of the online query data.
  13. 根据权利要求9所述的计算机设备,其中,所述获取存储的候选句向量的步骤包括:The computer device according to claim 9, wherein the step of obtaining the stored candidate sentence vector comprises:
    获取问题库中存储的候选问题;Obtain candidate questions stored in the question library;
    基于所述句向量生成器对所述候选问题进行离线计算,得到所述候选问题对应的候选句向量。Perform offline calculation on the candidate question based on the sentence vector generator to obtain a candidate sentence vector corresponding to the candidate question.
  14. 根据权利要求13所述的计算机设备,其中,所述基于所述句向量生成器对所述候选问题进行离线计算,得到所述候选问题对应的候选句向量的步骤之后,还包括:The computer device according to claim 13, wherein after the step of performing offline calculation on the candidate question based on the sentence vector generator to obtain the candidate sentence vector corresponding to the candidate question, the method further comprises:
    获取每个所述候选句向量对应的唯一标识信息;Acquiring unique identification information corresponding to each candidate sentence vector;
    根据所述标识信息,将所述候选句向量以字典的形式,与所述候选问题关联存储至数据库中。According to the identification information, the candidate sentence vector is stored in a database in a dictionary in association with the candidate question.
  15. 根据权利要求9至14任意一项所述的计算机设备,其中,所述基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度的步骤包括:The computer device according to any one of claims 9 to 14, wherein the sentence vector-based splicer matches the online sentence vector and the candidate sentence vector to obtain the online sentence vector and the candidate sentence The steps of vector similarity include:
    计算所述线上句向量和所述候选句向量在乘法、减法和最大值三个衡量维度上的差异特征向量;Calculating the difference feature vectors of the online sentence vector and the candidate sentence vector in the three measurement dimensions of multiplication, subtraction, and maximum;
    将所述三个衡量维度上的差异特征向量拼接在一起,得到最终的差异特征向量;Splicing the difference feature vectors in the three measurement dimensions together to obtain the final difference feature vector;
    对所述最终的差异特征向量进行正则化处理,得到处理结果;Performing regularization processing on the final difference feature vector to obtain a processing result;
    对所述处理结果进行函数处理,得到所述线上句向量和所述候选句向量之间的相似度。Performing function processing on the processing result to obtain the similarity between the online sentence vector and the candidate sentence vector.
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下所述的语义召回方法的步骤:A computer-readable storage medium having computer-readable instructions stored thereon, and when the computer-readable instructions are executed by a processor, the steps of the semantic recall method as described below are realized:
    在接收到线上查询数据时,基于句向量生成器获取所述线上查询数据对应的线上句向量;When receiving online query data, obtain the online sentence vector corresponding to the online query data based on the sentence vector generator;
    获取存储的候选句向量;Obtain the stored candidate sentence vector;
    基于句向量拼接器匹配所述线上句向量和所述候选句向量,得到所述线上句向量和所述候选句向量的相似度;Matching the online sentence vector and the candidate sentence vector based on the sentence vector splicer to obtain the similarity between the online sentence vector and the candidate sentence vector;
    根据所述相似度对所述候选句向量进行降序排序,并返回排序第一的候选句向量对应的候选问题的答案作为正确答案。The candidate sentence vectors are sorted in descending order according to the similarity, and the answer to the candidate question corresponding to the first-ranked candidate sentence vector is returned as the correct answer.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述所述基于句向量生成器获取所述线上查询数据对应的线上句向量的步骤包括:15. The computer-readable storage medium according to claim 16, wherein the step of obtaining the online sentence vector corresponding to the online query data by the sentence-based vector generator comprises:
    基于句向量生成器,获取所述线上查询数据的字向量;Obtain the word vector of the online query data based on the sentence vector generator;
    对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量。Multi-layer convolution processing is performed on the word vector to obtain the online sentence vector of the online query data.
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述基于句向量生成器,获取所述线上查询数据的字向量的步骤包括:18. The computer-readable storage medium according to claim 17, wherein the step of obtaining the word vector of the online query data based on the sentence vector generator comprises:
    基于句向量生成器的标记解析层对所述线上查询数据中的每个字进行ID化处理,得到所述线上查询数据中的每个字对应的ID;Based on the tag analysis layer of the sentence vector generator, IDize each word in the online query data to obtain an ID corresponding to each word in the online query data;
    基于所述句向量生成器的嵌入层对所述ID进行特征编码,得到所述线上查询数据中每个字对应的字向量。The ID is feature-encoded based on the embedding layer of the sentence vector generator to obtain a word vector corresponding to each word in the online query data.
  19. 根据权利要求17所述的计算机可读存储介质,其中,所述对所述字向量进行多层卷积处理,得到所述线上查询数据的线上句向量的步骤包括:18. The computer-readable storage medium according to claim 17, wherein the step of performing multi-layer convolution processing on the word vector to obtain the online sentence vector of the online query data comprises:
    基于卷积神经网络对所述字向量进行多层卷积处理,得到所述线上查询数据对应的语义特征;Performing multi-layer convolution processing on the word vector based on a convolutional neural network to obtain semantic features corresponding to the online query data;
    将每次得到的所述语义特征拼接在一起,得到所述线上查询数据的线上句向量。The semantic features obtained each time are spliced together to obtain the online sentence vector of the online query data.
  20. 根据权利要求16所述的计算机可读存储介质,其中,所述获取存储的候选句向量的步骤包括:The computer-readable storage medium according to claim 16, wherein the step of obtaining the stored candidate sentence vector comprises:
    获取问题库中存储的候选问题;Obtain candidate questions stored in the question library;
    基于所述句向量生成器对所述候选问题进行离线计算,得到所述候选问题对应的候选句向量。Perform offline calculation on the candidate question based on the sentence vector generator to obtain a candidate sentence vector corresponding to the candidate question.
PCT/CN2020/118454 2020-05-13 2020-09-28 Semantic recall method, apparatus, computer device, and storage medium WO2021135455A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010402690.9A CN111767375A (en) 2020-05-13 2020-05-13 Semantic recall method and device, computer equipment and storage medium
CN202010402690.9 2020-05-13

Publications (1)

Publication Number Publication Date
WO2021135455A1 true WO2021135455A1 (en) 2021-07-08

Family

ID=72719086

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118454 WO2021135455A1 (en) 2020-05-13 2020-09-28 Semantic recall method, apparatus, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN111767375A (en)
WO (1) WO2021135455A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837307A (en) * 2021-09-29 2021-12-24 平安科技(深圳)有限公司 Data similarity calculation method and device, readable medium and electronic equipment
CN115952270A (en) * 2023-03-03 2023-04-11 中国海洋大学 Intelligent question and answer method and device for refrigerator and storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597208A (en) * 2020-12-29 2021-04-02 深圳价值在线信息科技股份有限公司 Enterprise name retrieval method, enterprise name retrieval device and terminal equipment
CN113254620B (en) * 2021-06-21 2022-08-30 中国平安人寿保险股份有限公司 Response method, device and equipment based on graph neural network and storage medium
CN114328908A (en) * 2021-11-08 2022-04-12 腾讯科技(深圳)有限公司 Question and answer sentence quality inspection method and device and related products
CN114064820B (en) * 2021-11-29 2023-11-24 上证所信息网络有限公司 Mixed architecture-based table semantic query coarse arrangement method
CN114969486B (en) * 2022-08-02 2022-11-04 平安科技(深圳)有限公司 Corpus recommendation method, apparatus, device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086386A (en) * 2018-07-26 2018-12-25 腾讯科技(深圳)有限公司 Data processing method, device, computer equipment and storage medium
CN109815318A (en) * 2018-12-24 2019-05-28 平安科技(深圳)有限公司 The problems in question answering system answer querying method, system and computer equipment
CN110020009A (en) * 2017-09-29 2019-07-16 阿里巴巴集团控股有限公司 Online answering method, apparatus and system
CN110287296A (en) * 2019-05-21 2019-09-27 平安科技(深圳)有限公司 A kind of problem answers choosing method, device, computer equipment and storage medium
CN110347807A (en) * 2019-05-20 2019-10-18 平安科技(深圳)有限公司 Problem information processing method and processing device
CN110704587A (en) * 2019-08-22 2020-01-17 平安科技(深圳)有限公司 Text answer searching method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110020009A (en) * 2017-09-29 2019-07-16 阿里巴巴集团控股有限公司 Online answering method, apparatus and system
CN109086386A (en) * 2018-07-26 2018-12-25 腾讯科技(深圳)有限公司 Data processing method, device, computer equipment and storage medium
CN109815318A (en) * 2018-12-24 2019-05-28 平安科技(深圳)有限公司 The problems in question answering system answer querying method, system and computer equipment
CN110347807A (en) * 2019-05-20 2019-10-18 平安科技(深圳)有限公司 Problem information processing method and processing device
CN110287296A (en) * 2019-05-21 2019-09-27 平安科技(深圳)有限公司 A kind of problem answers choosing method, device, computer equipment and storage medium
CN110704587A (en) * 2019-08-22 2020-01-17 平安科技(深圳)有限公司 Text answer searching method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837307A (en) * 2021-09-29 2021-12-24 平安科技(深圳)有限公司 Data similarity calculation method and device, readable medium and electronic equipment
CN115952270A (en) * 2023-03-03 2023-04-11 中国海洋大学 Intelligent question and answer method and device for refrigerator and storage medium
CN115952270B (en) * 2023-03-03 2023-05-30 中国海洋大学 Intelligent question-answering method and device for refrigerator and storage medium

Also Published As

Publication number Publication date
CN111767375A (en) 2020-10-13

Similar Documents

Publication Publication Date Title
WO2021135455A1 (en) Semantic recall method, apparatus, computer device, and storage medium
CN109241524B (en) Semantic analysis method and device, computer-readable storage medium and electronic equipment
US10762150B2 (en) Searching method and searching apparatus based on neural network and search engine
WO2021121198A1 (en) Semantic similarity-based entity relation extraction method and apparatus, device and medium
CN112231569B (en) News recommendation method, device, computer equipment and storage medium
CN107145485B (en) Method and apparatus for compressing topic models
CN114780727A (en) Text classification method and device based on reinforcement learning, computer equipment and medium
CN111666416B (en) Method and device for generating semantic matching model
WO2019154411A1 (en) Word vector retrofitting method and device
CN112836521A (en) Question-answer matching method and device, computer equipment and storage medium
US20230008897A1 (en) Information search method and device, electronic device, and storage medium
CN114003682A (en) Text classification method, device, equipment and storage medium
CN115438149A (en) End-to-end model training method and device, computer equipment and storage medium
CN111078849B (en) Method and device for outputting information
CN111444321B (en) Question answering method, device, electronic equipment and storage medium
CN115374771A (en) Text label determination method and device
CN111008213A (en) Method and apparatus for generating language conversion model
CN113837307A (en) Data similarity calculation method and device, readable medium and electronic equipment
CN110807097A (en) Method and device for analyzing data
CN114742058B (en) Named entity extraction method, named entity extraction device, computer equipment and storage medium
CN116881446A (en) Semantic classification method, device, equipment and storage medium thereof
CN116957006A (en) Training method, device, equipment, medium and program product of prediction model
WO2021169356A1 (en) Voice file repairing method and apparatus, computer device, and storage medium
CN111274818B (en) Word vector generation method and device
CN115129863A (en) Intention recognition method, device, equipment, storage medium and computer program product

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20908918

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20908918

Country of ref document: EP

Kind code of ref document: A1