WO2023168814A1 - Sentence vector generation method and apparatus, computer device and storage medium - Google Patents

Sentence vector generation method and apparatus, computer device and storage medium Download PDF

Info

Publication number
WO2023168814A1
WO2023168814A1 PCT/CN2022/089817 CN2022089817W WO2023168814A1 WO 2023168814 A1 WO2023168814 A1 WO 2023168814A1 CN 2022089817 W CN2022089817 W CN 2022089817W WO 2023168814 A1 WO2023168814 A1 WO 2023168814A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentence
sequence
model
context
current
Prior art date
Application number
PCT/CN2022/089817
Other languages
French (fr)
Chinese (zh)
Inventor
陈浩
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023168814A1 publication Critical patent/WO2023168814A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to sentence vector generation methods, devices, computer equipment and storage media.
  • Sentence embedding as a vector representation of text data, is widely used in many application scenarios of natural language processing.
  • mapping text data into a quantifiable vector space we can obtain sentence vector representations that represent text data features, semantics, grammar and other information, and then use vector clustering, classification and other methods to obtain the relationship between text sentences, which can realize the sentence vector in Application in actual scenarios.
  • Existing solutions for sentence vector construction mainly include construction methods based on word vector average and construction methods based on contrastive learning.
  • Construction methods based on word vector average such as word2vec, glove, bert, etc.
  • construction methods based on contrastive learning Construct positive samples for contrastive learning by using different methods, such as dropout, replacement, deletion, back-translation, etc.
  • the inventor realized that the shortcomings of the existing solutions are: 1) The construction method based on the average word vector, which destroys the dependence between words in the sentence, and the accuracy of feature extraction is low; 2) The method based on contrastive learning Construction method.
  • the similarity between the randomly selected negative samples and the original sentences is low, which leads to low difficulty in training the model.
  • the transfer ability of the model in actual tasks is insufficient, which in turn leads to the generated Sentence vectors have lower accuracy.
  • this application provides a sentence vector generation method, device, computer equipment and storage medium.
  • the main purpose is to solve the problem in the existing technology that the construction method based on the word vector average has low accuracy in sentence feature extraction, and based on The construction method of contrastive learning has a technical problem of insufficient transfer ability of the model in actual tasks, resulting in low accuracy of the generated sentence vectors.
  • a sentence vector generation method which method includes:
  • the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
  • the trained sequence-to-sequence model is obtained through the following steps:
  • the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
  • the trained sequence-to-sequence model is obtained.
  • a sentence vector generation device which device includes:
  • the model training module can be used to use the initial sequence-to-sequence model to encode and contextually decode the current sentence in the sequence from the context sentences in the constructed sentence sample set to obtain the above prediction sentence and the below prediction of the current sentence. sentence; and, based on the above predicted sentence and the following predicted sentence, a trained sequence-to-sequence model is obtained;
  • the preprocessing module is used to perform semantic segmentation on the obtained initial sentence text and obtain the segmented sentence text;
  • An encoding module configured to utilize a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing used to predict the context of the sentence text.
  • the sentence vector generation model is a trained sequence-to-sequence model. encoding layer.
  • a storage medium on which a computer program is stored.
  • the above sentence vector generation method is implemented, including:
  • the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
  • the trained sequence-to-sequence model is obtained through the following steps:
  • the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
  • the trained sequence-to-sequence model is obtained.
  • a computer device including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor.
  • the processor executes the program, the above sentence vector is realized.
  • Generation methods including:
  • the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
  • the trained sequence-to-sequence model is obtained through the following steps:
  • the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
  • the trained sequence-to-sequence model is obtained.
  • sequence-to-sequence model training is performed on sequences based on contextual sentences, and the encoding layer of the trained sequence-to-sequence model is used to generate sentence vectors, which can effectively improve the accuracy of sentence vector generation on the basis of improving the difficulty of model training. It ensures the integrity of the semantic information and grammatical information of the generated sentence vectors, thereby effectively avoiding the existing construction method based on the average word vector, destroying the dependence between words in the sentence, resulting in low accuracy of sentence feature extraction.
  • the training difficulty of the model is low, the transfer ability of the model in actual tasks is insufficient, and the accuracy of the generated sentence vectors is low.
  • Figure 1 shows a schematic flowchart of a sentence vector generation method provided by an embodiment of the present application
  • Figure 2 shows a schematic flowchart of another sentence vector generation method provided by an embodiment of the present application
  • Figure 3 shows a schematic diagram of the initial sequence-to-sequence model architecture provided by the embodiment of the present application
  • Figure 4 shows a schematic structural diagram of a sentence vector generation device provided by an embodiment of the present application
  • Figure 5 shows a schematic structural diagram of another sentence vector generation device provided by an embodiment of the present application.
  • AI Artificial Intelligence
  • AI is the theory, method, technology and application system that uses digital computers or digital computer-controlled machines to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics and other technologies.
  • Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometric technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • this embodiment provides a sentence vector generation method, as shown in Figure 1.
  • This method is explained by taking the method applied to computer equipment such as servers as an example.
  • the server can be an independent server or a cloud-provided server. Services, cloud database, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery network (CDN: Content Delivery Network), and big data and artificial intelligence platforms and other basics Cloud server for cloud computing services.
  • the above method includes the following steps:
  • Step 101 Perform semantic segmentation on the obtained initial sentence text to obtain segmented sentence text.
  • the book recommendation scenario it is suitable for recommending other similar books based on the obtained book text content.
  • a book recommendation request is received, according to the book title in the book recommendation request, obtain and For the book text content corresponding to the book title, the book text content is segmented based on Chinese punctuation, and through text segmentation, multiple sentence texts are obtained for input into the sentence vector generation model.
  • the book text content can be book abstract text, book introduction text, etc., which are not specifically limited here.
  • Step 102 Use a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing for predicting the context of the sentence text.
  • the sentence vector generation model is the encoding of a trained sequence-to-sequence model. layer; wherein, the trained sequence-to-sequence model is obtained through the following steps: using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and context decoded on the current sentence in the sequence, and the result is The upper predicted sentence and the lower predicted sentence of the current sentence are used; according to the upper predicted sentence and the lower predicted sentence, a trained sequence-to-sequence model is obtained.
  • the initial sequence-to-sequence model is trained based on the constructed sentence sample set including the context sentence pair sequence, where the context sentence pair sequence includes the current sentence and the context sentence corresponding to the current sentence, and the current sentence is input into the initial sequence to the sequence
  • the coding layer of the model performs encoding processing to obtain a vector representation containing the context feature information of the current sentence.
  • the vector representation containing the context feature information of the current sentence is input into the initial sequence into the two decoding layers set up in parallel in the sequence model.
  • the current sentence is obtained through decoding processing.
  • the coding layer of the trained sequence-to-sequence model has the coding ability to accurately predict the current sentence context and can retain the integrity of the semantic information and grammatical information of the current sentence context. Therefore, the vector representation output on this basis can contain the current sentence. Complete contextual feature information to ensure the accuracy of subsequent book recommendations.
  • the context sentence pair sequence constructed based on the current sentence and its context sentences is used as the input data of the initial sequence-to-sequence model, which can retain the interdependence and mutual influence between words without destroying the overall structure of the text data, thereby ensuring
  • the model can learn the complete semantic information and grammatical information contained in the sentence text, improving the accuracy of the model in extracting contextual sentence features.
  • the obtained initial sentence text can be semantically segmented according to the above solution to obtain the segmented sentence text, and a pre-built sentence vector generation model can be used to predict the context of the sentence text through coding processing.
  • the sentence vector generation model is the encoding layer of the trained sequence-to-sequence model; wherein the trained sequence-to-sequence model is obtained through the following steps: using the initial sequence-to-sequence model , perform coding processing and context decoding processing on the context sentences in the constructed sentence sample set to the current sentence in the sequence, and obtain the upper prediction sentence and the lower prediction sentence of the current sentence; according to the upper prediction sentence and the lower prediction sentence, we get Trained sequence-to-sequence model.
  • this embodiment uses context sentences to perform sequence-to-sequence model training on sequences, and uses the coding layer of the trained sequence-to-sequence model
  • the generated sentence vector of the sentence text can ensure the integrity of the semantic information and grammatical information of the sentence text, thereby effectively improving the accuracy of sentence vector generation.
  • Step 201 Use the initial sequence-to-sequence model to perform encoding and context decoding processing on the current sentence in the sequence of the context sentences in the constructed sentence sample set to obtain the upper predicted sentence and the lower predicted sentence of the current sentence.
  • the context sentence pair sequence specifically includes: the current sentence used to be input to the coding layer of the initial sequence to sequence model for context sentence prediction; and the above sentence used to train the output result of the initial sequence to sequence model.
  • the target sentence and the target sentence below, the output result is the predicted sentence above and the predicted sentence below output during the model training process.
  • step 201 may specifically include: using a word segmentation tool to perform word segmentation processing according to the sequence of context sentence pairs to obtain a sequence of context sentence pairs after word segmentation;
  • the context sentence of the current sentence in the sequence is used to obtain the sentence embedding vector of the current sentence using the encoding layer of the initial sequence to sequence model; according to the sentence embedding vector of the current sentence, the initial sequence to sequence model is used
  • Two decoding layers are set up in parallel to obtain the upper prediction sentence and the lower prediction sentence respectively; wherein, the two decoding layers refer to the first decoding layer used to predict the upper part, and the second decoding layer used to predict the lower part. layer.
  • the first decoding layer used to predict the upper part is a first GRU model
  • the second decoding layer used to predict the lower part is a second GRU.
  • the step of obtaining the above predicted sentence and the below predicted sentence respectively according to the sentence embedding vector of the current sentence, using the two decoding layers set in parallel in the initial sequence to sequence model specifically includes:
  • the sentence embedding vector of the current sentence is used as the input data of the reset gate, update gate and candidate memory unit in the first GRU model, and the above predicted sentence of the current sentence is obtained through decoding processing;
  • the sentence embedding vector of the current sentence is used as the input data of the first GRU model.
  • the input data of the two GRU models are decoded to obtain the following predicted sentences of the current sentence.
  • the step of using the initial sequence-to-sequence model to obtain the above predicted sentence and the following predicted sentence of the current sentence based on the current sentence in the context sentence pair sequence it also includes: constructing a sentence sample set, and the sentence sample set includes the context. Sentence pair sequence. Specific steps include:
  • the sequence of context sentence pairs is expressed as (S 1 ,S 2 ,S 3 ), (S 2 ,S 3 ,S 4 ), (S 3 ,S 4 ,S 5 ), (S i-1 ,S i , S i+1 ),..., (S n-2 ,S n-1 ,S n ), where S i represents the current sentence, S i-1 represents the above target sentence adjacent to S i , S i+1 Represents the following target sentence adjacent to Si .
  • the encoding layer Encoder of the initial sequence-to-sequence model is used to output the sentence embedding vector sentence embedding h s of the current sentence, and the first decoding layer pre-Decoder for predicting the above sentence sequence and the first decoding layer for predicting the following sentence sequence are simultaneously input
  • the second decoding layer next-Decoder uses the first decoding layer pre-Decoder and the second decoding layer next-Decoder to respectively obtain the upper prediction sentence of the current sentence and the lower prediction sentence of the current sentence. As shown in Figure 3, specific steps include:
  • the initial sequence-to-sequence model includes one encoding layer and two decoding layers.
  • the basic models of the encoding layer and the decoding layer are both gated recurrent units (GRU: Gate Recurrent Unit). ).
  • the next-Encoder decodes the sentence embedding vector sentence embedding h s synchronously, and obtains the upper prediction sentence of the current sentence and the lower prediction sentence corresponding to the current sentence. Specifically include:
  • the sentence embedding vector sentence embedding h s as the input of the first decoding layer pre-Decoder (above decoding), and obtain the above predicted sentence Y i-1 corresponding to the current sentence through decoding processing.
  • the sentence embedding vector sentence embedding h s of the current sentence Si is used to predict the above predicted sentence Y i-1 corresponding to the current sentence. Since upward prediction does not conform to the characteristics of natural language, the training difficulty of the first decoding layer is greater than that of the second
  • the decoding layer next-Decoder improves the GRU model architecture to improve the accuracy of the above prediction while ensuring training efficiency and preventing gradient disappearance.
  • the GRU model at each moment can combine the sentence embedding vector sentence embedding h s of the current sentence S i .
  • the specific formula is as follows:
  • z t represents the update gate of the GRU model
  • W z , U z are the update gate parameters of the original GRU model
  • x t represents the input vector at the current time t
  • h t-1 represents the previous moment, that is, the input vector at time t-1
  • V z represents the parameters set for the sentence embedding vector sentence embedding h s .
  • the reset gate r t and candidate memory units of the GRU model They all incorporate sentence embedding h s , W r , U r , V r represents the parameters of the reset gate, tanh represents the activation function, W k , U k , V k represents the parameters of the candidate memory unit, h t represents the current moment t Output vector, ⁇ represents the fully connected layer with activation function, ⁇ represents the multiplication operation of the corresponding elements of the vector.
  • next-Encoder through decoding processing, obtains the following predicted sentence Y i+1 corresponding to the current sentence. Among them, predicting the following sentences based on the current sentence is in line with the top-down characteristics of natural language. Therefore, the second decoding layer next-Encoder uses the existing GRU model, and the sentence embedding vector sentence embedding h s is only used as the initial vector of the second decoding layer.
  • predicting the previous sentence of the current sentence based on the encoder-decoder model framework breaks the top-down rule of natural language, increases the difficulty of model training, and enables the model to be fully trained, thereby outputting complete semantic signals and
  • the sentence vector representation of grammatical information furthermore, by improving the update gate, reset gate and candidate memory unit of the GRU model, can effectively ensure the training efficiency of the model while improving the difficulty of model training.
  • Step 202 Use the target loss function to train the initial sequence-to-sequence model based on the upper predicted sentence and the lower predicted sentence of the current sentence to obtain a trained sequence-to-sequence model.
  • the target loss function is determined based on the sum of the first loss function and the second loss function, and the first loss function in the target loss function is set based on the first decoding layer used to predict the above, The second loss function in the target loss function is set based on the second decoding layer used to predict the following.
  • the target loss function is used to train the initialized sequence-to-sequence model. network parameters until the initialized sequence-to-sequence model converges, and the trained sequence-to-sequence model is obtained.
  • the cross-entropy loss function is used as the basic loss function, and the specific formula is:
  • CE represents the cross entropy loss function
  • S represents the current sentence
  • Y represents the predicted sentence generated by the decoding layer Decoder
  • l represents the number of tokens determined after segmentation of the current sentence S
  • t j represents the jth token obtained by segmenting the current sentence S.
  • token y j represents the j-th token in the predicted sentence Y.
  • the corresponding upper sentence loss function (first loss function) and the lower sentence are determined.
  • Loss function (second loss function) and then obtain the target loss function of the initialized sequence-to-sequence model, that is, the sum of the above sentence loss function and the following sentence loss function.
  • the initialized sequence-to-sequence model is trained until the target loss function value of the initialized sequence-to-sequence model is reached.
  • the training ends and the trained sequence-to-sequence model is obtained.
  • Step 203 Perform semantic segmentation on the obtained initial sentence text to obtain segmented sentence text.
  • Step 204 Use a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing for predicting the context of the sentence text.
  • the sentence vector generation model is the encoding of a trained sequence-to-sequence model. layer.
  • the coding layer of the trained sequence-to-sequence model is extracted as a sentence vector generation model, so that after receiving a book recommendation request, according to the book title in the book recommendation request, the introduction text corresponding to the book title is obtained, based on Chinese punctuation is used to segment the introduction text into sentences, and the Harbin Institute of Technology LTP model is used to perform word segmentation processing on the segmented introduction text to obtain the sentence text after word segmentation.
  • the sentence vector generation model is then used to encode the sentence text to obtain the vector representation of the sentence text. .
  • Step 205 Calculate the similarity value between the vector representation of the sentence text and the sentence embedding vector in the preset book sample library, where the sentence embedding vector in the preset book sample library is generated using the sentence vector obtained from the model output.
  • the sentence vector generation model is used to output the sentence embedding vector corresponding to the introduction text, thereby constructing a preset book sample library based on the output sentence embedding vector, and using cosine similarity
  • This algorithm calculates the similarity value between the corresponding sentence vector output according to the book recommendation request and the sentence embedding vector corresponding to each book in the preset book sample library.
  • Step 206 Generate book recommendation information for the sentence text based on the sentence embedding vectors whose similarity values meet the preset conditions in the preset book sample library.
  • the book when the user browses a book on the platform, the book is used as the target book, and a book recommendation request containing the title of the target book is generated.
  • the sentence vector generation model is used to generate the corresponding sentence vectors, and then calculate the similarity values between the generated sentence vectors and each set of sentence embedding vectors in the preset book sample library corresponding to the platform, and arrange them in descending order to embed sentences whose similarity values meet the preset conditions.
  • the book information corresponding to the vector is recommended to the user as a similar book.
  • the obtained initial sentence text is semantically segmented to obtain the segmented sentence text, and the pre-built sentence vector generation model is used to generate the coding process for predicting the context of the sentence text.
  • the sentence vector generation model is the encoding layer of the trained sequence-to-sequence model; wherein the trained sequence-to-sequence model is obtained through the following steps: using the initial sequence-to-sequence model , perform coding processing and context decoding processing on the context sentences in the constructed sentence sample set to the current sentence in the sequence, and obtain the upper prediction sentence and the lower prediction sentence of the current sentence; according to the upper prediction sentence and the lower prediction sentence, we get Trained sequence-to-sequence model.
  • performing sequence-to-sequence model training on sequences based on context sentences and using the encoding layer of the trained sequence-to-sequence model to generate sentence vectors can effectively improve the accuracy of sentence vector generation and ensure the generation of sentence vectors while improving the difficulty of model training.
  • the integrity of the semantic information and grammatical information of the sentence vector thereby effectively avoiding the existing construction method based on the average word vector, destroying the dependence between words in the sentence, resulting in low accuracy of sentence feature extraction, and based on contrastive learning
  • the construction method the training difficulty of the model is low, the transfer ability of the model in actual tasks is insufficient, and the accuracy of the generated sentence vectors is low, which is a technical problem.
  • the embodiment of the present application provides a sentence vector generation device, as shown in Figure 4.
  • the device includes: a model training module 41, a preprocessing module 42, and an encoding module 43.
  • the model training module 41 can be used to use the initial sequence-to-sequence model to perform encoding processing and context decoding processing on the current sentence in the sequence of the context sentences in the constructed sentence sample set to obtain the above predicted sentence and the following sentence of the current sentence. Predict sentences; based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
  • the preprocessing module 42 can be used to perform semantic segmentation on the obtained initial sentence text to obtain segmented sentence text.
  • the encoding module 43 may be used to utilize a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing for predicting the context of the sentence text.
  • the sentence vector generation model is a trained sequence to The encoding layer of the sequence model.
  • a book recommendation module 44 is also included.
  • the model training module 41 includes a training unit 411.
  • the training unit 411 may be used to train the initial sequence-to-sequence model using a target loss function based on the above prediction sentence and the below prediction sentence of the current sentence, and obtain a trained sequence-to-sequence model; wherein, The target loss function is determined based on the sum of the first loss function and the second loss function.
  • the context sentence pair sequence specifically includes: a current sentence used to be input to the encoding layer of the initial sequence to sequence model for context sentence prediction; and, used to train the initial sequence to sequence model.
  • the upper target sentence and the lower target sentence of the output result are the upper prediction sentence and the lower prediction sentence output during the model training process.
  • the model training module 41 can be used to perform word segmentation processing using a word segmentation tool according to the sequence of context sentence pairs to obtain a sequence of context sentence pairs after word segmentation.
  • the sequence of context sentence pairs after word segmentation For the current sentence in the sequence, use the encoding layer of the initial sequence to the sequence model to obtain the sentence embedding vector of the current sentence.
  • the sentence embedding vector of the current sentence use the initial sequence to the sequence set in parallel in the sequence model.
  • the two decoding layers respectively obtain the upper prediction sentence and the lower prediction sentence, wherein the two decoding layers refer to the first decoding layer used to predict the upper part and the second decoding layer used to predict the lower part.
  • the first decoding layer used to predict the upper part is a first GRU model
  • the second decoding layer used to predict the lower part is a second GRU model
  • the step of using the sentence embedding vector of the current sentence and using the two decoding layers set up in parallel in the initial sequence to sequence model to obtain the above predicted sentence and the following predicted sentence respectively specifically including:
  • the sentence embedding vector of the current sentence is used as the input data of the reset gate, update gate and candidate memory unit in the first GRU model, and the above predicted sentence of the current sentence is obtained through decoding processing;
  • the sentence embedding vector of the current sentence is As the input data of the second GRU model, the following predicted sentence of the current sentence is obtained through decoding processing.
  • the first loss function in the target loss function is set based on the first decoding layer used to predict the above, and the second loss function in the target loss function is based on the first loss function used in prediction. Set by the second decoding layer below.
  • the book recommendation module 44 includes a similarity calculation unit 441 and a generation unit 442.
  • the similarity calculation unit 441 may be used to calculate the similarity value between the vector representation of the sentence text and the sentence embedding vector in the preset book sample library.
  • the generation unit 442 may be configured to generate book recommendation information for the sentence text based on the sentence embedding vectors whose similarity values satisfy the preset conditions in the preset book sample library; wherein, the sentences in the preset book sample library The embedding vector is obtained using the sentence vector generation model output.
  • Sentence vector generation methods including:
  • the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
  • the trained sequence-to-sequence model is obtained through the following steps:
  • the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
  • the trained sequence-to-sequence model is obtained.
  • the steps of obtaining a trained sequence-to-sequence model based on the above predicted sentences and the following predicted sentences include:
  • the target loss function to train the initial sequence-to-sequence model to obtain a trained sequence-to-sequence model
  • the target loss function is determined based on the sum of the first loss function and the second loss function.
  • context sentence pair sequence specifically includes:
  • the upper target sentence and the lower target sentence used to train the output result of the initial sequence to sequence model, and the output result is the upper prediction sentence and the lower prediction sentence output during the model training process.
  • the storage medium is a computer-readable storage medium, which may be non-volatile or volatile.
  • the technical solution of the present application can be embodied in the form of a software product.
  • the software product can be stored in a storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) and includes a number of instructions to enable
  • a computer device which may be a personal computer, a server, or a network device, etc. executes the methods described in each implementation scenario of this application.
  • embodiments of the present application also provide a computer device, which can be a personal computer, Server, network equipment, etc.
  • the physical equipment includes a storage medium and a processor; the storage medium is used to store the computer program; the processor is used to execute the computer program to implement the above sentence vector generation method as shown in Figure 1 and Figure 2, include:
  • the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
  • the trained sequence-to-sequence model is obtained through the following steps:
  • the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
  • the trained sequence-to-sequence model is obtained.
  • the steps of obtaining a trained sequence-to-sequence model based on the above predicted sentences and the following predicted sentences include:
  • the target loss function to train the initial sequence-to-sequence model to obtain a trained sequence-to-sequence model
  • the target loss function is determined based on the sum of the first loss function and the second loss function.
  • context sentence pair sequence specifically includes:
  • the upper target sentence and the lower target sentence used to train the output result of the initial sequence to sequence model, and the output result is the upper prediction sentence and the lower prediction sentence output during the model training process.
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (Radio Frequency, RF) circuit, a sensor, an audio circuit, a WI-FI module, etc.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc.
  • the optional user interface may also include a USB interface, a card reader interface, etc.
  • Optional network interfaces may include standard wired interfaces, wireless interfaces (such as Bluetooth interfaces, WI-FI interfaces), etc.
  • a computer device does not constitute a limitation on the physical device, and may include more or less components, or combine certain components, or arrange different components.
  • the storage medium may also include an operating system and a network communication module.
  • An operating system is a program that manages the hardware and software resources of a computer device and supports the operation of information processing programs and other software and/or programs.
  • the network communication module is used to implement communication between components within the storage medium, as well as communication with other hardware and software in the physical device.
  • this embodiment uses context sentences to perform sequence-to-sequence model training on sequences, and utilizes well-trained
  • the sentence vectors of sentence texts generated by the encoding layer of the sequence-to-sequence model can ensure the integrity of the semantic information and grammatical information of the sentence text, thereby effectively improving the accuracy of sentence vector generation, thereby effectively avoiding the existing problem based on the average value of word vectors.
  • the construction method destroys the dependence between words in the sentence, resulting in low accuracy of sentence feature extraction.
  • the construction method based on contrastive learning the training difficulty of the model is low, and the transfer ability of the model in actual tasks is insufficient.
  • the accompanying drawing is only a schematic diagram of a preferred implementation scenario, and the modules or processes in the accompanying drawing are not necessarily necessary for implementing the present application.
  • the modules in the devices in the implementation scenario can be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or can be correspondingly changed and located in one or more devices different from the implementation scenario.
  • the modules of the above implementation scenarios can be combined into one module or further split into multiple sub-modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)

Abstract

The present application discloses a sentence vector generation method and apparatus, a computer device, and a storage medium, which relate to the technical field of artificial intelligence, and can improve the accuracy of sentence vector generation. The method comprises: performing semantic segmentation on obtained initial sentence text to obtain segmented sentence text; and utilizing a pre-constructed sentence vector generation model to obtain a vector representation of the sentence text by means of encoding processing used to predict the context of the sentence text, the sentence vector generation model being an encoding layer of a trained sequence-to-sequence model. The present application is suitable for book recommendation on the basis of sentence vectors of book texts.

Description

句子向量生成方法、装置、计算机设备及存储介质Sentence vector generation method, device, computer equipment and storage medium
本申请要求于2022年3月9日提交中国专利局、申请号为202210232057.9、申请名称为“句子向量生成方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on March 9, 2022, with application number 202210232057.9 and the application title "Sentence Vector Generation Method, Device, Computer Equipment and Storage Medium", the entire content of which is incorporated by reference. incorporated in the application.
技术领域Technical field
本申请涉及人工智能技术领域,尤其涉及句子向量生成方法、装置、计算机设备及存储介质。This application relates to the field of artificial intelligence technology, and in particular to sentence vector generation methods, devices, computer equipment and storage media.
背景技术Background technique
自然语言处理是计算机科学领域与人工智能领域中的一个重要方向,句子向量(sentence embedding)作为文本数据的向量表示被广泛应用在自然语言处理的诸多应用场景中。通过将文本数据映射到可量化的向量空间,得到表征文本数据特征、语义、语法等信息的句子向量表示,进而利用向量聚类,分类等方法得到文本句子之间的关系,能够实现句子向量在实际场景中的应用。Natural language processing is an important direction in the field of computer science and artificial intelligence. Sentence embedding, as a vector representation of text data, is widely used in many application scenarios of natural language processing. By mapping text data into a quantifiable vector space, we can obtain sentence vector representations that represent text data features, semantics, grammar and other information, and then use vector clustering, classification and other methods to obtain the relationship between text sentences, which can realize the sentence vector in Application in actual scenarios.
现有用于句子向量构造的解决方案主要包括基于词向量平均值的构造方法和基于对比学习的构造方法,基于词向量平均值的构造方法如word2vec、glove、bert等;基于对比学习的构造方法,通过使用不同的方式,如dropout、替换、删除、反译等方式构建对比学习的正样本。发明人意识到现有解决方案存在的不足为:1)基于词向量平均值的构造方法,其破坏了句子中词语之间的依赖关系,特征提取的准确性较低;2)基于对比学习的构造方法,虽然获取正样本的方法很多,但随机选取的负样本和原始句子之间的相似度较低,导致模型的训练难度较低,模型在实际任务中的迁移能力不足,进而导致生成的句子向量的准确度较低。Existing solutions for sentence vector construction mainly include construction methods based on word vector average and construction methods based on contrastive learning. Construction methods based on word vector average such as word2vec, glove, bert, etc.; construction methods based on contrastive learning, Construct positive samples for contrastive learning by using different methods, such as dropout, replacement, deletion, back-translation, etc. The inventor realized that the shortcomings of the existing solutions are: 1) The construction method based on the average word vector, which destroys the dependence between words in the sentence, and the accuracy of feature extraction is low; 2) The method based on contrastive learning Construction method. Although there are many ways to obtain positive samples, the similarity between the randomly selected negative samples and the original sentences is low, which leads to low difficulty in training the model. The transfer ability of the model in actual tasks is insufficient, which in turn leads to the generated Sentence vectors have lower accuracy.
发明内容Contents of the invention
有鉴于此,本申请提供了句子向量生成方法、装置、计算机设备及存储介质,主要目的在于解决现有技术中,基于词向量平均值的构造方法存在句子特征提取的准确性较低,以及基于对比学习的构造方法存在模型在实际任务中的迁移能力不足,导致生成的句子向量的准确度较低的技术问题。In view of this, this application provides a sentence vector generation method, device, computer equipment and storage medium. The main purpose is to solve the problem in the existing technology that the construction method based on the word vector average has low accuracy in sentence feature extraction, and based on The construction method of contrastive learning has a technical problem of insufficient transfer ability of the model in actual tasks, resulting in low accuracy of the generated sentence vectors.
根据本申请的一个方面,提供了一种句子向量生成方法,该方法包括:According to one aspect of the present application, a sentence vector generation method is provided, which method includes:
对获取到的初始句子文本进行语义分割,得到分割后的句子文本;Perform semantic segmentation on the obtained initial sentence text to obtain the segmented sentence text;
利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;Using a pre-built sentence vector generation model, the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
其中,所述训练好的序列到序列模型通过下述步骤得到:Wherein, the trained sequence-to-sequence model is obtained through the following steps:
利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;Using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
根据本申请的另一方面,提供了一种句子向量生成装置,该装置包括:According to another aspect of the present application, a sentence vector generation device is provided, which device includes:
模型训练模块,可以用于利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测 句子和下文预测句子;以及,根据上文预测句子和下文预测句子,得到训练好的序列到序列模型;The model training module can be used to use the initial sequence-to-sequence model to encode and contextually decode the current sentence in the sequence from the context sentences in the constructed sentence sample set to obtain the above prediction sentence and the below prediction of the current sentence. sentence; and, based on the above predicted sentence and the following predicted sentence, a trained sequence-to-sequence model is obtained;
预处理模块,用于对获取到的初始句子文本进行语义分割,得到分割后的句子文本;The preprocessing module is used to perform semantic segmentation on the obtained initial sentence text and obtain the segmented sentence text;
编码模块,用于利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层。An encoding module, configured to utilize a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing used to predict the context of the sentence text. The sentence vector generation model is a trained sequence-to-sequence model. encoding layer.
依据本申请又一个方面,提供了一种存储介质,其上存储有计算机程序,所述程序被处理器执行时实现上述句子向量生成方法,包括:According to another aspect of the present application, a storage medium is provided, on which a computer program is stored. When the program is executed by a processor, the above sentence vector generation method is implemented, including:
对获取到的初始句子文本进行语义分割,得到分割后的句子文本;Perform semantic segmentation on the obtained initial sentence text to obtain the segmented sentence text;
利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;Using a pre-built sentence vector generation model, the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
其中,所述训练好的序列到序列模型通过下述步骤得到:Wherein, the trained sequence-to-sequence model is obtained through the following steps:
利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;Using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
依据本申请再一个方面,提供了一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述句子向量生成方法,包括:According to another aspect of the present application, a computer device is provided, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor. When the processor executes the program, the above sentence vector is realized. Generation methods, including:
对获取到的初始句子文本进行语义分割,得到分割后的句子文本;Perform semantic segmentation on the obtained initial sentence text to obtain the segmented sentence text;
利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;Using a pre-built sentence vector generation model, the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
其中,所述训练好的序列到序列模型通过下述步骤得到:Wherein, the trained sequence-to-sequence model is obtained through the following steps:
利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;Using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
借由上述技术方案,基于上下文句子对序列进行序列到序列模型训练,利用训练好的序列到序列模型的编码层生成句子向量,能够在提升模型训练难度的基础上,有效提升句子向量生成的准确性,保证生成的句子向量语义信息和语法信息的完整性,从而有效避免现有基于词向量平均值的构造方法,破坏句子中词语之间的依赖关系,导致句子特征提取的准确性较低,以及基于对比学习的构造方法,模型的训练难度较低,模型在实际任务中的迁移能力不足,生成的句子向量的准确度较低的技术问题。With the above technical solution, sequence-to-sequence model training is performed on sequences based on contextual sentences, and the encoding layer of the trained sequence-to-sequence model is used to generate sentence vectors, which can effectively improve the accuracy of sentence vector generation on the basis of improving the difficulty of model training. It ensures the integrity of the semantic information and grammatical information of the generated sentence vectors, thereby effectively avoiding the existing construction method based on the average word vector, destroying the dependence between words in the sentence, resulting in low accuracy of sentence feature extraction. As well as the technical problems of the construction method based on contrastive learning, the training difficulty of the model is low, the transfer ability of the model in actual tasks is insufficient, and the accuracy of the generated sentence vectors is low.
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。The above description is only an overview of the technical solutions of the present application. In order to have a clearer understanding of the technical means of the present application, they can be implemented according to the content of the description, and in order to make the above and other purposes, features and advantages of the present application more obvious and understandable. , the specific implementation methods of the present application are specifically listed below.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the present application and constitute a part of the present application. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation of the present application. In the attached picture:
图1示出了本申请实施例提供的一种句子向量生成方法的流程示意图;Figure 1 shows a schematic flowchart of a sentence vector generation method provided by an embodiment of the present application;
图2示出了本申请实施例提供另一种句子向量生成方法的流程示意图;Figure 2 shows a schematic flowchart of another sentence vector generation method provided by an embodiment of the present application;
图3示出了本申请实施例提供的初始序列到序列模型架构示意图;Figure 3 shows a schematic diagram of the initial sequence-to-sequence model architecture provided by the embodiment of the present application;
图4示出了本申请实施例提供的一种句子向量生成装置的结构示意图;Figure 4 shows a schematic structural diagram of a sentence vector generation device provided by an embodiment of the present application;
图5示出了本申请实施例提供的另一种句子向量生成装置的结构示意图。Figure 5 shows a schematic structural diagram of another sentence vector generation device provided by an embodiment of the present application.
具体实施方式Detailed ways
下文中将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。The present application will be described in detail below with reference to the accompanying drawings and embodiments. It should be noted that, as long as there is no conflict, the embodiments and features in the embodiments of this application can be combined with each other.
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(AI:Artificial Intelligence)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。The embodiments of this application can obtain and process relevant data based on artificial intelligence technology. Among them, artificial intelligence (AI: Artificial Intelligence) is the theory, method, technology and application system that uses digital computers or digital computer-controlled machines to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics and other technologies. Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometric technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
针对现有技术中基于词向量平均值的构造方法存在句子特征提取的准确性较低,以及基于对比学习的构造方法存在模型在实际任务中的迁移能力不足,生成的句子向量的准确度较低的技术问题,本实施例提供了一种句子向量生成方法,如图1所示,以该方法应用于服务器等计算机设备为例进行说明,其中,服务器可以是独立的服务器,也可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(CDN:Content Delivery Network)、以及大数据和人工智能平台等基础云计算服务的云服务器。上述方法包括以下步骤:In view of the fact that the construction method based on the average word vector in the existing technology has low accuracy in sentence feature extraction, and the construction method based on contrastive learning has insufficient transfer ability of the model in actual tasks, and the accuracy of the generated sentence vectors is low To solve the technical problem, this embodiment provides a sentence vector generation method, as shown in Figure 1. This method is explained by taking the method applied to computer equipment such as servers as an example. The server can be an independent server or a cloud-provided server. Services, cloud database, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery network (CDN: Content Delivery Network), and big data and artificial intelligence platforms and other basics Cloud server for cloud computing services. The above method includes the following steps:
步骤101、对获取到的初始句子文本进行语义分割,得到分割后的句子文本。Step 101: Perform semantic segmentation on the obtained initial sentence text to obtain segmented sentence text.
在本实施例中,以图书推荐场景为例,适用于基于获取到的图书文本内容推荐其他相似图书,具体为,当接收到图书推荐请求时,根据图书推荐请求中的图书书名,获取与图书书名对应的图书文本内容,基于中文标点对图书文本内容进行断句,通过文本分割得到用于输入句子向量生成模型的多个句子文本。根据实际应用场景的需要,图书文本内容可以为图书摘要文本、图书简介文本等,此处不具体限定。In this embodiment, taking the book recommendation scenario as an example, it is suitable for recommending other similar books based on the obtained book text content. Specifically, when a book recommendation request is received, according to the book title in the book recommendation request, obtain and For the book text content corresponding to the book title, the book text content is segmented based on Chinese punctuation, and through text segmentation, multiple sentence texts are obtained for input into the sentence vector generation model. According to the needs of actual application scenarios, the book text content can be book abstract text, book introduction text, etc., which are not specifically limited here.
步骤102、利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;其中,所述训练好的序列到序列模型通过下述步骤得到:利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Step 102: Use a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing for predicting the context of the sentence text. The sentence vector generation model is the encoding of a trained sequence-to-sequence model. layer; wherein, the trained sequence-to-sequence model is obtained through the following steps: using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and context decoded on the current sentence in the sequence, and the result is The upper predicted sentence and the lower predicted sentence of the current sentence are used; according to the upper predicted sentence and the lower predicted sentence, a trained sequence-to-sequence model is obtained.
在本实施例中,基于构建的包括上下文句子对序列的句子样本集训练初始序列到序列模型,其中,上下文句子对序列包括当前句子和当前句子对应的上下文句子,将当前句子输入初始序列到序列模型的编码层进行编码处理,得到包含当前句子上下文特征信息的向量表示,将包含当前句子上下文特征信息的向量表示分别输入初始序列到序列模型并行设置的两个解码层,通过解码处理得到当前句子的上文预测句子和下文预测句子,进一步地,通过将上下文句子对序列中当前句子的上文句子和下文句子作为上文预测句子和下文预测句子的训练目标,得到训练好的序列到序列模型。可见,训练好的序列到序列模型的编码层具有准确预测当前句子上下文的编码能力,能够保留当前句子上下文的语义信息和语法信息的完整性,因此在此基础上输出的向量表示能够包含当前句子的完整上下文特征信息,进而保证后续图书推荐的准确性。In this embodiment, the initial sequence-to-sequence model is trained based on the constructed sentence sample set including the context sentence pair sequence, where the context sentence pair sequence includes the current sentence and the context sentence corresponding to the current sentence, and the current sentence is input into the initial sequence to the sequence The coding layer of the model performs encoding processing to obtain a vector representation containing the context feature information of the current sentence. The vector representation containing the context feature information of the current sentence is input into the initial sequence into the two decoding layers set up in parallel in the sequence model. The current sentence is obtained through decoding processing. The upper prediction sentence and the lower prediction sentence of . It can be seen that the coding layer of the trained sequence-to-sequence model has the coding ability to accurately predict the current sentence context and can retain the integrity of the semantic information and grammatical information of the current sentence context. Therefore, the vector representation output on this basis can contain the current sentence. Complete contextual feature information to ensure the accuracy of subsequent book recommendations.
其中,将基于当前句子及其上下文句子构建的上下文句子对序列作为初始序列到序列模型的输入数据,能够不破坏文本数据的整体结构,保留词语之间相互依赖,相互影响的 文本特征,从而保证模型能够学习到句子文本蕴含的完整语义信息和语法信息,提升模型对上下文句子特征提取的准确性。Among them, the context sentence pair sequence constructed based on the current sentence and its context sentences is used as the input data of the initial sequence-to-sequence model, which can retain the interdependence and mutual influence between words without destroying the overall structure of the text data, thereby ensuring The model can learn the complete semantic information and grammatical information contained in the sentence text, improving the accuracy of the model in extracting contextual sentence features.
对于本实施例可以按照上述方案,对获取到的初始句子文本进行语义分割,得到分割后的句子文本,并利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;其中,所述训练好的序列到序列模型通过下述步骤得到:利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。与现有基于词向量平均值的构造、基于对比学习的构造等句子向量生成方案相比,本实施例利用上下文句子对序列进行序列到序列模型训练,利用训练好的序列到序列模型的编码层生成的句子文本的句子向量,能够保证句子文本语义信息和语法信息的完整性,从而有效提升句子向量生成的准确性。For this embodiment, the obtained initial sentence text can be semantically segmented according to the above solution to obtain the segmented sentence text, and a pre-built sentence vector generation model can be used to predict the context of the sentence text through coding processing. Obtain the vector representation of the sentence text, and the sentence vector generation model is the encoding layer of the trained sequence-to-sequence model; wherein the trained sequence-to-sequence model is obtained through the following steps: using the initial sequence-to-sequence model , perform coding processing and context decoding processing on the context sentences in the constructed sentence sample set to the current sentence in the sequence, and obtain the upper prediction sentence and the lower prediction sentence of the current sentence; according to the upper prediction sentence and the lower prediction sentence, we get Trained sequence-to-sequence model. Compared with existing sentence vector generation solutions such as the construction based on word vector average and the construction based on contrastive learning, this embodiment uses context sentences to perform sequence-to-sequence model training on sequences, and uses the coding layer of the trained sequence-to-sequence model The generated sentence vector of the sentence text can ensure the integrity of the semantic information and grammatical information of the sentence text, thereby effectively improving the accuracy of sentence vector generation.
进一步的,作为上述实施例具体实施方式的细化和扩展,为了完整说明本实施例的具体实施过程,提供了另一种句子向量生成方法,如图2所示,该方法包括:Further, as a refinement and expansion of the specific implementation of the above embodiment, in order to completely explain the specific implementation process of this embodiment, another sentence vector generation method is provided, as shown in Figure 2. This method includes:
步骤201、利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子。Step 201: Use the initial sequence-to-sequence model to perform encoding and context decoding processing on the current sentence in the sequence of the context sentences in the constructed sentence sample set to obtain the upper predicted sentence and the lower predicted sentence of the current sentence.
其中,所述上下文句子对序列具体包括:用于输入至所述初始序列到序列模型的编码层进行上下文句子预测的当前句子;以及,用于训练所述初始序列到序列模型输出结果的上文目标句子和下文目标句子,所述输出结果为模型训练过程中输出的上文预测句子和下文预测句子。Wherein, the context sentence pair sequence specifically includes: the current sentence used to be input to the coding layer of the initial sequence to sequence model for context sentence prediction; and the above sentence used to train the output result of the initial sequence to sequence model. The target sentence and the target sentence below, the output result is the predicted sentence above and the predicted sentence below output during the model training process.
为了说明步骤201的具体实施方式,作为一种优选实施例,步骤201具体可以包括:根据所述上下文句子对序列,利用分词工具进行分词处理得到分词后的上下文句子对序列;根据所述分词后的上下文句子对序列中的当前句子,利用所述初始序列到序列模型的编码层,得到所述当前句子的句子嵌入向量;根据所述当前句子的句子嵌入向量,利用所述初始序列到序列模型中并行设置的两个解码层,分别得到上文预测句子和下文预测句子;其中,所述两个解码层是指用于预测上文的第一解码层,以及用于预测下文的第二解码层。In order to illustrate the specific implementation of step 201, as a preferred embodiment, step 201 may specifically include: using a word segmentation tool to perform word segmentation processing according to the sequence of context sentence pairs to obtain a sequence of context sentence pairs after word segmentation; The context sentence of the current sentence in the sequence is used to obtain the sentence embedding vector of the current sentence using the encoding layer of the initial sequence to sequence model; according to the sentence embedding vector of the current sentence, the initial sequence to sequence model is used Two decoding layers are set up in parallel to obtain the upper prediction sentence and the lower prediction sentence respectively; wherein, the two decoding layers refer to the first decoding layer used to predict the upper part, and the second decoding layer used to predict the lower part. layer.
为了说明步骤201的具体实施方式,作为另一种优选实施例,所述用于预测上文的第一解码层为第一GRU模型,所述用于预测下文的第二解码层为第二GRU模型,所述根据所述当前句子的句子嵌入向量,利用所述初始序列到序列模型中并行设置的两个解码层,分别得到上文预测句子和下文预测句子的步骤,具体包括;将所述当前句子的句子嵌入向量分别作为第一GRU模型中重置门、更新门和候选记忆单元的输入数据,通过解码处理得到当前句子的上文预测句子;将所述当前句子的句子嵌入向量作为第二GRU模型的输入数据,通过解码处理得到当前句子的下文预测句子。In order to illustrate the specific implementation of step 201, as another preferred embodiment, the first decoding layer used to predict the upper part is a first GRU model, and the second decoding layer used to predict the lower part is a second GRU. model, the step of obtaining the above predicted sentence and the below predicted sentence respectively according to the sentence embedding vector of the current sentence, using the two decoding layers set in parallel in the initial sequence to sequence model, specifically includes: The sentence embedding vector of the current sentence is used as the input data of the reset gate, update gate and candidate memory unit in the first GRU model, and the above predicted sentence of the current sentence is obtained through decoding processing; the sentence embedding vector of the current sentence is used as the input data of the first GRU model. The input data of the two GRU models are decoded to obtain the following predicted sentences of the current sentence.
实施中,根据上下文句子对序列中的当前句子,利用初始序列到序列模型得到当前句子的上文预测句子和下文预测句子的步骤之前,还包括:构建句子样本集,所述句子样本集包括上下文句子对序列。具体步骤包括:In implementation, before the step of using the initial sequence-to-sequence model to obtain the above predicted sentence and the following predicted sentence of the current sentence based on the current sentence in the context sentence pair sequence, it also includes: constructing a sentence sample set, and the sentence sample set includes the context. Sentence pair sequence. Specific steps include:
1)随机选取任意的图书文本,基于中文标点对选取的图书文本进行语句分割,得到图书文本D,D=[S 1,S 2,S 3,S 4,S 5…S i,…,S n],其中,S i表示图书文本D中的第i个句子,n表示图书文本D经语句分割得到的句子个数。例如,图书文本集合包括3727本图书,以及每本图书的所有文本内容,从中随机选取任意图书文本,并对所选取的图书文本的所有文本内容进行语句分割。 1) Randomly select any book text, segment the selected book text based on Chinese punctuation, and obtain the book text D, D=[S 1 , S 2 , S 3 , S 4 , S 5 …S i ,…,S n ], where S i represents the i-th sentence in book text D, and n represents the number of sentences obtained by sentence segmentation of book text D. For example, the book text collection includes 3727 books and all the text content of each book. Randomly select any book text from it, and perform sentence segmentation on all the text content of the selected book text.
2)基于图书文本D构建上下文句子对序列sentence pairs,即通过遍历图书文本D中的每个句子构建上下文句子对序列,得到句子样本集G。其中,上下文句子对序列表示为 (S 1,S 2,S 3)、(S 2,S 3,S 4)、(S 3,S 4,S 5)、(S i-1,S i,S i+1)、…、(S n-2,S n-1,S n),其中S i表示当前句子,S i-1表示与S i相邻的上文目标句子,S i+1表示与S i相邻的下文目标句子。 2) Construct the context sentence pair sequence sentence pairs based on the book text D, that is, construct the context sentence pair sequence by traversing each sentence in the book text D to obtain the sentence sample set G. Among them, the sequence of context sentence pairs is expressed as (S 1 ,S 2 ,S 3 ), (S 2 ,S 3 ,S 4 ), (S 3 ,S 4 ,S 5 ), (S i-1 ,S i , S i+1 ),..., (S n-2 ,S n-1 ,S n ), where S i represents the current sentence, S i-1 represents the above target sentence adjacent to S i , S i+1 Represents the following target sentence adjacent to Si .
实施中,利用初始序列到序列模型的编码层Encoder输出当前句子的句子嵌入向量sentence embedding h s,并同步输入用于预测上文句子序列的第一解码层pre-Decoder和用于预测下文句子序列的第二解码层next-Decoder,利用第一解码层pre-Decoder和第二解码层next-Decoder分别得到当前句子的上文预测句子和当前句子的下文预测句子。如图3所示,具体步骤包括: In the implementation, the encoding layer Encoder of the initial sequence-to-sequence model is used to output the sentence embedding vector sentence embedding h s of the current sentence, and the first decoding layer pre-Decoder for predicting the above sentence sequence and the first decoding layer for predicting the following sentence sequence are simultaneously input The second decoding layer next-Decoder uses the first decoding layer pre-Decoder and the second decoding layer next-Decoder to respectively obtain the upper prediction sentence of the current sentence and the lower prediction sentence of the current sentence. As shown in Figure 3, specific steps include:
1)利用分词工具(哈工大LTP模型)对句子样本集G的每个上下文句子对序列中的句子进行分词处理,得到分词后的句子表示为S i[t 1,t 2,…,t p,…,t l],其中,t p表示S i中第p个token,l表示S i分词后得到的token个数。 1) Use the word segmentation tool (Harbin Institute of Technology LTP model) to segment the sentences in each context sentence sequence of the sentence sample set G. The sentence after segmentation is expressed as Si [t 1 ,t 2 ,...,t p , …,t l ], where t p represents the p-th token in Si i , and l represents the number of tokens obtained after S i is segmented.
2)基于encoder-decoder模型架构构建初始序列到序列模型,初始序列到序列模型包括一个编码层和两个解码层,编码层和解码层的基础模型均为门控循环单元(GRU:Gate Recurrent Unit)。2) Build an initial sequence-to-sequence model based on the encoder-decoder model architecture. The initial sequence-to-sequence model includes one encoding layer and two decoding layers. The basic models of the encoding layer and the decoding layer are both gated recurrent units (GRU: Gate Recurrent Unit). ).
3)将分词处理后的句子样本集G作为初始序列到序列模型的输入,将每个句子对序列中的当前句子输入初始序列到序列模型的编码层Encoder,通过编码处理得到当前句子的句子嵌入向量sentence embedding h s,利用第一解码层pre-Decoder和第二解码层 3) Use the sentence sample set G after word segmentation processing as the input of the initial sequence to the sequence model, input the initial sequence of the current sentence in each sentence pair sequence to the encoding layer Encoder of the sequence model, and obtain the sentence embedding of the current sentence through encoding processing. Vector sentence embedding h s , using the first decoding layer pre-Decoder and the second decoding layer
next-Encoder对句子嵌入向量sentence embedding h s同步进行解码处理,分别得到当前句子的上文预测句子和当前句子对应的下文预测句子。具体包括: The next-Encoder decodes the sentence embedding vector sentence embedding h s synchronously, and obtains the upper prediction sentence of the current sentence and the lower prediction sentence corresponding to the current sentence. Specifically include:
①将上下文句子对序列中的当前句子作为初始序列到序列模型编码层Encoder的输入,以(S i-1,S i,S i+1)为例,将分词后的(S i-1,S i,S i+1)中的句子S i=[t 1,t 2,…,t p,…,t l]输入编码层Encoder,通过编码处理得到S i的句子嵌入向量sentence embedding h s① Use the current sentence in the context sentence pair sequence as the input of the initial sequence to the sequence model encoding layer Encoder. Take (S i-1 , S i , S i+1 ) as an example, and use (S i-1 , S i-1 , S i+1 ) after word segmentation. The sentence Si in Si , Si +1 ) = [t 1 , t 2 ,...,t p ,..., t l ] is input to the encoding layer Encoder, and the sentence embedding vector sentence embedding h s of Si is obtained through encoding processing. .
②将句子嵌入向量sentence embedding h s作为第一解码层pre-Decoder(上文解码)的输入,通过解码处理得到当前句子对应的上文预测句子Y i-1。其中,根据当前句子S i的句子嵌入向量sentence embedding h s预测当前句子对应的上文预测句子Y i-1,由于向上预测不符合自然语言的特点,因此第一解码层的训练难度大于第二解码层next-Decoder(下文解码),对GRU模型架构进行改进,在提升上文预测准确性的同时,保证训练效率,防止梯度消失。具体为,通过向第一解码层中的更新门、重置门及候选记忆单元的输入端增加当前句子的嵌入向量sentence embedding h s并设置相应参数,以保证在token-by-token生成的过程中,每个时刻的GRU模型均能够结合当前句子S i的句子嵌入向量sentence embedding h s,具体公式如下: ② Use the sentence embedding vector sentence embedding h s as the input of the first decoding layer pre-Decoder (above decoding), and obtain the above predicted sentence Y i-1 corresponding to the current sentence through decoding processing. Among them, the sentence embedding vector sentence embedding h s of the current sentence Si is used to predict the above predicted sentence Y i-1 corresponding to the current sentence. Since upward prediction does not conform to the characteristics of natural language, the training difficulty of the first decoding layer is greater than that of the second The decoding layer next-Decoder (below decoding) improves the GRU model architecture to improve the accuracy of the above prediction while ensuring training efficiency and preventing gradient disappearance. Specifically, by adding the embedding vector sentence embedding h s of the current sentence to the input terminals of the update gate, reset gate and candidate memory unit in the first decoding layer and setting the corresponding parameters, to ensure the token-by-token generation process , the GRU model at each moment can combine the sentence embedding vector sentence embedding h s of the current sentence S i . The specific formula is as follows:
z t=σ(W zx t+U zh t-1+V zh s) z t =σ(W z x t +U z h t-1 +V z h s )
r t=σ(W rx t+U rh t-1+V rh s) r t =σ(W r x t +U r h t-1 +V r h s )
Figure PCTCN2022089817-appb-000001
Figure PCTCN2022089817-appb-000001
Figure PCTCN2022089817-appb-000002
Figure PCTCN2022089817-appb-000002
其中,z t表示GRU模型的更新门,W z,U z为原始GRU模型更新门参数,x t表示当前时刻t的输入向量,h t-1表示前一时刻,即t-1时刻传到当前时刻t的向量,V z表示针对句子嵌入向量sentence embedding h s设置的参数。同理,GRU模型的重置门r t和候选记忆单元
Figure PCTCN2022089817-appb-000003
都融合了sentence embedding h s,W r,U r,V r表示重置门的参数,tanh表示激活函数,W k,U k,V k表示候选记忆单元的参数,h t表示当前时刻t的输出向量,σ表示带有激活函数的全连接层,☉表示向量对应元素相乘运算。
Among them, z t represents the update gate of the GRU model, W z , U z are the update gate parameters of the original GRU model, x t represents the input vector at the current time t, h t-1 represents the previous moment, that is, the input vector at time t-1 The vector at the current moment t, V z represents the parameters set for the sentence embedding vector sentence embedding h s . In the same way, the reset gate r t and candidate memory units of the GRU model
Figure PCTCN2022089817-appb-000003
They all incorporate sentence embedding h s , W r , U r , V r represents the parameters of the reset gate, tanh represents the activation function, W k , U k , V k represents the parameters of the candidate memory unit, h t represents the current moment t Output vector, σ represents the fully connected layer with activation function, ☉ represents the multiplication operation of the corresponding elements of the vector.
③与第一解码层同步,将句子嵌入向量sentence embedding h s输入第二解码层 ③Synchronize with the first decoding layer, input the sentence embedding vector sentence embedding h s into the second decoding layer
next-Encoder,通过解码处理得到当前句子对应的下文预测句子Y i+1。其中,基于当前句子预测下文句子,符合自然语言自上而下的特点,因此第二解码层next-Encoder采用现有GRU模型,句子嵌入向量sentence embedding h s仅作为第二解码层的初始向量。 next-Encoder, through decoding processing, obtains the following predicted sentence Y i+1 corresponding to the current sentence. Among them, predicting the following sentences based on the current sentence is in line with the top-down characteristics of natural language. Therefore, the second decoding layer next-Encoder uses the existing GRU model, and the sentence embedding vector sentence embedding h s is only used as the initial vector of the second decoding layer.
可见,基于encoder-decoder模型框架对当前句子的上文句子进行预测,打破了自然语言自上而下的规律,提升了模型训练的难度,使得模型得以充分的训练,从而输出包含完整语义信号和语法信息的句子向量表示,进一步地,通过对GRU模型的更新门、重置门及候选记忆单元的改进,能够在提升模型训练的难度的同时,有效保证模型的训练效率。It can be seen that predicting the previous sentence of the current sentence based on the encoder-decoder model framework breaks the top-down rule of natural language, increases the difficulty of model training, and enables the model to be fully trained, thereby outputting complete semantic signals and The sentence vector representation of grammatical information, furthermore, by improving the update gate, reset gate and candidate memory unit of the GRU model, can effectively ensure the training efficiency of the model while improving the difficulty of model training.
步骤202、根据所述当前句子的上文预测句子和下文预测句子,利用目标损失函数对所述初始序列到序列模型进行训练,得到训练好的序列到序列模型。其中,所述目标损失函数是根据第一损失函数与第二损失函数之和确定的,所述目标损失函数中的第一损失函数是基于用于预测上文的第一解码层设定的,所述目标损失函数中的第二损失函数是基于用于预测下文的第二解码层设定的。Step 202: Use the target loss function to train the initial sequence-to-sequence model based on the upper predicted sentence and the lower predicted sentence of the current sentence to obtain a trained sequence-to-sequence model. Wherein, the target loss function is determined based on the sum of the first loss function and the second loss function, and the first loss function in the target loss function is set based on the first decoding layer used to predict the above, The second loss function in the target loss function is set based on the second decoding layer used to predict the following.
实施中,根据上文目标句子S i-1、下文目标句子S i+1,以及上文预测句子Y i-1、下文预测句子Y i+1,利用目标损失函数训练初始化的序列到序列模型的网络参数,直至初始化的序列到序列模型收敛,得到训练好的序列到序列模型。具体地,利用交叉熵损失函数作为基础损失函数,具体公式为: In the implementation, based on the target sentence Si -1 above and the target sentence Si +1 below, as well as the predicted sentence Yi -1 above and the predicted sentence Yi +1 below, the target loss function is used to train the initialized sequence-to-sequence model. network parameters until the initialized sequence-to-sequence model converges, and the trained sequence-to-sequence model is obtained. Specifically, the cross-entropy loss function is used as the basic loss function, and the specific formula is:
Figure PCTCN2022089817-appb-000004
Figure PCTCN2022089817-appb-000004
其中,CE表示交叉熵损失函数,S表示当前句子,Y表示解码层Decoder生成的预测句子,l表示当前句子S分词后确定的token个数,t j表示当前句子S经过分词得到的第j个token,y j表示预测句子Y中的第j个token。 Among them, CE represents the cross entropy loss function, S represents the current sentence, Y represents the predicted sentence generated by the decoding layer Decoder, l represents the number of tokens determined after segmentation of the current sentence S, and t j represents the jth token obtained by segmenting the current sentence S. token, y j represents the j-th token in the predicted sentence Y.
进一步地,基于分别用于输出上文预测句子和下文预测句子的第一解码层pre-Decoder和第二解码层next-Encoder,确定相应的上文句子损失函数(第一损失函数)和下文句子损失函数(第二损失函数),进而得到初始化的序列到序列模型的目标损失函数,即上文句子损失函数和下文句子损失函数之和,具体公式如下:Further, based on the first decoding layer pre-Decoder and the second decoding layer next-Encoder respectively used to output the upper prediction sentence and the lower prediction sentence, the corresponding upper sentence loss function (first loss function) and the lower sentence are determined. Loss function (second loss function), and then obtain the target loss function of the initialized sequence-to-sequence model, that is, the sum of the above sentence loss function and the following sentence loss function. The specific formula is as follows:
Figure PCTCN2022089817-appb-000005
Figure PCTCN2022089817-appb-000005
其中,
Figure PCTCN2022089817-appb-000006
表示上文句子损失函数pre-loss,
Figure PCTCN2022089817-appb-000007
表示下文句子损失函数next-loss。
in,
Figure PCTCN2022089817-appb-000006
Represents the loss function pre-loss of the above sentence,
Figure PCTCN2022089817-appb-000007
Represents the following sentence loss function next-loss.
根据实际应用场景的需要,通过设定批大小batch size为128,时期epoch为50,学习率lr为0.005,对初始化的序列到序列模型进行训练,直到初始化的序列到序列模型的目标损失函数值趋于稳定,训练结束,得到训练好的序列到序列模型。According to the needs of the actual application scenario, by setting the batch size to 128, the epoch to 50, and the learning rate lr to 0.005, the initialized sequence-to-sequence model is trained until the target loss function value of the initialized sequence-to-sequence model is reached. When it becomes stable, the training ends and the trained sequence-to-sequence model is obtained.
步骤203、对获取到的初始句子文本进行语义分割,得到分割后的句子文本。Step 203: Perform semantic segmentation on the obtained initial sentence text to obtain segmented sentence text.
步骤204、利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层。Step 204: Use a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing for predicting the context of the sentence text. The sentence vector generation model is the encoding of a trained sequence-to-sequence model. layer.
实施中,提取训练好的序列到序列模型的编码层作为句子向量生成模型,以便在接收到图书推荐请求后,根据图书推荐请求中的图书书名,获取与图书书名对应的简介文本,基于中文标点对简介文本进行语句分割,并利用哈工大LTP模型对分割后的简介文本进行分词处理,得到分词后的句子文本,进而利用句子向量生成模型对句子文本进行编码处理,得到句子文本的向量表示。In the implementation, the coding layer of the trained sequence-to-sequence model is extracted as a sentence vector generation model, so that after receiving a book recommendation request, according to the book title in the book recommendation request, the introduction text corresponding to the book title is obtained, based on Chinese punctuation is used to segment the introduction text into sentences, and the Harbin Institute of Technology LTP model is used to perform word segmentation processing on the segmented introduction text to obtain the sentence text after word segmentation. The sentence vector generation model is then used to encode the sentence text to obtain the vector representation of the sentence text. .
步骤205、计算所述句子文本的向量表示与预设图书样本库中的句子嵌入向量之间的相似度值,其中,所述预设图书样本库中的句子嵌入向量是利用所述句子向量生成模型输出得到的。Step 205: Calculate the similarity value between the vector representation of the sentence text and the sentence embedding vector in the preset book sample library, where the sentence embedding vector in the preset book sample library is generated using the sentence vector obtained from the model output.
实施中,根据初始图书样本库中每本图书的简介文本,利用所述句子向量生成模型输出对应简介文本的句子嵌入向量,从而基于输出的句子嵌入向量构建预设图书样本库,利 用余弦值相似性算法,计算出根据图书推荐请求输出的相应句子向量与预设图书样本库中每本图书对应的句子嵌入向量的相似度值。In the implementation, based on the introduction text of each book in the initial book sample library, the sentence vector generation model is used to output the sentence embedding vector corresponding to the introduction text, thereby constructing a preset book sample library based on the output sentence embedding vector, and using cosine similarity This algorithm calculates the similarity value between the corresponding sentence vector output according to the book recommendation request and the sentence embedding vector corresponding to each book in the preset book sample library.
步骤206、根据所述预设图书样本库中相似度值满足预设条件的句子嵌入向量,生成所述句子文本的图书推荐信息。Step 206: Generate book recommendation information for the sentence text based on the sentence embedding vectors whose similarity values meet the preset conditions in the preset book sample library.
实施中,当用户在平台上浏览一本图书时,将该图书作为目标图书,生成包含该目标图书书名的图书推荐请求,根据目标图书书名对应的简介文本,利用句子向量生成模型生成相应的句子向量,进而分别计算出所生成的句子向量与该平台对应的预设图书样本库中每组句子嵌入向量的相似度值,并进行降序排列,以便将相似度值满足预设条件的句子嵌入向量对应的图书信息作为相似图书推荐给用户,实验发现,线上ABtest结果显示,基于本实施例得到的用户点击率能够有效提升2.31%。In the implementation, when the user browses a book on the platform, the book is used as the target book, and a book recommendation request containing the title of the target book is generated. Based on the introduction text corresponding to the title of the target book, the sentence vector generation model is used to generate the corresponding sentence vectors, and then calculate the similarity values between the generated sentence vectors and each set of sentence embedding vectors in the preset book sample library corresponding to the platform, and arrange them in descending order to embed sentences whose similarity values meet the preset conditions. The book information corresponding to the vector is recommended to the user as a similar book. The experiment found that the online AB test results showed that the user click-through rate obtained based on this embodiment can be effectively increased by 2.31%.
通过应用本实施例的技术方案,对获取到的初始句子文本进行语义分割,得到分割后的句子文本,并利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;其中,所述训练好的序列到序列模型通过下述步骤得到:利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。可见,基于上下文句子对序列进行序列到序列模型训练,利用训练好的序列到序列模型的编码层生成句子向量,能够在提升模型训练难度的基础上,有效提升句子向量生成的准确性,保证生成的句子向量语义信息和语法信息的完整性,从而有效避免现有基于词向量平均值的构造方法,破坏句子中词语之间的依赖关系,导致句子特征提取的准确性较低,以及基于对比学习的构造方法,模型的训练难度较低,模型在实际任务中的迁移能力不足,生成的句子向量的准确度较低的技术问题。By applying the technical solution of this embodiment, the obtained initial sentence text is semantically segmented to obtain the segmented sentence text, and the pre-built sentence vector generation model is used to generate the coding process for predicting the context of the sentence text. Obtain the vector representation of the sentence text, and the sentence vector generation model is the encoding layer of the trained sequence-to-sequence model; wherein the trained sequence-to-sequence model is obtained through the following steps: using the initial sequence-to-sequence model , perform coding processing and context decoding processing on the context sentences in the constructed sentence sample set to the current sentence in the sequence, and obtain the upper prediction sentence and the lower prediction sentence of the current sentence; according to the upper prediction sentence and the lower prediction sentence, we get Trained sequence-to-sequence model. It can be seen that performing sequence-to-sequence model training on sequences based on context sentences and using the encoding layer of the trained sequence-to-sequence model to generate sentence vectors can effectively improve the accuracy of sentence vector generation and ensure the generation of sentence vectors while improving the difficulty of model training. The integrity of the semantic information and grammatical information of the sentence vector, thereby effectively avoiding the existing construction method based on the average word vector, destroying the dependence between words in the sentence, resulting in low accuracy of sentence feature extraction, and based on contrastive learning The construction method, the training difficulty of the model is low, the transfer ability of the model in actual tasks is insufficient, and the accuracy of the generated sentence vectors is low, which is a technical problem.
进一步地,作为图1方法的具体实现,本申请实施例提供了一种句子向量生成装置,如图4所示,该装置包括:模型训练模块41、预处理模块42、编码模块43。Further, as a specific implementation of the method in Figure 1, the embodiment of the present application provides a sentence vector generation device, as shown in Figure 4. The device includes: a model training module 41, a preprocessing module 42, and an encoding module 43.
模型训练模块41,可以用于利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。The model training module 41 can be used to use the initial sequence-to-sequence model to perform encoding processing and context decoding processing on the current sentence in the sequence of the context sentences in the constructed sentence sample set to obtain the above predicted sentence and the following sentence of the current sentence. Predict sentences; based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
预处理模块42,可以用于对获取到的初始句子文本进行语义分割,得到分割后的句子文本。The preprocessing module 42 can be used to perform semantic segmentation on the obtained initial sentence text to obtain segmented sentence text.
编码模块43,可以用于利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层。The encoding module 43 may be used to utilize a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing for predicting the context of the sentence text. The sentence vector generation model is a trained sequence to The encoding layer of the sequence model.
在具体的应用场景中,如图5所示,还包括图书推荐模块44。In a specific application scenario, as shown in Figure 5, a book recommendation module 44 is also included.
在具体的应用场景中,模型训练模块41包括训练单元411。In a specific application scenario, the model training module 41 includes a training unit 411.
训练单元411,可以用于根据所述当前句子的上文预测句子和下文预测句子,利用目标损失函数对所述初始序列到序列模型进行训练,得到训练好的序列到序列模型;其中,所述目标损失函数是根据第一损失函数与第二损失函数之和确定的。The training unit 411 may be used to train the initial sequence-to-sequence model using a target loss function based on the above prediction sentence and the below prediction sentence of the current sentence, and obtain a trained sequence-to-sequence model; wherein, The target loss function is determined based on the sum of the first loss function and the second loss function.
在具体的应用场景中,所述上下文句子对序列具体包括:用于输入至所述初始序列到序列模型的编码层进行上下文句子预测的当前句子;以及,用于训练所述初始序列到序列模型输出结果的上文目标句子和下文目标句子,所述输出结果为模型训练过程中输出的上文预测句子和下文预测句子。In a specific application scenario, the context sentence pair sequence specifically includes: a current sentence used to be input to the encoding layer of the initial sequence to sequence model for context sentence prediction; and, used to train the initial sequence to sequence model. The upper target sentence and the lower target sentence of the output result are the upper prediction sentence and the lower prediction sentence output during the model training process.
在具体的应用场景中,所述模型训练模块41,具体可以用于根据所述上下文句子对序列,利用分词工具进行分词处理得到分词后的上下文句子对序列,根据所述分词后的上下文句子对序列中的当前句子,利用所述初始序列到序列模型的编码层,得到所述当前句子 的句子嵌入向量,根据所述当前句子的句子嵌入向量,利用所述初始序列到序列模型中并行设置的两个解码层,分别得到上文预测句子和下文预测句子,其中,所述两个解码层是指用于预测上文的第一解码层,以及用于预测下文的第二解码层。In a specific application scenario, the model training module 41 can be used to perform word segmentation processing using a word segmentation tool according to the sequence of context sentence pairs to obtain a sequence of context sentence pairs after word segmentation. According to the sequence of context sentence pairs after word segmentation, For the current sentence in the sequence, use the encoding layer of the initial sequence to the sequence model to obtain the sentence embedding vector of the current sentence. According to the sentence embedding vector of the current sentence, use the initial sequence to the sequence set in parallel in the sequence model. The two decoding layers respectively obtain the upper prediction sentence and the lower prediction sentence, wherein the two decoding layers refer to the first decoding layer used to predict the upper part and the second decoding layer used to predict the lower part.
在具体的应用场景中,所述用于预测上文的第一解码层为第一GRU模型,所述用于预测下文的第二解码层为第二GRU模型,所述根据所述当前句子的句子嵌入向量;所述根据所述当前句子的句子嵌入向量,利用所述初始序列到序列模型中并行设置的两个解码层,分别得到上文预测句子和下文预测句子的步骤,具体包括:将所述当前句子的句子嵌入向量分别作为第一GRU模型中重置门、更新门和候选记忆单元的输入数据,通过解码处理得到当前句子的上文预测句子;将所述当前句子的句子嵌入向量作为第二GRU模型的输入数据,通过解码处理得到当前句子的下文预测句子。In a specific application scenario, the first decoding layer used to predict the upper part is a first GRU model, the second decoding layer used to predict the lower part is a second GRU model, and the decoding layer based on the current sentence Sentence embedding vector; the step of using the sentence embedding vector of the current sentence and using the two decoding layers set up in parallel in the initial sequence to sequence model to obtain the above predicted sentence and the following predicted sentence respectively, specifically including: The sentence embedding vector of the current sentence is used as the input data of the reset gate, update gate and candidate memory unit in the first GRU model, and the above predicted sentence of the current sentence is obtained through decoding processing; the sentence embedding vector of the current sentence is As the input data of the second GRU model, the following predicted sentence of the current sentence is obtained through decoding processing.
在具体的应用场景中,所述目标损失函数中的第一损失函数是基于用于预测上文的第一解码层设定的,所述目标损失函数中的第二损失函数是基于用于预测下文的第二解码层设定的。In a specific application scenario, the first loss function in the target loss function is set based on the first decoding layer used to predict the above, and the second loss function in the target loss function is based on the first loss function used in prediction. Set by the second decoding layer below.
在具体的应用场景中,图书推荐模块44包括相似度计算单元441、生成单元442。In a specific application scenario, the book recommendation module 44 includes a similarity calculation unit 441 and a generation unit 442.
相似度计算单元441,可以用于计算所述句子文本的向量表示与预设图书样本库中的句子嵌入向量之间的相似度值。The similarity calculation unit 441 may be used to calculate the similarity value between the vector representation of the sentence text and the sentence embedding vector in the preset book sample library.
生成单元442,可以用于根据所述预设图书样本库中相似度值满足预设条件的句子嵌入向量,生成所述句子文本的图书推荐信息;其中,所述预设图书样本库中的句子嵌入向量是利用所述句子向量生成模型输出得到的。The generation unit 442 may be configured to generate book recommendation information for the sentence text based on the sentence embedding vectors whose similarity values satisfy the preset conditions in the preset book sample library; wherein, the sentences in the preset book sample library The embedding vector is obtained using the sentence vector generation model output.
需要说明的是,本申请实施例提供的一种句子向量生成装置所涉及各功能单元的其他相应描述,可以参考图1和图2中的对应描述,在此不再赘述。It should be noted that for other corresponding descriptions of each functional unit involved in a sentence vector generation device provided by the embodiment of the present application, reference can be made to the corresponding descriptions in Figures 1 and 2, and details will not be described again here.
基于上述如图1和图2所示方法,相应的,本申请实施例还提供了一种存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述如图1和图2的句子向量生成方法,包括:Based on the above methods shown in Figures 1 and 2, correspondingly, embodiments of the present application also provide a storage medium on which a computer program is stored. When the program is executed by a processor, the above-mentioned methods of Figures 1 and 2 are implemented. Sentence vector generation methods, including:
对获取到的初始句子文本进行语义分割,得到分割后的句子文本;Perform semantic segmentation on the obtained initial sentence text to obtain the segmented sentence text;
利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;Using a pre-built sentence vector generation model, the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
其中,所述训练好的序列到序列模型通过下述步骤得到:Wherein, the trained sequence-to-sequence model is obtained through the following steps:
利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;Using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
可选的,所述根据上文预测句子和下文预测句子,得到训练好的序列到序列模型的步骤,具体包括:Optionally, the steps of obtaining a trained sequence-to-sequence model based on the above predicted sentences and the following predicted sentences include:
根据所述当前句子的上文预测句子和下文预测句子,利用目标损失函数对所述初始序列到序列模型进行训练,得到训练好的序列到序列模型;According to the upper prediction sentence and the lower prediction sentence of the current sentence, use the target loss function to train the initial sequence-to-sequence model to obtain a trained sequence-to-sequence model;
其中,所述目标损失函数是根据第一损失函数与第二损失函数之和确定的。Wherein, the target loss function is determined based on the sum of the first loss function and the second loss function.
可选的,所述上下文句子对序列具体包括:Optionally, the context sentence pair sequence specifically includes:
用于输入至所述初始序列到序列模型的编码层进行上下文句子预测的当前句子;The current sentence used as input to the encoding layer of the initial sequence-to-sequence model for context sentence prediction;
以及,用于训练所述初始序列到序列模型输出结果的上文目标句子和下文目标句子,所述输出结果为模型训练过程中输出的上文预测句子和下文预测句子。And, the upper target sentence and the lower target sentence used to train the output result of the initial sequence to sequence model, and the output result is the upper prediction sentence and the lower prediction sentence output during the model training process.
可选的,所述存储介质为计算机可读存储介质,可以是非易失性,也可以是易失性。Optionally, the storage medium is a computer-readable storage medium, which may be non-volatile or volatile.
基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个存储介质(可以是CD-ROM,U盘,移动硬盘等)中,包括若干指令用以使 得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施场景所述的方法。Based on this understanding, the technical solution of the present application can be embodied in the form of a software product. The software product can be stored in a storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) and includes a number of instructions to enable A computer device (which may be a personal computer, a server, or a network device, etc.) executes the methods described in each implementation scenario of this application.
基于上述如图1、图2所示的方法,以及图4、图5所示的虚拟装置实施例,为了实现上述目的,本申请实施例还提供了一种计算机设备,具体可以为个人计算机、服务器、网络设备等,该实体设备包括存储介质和处理器;存储介质,用于存储计算机程序;处理器,用于执行计算机程序以实现上述如图1和图2所示的句子向量生成方法,包括:Based on the above methods shown in Figures 1 and 2, and the virtual device embodiments shown in Figures 4 and 5, in order to achieve the above purpose, embodiments of the present application also provide a computer device, which can be a personal computer, Server, network equipment, etc., the physical equipment includes a storage medium and a processor; the storage medium is used to store the computer program; the processor is used to execute the computer program to implement the above sentence vector generation method as shown in Figure 1 and Figure 2, include:
对获取到的初始句子文本进行语义分割,得到分割后的句子文本;Perform semantic segmentation on the obtained initial sentence text to obtain the segmented sentence text;
利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;Using a pre-built sentence vector generation model, the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
其中,所述训练好的序列到序列模型通过下述步骤得到:Wherein, the trained sequence-to-sequence model is obtained through the following steps:
利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;Using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
可选的,所述根据上文预测句子和下文预测句子,得到训练好的序列到序列模型的步骤,具体包括:Optionally, the steps of obtaining a trained sequence-to-sequence model based on the above predicted sentences and the following predicted sentences include:
根据所述当前句子的上文预测句子和下文预测句子,利用目标损失函数对所述初始序列到序列模型进行训练,得到训练好的序列到序列模型;According to the upper predicted sentence and the lower predicted sentence of the current sentence, use the target loss function to train the initial sequence-to-sequence model to obtain a trained sequence-to-sequence model;
其中,所述目标损失函数是根据第一损失函数与第二损失函数之和确定的。Wherein, the target loss function is determined based on the sum of the first loss function and the second loss function.
可选的,所述上下文句子对序列具体包括:Optionally, the context sentence pair sequence specifically includes:
用于输入至所述初始序列到序列模型的编码层进行上下文句子预测的当前句子;The current sentence used as input to the encoding layer of the initial sequence-to-sequence model for context sentence prediction;
以及,用于训练所述初始序列到序列模型输出结果的上文目标句子和下文目标句子,所述输出结果为模型训练过程中输出的上文预测句子和下文预测句子。And, the upper target sentence and the lower target sentence used to train the output result of the initial sequence to sequence model, and the output result is the upper prediction sentence and the lower prediction sentence output during the model training process.
可选的,该计算机设备还可以包括用户接口、网络接口、摄像头、射频(Radio Frequency,RF)电路,传感器、音频电路、WI-FI模块等等。用户接口可以包括显示屏(Display)、输入单元比如键盘(Keyboard)等,可选用户接口还可以包括USB接口、读卡器接口等。网络接口可选的可以包括标准的有线接口、无线接口(如蓝牙接口、WI-FI接口)等。Optionally, the computer device may also include a user interface, a network interface, a camera, a radio frequency (Radio Frequency, RF) circuit, a sensor, an audio circuit, a WI-FI module, etc. The user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc. The optional user interface may also include a USB interface, a card reader interface, etc. Optional network interfaces may include standard wired interfaces, wireless interfaces (such as Bluetooth interfaces, WI-FI interfaces), etc.
本领域技术人员可以理解,本实施例提供的一种计算机设备结构并不构成对该实体设备的限定,可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the structure of a computer device provided in this embodiment does not constitute a limitation on the physical device, and may include more or less components, or combine certain components, or arrange different components.
存储介质中还可以包括操作系统、网络通信模块。操作系统是管理计算机设备硬件和软件资源的程序,支持信息处理程序以及其它软件和/或程序的运行。网络通信模块用于实现存储介质内部各组件之间的通信,以及与该实体设备中其它硬件和软件之间通信。The storage medium may also include an operating system and a network communication module. An operating system is a program that manages the hardware and software resources of a computer device and supports the operation of information processing programs and other software and/or programs. The network communication module is used to implement communication between components within the storage medium, as well as communication with other hardware and software in the physical device.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本申请可以借助软件加必要的通用硬件平台的方式来实现,也可以通过硬件实现。通过应用本申请的技术方案,与现有基于词向量平均值的构造以及基于对比学习的构造等句子向量生成方案相比,本实施例利用上下文句子对序列进行序列到序列模型训练,利用训练好的序列到序列模型的编码层生成的句子文本的句子向量,能够保证句子文本语义信息和语法信息的完整性,从而有效提升句子向量生成的准确性,从而有效避免现有基于词向量平均值的构造方法,破坏句子中词语之间的依赖关系,导致句子特征提取的准确性较低,以及基于对比学习的构造方法,模型的训练难度较低,模型在实际任务中的迁移能力不足,生成的句子向量的准确度较低的技术问题。Through the above description of the embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus a necessary general hardware platform, or can also be implemented by hardware. By applying the technical solution of this application, compared with existing sentence vector generation solutions such as the construction based on word vector average and the construction based on contrastive learning, this embodiment uses context sentences to perform sequence-to-sequence model training on sequences, and utilizes well-trained The sentence vectors of sentence texts generated by the encoding layer of the sequence-to-sequence model can ensure the integrity of the semantic information and grammatical information of the sentence text, thereby effectively improving the accuracy of sentence vector generation, thereby effectively avoiding the existing problem based on the average value of word vectors. The construction method destroys the dependence between words in the sentence, resulting in low accuracy of sentence feature extraction. As well as the construction method based on contrastive learning, the training difficulty of the model is low, and the transfer ability of the model in actual tasks is insufficient. The generated Technical issues with low accuracy of sentence vectors.
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实 施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。Those skilled in the art can understand that the accompanying drawing is only a schematic diagram of a preferred implementation scenario, and the modules or processes in the accompanying drawing are not necessarily necessary for implementing the present application. Those skilled in the art can understand that the modules in the devices in the implementation scenario can be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or can be correspondingly changed and located in one or more devices different from the implementation scenario. The modules of the above implementation scenarios can be combined into one module or further split into multiple sub-modules.
上述本申请序号仅仅为了描述,不代表实施场景的优劣。以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。The above serial numbers of this application are only for description and do not represent the advantages and disadvantages of the implementation scenarios. What is disclosed above are only a few specific implementation scenarios of the present application. However, the present application is not limited thereto. Any changes that can be thought of by those skilled in the art should fall within the protection scope of the present application.

Claims (20)

  1. 一种句子向量生成方法,其中,包括:A sentence vector generation method, including:
    对获取到的初始句子文本进行语义分割,得到分割后的句子文本;Perform semantic segmentation on the obtained initial sentence text to obtain the segmented sentence text;
    利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;Using a pre-built sentence vector generation model, the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
    其中,所述训练好的序列到序列模型通过下述步骤得到:Wherein, the trained sequence-to-sequence model is obtained through the following steps:
    利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;Using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
    根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
  2. 根据权利要求1所述的方法,其中,所述根据上文预测句子和下文预测句子,得到训练好的序列到序列模型的步骤,具体包括:The method according to claim 1, wherein the step of obtaining a trained sequence-to-sequence model based on the above predicted sentence and the below predicted sentence specifically includes:
    根据所述当前句子的上文预测句子和下文预测句子,利用目标损失函数对所述初始序列到序列模型进行训练,得到训练好的序列到序列模型;According to the upper prediction sentence and the lower prediction sentence of the current sentence, use the target loss function to train the initial sequence-to-sequence model to obtain a trained sequence-to-sequence model;
    其中,所述目标损失函数是根据第一损失函数与第二损失函数之和确定的。Wherein, the target loss function is determined based on the sum of the first loss function and the second loss function.
  3. 根据权利要求1或2所述的方法,其中,所述上下文句子对序列具体包括:The method according to claim 1 or 2, wherein the sequence of context sentence pairs specifically includes:
    用于输入至所述初始序列到序列模型的编码层进行上下文句子预测的当前句子;The current sentence used as input to the encoding layer of the initial sequence-to-sequence model for context sentence prediction;
    以及,用于训练所述初始序列到序列模型输出结果的上文目标句子和下文目标句子,所述输出结果为模型训练过程中输出的上文预测句子和下文预测句子。And, the upper target sentence and the lower target sentence used to train the output result of the initial sequence to sequence model, and the output result is the upper prediction sentence and the lower prediction sentence output during the model training process.
  4. 根据权利要求1所述的方法,其中,所述利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子的步骤,具体包括:The method according to claim 1, wherein the initial sequence-to-sequence model is used to perform encoding processing and context decoding processing on the current sentence in the sequence of the context sentences in the constructed sentence sample set to obtain the upper limit of the current sentence. The steps for predicting sentences in the text and predicting sentences below include:
    根据所述上下文句子对序列,利用分词工具进行分词处理得到分词后的上下文句子对序列;According to the sequence of context sentence pairs, use a word segmentation tool to perform word segmentation processing to obtain a sequence of context sentence pairs after word segmentation;
    根据所述分词后的上下文句子对序列中的当前句子,利用所述初始序列到序列模型的编码层,得到所述当前句子的句子嵌入向量;According to the current sentence in the sequence of context sentences after word segmentation, use the encoding layer of the initial sequence to sequence model to obtain the sentence embedding vector of the current sentence;
    根据所述当前句子的句子嵌入向量,利用所述初始序列到序列模型中并行设置的两个解码层,分别得到上文预测句子和下文预测句子;According to the sentence embedding vector of the current sentence, use the two decoding layers set up in parallel in the initial sequence to sequence model to obtain the upper prediction sentence and the lower prediction sentence respectively;
    其中,所述两个解码层是指用于预测上文的第一解码层,以及用于预测下文的第二解码层。The two decoding layers refer to a first decoding layer used for predicting the upper part and a second decoding layer used for predicting the lower part.
  5. 根据权利要求4所述的方法,其中,所述用于预测上文的第一解码层为第一GRU模型,所述用于预测下文的第二解码层为第二GRU模型,所述根据所述当前句子的句子嵌入向量,利用所述初始序列到序列模型中并行设置的两个解码层,分别得到上文预测句子和下文预测句子的步骤,具体包括:The method according to claim 4, wherein the first decoding layer for predicting the upper part is a first GRU model, the second decoding layer for predicting the lower part is a second GRU model, and the first decoding layer for predicting the upper part is a second GRU model. The sentence embedding vector of the current sentence is described, and the two decoding layers set up in parallel in the initial sequence to sequence model are used to obtain the steps of predicting the sentence above and predicting the sentence below respectively, including:
    将所述当前句子的句子嵌入向量分别作为第一GRU模型中重置门、更新门和候选记忆单元的输入数据,通过解码处理得到当前句子的上文预测句子;Use the sentence embedding vector of the current sentence as the input data of the reset gate, update gate and candidate memory unit in the first GRU model, and obtain the above predicted sentence of the current sentence through decoding processing;
    将所述当前句子的句子嵌入向量作为第二GRU模型的输入数据,通过解码处理得到当前句子的下文预测句子。The sentence embedding vector of the current sentence is used as the input data of the second GRU model, and the following predicted sentence of the current sentence is obtained through decoding processing.
  6. 根据权利要求2或4所述的方法,其中,所述目标损失函数中的第一损失函数是基于用于预测上文的第一解码层设定的,所述目标损失函数中的第二损失函数是基于用于预测下文的第二解码层设定的。The method according to claim 2 or 4, wherein the first loss function in the target loss function is set based on the first decoding layer used to predict the above, and the second loss in the target loss function The function is set based on the second decoding layer used to predict the context.
  7. 根据权利要求1所述的方法,其中,所述利用所述句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示的步骤之后,还包括:The method according to claim 1, wherein after the step of obtaining the vector representation of the sentence text by using the sentence vector generation model through encoding processing for predicting the context of the sentence text, it further includes:
    计算所述句子文本的向量表示与预设图书样本库中的句子嵌入向量之间的相似度值;Calculate the similarity value between the vector representation of the sentence text and the sentence embedding vector in the preset book sample library;
    根据所述预设图书样本库中相似度值满足预设条件的句子嵌入向量,生成所述句子文本的图书推荐信息;Generate book recommendation information for the sentence text according to the sentence embedding vectors whose similarity values meet the preset conditions in the preset book sample library;
    其中,所述预设图书样本库中的句子嵌入向量是利用所述句子向量生成模型输出得到的。Wherein, the sentence embedding vectors in the preset book sample library are obtained by using the sentence vector generation model output.
  8. 一种句子向量生成装置,其中,包括:A sentence vector generation device, which includes:
    模型训练模块,用于利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;以及,根据上文预测句子和下文预测句子,得到训练好的序列到序列模型;The model training module is used to use the initial sequence-to-sequence model to encode and contextually decode the current sentence in the sequence from the context sentences in the constructed sentence sample set, to obtain the upper prediction sentence and the lower prediction sentence of the current sentence. ; And, based on the above predicted sentence and the following predicted sentence, the trained sequence-to-sequence model is obtained;
    预处理模块,用于对获取到的初始句子文本进行语义分割,得到分割后的句子文本;The preprocessing module is used to perform semantic segmentation on the obtained initial sentence text and obtain the segmented sentence text;
    编码模块,用于利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层。An encoding module, configured to utilize a pre-built sentence vector generation model to obtain a vector representation of the sentence text through encoding processing used to predict the context of the sentence text. The sentence vector generation model is a trained sequence-to-sequence model. encoding layer.
  9. 根据权利要求8所述的装置,其中,所述模型训练模块,具体包括:The device according to claim 8, wherein the model training module specifically includes:
    训练单元,用于根据所述当前句子的上文预测句子和下文预测句子,利用目标损失函数对所述初始序列到序列模型进行训练,得到训练好的序列到序列模型;A training unit, configured to use a target loss function to train the initial sequence-to-sequence model based on the above prediction sentence and the below prediction sentence of the current sentence, and obtain a trained sequence-to-sequence model;
    其中,所述目标损失函数是根据第一损失函数与第二损失函数之和确定的。Wherein, the target loss function is determined based on the sum of the first loss function and the second loss function.
  10. 根据权利要求8或9所述的装置,其中,所述上下文句子对序列具体包括:The device according to claim 8 or 9, wherein the sequence of context sentence pairs specifically includes:
    用于输入至所述初始序列到序列模型的编码层进行上下文句子预测的当前句子;The current sentence used as input to the encoding layer of the initial sequence-to-sequence model for context sentence prediction;
    以及,用于训练所述初始序列到序列模型输出结果的上文目标句子和下文目标句子,所述输出结果为模型训练过程中输出的上文预测句子和下文预测句子。And, the upper target sentence and the lower target sentence used to train the output result of the initial sequence to sequence model, and the output result is the upper prediction sentence and the lower prediction sentence output during the model training process.
  11. 根据权利要求8所述的装置,其中,所述模型训练模块,具体包括:The device according to claim 8, wherein the model training module specifically includes:
    根据所述上下文句子对序列,利用分词工具进行分词处理得到分词后的上下文句子对序列;According to the sequence of context sentence pairs, use a word segmentation tool to perform word segmentation processing to obtain a sequence of context sentence pairs after word segmentation;
    根据所述分词后的上下文句子对序列中的当前句子,利用所述初始序列到序列模型的编码层,得到所述当前句子的句子嵌入向量;According to the current sentence in the sequence of context sentences after word segmentation, use the encoding layer of the initial sequence to sequence model to obtain the sentence embedding vector of the current sentence;
    根据所述当前句子的句子嵌入向量,利用所述初始序列到序列模型中并行设置的两个解码层,分别得到上文预测句子和下文预测句子;According to the sentence embedding vector of the current sentence, use the two decoding layers set up in parallel in the initial sequence to sequence model to obtain the upper prediction sentence and the lower prediction sentence respectively;
    其中,所述两个解码层是指用于预测上文的第一解码层,以及用于预测下文的第二解码层。The two decoding layers refer to a first decoding layer used for predicting the upper part and a second decoding layer used for predicting the lower part.
  12. 根据权利要求11所述的装置,其中,所述用于预测上文的第一解码层为第一GRU模型,所述用于预测下文的第二解码层为第二GRU模型,所述根据所述当前句子的句子嵌入向量,利用所述初始序列到序列模型中并行设置的两个解码层,分别得到上文预测句子和下文预测句子的步骤,具体包括:The apparatus according to claim 11, wherein the first decoding layer for predicting the upper part is a first GRU model, the second decoding layer for predicting the lower part is a second GRU model, and the first decoding layer for predicting the upper part is a second GRU model, and the first decoding layer for predicting the upper part is a second GRU model. The sentence embedding vector of the current sentence is described, and the two decoding layers set up in parallel in the initial sequence to sequence model are used to obtain the steps of predicting the sentence above and predicting the sentence below respectively, including:
    将所述当前句子的句子嵌入向量分别作为第一GRU模型中重置门、更新门和候选记忆单元的输入数据,通过解码处理得到当前句子的上文预测句子;Use the sentence embedding vector of the current sentence as the input data of the reset gate, update gate and candidate memory unit in the first GRU model, and obtain the above predicted sentence of the current sentence through decoding processing;
    将所述当前句子的句子嵌入向量作为第二GRU模型的输入数据,通过解码处理得到当前句子的下文预测句子。The sentence embedding vector of the current sentence is used as the input data of the second GRU model, and the following predicted sentence of the current sentence is obtained through decoding processing.
  13. 根据权利要求9或11所述的装置,其中,所述目标损失函数中的第一损失函数是基于用于预测上文的第一解码层设定的,所述目标损失函数中的第二损失函数是基于用于预测下文的第二解码层设定的。The device according to claim 9 or 11, wherein the first loss function in the target loss function is set based on the first decoding layer used to predict the above, and the second loss in the target loss function The function is set based on the second decoding layer used to predict the context.
  14. 根据权利要求8所述的装置,其中,还包括图书推荐模块,具体包括:The device according to claim 8, further comprising a book recommendation module, specifically including:
    相似度计算单元,用于计算所述句子文本的向量表示与预设图书样本库中的句子嵌入向量之间的相似度值;A similarity calculation unit, used to calculate the similarity value between the vector representation of the sentence text and the sentence embedding vector in the preset book sample library;
    生成单元,用于根据所述预设图书样本库中相似度值满足预设条件的句子嵌入向量, 生成所述句子文本的图书推荐信息;A generation unit configured to generate book recommendation information for the sentence text based on sentence embedding vectors whose similarity values satisfy the preset conditions in the preset book sample library;
    其中,所述预设图书样本库中的句子嵌入向量是利用所述句子向量生成模型输出得到的。Wherein, the sentence embedding vectors in the preset book sample library are obtained by using the sentence vector generation model output.
  15. 一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现句子向量生成方法,包括:A computer device, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, wherein the processor implements a sentence vector generation method when executing the program, including:
    对获取到的初始句子文本进行语义分割,得到分割后的句子文本;Perform semantic segmentation on the obtained initial sentence text to obtain the segmented sentence text;
    利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;Using a pre-built sentence vector generation model, the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
    其中,所述训练好的序列到序列模型通过下述步骤得到:Wherein, the trained sequence-to-sequence model is obtained through the following steps:
    利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;Using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
    根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
  16. 根据权利要求15所述的计算机设备,其中,所述根据上文预测句子和下文预测句子,得到训练好的序列到序列模型的步骤,具体包括:The computer device according to claim 15, wherein the step of obtaining a trained sequence-to-sequence model based on the above prediction sentence and the below prediction sentence specifically includes:
    根据所述当前句子的上文预测句子和下文预测句子,利用目标损失函数对所述初始序列到序列模型进行训练,得到训练好的序列到序列模型;According to the upper prediction sentence and the lower prediction sentence of the current sentence, use the target loss function to train the initial sequence-to-sequence model to obtain a trained sequence-to-sequence model;
    其中,所述目标损失函数是根据第一损失函数与第二损失函数之和确定的。Wherein, the target loss function is determined based on the sum of the first loss function and the second loss function.
  17. 根据权利要求15或16所述的计算机设备,其中,所述上下文句子对序列具体包括:The computer device according to claim 15 or 16, wherein the sequence of context sentence pairs specifically includes:
    用于输入至所述初始序列到序列模型的编码层进行上下文句子预测的当前句子;The current sentence used as input to the encoding layer of the initial sequence-to-sequence model for context sentence prediction;
    以及,用于训练所述初始序列到序列模型输出结果的上文目标句子和下文目标句子,所述输出结果为模型训练过程中输出的上文预测句子和下文预测句子。And, the upper target sentence and the lower target sentence used to train the output result of the initial sequence to sequence model, and the output result is the upper prediction sentence and the lower prediction sentence output during the model training process.
  18. 一种存储介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现句子向量生成方法,包括:A storage medium with a computer program stored thereon, wherein when the program is executed by a processor, a sentence vector generation method is implemented, including:
    对获取到的初始句子文本进行语义分割,得到分割后的句子文本;Perform semantic segmentation on the obtained initial sentence text to obtain the segmented sentence text;
    利用预先构建的句子向量生成模型,通过用于预测所述句子文本上下文的编码处理,得到所述句子文本的向量表示,所述句子向量生成模型为训练好的序列到序列模型的编码层;Using a pre-built sentence vector generation model, the vector representation of the sentence text is obtained through encoding processing for predicting the context of the sentence text, and the sentence vector generation model is the coding layer of the trained sequence-to-sequence model;
    其中,所述训练好的序列到序列模型通过下述步骤得到:Wherein, the trained sequence-to-sequence model is obtained through the following steps:
    利用初始序列到序列模型,对构建的句子样本集中的上下文句子对序列中的当前句子进行编码处理和上下文解码处理,得到所述当前句子的上文预测句子和下文预测句子;Using the initial sequence-to-sequence model, the context sentences in the constructed sentence sample set are encoded and contextually decoded on the current sentence in the sequence to obtain the upper prediction sentence and the lower prediction sentence of the current sentence;
    根据上文预测句子和下文预测句子,得到训练好的序列到序列模型。Based on the above predicted sentences and the following predicted sentences, the trained sequence-to-sequence model is obtained.
  19. 根据权利要求18所述的计算机设备,其中,所述根据上文预测句子和下文预测句子,得到训练好的序列到序列模型的步骤,具体包括:The computer device according to claim 18, wherein the step of obtaining a trained sequence-to-sequence model based on the above predicted sentence and the below predicted sentence specifically includes:
    根据所述当前句子的上文预测句子和下文预测句子,利用目标损失函数对所述初始序列到序列模型进行训练,得到训练好的序列到序列模型;According to the upper prediction sentence and the lower prediction sentence of the current sentence, use the target loss function to train the initial sequence-to-sequence model to obtain a trained sequence-to-sequence model;
    其中,所述目标损失函数是根据第一损失函数与第二损失函数之和确定的。Wherein, the target loss function is determined based on the sum of the first loss function and the second loss function.
  20. 根据权利要求18或19所述的计算机设备,其中,所述上下文句子对序列具体包括:The computer device according to claim 18 or 19, wherein the sequence of context sentence pairs specifically includes:
    用于输入至所述初始序列到序列模型的编码层进行上下文句子预测的当前句子;The current sentence used as input to the encoding layer of the initial sequence-to-sequence model for context sentence prediction;
    以及,用于训练所述初始序列到序列模型输出结果的上文目标句子和下文目标句子,所述输出结果为模型训练过程中输出的上文预测句子和下文预测句子。And, the upper target sentence and the lower target sentence used to train the output result of the initial sequence to sequence model, and the output result is the upper prediction sentence and the lower prediction sentence output during the model training process.
PCT/CN2022/089817 2022-03-09 2022-04-28 Sentence vector generation method and apparatus, computer device and storage medium WO2023168814A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210232057.9A CN114444471A (en) 2022-03-09 2022-03-09 Sentence vector generation method and device, computer equipment and storage medium
CN202210232057.9 2022-03-09

Publications (1)

Publication Number Publication Date
WO2023168814A1 true WO2023168814A1 (en) 2023-09-14

Family

ID=81359057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/089817 WO2023168814A1 (en) 2022-03-09 2022-04-28 Sentence vector generation method and apparatus, computer device and storage medium

Country Status (2)

Country Link
CN (1) CN114444471A (en)
WO (1) WO2023168814A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178082A (en) * 2019-12-05 2020-05-19 北京葡萄智学科技有限公司 Sentence vector generation method and device and electronic equipment
US20200218780A1 (en) * 2019-01-03 2020-07-09 International Business Machines Corporation Automated contextual dialog generation for cognitive conversation
WO2020151688A1 (en) * 2019-01-24 2020-07-30 腾讯科技(深圳)有限公司 Coding method and device, equipment and storage medium
CN111602128A (en) * 2017-10-27 2020-08-28 巴比伦合伙有限公司 Computer-implemented method and system for determining
CN112052329A (en) * 2020-09-02 2020-12-08 平安科技(深圳)有限公司 Text abstract generation method and device, computer equipment and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111602128A (en) * 2017-10-27 2020-08-28 巴比伦合伙有限公司 Computer-implemented method and system for determining
US20200218780A1 (en) * 2019-01-03 2020-07-09 International Business Machines Corporation Automated contextual dialog generation for cognitive conversation
WO2020151688A1 (en) * 2019-01-24 2020-07-30 腾讯科技(深圳)有限公司 Coding method and device, equipment and storage medium
CN111178082A (en) * 2019-12-05 2020-05-19 北京葡萄智学科技有限公司 Sentence vector generation method and device and electronic equipment
CN112052329A (en) * 2020-09-02 2020-12-08 平安科技(深圳)有限公司 Text abstract generation method and device, computer equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
RYAN KIROS, YUKUN ZHU, RUSLAN SALAKHUTDINOV, RICHARD S ZEMEL, ANTONIO TORRALBA, RAQUEL URTASUN, SANJA FIDLER: "Skip-Thought Vectors", 22 June 2015 (2015-06-22), XP055428189, Retrieved from the Internet <URL:https://arxiv.org/pdf/1506.06726.pdf> *

Also Published As

Publication number Publication date
CN114444471A (en) 2022-05-06

Similar Documents

Publication Publication Date Title
CN111444340B (en) Text classification method, device, equipment and storage medium
WO2022007823A1 (en) Text data processing method and device
CN111967266A (en) Chinese named entity recognition model and construction method and application thereof
WO2022022421A1 (en) Language representation model system, pre-training method and apparatus, device and medium
CN110163181B (en) Sign language identification method and device
CN112131366A (en) Method, device and storage medium for training text classification model and text classification
CN111159485B (en) Tail entity linking method, device, server and storage medium
WO2020244475A1 (en) Method and apparatus for language sequence labeling, storage medium, and computing device
CN113051356B (en) Open relation extraction method and device, electronic equipment and storage medium
CN113553412B (en) Question-answering processing method, question-answering processing device, electronic equipment and storage medium
US20220414400A1 (en) Multi-dimensional language style transfer
CN114912450B (en) Information generation method and device, training method, electronic device and storage medium
WO2022228127A1 (en) Element text processing method and apparatus, electronic device, and storage medium
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
JP2023002690A (en) Semantics recognition method, apparatus, electronic device, and storage medium
CN116541492A (en) Data processing method and related equipment
CN116050425A (en) Method for establishing pre-training language model, text prediction method and device
CN115408488A (en) Segmentation method and system for novel scene text
CN110188158B (en) Keyword and topic label generation method, device, medium and electronic equipment
CN115114407A (en) Intention recognition method and device, computer equipment and storage medium
CN117275466A (en) Business intention recognition method, device, equipment and storage medium thereof
WO2023116572A1 (en) Word or sentence generation method and related device
WO2023137903A1 (en) Reply statement determination method and apparatus based on rough semantics, and electronic device
CN115357710B (en) Training method and device for table description text generation model and electronic equipment
WO2023168814A1 (en) Sentence vector generation method and apparatus, computer device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22930443

Country of ref document: EP

Kind code of ref document: A1