WO2023070732A1

WO2023070732A1 - Text recommendation method and apparatus based on deep learning, and related medium

Info

Publication number: WO2023070732A1
Application number: PCT/CN2021/129027
Authority: WO
Inventors: 钱启; 王天星; 杨东泉; 程佳宇
Original assignee: 深圳前海环融联易信息科技服务有限公司
Priority date: 2021-10-27
Filing date: 2021-11-05
Publication date: 2023-05-04
Also published as: CN113704386A

Abstract

Disclosed in the present application are a text recommendation method and apparatus based on deep learning, and a related medium. The method comprises: collecting different types of text information to construct a text database, and generating a text feature vector for each piece of text information in the text database by means of a siamese neural network structure; converting the text feature vector into Milvus vector index information, and storing same in a Milvus database; when matching is performed on text to be matched, acquiring a sentence vector, which includes semantic information, in said text by means of the siamese neural network structure; and selecting, from the Milvus database, the first N pieces of Milvus vector index information having the highest semantic similarity, and selecting on the basis of the correspondence between Milvus vector index information and text feature vectors, the corresponding first N pieces of text information from the text database to serve as a matching result of said text. According to the embodiments of the present application, a text database is constructed and a Milvus database is introduced, such that when text is recommended, rapid retrieval and real-time feedback can be achieved, and the accuracy is high.

Description

A text recommendation method, device and related media based on deep learning

This application is based on a Chinese patent application with application number 202111255426.8 and a filing date of October 27, 2021, and claims its priority. The entire content of this application is hereby incorporated into this application as a whole.

technical field

The present application relates to the technical field of computer software, in particular to a text recommendation method, device and related media based on deep learning.

Background technique

With the rapid development of science and technology, the field of machine learning has also achieved promising rapid development in the direction of deep learning. Natural language processing is an important direction in the field of artificial intelligence. It studies various theories and methods that can realize effective communication between humans and computers using natural language. Generally speaking, natural language processing technology includes text processing, machine translation, semantic understanding, knowledge graph, intelligent question answering and other technologies. Among them, text matching is a very important application direction of text processing, which plays a very important role in real life. At the same time, the development of this technology provides a feasible solution for users to search and match better in the sea of complicated information. In fact, text matching plays an important role in many practical scenarios. For example, in a search scenario, when a user inputs a piece of text to be matched, the system needs to search the corpus for content as semantically similar as possible to the text to be matched, and return the matching result to the user. For another example, in the intelligent question answering system, when a user asks a question, the system needs to find the most similar question in the question answer database according to the question raised by the user, and return the answer corresponding to the similar question. In these scenarios, the accuracy of text matching directly affects the user experience.

The so-called text matching generally involves calculating the semantic similarity between two texts through an algorithm, and judging the matching degree between the two through the similarity. The higher the similarity value, the better the match. On the contrary, the more mismatched. The current text matching mainly adopts more complex methods, and does not have dynamic scalability. Here, dynamic scalability means that the text database does not automatically expand, but needs to be expanded manually.

application content

Embodiments of the present application provide a text recommendation method, device, computer equipment, and storage medium based on deep learning, aiming at improving the efficiency and accuracy of text recommendation.

In the first aspect, the embodiment of the present application provides a text recommendation method based on deep learning, including:

Collect different types of text information to build a text database, and generate a text feature vector for each text information in the text database through a twin neural network structure;

The text feature vector is converted into Milvus vector index information, and stored in the Milvus database;

When the text to be matched is matched, the sentence vector containing semantic information in the text to be matched is obtained through the twin neural network structure;

Select the top N pieces of Milvus vector index information with the highest semantic similarity in the Milvus database, and based on the correspondence between the Milvus vector index information and the text feature vector, select the corresponding top N pieces of text information in the text database as the waiting list The matching result for the matched text.

In the second aspect, the embodiment of the present application provides a text recommendation device based on deep learning, including:

The first vector generation unit is used to collect different types of text information to construct a text database, and generates a text feature vector for each text information in the text database through a twin neural network structure;

The first vector conversion unit is used to convert the text feature vector into Milvus vector index information, and store it in the Milvus database;

The second vector generation unit is used to obtain the sentence vector containing semantic information in the text to be matched through the twin neural network structure when the text to be matched is matched;

The text matching unit is used to select the top N pieces of Milvus vector index information with the highest semantic similarity in the Milvus database, and based on the correspondence between the Milvus vector index information and text feature vectors, select the corresponding top N pieces in the text database. N pieces of text information are used as matching results of the text to be matched.

In a third aspect, an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the computer program, Realize the text recommendation method based on deep learning as described in the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program described in the first aspect is implemented. A text recommendation method based on deep learning.

The embodiment of the present application provides a deep learning-based text recommendation method, device, computer equipment, and storage medium, the method including: collecting different types of text information to construct a text database, and using a twin neural network structure to generate the text database Each text information in generates a text feature vector; the text feature vector is converted into Milvus vector index information, and stored in the Milvus database; when the text to be matched is matched, the text to be matched is obtained through the twin neural network structure Include the sentence vector of semantic information; Select the top N pieces of Milvus vector index information with the highest semantic similarity in the Milvus database, and based on the correspondence between Milvus vector index information and text feature vectors, select the corresponding The first N pieces of text information are used as the matching results of the text to be matched. The embodiment of the present application solves the time-consuming and labor-intensive defect of matching the text to be matched with the text information one by one by constructing a text database and introducing the Milvus database, and the recommended matching process of this embodiment is simple to implement, has high accuracy and consumes a lot of effort. It doesn't take long. When recommending text, it can achieve fast retrieval, real-time feedback, and has the dynamic scalability of text data in the text database.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present application more clearly, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can also obtain other drawings based on these drawings on the premise of not paying creative work.

FIG. 1 is a schematic flow diagram of a text recommendation method based on deep learning provided in an embodiment of the present application;

FIG. 2 is a schematic subflow diagram of a text recommendation method based on deep learning provided in an embodiment of the present application;

FIG. 3 is a schematic block diagram of a text recommendation device based on deep learning provided by an embodiment of the present application;

FIG. 4 is a sub-schematic block diagram of an apparatus for recommending text based on deep learning provided by an embodiment of the present application.

Detailed ways

The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

It should be understood that when used in this specification and the appended claims, the terms "comprising" and "comprises" indicate the presence of described features, integers, steps, operations, elements and/or components, but do not exclude one or Presence or addition of multiple other features, integers, steps, operations, elements, components and/or collections thereof.

It should also be understood that the terminology used in the specification of this application is for the purpose of describing particular embodiments only and is not intended to limit the application. As used in this specification and the appended claims, the singular forms "a", "an" and "the" are intended to include plural referents unless the context clearly dictates otherwise.

It should be further understood that the term "and/or" used in the description of the present application and the appended claims refers to any combination and all possible combinations of one or more of the associated listed items, and includes these combinations .

Please refer to FIG. 1 below. FIG. 1 is a schematic flowchart of a text recommendation method based on deep learning provided by an embodiment of the present application, which specifically includes steps S101 to S104.

S101. Collect different types of text information to construct a text database, and generate a text feature vector for each text information in the text database through a twin neural network structure;

S102, converting the text feature vector into Milvus vector index information, and storing it in the Milvus database;

S103. When matching the text to be matched, obtain a sentence vector containing semantic information in the text to be matched through a twin neural network structure;

S104. Select the top N pieces of Milvus vector index information with the highest semantic similarity in the Milvus database, and select the corresponding top N pieces of text information in the text database based on the correspondence between the Milvus vector index information and the text feature vector As the matching result of the text to be matched.

In this embodiment, firstly, a text database is constructed by using different types of text information, and at the same time, a text feature vector is generated for the text information in the text database through a twin neural network structure. Then convert the generated text feature vectors into Milvus vector index information and store them in the Milvus database. When it is necessary to make matching recommendations for the text to be matched, the corresponding sentence vector is also generated for the text to be matched through the twin neural network structure, and then the similarity between the sentence vector and each Milvus vector index information is calculated through the Milvus database, and one of them is selected. The top N pieces of Milvus vector index information with the highest similarity can then select the corresponding text information in the text database as the matching result or recommendation result.

In a specific application scenario, the text database is a CSV text database (that is, a text database in CSV format). Further, the specific steps of constructing the text database may be as follows: divide the texts according to the categories to be recommended, find several texts under the categories, and classify them according to the categories, and each category is a CSV file. The column names of the CSV file content can be question, flag. Among them, question represents the content of the text, and flag represents the name of the category. In a CSV file, the name of the flag is uniform. There are several types of text, and there are several CSV files.

In a specific embodiment, in order to facilitate data modification and data cleaning, this embodiment stores the text database in the MySQL database in the form of structured data. The purpose of this is that when performing data cleaning, you can directly write Python scripts to operate MySQL data tables to update text data. Compared with the general use of CSV format files as text databases, MySQL database has the characteristics of intuitive display, flexible operation, and convenient data dynamic expansion in text databases.

Although several texts can be returned through the above steps, in order to ensure the timeliness of recommending texts, it is necessary to save all the texts in the text database as offline feature files in advance, and the feature files occupy a large storage space. At the same time, generally speaking, when a text appears, the model can be used to compare the text with all the texts in the text database one by one, and return the most similar semantics. It is worth noting that this method is too time-consuming for a text database with a large amount of data. Therefore, I thought of storing the features of the text database offline, so that there is no need to generate the feature vectors of the text database one by one every time. However, the offline feature files generated in this way occupy a large space, and when the text database changes, the offline feature files will become invalid and cannot be used any more, which is inconvenient for maintenance. Therefore, in an embodiment, in order to solve this problem, this embodiment uses the Milvus database to store the characteristic information of the text database, so as to realize fast retrieval. The so-called Milvus database is an open-source vector database that supports addition, deletion, modification, and near-real-time query and retrieval of TB-level vectors. It has the characteristics of high flexibility, stability, reliability, and high-speed query.

In existing text matching technologies, text features of two text messages are usually extracted, and then based on the extracted text features, it is judged whether the two text messages match. In the process of extracting text features, the word vectors of the text information are often added directly, or directly combined with the weights of the words in the text information, and weighted to construct the text features of the text information. However, the obtained text vector may be affected by individual words in the text, so the constructed text features cannot accurately reflect the semantics of the text information, resulting in a low matching accuracy. In addition, the most commonly used method of representing sentence vectors is to average the vectors of the BERT output layer, or use the first word of the BERT output layer to represent, which will undoubtedly produce poor sentence encoding information. What's more, in a collection of 10,000 sentences, it takes 65 hours to find the most similar sentence pair using the above method. It can be seen that these technologies are relatively complicated, and there are problems such as high cost, low efficiency, and more time-consuming. Therefore, in view of the above problems, the deep learning-based text recommendation method provided in this embodiment solves the time-consuming and labor-intensive defect of matching the text to be matched with the text information one by one by constructing a text database and introducing the Milvus database, and The recommendation matching process of this embodiment is simple to implement, has high accuracy and does not take long time. When recommending text, it can achieve fast retrieval, real-time feedback, and has dynamic scalability of text data in the text database. In a specific test scenario, a request only takes about 30 milliseconds to return the result.

In one embodiment, the step S101 includes:

Combine the text information in the text database in pairs, and input the two text information in the combination into the BERT network model and the average pooling layer of the same structure respectively, and respectively output the two text information corresponding to The encoding result is used as a text feature vector corresponding to the two text information.

In this embodiment, when the text feature vector is generated by the twin neural network structure, the text information in the text database is first combined in pairs, and then for each combination of two text information, the two are respectively input into the same structure The BERT network model and the average pooling layer, and correspondingly obtain 2 encoding results. The encoding result of this model is the obtained text feature vector with semantic information. It is worth noting that this Siamese neural network structure can generate fixed-size vectors for the input sentences, and the semantic information of these vectors can be used to calculate similarity.

In addition, in order to obtain a text feature vector with a fixed size, this embodiment makes improvements based on the BERT network model. The full name of the BERT network model is: Bidirectional Encoder Representations from Transformer, which is a pre-trained network. As can be seen from the name, the goal of the BERT network model is to use large-scale unlabeled corpus training to obtain the semantic representation of the text containing rich semantic information, and then fine-tune the semantic representation of the text in specific NLP tasks, and finally apply for this NLP task. These tasks can include intelligent question answering, sentence classification, sentence pair representation, etc. However, a major disadvantage of the BERT network model is that it does not calculate independent sentence codes, which makes it difficult to obtain good sentence codes through the BERT network model.

Considering the above limitations of the BERT network model, the improvement of this embodiment mainly lies in adding an average pooling operation after the output layer of the BERT network model. The role of the pooling layer is feature translation invariance. The advantage of this setting is that after adding the average pooling layer, the final output vector size is fixed for different sentences.

In one embodiment, the step S102 includes:

Performing normalization processing on the text feature vector to obtain a normalized text feature vector;

Convert normalized text feature vectors to Milvus vector index information.

In this embodiment, when the text feature vector is converted into Milvus vector index information, the text feature vector is first normalized, and the specific steps of the normalization process are: input 2 text information, and then pass through the same structure respectively The BERT network model and the average pooling layer obtain two encoding results respectively, and then normalize the two encoding results respectively to obtain a normalized text feature vector. Afterwards, the normalized feature vectors are converted into Milvus vector index information and stored in the Milvus database to obtain Milvus vector information. In this way, the text database and the Milvus database are corresponding (that is, the ID numbers of the two are exactly the same), which is convenient for returning the text information of the original text database after query, instead of only returning the difficult-to-recognize index information of Milvus.

In one embodiment, the step S103 includes:

Input the text to be matched into the BERT network model separately to obtain the text semantic representation corresponding to the text to be matched;

The vector size of the text semantic representation is fixed through an average pooling layer to obtain the sentence vector.

In this embodiment, although the twin neural network structure is used when generating the text feature vector and generating the sentence vector, when generating the text feature vector, since the twin neural network structure is provided with 2 inputs, 2 text information are input at the same time . When generating sentence vectors, since the twin neural network structure already has the feature representation ability to adapt to similar data after generating text feature vectors, it is only necessary to input the text to be matched separately, that is, to input the text to be matched into the BERT network in sequence Model and average pooling layers.

In one embodiment, the step S104 includes:

Utilize cosine similarity method to calculate the similarity score of described sentence vector and each described Milvus vector index information;

Select the top N Milvus vector index information with the highest similarity score.

In this embodiment, when performing text recommendation, for the text to be matched, the cosine similarity method is used to calculate the similarity between the sentence vector and the Milvus vector index information, so as to search and select the top N text matching results with similar semantics, that is, the top N N pieces of Milvus vector index information. The corresponding text information can then be found in the text database. In a specific embodiment, the Milvus vector index information is sorted and selected according to the degree of confidence.

In another specific embodiment, before starting the Milvus Docker container, it is necessary to modify the MySQL address in the configuration file and expose port 19530. Once started, the container automatically creates 4 Milvus metadata tables in the MySQL database. If the text matching model is updated, the Milvus index vector needs to be rebuilt. The Milvus vector database and the Siamese neural network structure jointly build a semantic search engine for text recommendation.

In an embodiment, as shown in FIG. 2 , the text recommendation based on deep learning further includes: steps S201-S204.

S201. Select a text data set, and divide the text data set into a training set and a test set according to the ratio of training set:test set=7:3;

S202. Use the text data in the training set to train and learn the twin neural network structure, and set the hyperparameter batch size of the twin neural network structure to 16, and the learning rate to 2e-5;

S203, using the Adam optimizer to optimize the parameters of the twin neural network structure, and using a cosine similarity loss function to evaluate the performance of the twin neural network structure;

S204. Utilize the text data in the test set to update the parameters of the optimized Siamese neural network structure.

In this embodiment, in setting the data set, in order to ensure a certain degree of generalization, the principle that the ratio of the training set to the test set is 7:3 is followed. Further, the sample label score value output by the Siamese neural network structure is set as a number 0-5. The advantage of setting the sample label score in this way is that it can describe the similarity between two texts more finely, instead of the 0-1 label, there are only two cases of similarity and dissimilarity. Because how similar the two texts are, or exactly the same, it cannot be seen only through label 1. The number 0 means that the semantics of text A and text B are completely different. The number 5 means that the semantics of text A and text B are exactly the same. Other numbers (such as: 1, 2, 3, 4) represent the degree of semantic similarity between the two sentences in the middle. At the same time, in order to train the network normally, during the training process, these label score values need to be divided by 5 to obtain a normalized score value.

During training, there will be many hyperparameters. For example, set the hyperparameter batch size to 16, the optimizer uses the Adam optimizer, and the learning rate is 2e-5. The loss function used is the cosine similarity loss function. Here, the loss function can also use other loss functions, but compared with other functions, the cosine similarity loss function has more obvious advantages in terms of speed. Directly using cosine similarity to measure the similarity between two sentence vectors can greatly improve the reasoning speed.

When using the test set for prediction, the performance effect on the test set can be observed quantitatively and qualitatively. By predicting the outcome, we can tell if the model has converged.

In one embodiment, the text recommendation based on deep learning also includes:

Acquiring text update information, and storing the text update information in the text database after data cleaning;

According to the text update information in the text database, the corresponding update text feature vector is generated through the twin neural network structure, and the update text feature vector is converted into Milvus update vector index information and stored in the Milvus database.

Considering that the text information in the MySQL database may come from multiple tables, for example, the information in the text database contains 3 categories, and the information of these 3 categories can be passed through the specific content information of a field in the 3 database tables Data cleaning is obtained. Multiple tables represent different types of data sources. After the data of multiple tables is cleaned, the desired data can be obtained and stored in the text database. These tables are called data source tables. Assuming a situation is considered, after the text database is made, the data source table is still continuously adding data. At this time, in order to enhance the dynamic scalability, this embodiment therefore adds a timing synchronization stage, which aims to perform data cleaning on the newly added data (that is, the text update information) according to querying the data source table at a specific time, and synchronize the latest data to the text database. Here, the reason for data cleaning is that the data of the text database itself is cleaned, processed and extracted from the specific content information of a certain field in the database table. At the same time, according to the information synchronized to the text database, the primary key ID number is returned. Pass the newly added text data through the trained text matching model to obtain the text feature encoding vector, normalize the feature vector, combine the ID number, create a Milvus index, encode the text feature vector, generate an index, and insert it into Milvus In the vector database, it is convenient for subsequent efficient query. When comparing text similarity, search the index vector in the Milvus vector database, and perform cosine similarity to calculate the matching result.

In a specific embodiment, the apscheduler timing framework in the Python language can be used to complete the execution of the timing task. The data in the data source table is constantly updated. If there is no timing synchronization mechanism, the newly added data cannot be automatically stored in the text database, nor can the Milvus index be created in time and inserted into the Milvus vector database. Then, when querying, the text information cannot keep pace with the times.

FIG. 3 is a schematic block diagram of a text recommendation device 300 based on deep learning provided by an embodiment of the present application. The device 300 includes:

The first vector generation unit 301 is used to collect different types of text information to construct a text database, and generate a text feature vector for each text information in the text database through a twin neural network structure;

The first vector conversion unit 302 is used to convert the text feature vector into Milvus vector index information, and store it in the Milvus database;

The second vector generation unit 303 is used to obtain the sentence vector containing semantic information in the text to be matched through the twin neural network structure when the text to be matched is matched;

The text matching unit 304 is used to select the top N pieces of Milvus vector index information with the highest semantic similarity in the Milvus database, and based on the correspondence between the Milvus vector index information and text feature vectors, select the corresponding in the text database. The first N pieces of text information are used as the matching results of the text to be matched.

In an embodiment, the first vector generation unit 301 includes:

The encoding output unit is used to combine the text information in the text database in pairs, and input the two text information in the combination into the BERT network model and the average pooling layer of the same structure respectively, and output them respectively to obtain An encoding result corresponding to the two text information, and then use the encoding result as a text feature vector corresponding to the two text information.

In an embodiment, the first vector conversion unit 302 includes:

A normalization unit, configured to perform normalization processing on the text feature vector to obtain a normalized text feature vector;

The second vector conversion unit is used to convert the normalized text feature vector into Milvus vector index information.

In an embodiment, the second vector generating unit 303 includes:

The text semantic representation acquisition unit is used to separately input the text to be matched into the BERT network model to obtain the text semantic representation corresponding to the text to be matched;

The vector fixing unit is used to fix the vector size of the text semantic representation through the average pooling layer to obtain the sentence vector.

In one embodiment, the text matching unit 304 includes:

A similarity calculation unit, for utilizing the cosine similarity method to calculate the similarity score between the sentence vector and each of the Milvus vector index information;

The index information selection unit is used to select the first N Milvus vector index information with the highest similarity score.

In one embodiment, as shown in FIG. 4, the deep learning-based text recommendation device 300 further includes:

The data set division unit 401 is used to select a text data set, and divide the text data set into a training set and a test set according to the ratio of training set: test set=7:3;

The training learning unit 402 is used to use the text data in the training set to train and learn the twin neural network structure, and set the hyperparameter batch size of the twin neural network structure to 16, and the learning rate is 2e-5;

An optimization evaluation unit 403, configured to optimize parameters of the twin neural network structure by using an Adam optimizer, and perform performance evaluation on the twin neural network structure by using a cosine similarity loss function;

A parameter update unit 404, configured to update the parameters of the optimized Siamese neural network structure using the text data in the test set.

In one embodiment, the text recommendation device 300 based on deep learning also includes:

An update information acquiring unit, configured to acquire text update information, and store the text update information in the text database after data cleaning;

The update storage unit is used to generate a corresponding update text feature vector through the twin neural network structure according to the text update information in the text database, and store the update text feature vector into the Milvus update vector index information after converting it into the Milvus update vector index information. Milvus database.

Since the embodiment of the device part corresponds to the embodiment of the method part, please refer to the description of the embodiment of the method part for the embodiment of the device part, and details will not be repeated here.

The embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored. When the computer program is executed, the steps provided in the above-mentioned embodiments can be realized. The storage medium may include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, and other media capable of storing program codes.

The embodiment of the present application also provides a computer device, which may include a memory and a processor. A computer program is stored in the memory. When the processor invokes the computer program in the memory, the steps provided in the above embodiments can be implemented. Of course, the computer equipment may also include components such as various network interfaces and power supplies.

Each embodiment in the description is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other. As for the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and for the related information, please refer to the description of the method part. It should be pointed out that those skilled in the art can make some improvements and modifications to the application without departing from the principles of the application, and these improvements and modifications also fall within the protection scope of the claims of the application.

It should also be noted that in this specification, relative terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or operations There is no such actual relationship or order between the operations. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements of or also include elements inherent in such a process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

Claims

A text recommendation method based on deep learning, characterized in that it includes:

Collect different types of text information to build a text database, and generate a text feature vector for each text information in the text database through a twin neural network structure;

The text feature vector is converted into Milvus vector index information, and stored in the Milvus database;

When the text to be matched is matched, the sentence vector containing semantic information in the text to be matched is obtained through the twin neural network structure;

Select the top N pieces of Milvus vector index information with the highest semantic similarity in the Milvus database, and based on the correspondence between the Milvus vector index information and the text feature vector, select the corresponding top N pieces of text information in the text database as the waiting list The matching result for the matched text.
The text recommendation method based on deep learning according to claim 1, wherein said generating a text feature vector for each text information in said text database through a twin neural network structure comprises:

Combine the text information in the text database in pairs, and input the two text information in the combination into the BERT network model and the average pooling layer of the same structure respectively, and respectively output the two text information corresponding to The encoding result is used as a text feature vector corresponding to the two text information.
The text recommendation method based on deep learning according to claim 1, wherein said text feature vector is converted into Milvus vector index information, and stored in the Milvus database, comprising:

Performing normalization processing on the text feature vector to obtain a normalized text feature vector;

Convert normalized text feature vectors to Milvus vector index information.
The text recommendation method based on deep learning according to claim 1, wherein when the text to be matched is matched, the sentence vector containing semantic information in the text to be matched is obtained through a twin neural network structure, including:

Input the text to be matched into the BERT network model separately to obtain the text semantic representation corresponding to the text to be matched;

The vector size of the semantic representation of the text is fixed through an average pooling layer to obtain the sentence vector.
The text recommendation method based on deep learning according to claim 1, wherein the first N pieces of Milvus vector index information with the highest semantic similarity are selected in the Milvus database, including:

Utilize cosine similarity method to calculate the similarity score of described sentence vector and each described Milvus vector index information;

Select the top N Milvus vector index information with the highest similarity score.
The text recommendation method based on deep learning according to claim 1, further comprising:

Select a text data set, and divide the text data set into a training set and a test set according to a training set: a test set=7:3 ratio;

Utilize the text data in the training set to train and learn the twin neural network structure, and set the hyperparameter batch size of the twin neural network structure to be 16, and the learning rate is 2e-5;

Using the Adam optimizer to optimize the parameters of the twin neural network structure, and using a cosine similarity loss function to evaluate the performance of the twin neural network structure;

Using the text data in the test set to update the parameters of the optimized Siamese neural network structure.
The text recommendation method based on deep learning according to claim 1, further comprising:

Acquiring text update information, and storing the text update information in the text database after data cleaning;

According to the text update information in the text database, the corresponding update text feature vector is generated through the twin neural network structure, and the update text feature vector is converted into Milvus update vector index information and stored in the Milvus database.
A text recommendation device based on deep learning, characterized in that it includes:

The first vector generation unit is used to collect different types of text information to construct a text database, and generates a text feature vector for each text information in the text database through a twin neural network structure;

The first vector conversion unit is used to convert the text feature vector into Milvus vector index information, and store it in the Milvus database;

The second vector generation unit is used to obtain the sentence vector containing semantic information in the text to be matched through the twin neural network structure when the text to be matched is matched;

The text matching unit is used to select the top N pieces of Milvus vector index information with the highest semantic similarity in the Milvus database, and based on the correspondence between the Milvus vector index information and text feature vectors, select the corresponding top N pieces in the text database. N pieces of text information are used as matching results of the text to be matched.
A computer device, characterized in that it includes a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the computer program, it realizes the requirements of claims 1 to 1. 7. The text recommendation method based on deep learning according to any one of the above.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the deep learning-based text recommendation method.