CN116069903A - Class search method, system, electronic equipment and storage medium - Google Patents

Class search method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN116069903A
CN116069903A CN202310187729.3A CN202310187729A CN116069903A CN 116069903 A CN116069903 A CN 116069903A CN 202310187729 A CN202310187729 A CN 202310187729A CN 116069903 A CN116069903 A CN 116069903A
Authority
CN
China
Prior art keywords
case
encoder
search
vector representation
retrieval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310187729.3A
Other languages
Chinese (zh)
Inventor
邹游
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Terminus Technology Group Co Ltd
Original Assignee
Terminus Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Terminus Technology Group Co Ltd filed Critical Terminus Technology Group Co Ltd
Priority to CN202310187729.3A priority Critical patent/CN116069903A/en
Publication of CN116069903A publication Critical patent/CN116069903A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure provides a class retrieval method, a system, electronic equipment and a storage medium, and belongs to the field of class retrieval. The method comprises the following steps: receiving a retrieval case request of a user; inputting the search case sentences into a class search encoder trained in advance to obtain corresponding sentence vector representations; and obtaining similar cases similar to the search case sentences according to the sentence vector representation and the case vector representation set. According to the case retrieval method, system, electronic equipment and storage medium, the training problem of the case database data of the case is solved under the scene without labels, and the labor cost and time for manually adding labels are avoided. The quality of the retrieval sentence vector representation is improved, so that the retrieval accuracy is improved.

Description

Class search method, system, electronic equipment and storage medium
Technical Field
The embodiment of the disclosure belongs to the field of data retrieval, and particularly relates to a category retrieval method, a category retrieval system, electronic equipment and a storage medium.
Background
The search of the class often has the situation that no data is marked, and in this case, a supervised model cannot be trained. The current practice is to search the class cases in a supervised mode of manually adding labels, so that the labor cost is increased; or the search of the class case is simply carried out through the keywords, and the accuracy is lower.
And an unsupervised mode is used, the sentence vector is often expressed with low quality, so that the retrieval accuracy of the class case is difficult to ensure. The current practice simply adopts the word vector to generate sentence vector to represent sentence vector, resulting in very poor sentence vector representation quality.
In addition, in the case where the data amount of the case library is large, the time-consuming problem of retrieving similar cases is also serious.
Disclosure of Invention
Embodiments of the present disclosure aim to solve at least one of the technical problems existing in the prior art, and provide a category search method, a system, an electronic device, and a storage medium.
One aspect of the present disclosure provides a category retrieval method, including:
receiving a retrieval case request of a user; wherein the search case request includes a search case sentence;
inputting the search case sentences into a class search encoder trained in advance to obtain corresponding sentence vector representations; wherein, the case search encoder is trained in advance by adopting an unsupervised mode based on contrast learning;
obtaining similar cases similar to the search case sentences according to the sentence vector representation and the case vector representation set; wherein the case vector representation set is obtained by processing the case set through the case search encoder.
Optionally, the class search encoder is trained by the following steps:
setting a class search encoder and a momentum encoder;
updating the network weights of the encoder and the momentum encoder according to a loss functionL i Obtaining the trained case search encoder.
Optionally, the loss functionL i The following relation is satisfied:
Figure SMS_1
(1)
wherein ,Mrepresents the queue size of the momentum encoder,
Figure SMS_2
a coded representation of a negative sample in a queue representing a momentum encoder,Nindicates the size of each mini-batch,/-for each mini-batch>
Figure SMS_3
Encoded representation of negative samples representing each mini-batch, +.>
Figure SMS_4
and />
Figure SMS_5
Each sample and its enhanced positive sample are represented separately.
Optionally, the updating the network weights of the encoder and the momentum encoder includes:
updating the network weight of the encoder by a back propagation method;
updating the network weight of the momentum encoder by the following formula (1):
Figure SMS_6
(2)
wherein ,
Figure SMS_7
for the network weight of the momentum encoder>
Figure SMS_8
For the network weights of the encodings,
Figure SMS_9
and in the process of updating the momentum encoder, each latest mini-batch data enters a queue, the oldest data is discharged from the queue, and when each batch of mini-batch data is trained, the data codes of the queues are used as negative samples for comparison learning.
Optionally, the obtaining similar cases similar to the search case sentence according to the sentence vector representation and the case vector representation set includes:
and respectively calculating the similarity between the sentence vector representation and each case vector representation in the case vector representation set, and selecting a plurality of cases with highest similarity as the similarity cases.
Optionally, the calculating the similarity between the sentence vector representation and each case vector representation in the case vector representation set, and selecting a plurality of cases with the highest similarity as the similarity case includes:
distributing each case vector representation to a corresponding plurality of nodes;
for each node, calculating the similarity between the sentence vector representation and the case vector representation of the node through cosine similarity to obtain case selection results of a plurality of nodes;
and merging the case selection results of the nodes, sorting the case selection results in a descending order according to the similarity, and selecting the first K cases as the similar cases.
Another aspect of the present disclosure provides a category retrieval system, characterized in that the system comprises:
and a receiving module: the method comprises the steps of receiving a retrieval case request of a user; wherein the search case request includes a search case sentence;
and a coding module: inputting the search case sentences into a class search encoder trained in advance to obtain corresponding sentence vector representations; wherein, the case search encoder is trained in advance by adopting an unsupervised mode based on contrast learning;
the calculation module: the method is used for obtaining similar cases similar to the search case sentences according to the sentence vector representation and the case vector representation set; wherein the case vector representation set is obtained by processing the case set through the case search encoder.
Optionally, the system further comprises a training module, wherein the training module is used for:
setting a class search encoder and a momentum encoder;
updating the network weights of the encoder and the momentum encoder according to a loss functionL i Obtaining the trained case search encoder.
Another aspect of the present disclosure provides an electronic device, including:
at least one processor; the method comprises the steps of,
and a memory communicatively coupled to the at least one processor for storing one or more programs that, when executed by the at least one processor, cause the at least one processor to implement the class retrieval method as described above.
A final aspect of the present disclosure provides a computer readable storage medium storing a computer program which when executed by a processor implements a class retrieval method as described above.
According to the case retrieval method, system, electronic equipment and storage medium, the training problem of the case database data of the case is solved under the scene without labels, and the labor cost and time for manually adding labels are avoided. The quality of the retrieval sentence vector representation is improved, so that the retrieval accuracy is improved.
Drawings
FIG. 1 is a flow chart of a class search method according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram of a type of retrieval system according to another embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of an electronic device according to another embodiment of the disclosure.
Detailed Description
In order that those skilled in the art will better understand the technical solutions of the present disclosure, the present disclosure will be described in further detail with reference to the accompanying drawings and detailed description.
As shown in fig. 1, an embodiment of the present disclosure provides a category retrieval method including the steps of:
step S11, training a case search encoder.
Taking a large number of cases as a training set, and performing unsupervised training based on contrast learning. In the embodiment of the disclosure, each sample x of the training set can only be set to 256 for a graphics card with 12G video memory because of the limitation of video memory. To increase the negative sample for comparison, the present embodiment adopts a contrast learning method of Momentum Contrast.
Specifically, we use the chinese version deberta model as a class search encoder, additionally set as a momentum encoder, and the initial weights use the weights of the encoder. momentum encoder does not back propagate the gradient and his network weight update rule is as follows (2):
Figure SMS_10
(2)
wherein ,
Figure SMS_11
for the network weight of the momentum encoder>
Figure SMS_12
For the network weights of the encodings,
Figure SMS_13
. At the same time->
Figure SMS_14
Updates are made by back propagation.
The momentum encoder maintains a queue of size Z times the size of batch_size, M. And in the training process, each latest mini-batch data enters a queue, the oldest data is discharged from the queue, and when each batch of mini-batch data is trained, the data codes of the queue are used as negative samples for comparison learning.
Here our loss function is as follows (1):
Figure SMS_15
(1)
wherein ,Mrepresents the queue size of the momentum encoder,
Figure SMS_16
a coded representation of a negative sample in a queue representing a momentum encoder,Nindicates the size of each mini-batch,/-for each mini-batch>
Figure SMS_17
Encoded representation of negative samples representing each mini-batch, +.>
Figure SMS_18
and />
Figure SMS_19
Each sample and its enhanced positive sample are represented separately.
And performing self-supervised contrast learning training through the loss function, wherein the weight of the encoder obtained after the final training is the final weight of our bert.
Step S12, receiving a search case request of a user.
Specifically, when the user needs to search the class, the keyword sentence to be searched is input on the computer and other devices to form a search case sentence, so as to complete the receiving of the user search case request.
Step S13, inputting the search case sentences into a class search encoder trained in advance to obtain corresponding sentence vector representations.
Specifically, when searching for similar cases, all cases in the case library are first compared and learned by the case search encoder after self-supervision training in the initialization stage to obtain vector representations of all cases in the case library
Figure SMS_20
. Every time the user searches, the sentence searched by the user is obtained by the class search encoder to obtain the vector representation of the sentence searched by the user +.>
Figure SMS_21
And step S14, obtaining similar cases similar to the search case sentences according to the sentence vector representation and the case vector representation set.
Specifically, sentence vector representation of user retrieval cases
Figure SMS_22
Representing all the case vectors in the case library respectively
Figure SMS_23
The similarity is calculated, the cos function is used for calculating the similarity of the two vector representations, and the first K cases (K epsilon N) with the highest similarity are the similar cases needing to be returned for retrieval.
The embodiment of the disclosure provides an unsupervised case retrieval method based on contrast learning, solves the training problem under the scene that case base data of the case are not marked, and avoids the manpower cost and time for manually adding marks. The quality of the retrieval sentence vector representation is improved, so that the retrieval accuracy is improved.
Illustratively, in order to optimize the speed when performing step S14, to make the speed of calculating the similarity faster, embodiments of the present disclosure provide a distributed calculation method, including:
distributing each case vector representation to a corresponding plurality of nodes;
for each node, calculating the similarity between the sentence vector representation and the case vector representation of the node through cosine similarity to obtain case selection results of a plurality of nodes;
and merging the case selection results of the nodes, sorting the case selection results in a descending order according to the similarity, and selecting the first K cases as the similar cases.
Specifically, each case vector is represented
Figure SMS_24
Respectively put innThe nodes master, node1, node2, node3 and …. Assume that the number of similar cases to be retrieved isSAll case vectors are represented +.>
Figure SMS_25
Evenly distributed on each node, each node is provided withS/nThe individual case vectors represent. At this time, sentence vector representation +_is synchronously calculated on each node by cosine similarity>
Figure SMS_26
Representation +.>
Figure SMS_27
And the top K results are taken separately. Then master node merges allnAnd sequencing the first K results of each node from high to low in similarity of the combined results, and then taking the first K results and returning. Ideally the number of nodes to be set up in the calculationnInversely proportional.
The embodiment of the disclosure provides a distributed similarity calculation method, solves the problem of serious time consumption when searching a large number of cases, and improves the searching efficiency.
Another embodiment of the present disclosure provides a category retrieval system, as shown in fig. 2, comprising:
the receiving module 201: the method comprises the steps of receiving a retrieval case request of a user; wherein the search case request includes a search case sentence;
encoding module 202: inputting the search case sentences into a class search encoder trained in advance to obtain corresponding sentence vector representations; wherein, the case search encoder is trained in advance by adopting an unsupervised mode based on contrast learning;
the calculation module 203: the method is used for obtaining similar cases similar to the search case sentences according to the sentence vector representation and the case vector representation set; wherein the case vector representation set is obtained by processing the case set through the case search encoder.
Specifically, when the user needs to search for a category, a keyword sentence to be searched for is input into a device such as a computer to form a search case sentence. The receiving module 201 is responsible for receiving the retrieval case sentences. In the initialization phase, the coding modeBlock 202 causes all cases in the case library to be compared with the case search encoder after self-supervision training to obtain vector representations of all cases in the case library
Figure SMS_28
Every time a user searches, the user searched sentences are obtained to be vector representation of the user searched sentences through a class search encoder>
Figure SMS_29
. Calculation module 203 calculates sentence vector representation +.>
Figure SMS_30
Representation of all case vectors in the case base +.>
Figure SMS_31
The similarity of the two vector representations is calculated by using a cos function, and the first K cases (K epsilon N) with the highest similarity are the similar cases needing to be returned for retrieval.
Illustratively, the system further includes a training module 204 for:
setting a class search encoder and a momentum encoder;
updating the network weights of the encoder and the momentum encoder according to a loss functionL i Obtaining the trained case search encoder.
Specifically, the training module 204 performs the training method of the class search encoder described in step S11, so as to obtain a trained class search encoder for use by the encoding module 202.
According to the case retrieval system in the embodiment of the disclosure, the problem of training under the scene that case base data of the case is not marked is solved by an unsupervised case retrieval method based on contrast learning, and the labor cost and time for manually adding marks are avoided. The quality of the retrieval sentence vector representation is improved, so that the retrieval accuracy is improved.
As shown in fig. 3, another embodiment of the present disclosure provides an electronic device, including:
at least one processor 301, and a memory 302 communicatively coupled to the at least one processor 301 for storing one or more programs that, when executed by the at least one processor 301, enable the at least one processor 301 to implement a class retrieval method as described above.
Where the memory and the processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting the various circuits of the one or more processors and the memory together. The bus may also connect various other circuits such as peripherals, voltage regulators, and power management circuits, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or may be a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over the wireless medium via the antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory may be used to store data used by the processor in performing operations.
According to the electronic equipment in the embodiment of the disclosure, the training problem of the case-like case library data under the scene without labeling is solved by realizing the case-like retrieval method, and the labor cost and time for manually adding labeling are avoided. The quality of the retrieval sentence vector representation is improved, so that the retrieval accuracy is improved.
Another embodiment of the present disclosure provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements a class search method as described above.
The computer readable storage medium may be included in the system and the electronic device of the present disclosure, or may exist alone.
A computer readable storage medium may be any tangible medium that can contain, or store a program that can be electronic, magnetic, optical, electromagnetic, infrared, semiconductor systems, apparatus, device, more specific examples including, but not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, an optical fiber, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination thereof.
The computer readable storage medium may also include a data signal propagated in baseband or as part of a carrier wave, with the computer readable program code embodied therein, specific examples of which include, but are not limited to, electromagnetic signals, optical signals, or any suitable combination thereof.
It is to be understood that the above embodiments are merely exemplary embodiments employed to illustrate the principles of the present disclosure, however, the present disclosure is not limited thereto. Various modifications and improvements may be made by those skilled in the art without departing from the spirit and substance of the disclosure, and are also considered to be within the scope of the disclosure.

Claims (10)

1. A category retrieval method, the method comprising:
receiving a retrieval case request of a user; wherein the search case request includes a search case sentence;
inputting the search case sentences into a class search encoder trained in advance to obtain corresponding sentence vector representations; wherein, the case search encoder is trained in advance by adopting an unsupervised mode based on contrast learning;
obtaining similar cases similar to the search case sentences according to the sentence vector representation and the case vector representation set; wherein the case vector representation set is obtained by processing the case set through the case search encoder.
2. The case retrieval method according to claim 1, wherein the case retrieval encoder is trained by:
setting a class search encoder and a momentum encoder;
updating the network weights of the encoder and the momentum encoder according to a loss functionL i Obtaining the trained case search encoder.
3. The case retrieval method according to claim 2, wherein the loss functionL i The following relation is satisfied:
Figure QLYQS_1
(1)
wherein ,Mrepresents the queue size of the momentum encoder,
Figure QLYQS_2
a coded representation of a negative sample in a queue representing a momentum encoder,Nindicates the size of each mini-batch,/-for each mini-batch>
Figure QLYQS_3
Encoded representation of negative samples representing each mini-batch, +.>
Figure QLYQS_4
and />
Figure QLYQS_5
Each sample and its enhanced positive sample are represented separately.
4. The case retrieval method according to claim 2, wherein the updating the network weights of the encoder and the momentum encoder includes:
updating the network weight of the encoder by a back propagation method;
updating the network weights of the momentum encodings by the following formula (2):
Figure QLYQS_6
(2)
wherein ,
Figure QLYQS_7
for the network weight of the momentum encoder>
Figure QLYQS_8
For the network weights of the encodings,
Figure QLYQS_9
and in the process of updating the momentum encoder, each latest mini-batch data enters a queue, the oldest data is discharged from the queue, and when each batch of mini-batch data is trained, the data codes of the queues are used as negative samples for comparison learning.
5. The case retrieval method according to claim 1, wherein the obtaining similar cases similar to the retrieved case sentences from the sentence vector representation and case vector representation sets includes:
and respectively calculating the similarity between the sentence vector representation and each case vector representation in the case vector representation set, and selecting a plurality of cases with highest similarity as the similar cases.
6. The case retrieval method according to claim 5, wherein the calculating the similarity between the sentence vector representation and each case vector representation in the case vector representation set, respectively, and selecting a plurality of cases with the highest similarity as the similar cases includes:
distributing each case vector representation to a corresponding plurality of nodes;
for each node, calculating the similarity between the sentence vector representation and the case vector representation of the node through cosine similarity to obtain case selection results of a plurality of nodes;
and merging the case selection results of the nodes, sorting the case selection results in a descending order according to the similarity, and selecting the first K cases as the similar cases.
7. A category retrieval system, the system comprising:
and a receiving module: the method comprises the steps of receiving a retrieval case request of a user; wherein the search case request includes a search case sentence;
and a coding module: inputting the search case sentences into a class search encoder trained in advance to obtain corresponding sentence vector representations; wherein, the case search encoder is trained in advance by adopting an unsupervised mode based on contrast learning;
the calculation module: the method is used for obtaining similar cases similar to the search case sentences according to the sentence vector representation and the case vector representation set; wherein the case vector representation set is obtained by processing the case set through the case search encoder.
8. The case retrieval system of claim 7, further comprising a training module to:
setting a class search encoder and a momentum encoder;
updating the network weights of the encoder and the momentum encoder according to a loss functionL i Obtaining the trained case search encoder.
9. An electronic device, comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor for storing one or more programs that, when executed by the at least one processor, cause the at least one processor to implement the class retrieval method of any of claims 1-6.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the class search method of any one of claims 1 to 6.
CN202310187729.3A 2023-03-02 2023-03-02 Class search method, system, electronic equipment and storage medium Pending CN116069903A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310187729.3A CN116069903A (en) 2023-03-02 2023-03-02 Class search method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310187729.3A CN116069903A (en) 2023-03-02 2023-03-02 Class search method, system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116069903A true CN116069903A (en) 2023-05-05

Family

ID=86180194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310187729.3A Pending CN116069903A (en) 2023-03-02 2023-03-02 Class search method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116069903A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837370A (en) * 2021-10-20 2021-12-24 北京房江湖科技有限公司 Method and apparatus for training a model based on contrast learning
CN113934830A (en) * 2021-10-19 2022-01-14 平安国际智慧城市科技股份有限公司 Text retrieval model training, question and answer retrieval method, device, equipment and medium
CN114154518A (en) * 2021-12-02 2022-03-08 泰康保险集团股份有限公司 Data enhancement model training method and device, electronic equipment and storage medium
CN114201581A (en) * 2021-11-29 2022-03-18 中国科学院深圳先进技术研究院 Long text retrieval model based on contrast learning
CN114443891A (en) * 2022-01-14 2022-05-06 北京有竹居网络技术有限公司 Encoder generation method, fingerprint extraction method, medium, and electronic device
CN114881043A (en) * 2022-07-11 2022-08-09 四川大学 Deep learning model-based legal document semantic similarity evaluation method and system
CN115495555A (en) * 2022-09-26 2022-12-20 中国科学院深圳先进技术研究院 Document retrieval method and system based on deep learning
CN115640799A (en) * 2022-09-07 2023-01-24 天津工业大学 Sentence vector characterization method based on enhanced momentum contrast learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113934830A (en) * 2021-10-19 2022-01-14 平安国际智慧城市科技股份有限公司 Text retrieval model training, question and answer retrieval method, device, equipment and medium
CN113837370A (en) * 2021-10-20 2021-12-24 北京房江湖科技有限公司 Method and apparatus for training a model based on contrast learning
CN114201581A (en) * 2021-11-29 2022-03-18 中国科学院深圳先进技术研究院 Long text retrieval model based on contrast learning
CN114154518A (en) * 2021-12-02 2022-03-08 泰康保险集团股份有限公司 Data enhancement model training method and device, electronic equipment and storage medium
CN114443891A (en) * 2022-01-14 2022-05-06 北京有竹居网络技术有限公司 Encoder generation method, fingerprint extraction method, medium, and electronic device
CN114881043A (en) * 2022-07-11 2022-08-09 四川大学 Deep learning model-based legal document semantic similarity evaluation method and system
CN115640799A (en) * 2022-09-07 2023-01-24 天津工业大学 Sentence vector characterization method based on enhanced momentum contrast learning
CN115495555A (en) * 2022-09-26 2022-12-20 中国科学院深圳先进技术研究院 Document retrieval method and system based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TING CHEN等: "A Simple Framework for Contrastive Learning of Visual Representations", Retrieved from the Internet <URL:http://arxiv.org> *

Similar Documents

Publication Publication Date Title
US20190057164A1 (en) Search method and apparatus based on artificial intelligence
CN110598078B (en) Data retrieval method and device, computer-readable storage medium and electronic device
CN113806582B (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN111382270A (en) Intention recognition method, device and equipment based on text classifier and storage medium
CN114329029B (en) Object retrieval method, device, equipment and computer storage medium
CN114385780A (en) Program interface information recommendation method and device, electronic equipment and readable medium
CN110634050B (en) Method, device, electronic equipment and storage medium for identifying house source type
CN114494709A (en) Feature extraction model generation method, image feature extraction method and device
CN113591490B (en) Information processing method and device and electronic equipment
CN114490926A (en) Method and device for determining similar problems, storage medium and terminal
CN112487813A (en) Named entity recognition method and system, electronic equipment and storage medium
CN104615620A (en) Map search type identification method and device and map search method and system
JP2022541832A (en) Method and apparatus for retrieving images
CN116069903A (en) Class search method, system, electronic equipment and storage medium
CN113240089B (en) Graph neural network model training method and device based on graph retrieval engine
CN114880991A (en) Knowledge map question-answer entity linking method, device, equipment and medium
CN110688508B (en) Image-text data expansion method and device and electronic equipment
CN115146033A (en) Named entity identification method and device
CN112417260B (en) Localized recommendation method, device and storage medium
CN109325198B (en) Resource display method and device and storage medium
EP4127957A1 (en) Methods and systems for searching and retrieving information
CN112784600A (en) Information sorting method and device, electronic equipment and storage medium
CN111949765A (en) Similar text searching method, system, equipment and storage medium based on semantics
CN110781227B (en) Information processing method and device
CN116049414B (en) Topic description-based text clustering method, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230505