CN111581332A - Similar judicial case matching method and system based on triple deep hash learning - Google Patents

Similar judicial case matching method and system based on triple deep hash learning Download PDF

Info

Publication number
CN111581332A
CN111581332A CN202010354059.6A CN202010354059A CN111581332A CN 111581332 A CN111581332 A CN 111581332A CN 202010354059 A CN202010354059 A CN 202010354059A CN 111581332 A CN111581332 A CN 111581332A
Authority
CN
China
Prior art keywords
judicial case
document
matched
documents
hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010354059.6A
Other languages
Chinese (zh)
Inventor
尹义龙
聂秀山
刘兴波
崔超然
韩晓晖
马玉玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202010354059.6A priority Critical patent/CN111581332A/en
Publication of CN111581332A publication Critical patent/CN111581332A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/325Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure discloses a similar judicial case matching method and system based on triple deep hash learning, which comprises the steps of obtaining a judicial case document to be matched; inputting the judicial case document to be matched into a pre-trained feature extraction model to obtain a feature expression vector of the judicial case document to be matched; simultaneously inputting the feature expression vectors of the judicial case documents to be matched into a pre-trained triple deep Hash learning model to obtain Hash codes of the judicial case documents to be matched; and calculating the similarity of the judicial case documents based on the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents. And the similarity accurate matching of the judicial case documents is realized.

Description

Similar judicial case matching method and system based on triple deep hash learning
Technical Field
The disclosure relates to the technical field of natural language processing and big data retrieval, in particular to a similar judicial case matching method and system based on triple deep hash learning.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
With the development of society, the number of various legal cases is also rapidly increasing. Similar case matching technology is widely concerned, and in order to pursue accuracy, the existing method generally converts case documents into real-value representations, measures similarity by calculating distances between the real-value representations, and judges matching degree. In implementing the present disclosure, the inventors found that this approach is not suitable for large-scale similar case matching scenarios.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a similar judicial case matching method and system based on triple deep hash learning;
in a first aspect, the present disclosure provides a similar judicial case matching method based on triple deep hash learning;
the similar judicial case matching method based on triple deep hash learning comprises the following steps:
acquiring a judicial case document to be matched;
inputting the judicial case document to be matched into a pre-trained feature extraction model to obtain a feature expression vector of the judicial case document to be matched;
simultaneously inputting the feature expression vectors of the judicial case documents to be matched into a pre-trained triple deep Hash learning model to obtain Hash codes of the judicial case documents to be matched;
and calculating the similarity of the judicial case documents based on the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents.
In a second aspect, the present disclosure provides a similar judicial case matching system based on triple deep hash learning;
the similar judicial case matching system based on triple deep hash learning comprises the following steps:
an acquisition module configured to: acquiring a judicial case document to be matched;
a feature extraction module configured to: inputting the judicial case document to be matched into a pre-trained feature extraction model to obtain a feature expression vector of the judicial case document to be matched;
a hash code extraction module configured to: simultaneously inputting the feature expression vectors of the judicial case documents to be matched into a pre-trained triple deep Hash learning model to obtain Hash codes of the judicial case documents to be matched;
a similarity matching module configured to: and calculating the similarity of the judicial case documents based on the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents.
In a third aspect, the present disclosure also provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs are stored in the memory, and when the electronic device is running, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first aspect.
In a fourth aspect, the present disclosure also provides a computer-readable storage medium for storing computer instructions which, when executed by a processor, perform the method of the first aspect.
In a fifth aspect, the present disclosure also provides a computer program (product) comprising a computer program for implementing the method of any one of the preceding first aspects when run on one or more processors.
Compared with the prior art, the beneficial effect of this disclosure is:
inputting the judicial case documents to be matched into a pre-trained feature extraction model to obtain feature expression vectors of the judicial case documents to be matched, and assisting in realizing the similarity accurate matching of the judicial case documents;
the characteristic expression vector of the judicial case document to be matched is simultaneously input into the pre-trained triple deep Hash learning model to obtain the Hash code of the judicial case document to be matched, so that the precise matching of the similarity of the judicial case documents is realized.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.
FIG. 1 is a flow chart of the method of the first embodiment.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and it should be understood that the terms "comprises" and "comprising", and any variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
Example one
The embodiment provides a similar judicial case matching method based on triple deep hash learning;
as shown in fig. 1, the similar judicial case matching method based on triple deep hash learning includes:
s101: acquiring a judicial case document to be matched;
s102: inputting the judicial case document to be matched into a pre-trained feature extraction model to obtain a feature expression vector of the judicial case document to be matched;
s103: simultaneously inputting the feature expression vectors of the judicial case documents to be matched into a pre-trained triple deep Hash learning model to obtain Hash codes of the judicial case documents to be matched;
s104: and calculating the similarity of the judicial case documents based on the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents.
As one or more embodiments, in S101, a judicial case document to be matched is obtained; the method comprises the following specific steps:
acquiring a judicial case document to be matched;
deleting characters which have no practical significance to the judicial case documents to be matched;
grouping the processed judicial case documents into a group according to N Chinese characters; n is a positive integer.
It should be understood that words that have no practical significance to the deletion of the judicial case documents to be matched; the method comprises the following steps:
characters such as numbers, punctuation marks, virtual words without practical significance and the like are removed by text preprocessing means.
It should be understood that the treated judicial case documents are grouped according to N Chinese characters as a group; the method comprises the following steps:
the document is grouped into 1024 Chinese characters.
Further, for the document fragment with less than 1024 Chinese characters, the number 0 is used for filling.
As one or more embodiments, in S102, inputting the judicial case document to be matched into the pre-trained feature extraction model to obtain a feature expression vector of the judicial case document to be matched; the method comprises the following specific steps:
inputting each group of grouped Chinese characters into a pre-trained feature extraction model to obtain vector representation; repeating the current step to obtain the vector representation corresponding to each group of Chinese characters;
and splicing all vector representations to obtain the feature representation vectors of the judicial case documents to be matched.
Illustratively, the feature extraction model may be, for example: the natural language processing model BERT.
The natural language processing model BERT is used for reducing the number of characters of the document as much as possible on the premise of keeping the document semantics so as to reduce the dimensionality of the compressed document feature representation.
Specifically, in the pre-trained feature extraction model, a training set used in the training process is a plurality of groups of Chinese characters represented by known vectors.
It should be understood that each group of Chinese characters after grouping is input into the pre-trained feature extraction model, resulting in 768-dimensional vector representation.
As one or more embodiments, in S103, the triple deep hash learning model specifically includes: a deep neural network, the loss function of which is a triplet loss function.
As one or more embodiments, in S103, the training process of the pre-trained triple deep hash learning model includes:
constructing a Hash learning model; constructing a training set;
and inputting the training set into a Hash learning model for training, and stopping training when the triple loss function reaches the minimum value to obtain a pre-trained triple deep Hash learning model.
Further, the training set is a plurality of document triplets; each document triplet comprises a known feature representation vector of each document in the three documents and a known hash code of each document in the three documents;
assuming that a document triplet is represented as (d, d1, d2), d represents a feature representation vector of a first document, d1 represents a feature representation vector of a second document, and d2 represents a feature representation vector of a third document, for the training set, the similarity between the feature representation vector of the first document and the feature representation vector of the second document is greater than the similarity between the feature representation vector of the first document and the feature representation vector of the third document.
Further, the loss function is:
Figure BDA0002472866590000061
wherein F is a deep neural network, I is a feature expression vector extracted by a BERT model of a first document, and I is+The feature representation vector, I, extracted for the BERT model of the second document-The feature extracted for the BERT model of the third document represents a vector, K is the hash code length, F (I) is the hash code of the first document, F (I)+) As hash code of the second document, F (I)-) Is the hash code of the third document,
Ltriplet(F(I),F(I+),F(I-) Is) represents a loss function.
And establishing a loss function based on the similarity of the triple documents, namely designing according to the consistency of the similarity between the three documents and the similarity between the hash codes corresponding to the three documents, and further keeping the similarity relation between the original documents by the final document hash codes.
Given the characteristic representation of three documents, I+And I-Wherein the document I and the document I+Has a similarity greater than that of document I and document I-The similarity of (c).
Using the document feature representation extracted by BERT in the previous step as an input, a non-linear mapping from the document feature representation to the hamming space is learned using a deep neural network for generating a hash representation of the unknown document.
In this disclosure, to reduce training overhead and reduce model complexity, F is defined in a deep neural network that is two hidden layers. The first hidden layer adopts a ReLU activation function to deal with gradient disappearance and gradient explosion conveniently, and the second hidden layer adopts a sigmoid activation function to map output between 0 and 1, so 0.5 is used as a threshold value for converting a real value into a binary code (hash code). For training of the neural network, we implemented using a random gradient descent, where the learning rate was set to 0.001 and the number of iterations was set to 120 rounds.
And inputting the characteristic representation of the document into a pre-trained Hash learning deep neural network to obtain the real number representation of the document. Further, a real number is used for binarization with a threshold value of 0.5, i.e., a transformation greater than 0.5 is 1, and a transformation less than 0.5 is 0. Finally, the document of a Chinese character is converted into a hash code with the length of K.
As one or more embodiments, in S104, the similarity of the judicial case documents is calculated based on the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents; the method comprises the following steps: and calculating the similarity of the judicial case documents according to the hamming distance between the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents.
Specifically, when the hamming distance is less than a set threshold, the similarity of the judicial case documents is high; otherwise, the similarity of the judicial case documents is low.
Most of the existing case matching algorithms adopt real-value representation to compare the similarity of cases, and large-scale matching is not used. The hash method can convert multimedia such as documents, images, videos and the like into a compact binary code, and the similarity relation among original data is reserved. The distance measure between the binary codes (also called hash codes) uses hamming distance, which can be solved quickly by hardware xor operation. Therefore, the hash method can have great advantages in storage and efficiency.
The method and the device greatly reduce the storage overhead of the document representation by adopting Hash learning, and improve the matching speed of similar cases. The method is suitable for large-scale similar case matching scenes.
Table 1 is a simulation experiment of the disclosed method, measured with matching accuracy. The data set used for this task was a legal document from the "network of official documents" disclosure, where each data set consisted of three legal documents. For each legal instrument, we provide only a description of the fact.
For each data, we represent the set of data by (d, d1, d2), where d, d1, d2 all correspond to a certain document. For training data, we guarantee that our paperwork data d is more similar to d1 than d is to d2, i.e., sim (d, d1) > sim (d, d 2). This data set relates to a total of five thousand sets of document triplets, all of which must be private lending. We used 4500 sets of document triplets as the training set and 500 sets of document triplets as the test set.
Compared with the prior art, the method and the device have the advantages that Hash learning is adopted, so that the storage cost of document feature representation is greatly reduced, and the matching speed of similar cases is improved.
Table 1 comparison of accuracy of the present disclosure with other algorithms
Figure BDA0002472866590000081
Example two
The embodiment provides a similar judicial case matching system based on triple deep hash learning;
the similar judicial case matching system based on triple deep hash learning comprises the following steps:
an acquisition module configured to: acquiring a judicial case document to be matched;
a feature extraction module configured to: inputting the judicial case document to be matched into a pre-trained feature extraction model to obtain a feature expression vector of the judicial case document to be matched;
a hash code extraction module configured to: simultaneously inputting the feature expression vectors of the judicial case documents to be matched into a pre-trained triple deep Hash learning model to obtain Hash codes of the judicial case documents to be matched;
a similarity matching module configured to: and calculating the similarity of the judicial case documents based on the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents.
It should be noted here that the above-mentioned obtaining module, the feature extracting module, the hash code extracting module and the similarity matching module correspond to steps S101 to S104 in the first embodiment, and the above-mentioned modules are the same as the corresponding steps in the implementation example and application scenarios, but are not limited to the contents disclosed in the first embodiment. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
In the foregoing embodiments, the descriptions of the embodiments have different emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The proposed system can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules is merely a logical functional division, and in actual implementation, there may be other divisions, for example, multiple modules may be combined or integrated into another system, or some features may be omitted, or not executed.
EXAMPLE III
The present embodiment also provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein, a processor is connected with the memory, the one or more computer programs are stored in the memory, and when the electronic device runs, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first embodiment.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software.
The method in the first embodiment may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements, i.e., algorithm steps, described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Example four
The present embodiments also provide a computer-readable storage medium for storing computer instructions, which when executed by a processor, perform the method of the first embodiment.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims (10)

1. The similar judicial case matching method based on triple deep hash learning is characterized by comprising the following steps of:
acquiring a judicial case document to be matched;
inputting the judicial case document to be matched into a pre-trained feature extraction model to obtain a feature expression vector of the judicial case document to be matched;
simultaneously inputting the feature expression vectors of the judicial case documents to be matched into a pre-trained triple deep Hash learning model to obtain Hash codes of the judicial case documents to be matched;
and calculating the similarity of the judicial case documents based on the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents.
2. The method as claimed in claim 1, characterized by obtaining the judicial case documents to be matched; the method comprises the following specific steps:
acquiring a judicial case document to be matched;
deleting characters which have no practical significance to the judicial case documents to be matched;
grouping the processed judicial case documents into a group according to N Chinese characters; n is a positive integer.
3. The method as claimed in claim 1, wherein the judicial case documents to be matched are input into the pre-trained feature extraction model to obtain the feature expression vectors of the judicial case documents to be matched; the method comprises the following specific steps:
inputting each group of grouped Chinese characters into a pre-trained feature extraction model to obtain vector representation; repeating the current step to obtain the vector representation corresponding to each group of Chinese characters;
and splicing all vector representations to obtain the feature representation vectors of the judicial case documents to be matched.
4. The method of claim 1, wherein the pre-trained triple deep hash learning model is trained by a process comprising:
constructing a Hash learning model; constructing a training set;
and inputting the training set into a Hash learning model for training, and stopping training when the triple loss function reaches the minimum value to obtain a pre-trained triple deep Hash learning model.
5. The method of claim 1, wherein the training set is a plurality of document triplets; each document triplet comprises a known feature representation vector of each document in the three documents and a known hash code of each document in the three documents;
assuming that a document triplet is represented as (d, d1, d2), d represents a feature representation vector of a first document, d1 represents a feature representation vector of a second document, and d2 represents a feature representation vector of a third document, for the training set, the similarity between the feature representation vector of the first document and the feature representation vector of the second document is greater than the similarity between the feature representation vector of the first document and the feature representation vector of the third document.
6. The method of claim 1, wherein a loss function based on triple document similarity is established, which is designed based on consistency of similarity between three documents and similarity between their corresponding hash codes, such that the final document hash code retains the similarity relationship between the original documents.
7. The method as claimed in claim 1, wherein the similarity of the judicial case documents is calculated based on the hash code of the judicial case documents to be matched and the hash code of the known judicial case documents; the method comprises the following steps: and calculating the similarity of the judicial case documents according to the hamming distance between the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents.
8. The similar judicial case matching system based on triple deep hash learning is characterized by comprising the following steps:
an acquisition module configured to: acquiring a judicial case document to be matched;
a feature extraction module configured to: inputting the judicial case document to be matched into a pre-trained feature extraction model to obtain a feature expression vector of the judicial case document to be matched;
a hash code extraction module configured to: simultaneously inputting the feature expression vectors of the judicial case documents to be matched into a pre-trained triple deep Hash learning model to obtain Hash codes of the judicial case documents to be matched;
a similarity matching module configured to: and calculating the similarity of the judicial case documents based on the hash codes of the judicial case documents to be matched and the hash codes of the known judicial case documents.
9. An electronic device, comprising: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs being stored in the memory, the processor executing the one or more computer programs stored in the memory when the electronic device is running, to cause the electronic device to perform the method of any of the preceding claims 1-7.
10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the method of any one of claims 1 to 7.
CN202010354059.6A 2020-04-29 2020-04-29 Similar judicial case matching method and system based on triple deep hash learning Pending CN111581332A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010354059.6A CN111581332A (en) 2020-04-29 2020-04-29 Similar judicial case matching method and system based on triple deep hash learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010354059.6A CN111581332A (en) 2020-04-29 2020-04-29 Similar judicial case matching method and system based on triple deep hash learning

Publications (1)

Publication Number Publication Date
CN111581332A true CN111581332A (en) 2020-08-25

Family

ID=72127610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010354059.6A Pending CN111581332A (en) 2020-04-29 2020-04-29 Similar judicial case matching method and system based on triple deep hash learning

Country Status (1)

Country Link
CN (1) CN111581332A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308743A (en) * 2020-10-21 2021-02-02 上海交通大学 Trial risk early warning method based on triple similar tasks

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657350A (en) * 2015-03-04 2015-05-27 中国科学院自动化研究所 Hash learning method for short text integrated with implicit semantic features
CN108629414A (en) * 2018-05-09 2018-10-09 清华大学 depth hash learning method and device
CN110134761A (en) * 2019-04-16 2019-08-16 深圳壹账通智能科技有限公司 Adjudicate document information retrieval method, device, computer equipment and storage medium
CN110222140A (en) * 2019-04-22 2019-09-10 中国科学院信息工程研究所 A kind of cross-module state search method based on confrontation study and asymmetric Hash

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657350A (en) * 2015-03-04 2015-05-27 中国科学院自动化研究所 Hash learning method for short text integrated with implicit semantic features
CN108629414A (en) * 2018-05-09 2018-10-09 清华大学 depth hash learning method and device
CN110134761A (en) * 2019-04-16 2019-08-16 深圳壹账通智能科技有限公司 Adjudicate document information retrieval method, device, computer equipment and storage medium
CN110222140A (en) * 2019-04-22 2019-09-10 中国科学院信息工程研究所 A kind of cross-module state search method based on confrontation study and asymmetric Hash

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308743A (en) * 2020-10-21 2021-02-02 上海交通大学 Trial risk early warning method based on triple similar tasks
CN112308743B (en) * 2020-10-21 2022-11-11 上海交通大学 Trial risk early warning method based on triple similar tasks

Similar Documents

Publication Publication Date Title
CN108959246B (en) Answer selection method and device based on improved attention mechanism and electronic equipment
CN110413785B (en) Text automatic classification method based on BERT and feature fusion
WO2020224219A1 (en) Chinese word segmentation method and apparatus, electronic device and readable storage medium
CN111814466A (en) Information extraction method based on machine reading understanding and related equipment thereof
CN110928997A (en) Intention recognition method and device, electronic equipment and readable storage medium
CN111680494B (en) Similar text generation method and device
WO2021056710A1 (en) Multi-round question-and-answer identification method, device, computer apparatus, and storage medium
CN110866098B (en) Machine reading method and device based on transformer and lstm and readable storage medium
CN112052684A (en) Named entity identification method, device, equipment and storage medium for power metering
CN112380837B (en) Similar sentence matching method, device, equipment and medium based on translation model
CN112084794A (en) Tibetan-Chinese translation method and device
CN111259113A (en) Text matching method and device, computer readable storage medium and computer equipment
CN111291794A (en) Character recognition method, character recognition device, computer equipment and computer-readable storage medium
CN112085091B (en) Short text matching method, device, equipment and storage medium based on artificial intelligence
CN111767714B (en) Text smoothness determination method, device, equipment and medium
CN115062134B (en) Knowledge question-answering model training and knowledge question-answering method, device and computer equipment
CN114064852A (en) Method and device for extracting relation of natural language, electronic equipment and storage medium
CN113722512A (en) Text retrieval method, device and equipment based on language model and storage medium
CN115859302A (en) Source code vulnerability detection method, device, equipment and storage medium
CN113360654B (en) Text classification method, apparatus, electronic device and readable storage medium
CN112232195B (en) Handwritten Chinese character recognition method, device and storage medium
CN114445808A (en) Swin transform-based handwritten character recognition method and system
Touati-Hamad et al. Arabic quran verses authentication using deep learning and word embeddings
CN113887169A (en) Text processing method, electronic device, computer storage medium, and program product
CN113449081A (en) Text feature extraction method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination