CN106815266B - Referee document retrieval method and device - Google Patents

Referee document retrieval method and device Download PDF

Info

Publication number
CN106815266B
CN106815266B CN201510869926.9A CN201510869926A CN106815266B CN 106815266 B CN106815266 B CN 106815266B CN 201510869926 A CN201510869926 A CN 201510869926A CN 106815266 B CN106815266 B CN 106815266B
Authority
CN
China
Prior art keywords
referee document
referee
document
preset
complexity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510869926.9A
Other languages
Chinese (zh)
Other versions
CN106815266A (en
Inventor
李轶
崔维福
杜宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201510869926.9A priority Critical patent/CN106815266B/en
Publication of CN106815266A publication Critical patent/CN106815266A/en
Application granted granted Critical
Publication of CN106815266B publication Critical patent/CN106815266B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method and a device for searching official documents. Wherein, the method comprises the following steps: acquiring a referee document set matched with the search terms; calculating the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for expressing the matching degree of the referee document and the search word, and the complexity is used for expressing the complexity of the referee document; calculating the corrected relevance value of each referee document in the referee document set according to the relevance value and the complexity respectively to obtain a plurality of corrected relevance values corresponding to a plurality of referee documents in the referee document set respectively; sorting the referee documents in the referee document set according to the plurality of corrected relevance scores to obtain a sorting result; and displaying the referee documents in the referee document set according to the sequencing result. By the aid of the method and the device, the technical problem of low accuracy in sorting of the search results of the judgment documents in the related art is solved.

Description

Referee document retrieval method and device
Technical Field
The application relates to the field of data processing, in particular to a method and a device for searching official documents.
Background
Since 2014, all levels of people's courts began to gradually disclose validation documents on the internet on a large scale under the requirement of the highest people's courts. A large number of effective referee documents have high reference significance and research value for the legal practice of lawful officers, lawyers, law researchers and other legal workers. Since the official documents are long text data, the information retrieval technology is widely used in this respect.
However, the conventional official document retrieval is only stopped at the hit, i.e., all documents matching the user retrieval condition are returned. If the number of documents meeting the conditions is large, the documents cannot be completely displayed on the first pages of the search result page, and the current technology can only perform priority ranking on the display of the documents according to the relevance. The retrieval mode only takes the relevance measured by the keywords as a ranking reference, omits other characteristics contained in the referee document and ignores the real requirements of the user. China is not a case law country, and judges need to carry out theoretical deduction strictly according to legal regulations in the judging process, so that users hope to see cases with sufficient theories and important influence.
Aiming at the problem of lower accuracy in sequencing the retrieval results of the judgment documents in the related technology, no effective solution is provided at present.
Disclosure of Invention
The present application mainly aims to provide a referee document retrieval method and device, so as to solve the problem of low accuracy in ranking the retrieval results of referee documents in the related art.
In order to achieve the above object, according to one aspect of the present application, there is provided a referee document retrieval method. The method comprises the following steps: acquiring a referee document set matched with the search terms; calculating the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for expressing the matching degree of the referee document and the search word, and the complexity is used for expressing the complexity of the referee document; calculating the corrected relevance value of each referee document in the referee document set according to the relevance value and the complexity respectively to obtain a plurality of corrected relevance values corresponding to a plurality of referee documents in the referee document set respectively; sorting the referee documents in the referee document set according to the plurality of corrected relevance scores to obtain a sorting result; and displaying the referee documents in the referee document set according to the sequencing result.
Further, the official document set includes a first official document, and the calculating the complexity of each official document in the official document set includes: acquiring index parameters of the first referee document, wherein the index parameters comprise at least one of the following parameters: the length of the first referee document, the number of applicable law of the first referee document and the litigation amount of the first referee document; and calculating the complexity of the first referee document according to the index parameters.
Further, calculating the complexity of the first official document according to the index parameter includes: acquiring the weight of each parameter in the index parameters; and calculating the complexity of the first referee document according to the value of each parameter in the index parameters and the weight of each parameter.
Further, the method further comprises: setting the weight according to the following preset rules: calculating the relevance score of each referee document in a plurality of preset referee document sets, wherein the plurality of preset referee document sets are a plurality of referee document sets respectively matched with a plurality of preset training words; calculating the complexity of each referee document in a plurality of preset referee document sets, wherein the complexity is calculated according to the value of each parameter in the index parameters and the initial weight of each parameter, the index parameters used by each referee document participating in calculation are correspondingly consistent, and the initial weights of the same parameters in the index parameters are equal; calculating a corrected relevance score of each referee document in a plurality of preset referee document sets according to the relevance score and corresponding complexity of each referee document in the plurality of preset referee document sets, and determining a first preset number of referee documents with the corrected relevance scores in each preset referee document set in a front sequence; correcting the initial weight of each parameter according to the first preset number of referee documents corresponding to each preset referee document set and the sequence of the corrected relevance scores and the reference sequence; and respectively taking the initial weight of each corrected parameter as the weight of each parameter, wherein the corrected weights of the same parameters are equal.
Further, according to the ranking and the reference ranking of the corrected relevance scores of the first preset number of referee documents corresponding to each preset referee document set, correcting the initial weights of the parameters comprises: calculating the ratio values of the first preset number of referee documents corresponding to each preset referee document set, which are different from the reference sequence according to the sequence of the corrected relevance scores, so as to obtain a plurality of ratio values; judging whether the ratio values are all smaller than a preset threshold value; when the ratio value which is larger than a preset threshold value exists in the plurality of ratio values, correcting the initial weight of each parameter; and finishing the correction of the initial weight of each parameter when the plurality of ratio values are judged to be smaller than the preset threshold value.
In order to achieve the above object, according to another aspect of the present application, there is provided an official document retrieval apparatus. The device includes: the acquisition unit is used for acquiring a referee document set matched with the search terms; the first calculating unit is used for calculating the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for indicating the matching degree of the referee document and the search word, and the complexity is used for indicating the complexity of the referee document; the second calculation unit is used for calculating the corrected relevance score of each referee document in the referee document set according to the relevance score and the complexity respectively to obtain a plurality of corrected relevance scores corresponding to a plurality of referee documents in the referee document set respectively; the sequencing unit is used for sequencing the referee documents in the referee document set according to the plurality of corrected relevance scores to obtain a sequencing result; and a display unit for displaying the referee documents in the referee document set according to the sorting result.
Further, the referee document set includes a first referee document, and the first calculation unit includes: an obtaining module, configured to obtain an index parameter of the first referee document, where the index parameter includes at least one of the following parameters: the length of the first referee document, the number of applicable law of the first referee document and the litigation amount of the first referee document; and the first calculating module is used for calculating the complexity of the first referee document according to the index parameters.
Further, the first calculation module includes: the obtaining submodule is used for obtaining the weight of each parameter in the index parameters; and the calculation submodule is used for calculating the complexity of the first referee document according to the values of all the parameters in the index parameters and the weights of all the parameters.
Further, the apparatus further comprises: the third calculating unit is used for calculating the relevance score of each referee document in a plurality of preset referee document sets, wherein the plurality of preset referee document sets are a plurality of referee document sets respectively matched with a plurality of preset training words; the fourth calculating unit is used for calculating the complexity of each referee document in a plurality of preset referee document sets, wherein the complexity is calculated according to the value of each parameter in the index parameters and the initial weight of each parameter, the index parameters used by each referee document participating in calculation are correspondingly consistent, and the initial weights of the same parameters in the index parameters are equal; a fifth calculating unit, configured to calculate a modified relevance score of each referee document in the plurality of preset referee document sets according to the relevance score and corresponding complexity of each referee document in the plurality of preset referee document sets, and determine a first preset number of referee documents with modified relevance scores in each preset referee document set ranked in the top; the correcting unit is used for correcting the initial weight of each parameter according to the sorting and the reference sorting of the corrected relevance scores of the first preset number of referee documents corresponding to each preset referee document set; and a determining unit for respectively using the initial weight of each parameter after correction as the weight of each parameter, wherein the weights after correction of the same parameter are equal.
Further, the correction unit includes: the second calculation module is used for calculating the ratio values of the first preset number of referee documents corresponding to each preset referee document set, which are different from the reference sequence according to the sequence of the corrected relevance scores, so as to obtain a plurality of ratio values; the judging module is used for judging whether the ratio values are all smaller than a preset threshold value; and the correction module is used for correcting the initial weight of each parameter when judging that the ratio values larger than the preset threshold exist in the plurality of ratio values, and finishing the correction of the initial weight of each parameter when judging that the ratio values are smaller than the preset threshold.
The method comprises the steps of acquiring a referee document set matched with a search term; calculating the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for expressing the matching degree of the referee document and the search word, and the complexity is used for expressing the complexity of the referee document; calculating the corrected relevance value of each referee document in the referee document set according to the relevance value and the complexity respectively to obtain a plurality of corrected relevance values corresponding to a plurality of referee documents in the referee document set respectively; sorting the referee documents in the referee document set according to the plurality of corrected relevance scores to obtain a sorting result; and displaying the referee documents in the referee document set according to the sequencing result, so that the problem of low accuracy in sequencing the retrieval results of the referee documents in the related art is solved, and the effect of improving the accuracy in sequencing the retrieval results of the referee documents is achieved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:
FIG. 1 is a flow chart of a method of referee document retrieval according to an embodiment of the application;
FIG. 2 is a flow chart of setting weights according to a preset rule according to an embodiment of the present application; and
fig. 3 is a schematic diagram of a referee document retrieval apparatus according to an embodiment of the present application.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an embodiment of the present application, a referee document retrieval method is provided, and fig. 1 is a flowchart of the referee document retrieval method according to the embodiment of the present application. As shown in fig. 1, the method includes steps S102 to S110 as follows:
step S102: and acquiring a referee document set matched with the search terms.
The search term of the embodiment of the application is a keyword for searching the referee document, and the search term can be one or more. When searching documents, generally, a search term input by a user is received, and a referee document matched with the search term is obtained from a referee document database according to the search term. Alternatively, the referee document containing the search term may be acquired, or the referee document containing part of the content of the search term may be acquired, for example, if the search term is contract invalid, the referee document containing the contract invalid may be acquired to form a referee document set, or the referee document containing the contract invalid and the referee document containing both the contract and the invalid may be acquired to form a referee document set.
Step S104: and calculating the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for expressing the matching degree of the referee document and the search word, and the complexity is used for expressing the complexity of the referee document.
The relevance score of the embodiment of the application is used for measuring the matching degree of the referee document and the search word, the bigger the relevance score is, the higher the matching degree of the referee document and the search word is, and the smaller the relevance score is, the lower the matching degree of the referee document and the search word is. Alternatively, the relevance score may be calculated by counting the number of times of occurrence and the completeness of a search term in a referee document, wherein the more the number of times of occurrence and the completeness of a search term in a referee document are, the higher the relevance score of the referee document is.
Generally, cases with sufficient meanings and important influences have higher reference meanings and research values for users, so that when the referee document retrieval is carried out, the relevance score and the complexity of each referee document are comprehensively considered to return a retrieval result so as to improve the pertinence of the returned referee document. The complexity of the embodiment of the application is used for measuring the complexity of the referee document, such as the copying degree and the importance degree of the referee document case, and specifically, the complexity of the referee document can be measured by some index parameters, such as the case amount, the number of applicable laws, the length of the referee document, and the like. In practical situations, an applicable index parameter may be selected according to a requirement, for example, one index parameter may be selected to calculate the complexity of the referee document, or a plurality of index parameters may be selected to calculate the complexity of the referee document together.
Step S106: and calculating the corrected relevance score of each referee document in the referee document set according to the relevance score and the complexity respectively to obtain a plurality of corrected relevance scores corresponding to the plurality of referee documents in the referee document set respectively.
For example, the relevance score of a referee document can be multiplied by the complexity to obtain a revised relevance score for the referee document. In the embodiment of the application, each referee document in the referee document set corresponds to one corrected relevance score, wherein the corrected relevance score corresponding to the referee document is calculated according to the relevance score and the complexity of the referee document.
Step S108: and sequencing the referee documents in the referee document set according to the plurality of corrected relevance scores to obtain a sequencing result.
Step S110: and displaying the referee documents in the referee document set according to the sequencing result.
Specifically, the official documents with large corrected relevance scores can be displayed in front of the search results, so that the user can visually check the official documents with complicated cases and great reference significance.
The method comprises the steps of acquiring a referee document set matched with a search term; calculating the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for expressing the matching degree of the referee document and the search word, and the complexity is used for expressing the complexity of the referee document; calculating the corrected relevance value of each referee document in the referee document set according to the relevance value and the complexity respectively to obtain a plurality of corrected relevance values corresponding to a plurality of referee documents in the referee document set respectively; sorting the referee documents in the referee document set according to the plurality of corrected relevance scores to obtain a sorting result; and displaying the referee documents in the referee document set according to the sequencing result. According to the embodiment of the application, the relevance scores of the referee documents are corrected according to the complexity of the referee documents, and the retrieval results are sorted and displayed according to the corrected relevance scores.
Preferably, the official document set includes a first official document, and calculating the complexity of each official document in the official document set includes: acquiring index parameters of the first referee document, wherein the index parameters comprise at least one of the following parameters: the length of the first referee document, the number of applicable law of the first referee document and the litigation amount of the first referee document; and calculating the complexity of the first referee document according to the index parameters.
The first referee document may be any one of a set of referee documents. The length of the first official document may be the full-text length of the first official document, or may be the length of the legal discussion part of the first official document, and specifically, the number of words may be counted as the length thereof. The number of applicable laws of the first referee document can be counted by extracting the law number in the referee document according to the preset information extraction rule, and optionally, the same law number of different laws and the different law number of different laws can be counted. The litigation amount of the first official document is a charge related to the first official document, and is, for example, a case acceptance charge, an indemnity charge, a fine, or the like. Alternatively, if halving of the litigation cost is involved, the amount withdrawn should be reduced to before halving. For example, a referee document has the following contents:
"this institute believes that a couple of Yuandou and Duliu registered in the former period is seven years later, and a couple emotion is established after the couple. Though advocated to go away from home, the original advices do not prove and are not approved. The original request divorce has no legal conditions and is not supported. According to the thirty-second article of the marriage law of the people's republic of China, the one hundred and forty-four article of the litigation law of the people's republic of China, the following is judged:
no one should be confused with either the Du or the Du-Liu.
The case acceptance fee is 300 Yuan and is borne by the original report.
If the judgment is not taken, the petition can be submitted to the institute within fifteen days from the date of delivery of the judgment book, copies are proposed according to the number of the people of the opposite party, and the petition is complained to the middle-level people court in Weifang city in Shandong province. "
For the referee document, the length is 216, the number of applicable laws is 2 (namely thirty-second of the marriage law of the people's republic of China and hundred and forty-fourth of the litigation law of the people's republic of China), and the litigation amount is 300.
Preferably, the calculating the complexity of the first official document based on the index parameter includes: acquiring the weight of each parameter in the index parameters; and calculating the complexity of the first referee document according to the value of each parameter in the index parameters and the weight of each parameter.
Specifically, each of the index parameters in the embodiments of the present application may be any two different index parameters, for example, the number of applicable laws and the litigation amount, or may be any two or more different index parameters, for example, the number of applicable laws, the litigation amount, and the document length. The value of each parameter is a value of each parameter for a specific official document, and for example, the value of the number of applicable laws for the official document 1 is a1, the value of the litigation amount is B1, the value of the number of applicable laws for the official document 2 is a2, and the value of the litigation amount is B2. The weight of each parameter may be a value preset according to experience, or a value obtained by training according to a preset rule.
For example, the complexity of the first referee document can be calculated using the following formula:
where C denotes the complexity of the first referee document, L, M, N denotes the length of the first referee document, the number of applicable laws of the first referee document, and the litigation amount of the first referee document, and pL, pM, and pN denote the length of the first referee document, the number of applicable laws of the first referee document, and the weight of the litigation amount of the first referee document, respectively.
Preferably, in order to further improve the accuracy of ranking the search results, before receiving the search terms and performing official document search, the present application may further set the weights according to the following preset rules: calculating the relevance score of each referee document in a plurality of preset referee document sets, wherein the plurality of preset referee document sets are a plurality of referee document sets respectively matched with a plurality of preset training words; calculating the complexity of each referee document in a plurality of preset referee document sets, wherein the complexity is calculated according to the value of each parameter in the index parameters and the initial weight of each parameter, the index parameters used by each referee document participating in calculation are correspondingly consistent, and the initial weights of the same parameters in the index parameters are equal; calculating a corrected relevance score of each referee document in a plurality of preset referee document sets according to the relevance score and corresponding complexity of each referee document in the plurality of preset referee document sets, and determining a first preset number of referee documents with the corrected relevance scores in each preset referee document set in a front sequence; correcting the initial weight of each parameter according to the first preset number of referee documents corresponding to each preset referee document set and the sequence of the corrected relevance scores and the reference sequence; and respectively taking the initial weight of each corrected parameter as the weight of each parameter, wherein the corrected weights of the same parameters are equal.
The preset training words are keywords used for the weight training, and in the embodiment of the application, a group of keywords used for the weight training is preset, and preset referee document sets corresponding to the preset training words are respectively obtained, wherein each referee document in the preset referee document sets is a referee document matched with the corresponding preset training word, for example, a group of preset training words comprises a preset training word a and a preset training word b, the referee documents matched with the preset training word a are obtained to form a preset referee document set a, and the referee documents matched with the preset training word b are obtained to form a preset referee document set b.
After obtaining the plurality of preset referee document sets, calculating the relevance score and the complexity of each referee document in the plurality of preset referee document sets. Specifically, the calculation method of the relevance score and the complexity of the referee document is the same as above, and is not repeated herein, and it is to be noted that, for different referee documents, the selected index parameters are the same, and the initial weights of the same index parameters are also the same, for example, for the referee document 1, the selected index parameters are the number of applicable legal articles and the litigation amount, wherein the initial weight of the number of applicable legal articles is x, and the initial weight of the litigation amount is y, for the referee document 2, the selected index parameters are the number of applicable legal articles and the litigation amount, and the initial weight of the number of applicable legal articles is x, and the initial weight of the litigation amount is y, and the initial weights of the parameters may be any one value in the interval (0, 1).
After obtaining the relevance score and the complexity of each referee document in each preset referee document set, calculating the corrected relevance score of each referee document, for example, multiplying the relevance score and the complexity of the referee document to obtain the corrected relevance score. After the corrected relevance score of each referee document in a plurality of preset referee document sets is obtained, a first preset number of referee documents with corrected relevance scores in each preset referee document set in a front sequence are determined.
For example, there are 100 referee documents in the predetermined referee document set a and 130 referee documents in the predetermined referee document set b, and the documents are sorted according to the modified relevance scores of the 100 referee documents in the predetermined referee document set a and the modified relevance scores of the 130 referee documents in the predetermined referee document set b, respectively, and 10 referee documents in the predetermined referee document set a and the predetermined referee document set b which are ranked first are determined respectively.
The benchmark rank of the embodiment of the application is a rank preset by a user, and is used for measuring the accuracy of the ranking results ranked according to the corrected relevance scores, and specifically, the accuracy of the ranking results ranked according to the corrected relevance scores can be determined by counting the same ratio or different ratios of the ranking according to the corrected relevance scores and the benchmark rank.
For example, the modified relevance scores returned by the preset training word "contract invalid" are ranked as the top 10 official documents a1 to a10, and a1 to a10 are ranked according to the modified relevance scores: a1> A2> A3> A4> A5> A6> A7> A8> A9> A10, and the benchmark ranks of A1 to A10 are A2> A3> A1> A5> A6> A4> A7> A8> A10> A9. According to the arrangement principle, 45 sorting modes exist in 10 official documents A1-A10, for example, A2> A3, A8> A10 and the like, and as can be seen from the above, 41 same sorts exist in the A1-A10 sorting and the reference sorting according to the modified relevance score, and 4 different sorts exist, namely, the sorts of A2 and A1, A3 and A1, A5 and A4, A6 and A4 are different, so that the same ratio and the different ratio are 91% and 9% respectively according to the modified relevance score sorting and the reference sorting.
Preferably, the correcting the initial weight of each parameter according to the ranking of the corrected relevance scores and the reference ranking of the first preset number of referee documents corresponding to each preset referee document set comprises: calculating the ratio values of the first preset number of referee documents corresponding to each preset referee document set, which are different from the reference sequence according to the sequence of the corrected relevance scores, so as to obtain a plurality of ratio values; judging whether the ratio values are all smaller than a preset threshold value; when the ratio value which is larger than a preset threshold value exists in the plurality of ratio values, correcting the initial weight of each parameter; and finishing the correction of the initial weight of each parameter when the plurality of ratio values are judged to be smaller than the preset threshold value.
And respectively counting the ratio values of the first preset number of referee documents corresponding to each preset referee document set according to the different sorting of the corrected relevance scores and the reference sorting, and comparing the ratio values with a preset threshold, wherein the preset threshold can be set according to actual conditions, for example, 5%, 8% and the like.
For example, the preset threshold is 5%, the preset rate of the front 10 official documents a1 to a10 in the official document set a sorted according to the modified relevance is 12%, and the rate of the difference between the sorting result of the a1 to a10 sorted according to the modified relevance and the reference sorting is 12%, that is, the rate value a is 12%; the ratio of the top 10 official documents B1-B10 in the preset official document set B sorted according to the modified correlation and the difference between the sorting result of B1-B10 sorted according to the modified correlation and the reference sorting is 20%, that is, the ratio value B is 20%, so that it can be seen that both the ratio value a and the ratio value B are greater than the preset threshold, and at this time, the initial weights of the parameters need to be modified, for example, three initial weights pL, pM, and pN in the formula are modified.
Specifically, in the embodiment of the present application, after the initial weights of the parameters are corrected, the complexity of each referee document in a plurality of preset referee document sets is recalculated according to the corrected initial weights of the parameters, and the above steps are repeatedly executed until the ratio values of the first preset number of referee documents corresponding to each preset referee document set, which are different from the reference sequence according to the sequence of the corrected relevance scores, are all smaller than the preset threshold, and at this time, the finally corrected initial weights of the parameters are used as the weights of the parameters.
The initial weight of each parameter may be modified by adjusting the value of the initial weight of each parameter according to a preset step length, or by manually adjusting the value of the initial weight of each parameter.
Fig. 2 is a flowchart of setting weights according to a preset rule according to an embodiment of the present application. As shown in fig. 2, the setting of the weight according to the preset rule includes the following steps:
step S202: a set of referee documents is obtained.
Optionally, the official document may be correspondingly stored in a memory or a hard disk database or a data structure according to the unique identifier.
Step S204: and traversing the group of referee documents, and extracting the full text length L, the number N of applicable laws and the litigation amount M of each referee document.
Specifically, traversing the set of official documents, for each official document, the following operations are performed: firstly, extracting the full text length of the document and marking as L; extracting the referee applicable law rules of the document, and recording the total number N of the applicable law rules, wherein a plurality of law rules of the same law are regarded as a plurality of law rules; and thirdly, extracting the litigation amount M of the document, wherein when the litigation cost is halved, the extracted amount is reduced to the level before halving. It should be noted that the three extracted index parameters can be all correspondingly stored in the field to which the corresponding referee document record belongs.
Step S206: and acquiring three preset initial parameters pL, pN and pM, and calculating the complexity index C of the current referee document.
Specifically, the complexity index C, i.e. the complexity, is calculated according to the three initial parameters (i.e. initial weights) pL, pN, and pM given in advance in the traversal process of step S204, and the calculation formula is as follows:
where L, M, N denotes the length of the current referee document, the number of applicable articles of the current referee document, and the litigation amount of the current referee document, and pL, pM, and pN denote the length of the current referee document, the number of applicable articles of the current referee document, and the initial weight of the litigation amount of the current referee document, respectively. And after the complexity index C of the current referee document is obtained, correspondingly storing the complexity index C into a field to which the current referee document record belongs, wherein the current referee document is the currently processed referee document.
Step S208: giving a group of preset training words, obtaining the relevance scores of the matching documents, multiplying the relevance scores of the preset training words by the complexity index C to obtain modification relevance scores, and pushing the previous 10 matching documents according to the modification relevance scores from high to low after obtaining the modification relevance scores of all the matching documents.
Specifically, the following description will be given by taking a preset training word as an example, for a preset training word "contract invalid", for example, full-text search may be performed on the stored referee document to obtain a relevance score of a matching document (i.e., the referee document matching the preset training word), where the more times the preset training word appears in the matching document, the more complete the preset training word is, the higher the relevance score is. And multiplying the relevance score of each matching document corresponding to the preset training word by the complexity index C of the matching document to obtain a modified relevance score, and recording the modified relevance score as a modified relevance score (namely a modified relevance score).
After all referee documents matched with the preset training words are subjected to the processing to obtain the modification relevance scores of all the matched documents, all the matched documents are ranked from high to low according to the modification relevance scores, and the top 10 matched documents are pushed. And executing the processing on each preset training word in the group of preset training words, so as to obtain the first 10 matched documents corresponding to each preset training word.
Step S210: and sorting the first 10 matched documents corresponding to each preset training word in a manual mode, and calculating the proportion accu% of sorting in the manual mode and sorting according to the modification relevance value.
Specifically, the top 10 matching documents corresponding to each preset training word in step S208 are sorted manually (i.e., the benchmark sorting), for example, the top 10 referee documents a1 to a10 are sorted by the corrected relevance scores returned by the preset training word "contract invalid", and a1 to a10 are sorted according to the corrected relevance scores as: a1> A2> A3> A4> A5> A6> A7> A8> A9> A10, and A1 to A10 are manually ordered into A2> A3> A1> A5> A6> A4> A7> A8> A10> A9.
Specifically, for the first 10 matching documents corresponding to all the preset training words, the proportion accu% of the manual sorting and the sorting according to the modification relevance score are respectively calculated.
Step S212: and (3) minimizing the accu% to be a target function, and adjusting the initial values of the three parameters pL, pN and pM in a gradient manner to obtain the optimal solutions pL ', pN ' and pM ' of the three parameters.
The optimal solutions pL ', pN ', and pM ' of the three parameters are weights corresponding to the index parameters.
Specifically, the accu% corresponding to each preset training word may be respectively compared with a preset threshold, if the accu% corresponding to each preset training word is less than the preset threshold, then it is not necessary to correct pL, pN, and pM, the current pL, pN, and pM are the optimal solutions, if the accu% corresponding to each preset training word is greater than the preset threshold, then three parameters pL, pN, and pM are corrected, and the complexity of each matching document is recalculated through the corrected pL, pN, and pM, that is, the above steps S206 to S212 are repeatedly performed until the accu% corresponding to each preset training word is less than the preset threshold, and at this time, the corresponding pL, pN, and pM are the optimal solutions.
Step S214: and traversing all the referee documents again according to pL ', pN ' and pM ' and calculating the complexity indexes of the referee documents.
It should be noted that, in the user retrieval process, the relevance score is multiplied by the new complexity index to obtain a new corrected relevance score, and the matching results are sorted and displayed according to the score.
The embodiment of the application is a sorting mode fusing the importance (complexity) of the referee documents, and the mode adds indexes (such as length, involved amount, number of applicable laws and the like) for measuring the importance of the referee documents on the basis of the traditional full-text retrieval relevance sorting, and adjusts the values of relevant parameters by combining manual judgment and a mode of solving the optimal solution of an objective function to obtain the optimal effect, so that the sorting result of the referee documents is more in line with the retrieval requirements of users.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
According to another aspect of the embodiments of the present application, there is provided a referee document retrieval apparatus which can be used to execute the referee document retrieval method according to the embodiments of the present application, and the referee document retrieval method according to the embodiments of the present application can also be executed by the referee document retrieval apparatus according to the embodiments of the present application.
Fig. 3 is a schematic diagram of an official document retrieval apparatus according to an embodiment of the present application, as shown in fig. 3, the apparatus including: an acquisition unit 10, a first calculation unit 20, a second calculation unit 30, a sorting unit 40 and a display unit 50.
An obtaining unit 10, configured to obtain a referee document set matched with the search term.
The search term of the embodiment of the application is a keyword for searching the referee document, and the search term can be one or more. When searching documents, generally, a search term input by a user is received, and a referee document matched with the search term is obtained from a referee document database according to the search term. Alternatively, the referee document containing the search term may be acquired, or the referee document containing part of the content of the search term may be acquired, for example, if the search term is contract invalid, the referee document containing the contract invalid may be acquired to form a referee document set, or the referee document containing the contract invalid and the referee document containing both the contract and the invalid may be acquired to form a referee document set.
The first calculating unit 20 is configured to calculate a relevance score and a complexity of each referee document in the referee document set, where the relevance score is used to indicate a matching degree of the referee document and the search term, and the complexity is used to indicate a complexity of the referee document.
The relevance score of the embodiment of the application is used for measuring the matching degree of the referee document and the search word, the bigger the relevance score is, the higher the matching degree of the referee document and the search word is, and the smaller the relevance score is, the lower the matching degree of the referee document and the search word is. The complexity of the embodiment of the application is used for measuring the complexity of the referee document, such as the copying degree and the importance degree of the referee document case, and specifically, the complexity of the referee document can be measured by some index parameters, such as the case amount, the number of applicable laws, the length of the referee document, and the like.
The second calculating unit 30 is configured to calculate a corrected relevance score of each referee document in the referee document set according to the relevance score and the complexity, and obtain a plurality of corrected relevance scores corresponding to the plurality of referee documents in the referee document set.
And the sorting unit 40 is used for sorting the referee documents in the referee document set according to the plurality of corrected relevance scores to obtain a sorting result.
And the display unit 50 is used for displaying the referee documents in the referee document set according to the sequencing result.
In the embodiment of the application, the referee document set matched with the search terms is acquired through the acquisition unit 10; the first calculating unit 20 calculates the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for indicating the matching degree of the referee document and the search word, and the complexity is used for indicating the complexity of the referee document; the second calculating unit 30 calculates the corrected relevance score of each referee document in the referee document set according to the relevance score and the complexity respectively to obtain a plurality of corrected relevance scores corresponding to a plurality of referee documents in the referee document set respectively; the sorting unit 40 sorts the referee documents in the referee document set according to the plurality of corrected relevance scores to obtain a sorting result; and the display unit 50 displays the official documents in the official document set according to the sorting result. According to the embodiment of the application, the relevance scores of the referee documents are corrected according to the complexity of the referee documents, and the retrieval results are sorted and displayed according to the corrected relevance scores.
Preferably, the first official document is included in the official document set, and the first calculation unit 20 includes: an obtaining module, configured to obtain an index parameter of the first referee document, where the index parameter includes at least one of the following parameters: the length of the first referee document, the number of applicable law of the first referee document and the litigation amount of the first referee document; and the first calculating module is used for calculating the complexity of the first referee document according to the index parameters.
Preferably, the first calculation module comprises: the obtaining submodule is used for obtaining the weight of each parameter in the index parameters; and the calculation submodule is used for calculating the complexity of the first referee document according to the values of all the parameters in the index parameters and the weights of all the parameters.
Further, the apparatus further comprises: the third calculating unit is used for calculating the relevance score of each referee document in a plurality of preset referee document sets, wherein the plurality of preset referee document sets are a plurality of referee document sets respectively matched with a plurality of preset training words; the fourth calculating unit is used for calculating the complexity of each referee document in a plurality of preset referee document sets, wherein the complexity is calculated according to the value of each parameter in the index parameters and the initial weight of each parameter, the index parameters used by each referee document participating in calculation are correspondingly consistent, and the initial weights of the same parameters in the index parameters are equal; a fifth calculating unit, configured to calculate a modified relevance score of each referee document in the plurality of preset referee document sets according to the relevance score and corresponding complexity of each referee document in the plurality of preset referee document sets, and determine a first preset number of referee documents with modified relevance scores in each preset referee document set ranked in the top; the correcting unit is used for correcting the initial weight of each parameter according to the sorting and the reference sorting of the corrected relevance scores of the first preset number of referee documents corresponding to each preset referee document set; and a determining unit for respectively using the initial weight of each parameter after correction as the weight of each parameter, wherein the weights after correction of the same parameter are equal.
Further, the correction unit includes: the second calculation module is used for calculating the ratio values of the first preset number of referee documents corresponding to each preset referee document set, which are different from the reference sequence according to the sequence of the corrected relevance scores, so as to obtain a plurality of ratio values; the judging module is used for judging whether the ratio values are all smaller than a preset threshold value; and the correction module is used for correcting the initial weight of each parameter when judging that the ratio values larger than the preset threshold exist in the plurality of ratio values, and finishing the correction of the initial weight of each parameter when judging that the ratio values are smaller than the preset threshold.
The referee document retrieval device comprises a processor and a memory, wherein the acquisition unit, the first calculation unit, the second calculation unit, the sequencing unit, the display unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to one or more than one, and the accurate search of the referee document is completed by adjusting the kernel parameters.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
The present application further provides a computer program product adapted to perform program code for initializing the following method steps when executed on a data processing device: acquiring a referee document set matched with the search terms; calculating the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for representing the matching degree of the referee document and the search word, and the complexity is used for representing the complexity of the referee document; calculating a corrected relevance score of each referee document in the referee document set according to the relevance score and the complexity respectively to obtain a plurality of corrected relevance scores corresponding to a plurality of referee documents in the referee document set respectively; sequencing the referee documents in the referee document set according to the corrected relevance scores to obtain a sequencing result; and displaying the referee documents in the referee document set according to the sequencing result.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (6)

1. A method for retrieving official documents, comprising:
acquiring a referee document set matched with the search terms;
calculating the relevance score and the complexity of each referee document in the referee document set, wherein the relevance score is used for representing the matching degree of the referee document and the search word, and the complexity is used for representing the complexity of the referee document;
calculating a corrected relevance score of each referee document in the referee document set according to the relevance score and the complexity respectively to obtain a plurality of corrected relevance scores corresponding to a plurality of referee documents in the referee document set respectively;
sequencing the referee documents in the referee document set according to the corrected relevance scores to obtain a sequencing result; and
displaying the referee documents in the referee document set according to the sequencing result,
the official document set comprises a first official document, and the calculating of the complexity of each official document in the official document set comprises: acquiring an index parameter of the first referee document, wherein the index parameter comprises at least one of the following parameters: the length of the first referee document, the number of applicable laws of the first referee document, and the litigation amount of the first referee document; and calculating the complexity of the first referee document according to the index parameter,
calculating the complexity of the first referee document according to the index parameter comprises: acquiring the weight of each parameter in the index parameters; calculating the complexity of the first referee document according to the value of each parameter in the index parameters and the weight of each parameter;
the complexity of the first referee document is calculated using the following formula:
where C denotes the complexity of the first referee document, L, M, N denotes the length of the first referee document, the number of applicable laws of the first referee document, and the litigation amount of the first referee document, and pL, pM, and pN denote the weight of the length of the first referee document, the weight of the number of applicable laws of the first referee document, and the weight of the litigation amount of the first referee document, respectively.
2. The method of claim 1, further comprising: setting the weight according to the following preset rules:
calculating the relevance score of each referee document in a plurality of preset referee document sets, wherein the plurality of preset referee document sets are a plurality of referee document sets respectively matched with a plurality of preset training words;
calculating the complexity of each referee document in the preset referee document sets, wherein the complexity is calculated according to the value of each parameter in the index parameters and the initial weight of each parameter, the index parameters used by each referee document participating in calculation are correspondingly consistent, and the initial weights of the same parameters in the index parameters are equal;
calculating a corrected relevance score of each referee document in the plurality of preset referee document sets according to the relevance score and corresponding complexity of each referee document in the plurality of preset referee document sets, and determining a first preset number of referee documents with corrected relevance scores in each preset referee document set in a front sequence;
correcting the initial weight of each parameter according to the first preset number of referee documents corresponding to each preset referee document set and the sequence of correction relevance scores and the reference sequence; and
and respectively taking the corrected initial weight of each parameter as the weight of each parameter, wherein the corrected weights of the same parameters are equal.
3. The method according to claim 2, wherein modifying the initial weight of each parameter according to the first predetermined number of official documents corresponding to each of the predetermined official document sets in terms of the modified relevance score ranking and the benchmark ranking comprises:
calculating a ratio value of the first preset number of referee documents corresponding to each preset referee document set, which is different from the reference sequence according to the sequence of the corrected relevance scores, to obtain a plurality of ratio values;
judging whether the ratio values are all smaller than a preset threshold value;
when the ratio value which is larger than the preset threshold value exists in the plurality of ratio values, correcting the initial weight of each parameter; and
and when the plurality of ratio values are judged to be smaller than the preset threshold value, finishing the correction of the initial weight of each parameter.
4. An official document retrieval apparatus, comprising:
the acquisition unit is used for acquiring a referee document set matched with the search terms;
a first calculating unit, configured to calculate a relevance score and a complexity of each referee document in the referee document set, where the relevance score is used to indicate a matching degree of the referee document and the search term, and the complexity is used to indicate a complexity of the referee document;
a second calculating unit, configured to calculate a corrected relevance score of each referee document in the referee document set according to the relevance score and the complexity, respectively, to obtain multiple corrected relevance scores corresponding to multiple referee documents in the referee document set, respectively;
the sequencing unit is used for sequencing the referee documents in the referee document set according to the corrected relevance scores to obtain a sequencing result; and
a display unit for displaying the referee documents in the referee document set according to the sorting result,
the official document set includes a first official document, and the first calculating unit includes: an obtaining module, configured to obtain an index parameter of the first referee document, where the index parameter includes at least one of the following parameters: the length of the first referee document, the number of applicable laws of the first referee document, and the litigation amount of the first referee document; and a first calculating module for calculating the complexity of the first referee document according to the index parameter,
the first computing module includes: the acquisition submodule acquires the weight of each parameter in the index parameters; the calculation submodule is used for calculating the complexity of the first referee document according to the value of each parameter in the index parameters and the weight of each parameter;
the complexity of the first referee document is calculated using the following formula:
where C denotes the complexity of the first referee document, L, M, N denotes the length of the first referee document, the number of applicable laws of the first referee document, and the litigation amount of the first referee document, and pL, pM, and pN denote the weight of the length of the first referee document, the weight of the number of applicable laws of the first referee document, and the weight of the litigation amount of the first referee document, respectively.
5. The apparatus of claim 4, further comprising:
the third calculating unit is used for calculating the relevance score of each referee document in a plurality of preset referee document sets, wherein the plurality of preset referee document sets are a plurality of referee document sets respectively matched with a plurality of preset training words;
a fourth calculating unit, configured to calculate complexity of each referee document in the plurality of preset referee document sets, where the complexity is calculated according to a value of each parameter in the index parameters and an initial weight of each parameter, index parameters used by each referee document participating in calculation are correspondingly consistent, and initial weights of the same parameters in the index parameters are equal;
a fifth calculating unit, configured to calculate a modified relevance score of each referee document in the plurality of preset referee document sets according to the relevance score and corresponding complexity of each referee document in the plurality of preset referee document sets, and determine a first preset number of referee documents with modified relevance scores in each preset referee document set ranked in the top;
a correcting unit, configured to correct the initial weight of each parameter according to a first preset number of referee documents corresponding to each preset referee document set, according to a modified relevance score ranking and a reference ranking; and
and the determining unit is used for respectively taking the initial weight of each modified parameter as the weight of each parameter, and the modified weights of the same parameter are equal.
6. The apparatus of claim 5, wherein the correction unit comprises:
a second calculation module, configured to calculate a ratio value, which is different from the reference ranking according to a ranking of modified relevance scores, of the first preset number of referee documents corresponding to each preset referee document set, so as to obtain multiple ratio values;
the judging module is used for judging whether the ratio values are all smaller than a preset threshold value; and
and the correcting module is used for correcting the initial weight of each parameter when judging that the ratio values larger than the preset threshold exist in the ratio values, and finishing the correction of the initial weight of each parameter when judging that the ratio values are smaller than the preset threshold.
CN201510869926.9A 2015-12-01 2015-12-01 Referee document retrieval method and device Active CN106815266B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510869926.9A CN106815266B (en) 2015-12-01 2015-12-01 Referee document retrieval method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510869926.9A CN106815266B (en) 2015-12-01 2015-12-01 Referee document retrieval method and device

Publications (2)

Publication Number Publication Date
CN106815266A CN106815266A (en) 2017-06-09
CN106815266B true CN106815266B (en) 2020-06-16

Family

ID=59107655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510869926.9A Active CN106815266B (en) 2015-12-01 2015-12-01 Referee document retrieval method and device

Country Status (1)

Country Link
CN (1) CN106815266B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019659B (en) * 2017-07-31 2021-07-30 北京国双科技有限公司 Method and device for searching referee document
CN109388796B (en) * 2017-08-11 2023-04-18 北京国双科技有限公司 Method and device for pushing referee document
CN108573057A (en) * 2018-04-25 2018-09-25 王慧 A kind of legal documents and laws and regulations correspondence search method
CN109446511B (en) * 2018-09-10 2022-07-08 平安科技(深圳)有限公司 Referee document processing method, referee document processing device, computer equipment and storage medium
CN109299382B (en) * 2018-11-01 2021-08-10 厦门市美亚柏科信息股份有限公司 Recommendation method and system for character data and computer storage medium
CN114417084A (en) * 2021-05-21 2022-04-29 深圳市智尊宝数据开发有限公司 Information retrieval method, related apparatus and medium, and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1172994A (en) * 1996-05-29 1998-02-11 松下电器产业株式会社 Document retrieval system
WO2001080087A1 (en) * 2000-04-14 2001-10-25 Rightnow Technologies, Inc. Temporal updates of relevancy rating of retrieved information in an information search system
CN1517914A (en) * 2003-01-06 2004-08-04 Searching of structural file
CN103793418A (en) * 2012-10-31 2014-05-14 珠海富讯网络科技有限公司 Search method of real-time vertical search engine for security industry

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1172994A (en) * 1996-05-29 1998-02-11 松下电器产业株式会社 Document retrieval system
WO2001080087A1 (en) * 2000-04-14 2001-10-25 Rightnow Technologies, Inc. Temporal updates of relevancy rating of retrieved information in an information search system
CN1517914A (en) * 2003-01-06 2004-08-04 Searching of structural file
CN103793418A (en) * 2012-10-31 2014-05-14 珠海富讯网络科技有限公司 Search method of real-time vertical search engine for security industry

Also Published As

Publication number Publication date
CN106815266A (en) 2017-06-09

Similar Documents

Publication Publication Date Title
CN106815266B (en) Referee document retrieval method and device
CN105183731B (en) Recommendation information generation method, device and system
US11514242B2 (en) Method for automatically summarizing internet web page and text information
US9535911B2 (en) Processing a content item with regard to an event
CN106445963B (en) Advertisement index keyword automatic generation method and device of APP platform
CN104361115B (en) It is a kind of based on the entry Weight Determination clicked jointly and device
CN110737859A (en) UP main matching method and device
JP2013522720A (en) Determination of word information entropy
CN107943910B (en) Personalized book recommendation method based on combined algorithm
CN106294744A (en) Interest recognition methods and system
CN109388796B (en) Method and device for pushing referee document
CN105512333A (en) Product comment theme searching method based on emotional tendency
CN110825977A (en) Data recommendation method and related equipment
CN113626700B (en) Lawyer recommendation method, system and equipment
CN111553151A (en) Question recommendation method and device based on field similarity calculation and server
CN115858731A (en) Method, device and system for matching laws and regulations of law and regulation library
JP5367632B2 (en) Knowledge amount estimation apparatus and program
CN109522275B (en) Label mining method based on user production content, electronic device and storage medium
CN105740276B (en) Method and device for estimating click feedback model suitable for commercial search
CN104615685B (en) A kind of temperature evaluation method of network-oriented topic
CN110909532A (en) User name matching method and device, computer equipment and storage medium
CN105447087A (en) Video recommendation method and apparatus
CN105893397A (en) Video recommendation method and apparatus
CN114610796A (en) Text similarity determination method and device, storage medium and electronic equipment
CN107992524A (en) A kind of expert info search and field scoring computational methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant