CN111274229A - Method and device for verifying denoising result of retrieved data - Google Patents
Method and device for verifying denoising result of retrieved data Download PDFInfo
- Publication number
- CN111274229A CN111274229A CN202010116110.XA CN202010116110A CN111274229A CN 111274229 A CN111274229 A CN 111274229A CN 202010116110 A CN202010116110 A CN 202010116110A CN 111274229 A CN111274229 A CN 111274229A
- Authority
- CN
- China
- Prior art keywords
- database
- patent document
- core word
- obtaining
- denoising
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24564—Applying rules; Deductive queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method and a device for verifying a denoising result of retrieval data, which are used for verifying the denoising result of the retrieval data by obtaining a first keyword of a first patent document; acquiring a first database from a patent retrieval database according to a first keyword, acquiring a first denoising instruction, deleting a second database comprising a second keyword from the first database, and acquiring second patent literature with first classification information from the second database; judging whether the first classification information meets a first preset condition or not; when satisfied, a first restore instruction is obtained for restoring the second patent document into the first database. The method solves the technical problems that in the prior art, a retrieval database after denoising processing is lack of a validation process or manual operation, and excessive denoising of a target database or incomplete target patent database exists. The effect of realizing the verification of the dryness removal result through classification number analysis and comparison is achieved, the error operation in the dryness removal processing is effectively avoided, the accuracy of retrieving the target database is ensured, and the method is suitable for the technical effect of wide crowds.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a device for verifying a denoising result of retrieved data.
Background
With the continuous development and improvement of social systems, the number of patent documents is rapidly increased, so that the protection of the patent rights of enterprises in various countries is more and more important. For an enterprise, how to accurately retrieve and analyze information meeting the needs of the enterprise from a large amount of patent documents is very important for the development of the whole enterprise. In the era of intellectual economy, intellectual property rights are regarded as strategic resources for providing core competitiveness for an enterprise or even a country, and unprecedented importance is highlighted. The patent contains a large amount of technical information, and a user can acquire the technical development trend in the current technical field by searching and analyzing related patents, so that a direction is provided for later research and development, and infringement risks can be avoided. The patent literature retrieval is the basic work that enterprises comprehensively know the prior art, improves the research and development starting point and avoids intellectual property risks. Because original patent data disclosed on the internet is incomplete, language is obscure, and the original patent data is long and difficult to understand, enterprises have difficulty in searching if professional searching methods and skills are not mastered.
However, the applicant of the present invention finds that the prior art has at least the following technical problems:
in the prior art, a retrieval database subjected to denoising processing lacks a validation process, or the validation process needs to be manually operated by a professional, the process is complicated, the application range is limited, the accuracy of a validation result cannot be guaranteed, and the technical problems that a target database is excessively denoised or the target patent database is incomplete exist.
Disclosure of Invention
The embodiment of the invention provides a method and a device for verifying a denoising result of retrieval data, and solves the technical problems that in the prior art, a retrieval database after denoising processing is lack of a validation process, or the validation process needs manual operation by professional personnel, the process is complicated, the application range is limited, the accuracy of the validation result cannot be ensured, and excessive denoising of a target database or incomplete target patent database exists.
In view of the foregoing problems, embodiments of the present application are provided to provide a method and an apparatus for verifying a denoising result of retrieved data.
In a first aspect, the present invention provides a method for verifying a denoising result of retrieved data, where the method includes: obtaining a first patent document having a first keyword; obtaining a first database from a patent retrieval database according to a first keyword, wherein a patent document in the first database comprises the first keyword; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second database from the first database, a patent document in the second database comprises a second keyword, and the second keyword is denoising information; obtaining a second patent document from the second database, the second patent document having first classification information; judging whether the first classification information meets a first preset condition or not; when the first classification information meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent literature to the first database.
Preferably, after determining whether the first classification information satisfies a first predetermined condition, the method includes: and when the first classification information does not meet the first preset condition, taking the first database as a target database.
Preferably, before obtaining the first denoising instruction, the method includes: obtaining a first core word according to the first patent document; obtaining a second core word according to the first core word, wherein the second core word is information of a dryness removal target; obtaining a third patent document according to the first database; obtaining a third core word according to the third patent document; obtaining a first relevance according to the third core word and the second core word; judging whether the first relevance meets a first preset threshold value or not; and when the first relevance meets a first preset threshold value, determining the second keyword according to the third core word.
Preferably, after determining whether the first correlation satisfies a first preset threshold, the method includes: when the first relevance does not meet a first preset threshold value, obtaining a first attribute according to the third core word; obtaining second classification information according to the first attribute; judging whether the second classification information and the first classification information meet second relevance; and when the second classification information and the first classification information meet a second relevance, taking the first database as the target database.
Preferably, the first relevance is that the third core word and the second core word can be mutually replaced.
In a second aspect, the present invention provides a device for verifying a denoising result of retrieved data, the device comprising:
a first obtaining unit configured to obtain a first patent document having a first keyword;
a second obtaining unit, configured to obtain a first database from a patent retrieval database according to a first keyword, where a patent document in the first database includes the first keyword;
a third obtaining unit, configured to obtain a first denoising instruction, where the first denoising instruction is that a user deletes a second database from the first database, where a patent document in the second database includes a second keyword, and the second keyword is denoising information;
a fourth obtaining unit configured to obtain a second patent document from the second database, the second patent document having first classification information;
a first judging unit, configured to judge whether the first classification information satisfies a first predetermined condition;
a first execution unit, configured to obtain a first recovery instruction when the first classification information satisfies the first predetermined condition, where the first recovery instruction is used to recover the second patent document into the first database.
Preferably, the apparatus further comprises:
a second execution unit configured to take the first database as a target database when the first classification information does not satisfy the first predetermined condition.
Preferably, the apparatus further comprises:
a fifth obtaining unit configured to obtain a first core word according to the first patent document;
a sixth obtaining unit, configured to obtain a second core word according to the first core word, where the second core word is information of a dryness target;
a seventh obtaining unit configured to obtain a third patent document from the first database;
an eighth obtaining unit, configured to obtain a third core word according to the third patent document;
a ninth obtaining unit, configured to obtain a first relevance according to the third core word and the second core word;
a second judging unit, configured to judge whether the first correlation satisfies a first preset threshold;
a first determining unit, configured to determine the second keyword according to the third core word when the first relevance satisfies a first preset threshold.
Preferably, the apparatus further comprises:
a tenth obtaining unit, configured to, when the first relevance does not satisfy a first preset threshold, obtain a first attribute according to the third core word;
an eleventh obtaining unit, configured to obtain second classification information according to the first attribute;
a third judging unit, configured to judge whether the second classification information and the first classification information satisfy a second association;
a third executing unit, configured to, when the second classification information and the first classification information satisfy a second correlation, take the first database as the target database.
Preferably, the first relevance is that the third core word and the second core word can be mutually replaced.
In a third aspect, the present invention provides a verification apparatus for retrieving a data denoising result, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of any one of the above methods when executing the program.
In a fourth aspect, the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of any of the methods described above.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
according to the verification method and device for the denoising result of the retrieved data, provided by the embodiment of the invention, a first patent document is obtained, and the first patent document has a first keyword; obtaining a first database from a patent retrieval database according to a first keyword, wherein a patent document in the first database comprises the first keyword; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second database from the first database, a patent document in the second database comprises a second keyword, and the second keyword is denoising information; obtaining a second patent document from the second database, the second patent document having first classification information; judging whether the first classification information meets a first preset condition or not; when the first classification information meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent literature to the first database. The method and the device achieve the technical effects that the verification process of the dryness removal result is realized through the analysis and comparison of the classification numbers, the error operation in the dryness removal processing is effectively avoided, the accuracy of the retrieval target database is ensured, the favorable guarantee is provided for the subsequent patent analysis, the process is convenient, and the method and the device are suitable for the wide crowd. Therefore, the technical problems that in the prior art, a retrieval database subjected to denoising processing is lack of a validation process, or the validation process needs to be manually operated by a professional, the process is complicated, the application range is limited, the accuracy of a validation result cannot be guaranteed, and excessive denoising of a target database or incomplete target patent database exists are solved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
FIG. 1 is a schematic flow chart of a method for verifying a denoising result of retrieved data according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a verification apparatus for retrieving a data denoising result according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of another verification apparatus for retrieving a data denoising result according to an embodiment of the present invention.
Description of reference numerals: a first obtaining unit 11, a second obtaining unit 12, a third obtaining unit 13, a fourth obtaining unit 14, a first judging unit 15, a first executing unit 16, a bus 300, a receiver 301, a processor 302, a transmitter 303, a memory 304, and a bus interface 306.
Detailed Description
The embodiment of the invention provides a method and a device for verifying a denoising result of retrieval data, which are used for solving the technical problems that in the prior art, a retrieval database after denoising processing is lack of a validation process, or the validation process needs manual operation by professional personnel, the process is complicated, the application range is limited, the accuracy of the validation result cannot be ensured, and excessive denoising of a target database or incomplete target patent database exists.
The technical scheme provided by the invention has the following general idea:
obtaining a first patent document having a first keyword; obtaining a first database from a patent retrieval database according to a first keyword, wherein a patent document in the first database comprises the first keyword; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second database from the first database, a patent document in the second database comprises a second keyword, and the second keyword is denoising information; obtaining a second patent document from the second database, the second patent document having first classification information; judging whether the first classification information meets a first preset condition or not; when the first classification information meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent literature to the first database. The method and the device achieve the technical effects that the verification process of the dryness removal result is realized through the analysis and comparison of the classification numbers, the error operation in the dryness removal processing is effectively avoided, the accuracy of the retrieval target database is ensured, the favorable guarantee is provided for the subsequent patent analysis, the process is convenient, and the method and the device are suitable for the wide crowd.
The technical solutions of the present invention are described in detail below with reference to the drawings and specific embodiments, and it should be understood that the specific features in the embodiments and examples of the present invention are described in detail in the technical solutions of the present application, and are not limited to the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Example one
Fig. 1 is a schematic flow chart of a method for verifying a denoising result of search data according to an embodiment of the present invention. As shown in fig. 1, an embodiment of the present invention provides a method for verifying a denoising result of retrieved data, where the method includes:
step 110: a first patent document is obtained, the first patent document having a first keyword.
Step 120: and obtaining a first database from a patent retrieval database according to a first keyword, wherein the patent documents in the first database comprise the first keyword.
Specifically, the first patent document is a patent document to be searched, and by analyzing the content of the first patent document, the patent content associated therewith is obtained for the purpose of searching thereof. The common searching process is to search by using keywords of patents, the keywords are the invention content mainly described by the first patent document, and words capable of highly summarizing and summarizing the first patent document can be obtained by comprehensively analyzing and comparing the keywords of the patent document according to the title content, the right content, the specific problem to be solved, the technical effect and other aspects of the patent document, and the patent documents containing all the keywords can be obtained from the patent database by searching the keywords in the patent searching database, and all the patent documents obtained by searching are called as the first database, but because the description of the keywords can have multiple meanings or the application environment and the meaning of the invention main body are different, the database searched by the keywords has high noise, and the patent content with high matching degree with the first patent document cannot be searched, some patent documents can only have a search keyword, but the main content is completely different from the content of the first patent document, which is not the target patent document, so that the content of the searched patent database needs to be denoised to ensure the reliability of the search result, and the denoising of the search result is performed again by the manual search of professionals, which is tedious, easy to have the phenomenon of incomplete denoising or incorrect operation, and has large limitation, and is difficult to perform for company staff.
Step 130: and obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second database from the first database, a patent document in the second database comprises a second keyword, and the second keyword is denoising information.
Further, before obtaining the first denoising instruction, the method includes: obtaining a first core word according to the first patent document; obtaining a second core word according to the first core word, wherein the second core word is information of a dryness removal target; obtaining a third patent document according to the first database; obtaining a third core word according to the third patent document; obtaining a first relevance according to the third core word and the second core word; judging whether the first relevance meets a first preset threshold value or not; and when the first relevance meets a first preset threshold value, determining the second keyword according to the third core word.
Further, the first relevance is that the third core word and the second core word can be mutually replaced.
Specifically, the embodiment of the invention adopts an automatic denoising process, and performs automatic denoising according to the denoising requirement of a user, wherein the denoising process can be determined by setting denoising keywords by the user or analyzing patent contents. In the process of determining the denoising keyword, firstly, a core word in a first patent document is obtained according to the first patent document, a second core word is obtained according to the core word of the first patent, wherein the second core word is a denoising core word, that is, a patent related to the second core word needs to be denoised from a database, for example, the first patent document is a patent document related to a toothbrush, a user needs to perform a manual toothbrush and to denoise the electric toothbrush, the patent document including a motor can be deleted according to the specific requirements of the electric toothbrush, the denoising core word can be set as the motor, and in order to remove the dryness of the whole, when the denoising keyword is determined, the core word can be expanded according to the second core word to avoid the existence of a denoising target in the first database, but the denoising target is different from the second core word, the method comprises the steps of enabling noise of a denoised database to still exist, analyzing a patent document in a first database again to obtain a third core word, judging whether relevance exists between the third core word and a second core word, setting a specific preset threshold according to the specific requirements of denoising, and denoising by using a core word with larger relevance to generally ensure the effectiveness of a denoising result, wherein the threshold is required to be higher when the preset threshold of relevance is set, if the threshold can be replaced mutually, or at least reaches more than 85%, so that the relevance between the third core word and the second core word is specifically judged, analyzing the core words through the attributes, meanings, application fields, effects and the like of the second core word and the third core word, weighting the parameters and the parameter values of the second core word and the third core word, and determining whether the multi-directional analysis of the second core word and the third core word meets the requirements of the first relevance or not through the multi-directional analysis of the second core word and the third core word For example, the second core word is a motor, the third core word is a mini motor, and the two words can be replaced within a certain range, so that a specific main body of a patent document in a patent database needs to be judged in a combined manner, and when the two words are determined to be in a replaceable relationship or the similarity meets a threshold requirement, a final denoising keyword is determined according to the second core word and the third core word, the denoising keyword can be single or multiple, and secondary expansion or upper processing can be performed according to the second core word and the third core word, so as to determine the denoising accuracy. After the denoising keyword is determined, the denoising keyword is used for searching in the first database according to the denoising keyword, namely the second keyword, patent documents related to the second keyword in the first database are searched out and deleted from the first database, so that the denoising process is completed, the complexity of manual denoising is avoided, and the accuracy is higher.
Step 140: a second patent document is obtained from the second database, the second patent document having first classification information.
Specifically, in order to avoid errors or excessive operations in the denoising process, the embodiment of the present invention has a validation process of denoising results, performs a specific denoising validation by using the classification number, and determines the classification number information of each patent document by analyzing the classification number of the patent document in the second database, i.e., the patent document database from which denoising needs to be performed. The classification principle that the classification of patent literature is an international patent classification mode, IPC classification is adopted in China, functions and applications are combined, and the functions are mainly used and the applications are supplemented is adopted. In the form of a grade, the technical contents are noted as follows: the parts, the large class, the small class, the large group and the small group are classified step by step to form a complete classification system. According to the international classification of a certain product, the patent information of the technical field to which the product belongs can be easily searched. The patent classification number of the invention and the utility model patent application is marked by adopting an IPC international patent classification table. When the same patent may have several classification numbers, the first of which is called the principal classification number. For example, when an invention patent application or a utility model patent application relates to different types of technical subjects, and the technical subjects constitute the invention information, multiple classification should be performed according to the technical subjects concerned, a plurality of classification numbers are given, and the classification number that can represent the invention information most sufficiently is ranked first.
The classification table is a tool for uniformly classifying the patent documents of each country. Its primary purpose is to serve as an effective search tool for patent literature searches conducted by various patent offices and other users in determining the novelty and creativity of patent applications, including evaluation of technical advancement and practical value.
Step 150: and judging whether the first classification information meets a first preset condition or not.
Step 160: when the first classification information meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent literature to the first database.
Further, after determining whether the first classification information satisfies a first predetermined condition, the method includes: and when the first classification information does not meet the first preset condition, taking the first database as a target database.
Specifically, the classification information of the patent documents in the second database is analyzed to determine the classification number of the patent documents, and whether the classification number meets the predetermined condition is judged, wherein the process of judging whether the predetermined condition is met is to firstly determine the classification information of the first patent document according to the first patent document, analyze the classification information of the first patent document, judge whether the first classification information of the second patent document and the classification number information of the first patent document meet the requirement of relevance, the first predetermined condition is that the first classification information has higher similarity with the classification number information of the first patent document and is close to or can be replaced, the specific contents in the two patent documents can be analyzed and judged more accurately by comparing the classification numbers, if the classification numbers of the two patent documents meet the predetermined condition, the contents of the two patents belong to the same type or have extremely high similarity, the method can be used as a target patent document, error operation exists when denoising is carried out through a second keyword, a corresponding instruction is obtained according to a comparison result, the second patent document is recovered and is added into the first database again to form a new database as the target database, so that the accuracy of retrieving the target database is guaranteed, error operation in dry processing is effectively avoided through a validation process of classification numbers, favorable guarantee is provided for subsequent patent analysis, the process is convenient, and the method is suitable for wide crowds. The technical problems that in the prior art, a retrieval database subjected to denoising processing is lack of a validation process, or the validation process needs to be manually operated by a professional, the process is complicated, the application range is limited, the accuracy of a validation result cannot be guaranteed, and excessive denoising of a target database or incomplete target patent database exists are solved. If the first classification information of the second patent document and the classification number information of the first patent document do not meet the preset requirement, the first classification information and the classification number information of the first patent document are indicated to have a gap, the second patent document does not meet the requirement of target retrieval, at the moment, the first database after the second patent document in the second database is deleted is continuously maintained as the target database, and the operation of recovering the data is not needed.
Further, after the determining whether the first correlation satisfies a first preset threshold, the method includes: when the first relevance does not meet a first preset threshold value, obtaining a first attribute according to the third core word; obtaining second classification information according to the first attribute; judging whether the second classification information and the first classification information meet second relevance; and when the second classification information and the first classification information meet a second relevance, taking the first database as the target database.
Specifically, in the process of determining the denoising keyword, if it is determined that the third core word and the second core word do not meet the requirement of the predetermined threshold, in order to ensure the accuracy of the target database, further classification information analysis is performed according to the third core word, when the condition is met, the third core word is stored in the first database, that is, the target database, as the target document, and if the condition is not met, further analysis is performed on the third core word, for example, correlation analysis with the first patent document, and the like, so as to determine whether the third core word meets the requirement of the retrieval. Firstly, performing attribute analysis according to a third core word, wherein the attribute analysis comprises word meaning, action, effect, theme, application range, belonging field and the like, obtaining a specific application environment through the attribute analysis of the third core word so as to obtain corresponding classification information which can be single or multiple, respectively comparing the second classification information with the first classification information to determine whether a second relevance requirement is met, wherein the second relevance requirement also has the characteristic of high similarity so as to be mutually replaceable, when the second relevance requirement is met, the content of the third core word and the content of the first patent document are in accordance with a retrieval requirement, a patent document corresponding to the third core word is stored in a target database as a target patent document, and at the moment, the third patent document in the first database does not need to be correspondingly adjusted, the first database is thus taken as the target database.
Example two
Based on the same inventive concept as the verification method of the denoising result of the search data in the foregoing embodiment, the present invention further provides a device of the verification method of the denoising result of the search data, as shown in fig. 2, the device includes:
a first obtaining unit 11, the first obtaining unit 11 being configured to obtain a first patent document, the first patent document having a first keyword;
a second obtaining unit 12, configured to obtain a first database from a patent retrieval database according to a first keyword, where a patent document in the first database includes the first keyword;
a third obtaining unit 13, where the third obtaining unit 13 is configured to obtain a first denoising instruction, where the first denoising instruction is that a user deletes a second database from the first database, where a patent document in the second database includes a second keyword, and the second keyword is denoising information;
a fourth obtaining unit 14, wherein the fourth obtaining unit 14 is configured to obtain a second patent document from the second database, and the second patent document has first classification information;
a first judging unit 15, wherein the first judging unit 15 is used for judging whether the first classification information meets a first preset condition;
a first execution unit 16, wherein the first execution unit 16 is configured to obtain a first recovery instruction when the first classification information satisfies the first predetermined condition, and the first recovery instruction is configured to recover the second patent document into the first database.
Further, the apparatus further comprises:
a second execution unit configured to take the first database as a target database when the first classification information does not satisfy the first predetermined condition.
Further, the apparatus further comprises:
a fifth obtaining unit configured to obtain a first core word according to the first patent document;
a sixth obtaining unit, configured to obtain a second core word according to the first core word, where the second core word is information of a dryness target;
a seventh obtaining unit configured to obtain a third patent document from the first database;
an eighth obtaining unit, configured to obtain a third core word according to the third patent document;
a ninth obtaining unit, configured to obtain a first relevance according to the third core word and the second core word;
a second judging unit, configured to judge whether the first correlation satisfies a first preset threshold;
a first determining unit, configured to determine the second keyword according to the third core word when the first relevance satisfies a first preset threshold.
Further, the apparatus further comprises:
a tenth obtaining unit, configured to, when the first relevance does not satisfy a first preset threshold, obtain a first attribute according to the third core word;
an eleventh obtaining unit, configured to obtain second classification information according to the first attribute;
a third judging unit, configured to judge whether the second classification information and the first classification information satisfy a second association;
a third executing unit, configured to, when the second classification information and the first classification information satisfy a second correlation, take the first database as the target database.
Further, the first relevance is that the third core word and the second core word can be mutually replaced.
Various changes and specific examples of the method for verifying the denoising result of the search data in the first embodiment of fig. 1 are also applicable to the apparatus for verifying the denoising result of the search data in the present embodiment, and through the foregoing detailed description of the method for verifying the denoising result of the search data, those skilled in the art can clearly know the method for implementing the apparatus for verifying the denoising result of the search data in the present embodiment, so for the brevity of the description, detailed description is not repeated here.
EXAMPLE III
Based on the same inventive concept as the verification method of the denoising result of the retrieved data in the foregoing embodiment, the present invention further provides a verification apparatus of the denoising result of the retrieved data, as shown in fig. 3, including a memory 304, a processor 302, and a computer program stored on the memory 304 and operable on the processor 302, wherein the processor 302, when executing the program, implements the steps of any one of the methods of the verification method of the denoising result of the retrieved data.
Where in fig. 3 a bus architecture (represented by bus 300), bus 300 may include any number of interconnected buses and bridges, bus 300 linking together various circuits including one or more processors, represented by processor 302, and memory, represented by memory 304. The bus 300 may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface 306 provides an interface between the bus 300 and the receiver 301 and transmitter 303. The receiver 301 and the transmitter 303 may be the same element, i.e., a transceiver, providing a means for communicating with various other apparatus over a transmission medium. The processor 302 is responsible for managing the bus 300 and general processing, and the memory 304 may be used for storing data used by the processor 302 in performing operations.
Example four
Based on the same inventive concept as the verification method of the denoising result of the retrieved data in the foregoing embodiments, the present invention further provides a computer-readable storage medium having a computer program stored thereon, which when executed by a processor, implements the following steps: obtaining a first patent document having a first keyword; obtaining a first database from a patent retrieval database according to a first keyword, wherein a patent document in the first database comprises the first keyword; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second database from the first database, a patent document in the second database comprises a second keyword, and the second keyword is denoising information; obtaining a second patent document from the second database, the second patent document having first classification information; judging whether the first classification information meets a first preset condition or not; when the first classification information meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent literature to the first database.
In a specific implementation, when the program is executed by a processor, any method step in the first embodiment may be further implemented.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
according to the verification method and device for the denoising result of the retrieved data, provided by the embodiment of the invention, a first patent document is obtained, and the first patent document has a first keyword; obtaining a first database from a patent retrieval database according to a first keyword, wherein a patent document in the first database comprises the first keyword; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second database from the first database, a patent document in the second database comprises a second keyword, and the second keyword is denoising information; obtaining a second patent document from the second database, the second patent document having first classification information; judging whether the first classification information meets a first preset condition or not; when the first classification information meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent literature to the first database. The method and the device achieve the technical effects that the verification process of the dryness removal result is realized through the analysis and comparison of the classification numbers, the error operation in the dryness removal processing is effectively avoided, the accuracy of the retrieval target database is ensured, the favorable guarantee is provided for the subsequent patent analysis, the process is convenient, and the method and the device are suitable for the wide crowd. Therefore, the technical problems that in the prior art, a retrieval database subjected to denoising processing is lack of a validation process, or the validation process needs to be manually operated by a professional, the process is complicated, the application range is limited, the accuracy of a validation result cannot be guaranteed, and excessive denoising of a target database or incomplete target patent database exists are solved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (10)
1. A method for verifying a denoising result of retrieved data is characterized by comprising the following steps:
obtaining a first patent document having a first keyword;
obtaining a first database from a patent retrieval database according to a first keyword, wherein a patent document in the first database comprises the first keyword;
obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second database from the first database, a patent document in the second database comprises a second keyword, and the second keyword is denoising information;
obtaining a second patent document from the second database, the second patent document having first classification information;
judging whether the first classification information meets a first preset condition or not;
when the first classification information meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent literature to the first database.
2. The method of claim 1, wherein said determining whether the first classification information satisfies a first predetermined condition comprises:
and when the first classification information does not meet the first preset condition, taking the first database as a target database.
3. The method of claim 1, wherein prior to obtaining the first denoising instruction, the method comprises:
obtaining a first core word according to the first patent document;
obtaining a second core word according to the first core word, wherein the second core word is information of a dryness removal target;
obtaining a third patent document according to the first database;
obtaining a third core word according to the third patent document;
obtaining a first relevance according to the third core word and the second core word;
judging whether the first relevance meets a first preset threshold value or not;
and when the first relevance meets a first preset threshold value, determining the second keyword according to the third core word.
4. The method of claim 3, wherein said determining whether said first association satisfies a first predetermined threshold comprises:
when the first relevance does not meet a first preset threshold value, obtaining a first attribute according to the third core word;
obtaining second classification information according to the first attribute;
judging whether the second classification information and the first classification information meet second relevance;
and when the second classification information and the first classification information meet a second relevance, taking the first database as the target database.
5. The method of claim 3, wherein the first association is that the third core word and the second core word are substitutable for each other.
6. A verification apparatus for retrieving a data denoising result, the apparatus comprising:
a first obtaining unit configured to obtain a first patent document having a first keyword;
a second obtaining unit, configured to obtain a first database from a patent retrieval database according to a first keyword, where a patent document in the first database includes the first keyword;
a third obtaining unit, configured to obtain a first denoising instruction, where the first denoising instruction is that a user deletes a second database from the first database, where a patent document in the second database includes a second keyword, and the second keyword is denoising information;
a fourth obtaining unit configured to obtain a second patent document from the second database, the second patent document having first classification information;
a first judging unit, configured to judge whether the first classification information satisfies a first predetermined condition;
a first execution unit, configured to obtain a first recovery instruction when the first classification information satisfies the first predetermined condition, where the first recovery instruction is used to recover the second patent document into the first database.
7. The apparatus of claim 6, wherein the apparatus further comprises:
a second execution unit configured to take the first database as a target database when the first classification information does not satisfy the first predetermined condition.
8. The apparatus of claim 6, wherein the apparatus further comprises:
a fifth obtaining unit configured to obtain a first core word according to the first patent document;
a sixth obtaining unit, configured to obtain a second core word according to the first core word, where the second core word is information of a dryness target;
a seventh obtaining unit configured to obtain a third patent document from the first database;
an eighth obtaining unit, configured to obtain a third core word according to the third patent document;
a ninth obtaining unit, configured to obtain a first relevance according to the third core word and the second core word;
a second judging unit, configured to judge whether the first correlation satisfies a first preset threshold;
a first determining unit, configured to determine the second keyword according to the third core word when the first relevance satisfies a first preset threshold.
9. A verification apparatus for retrieving a data denoising result, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 5 when executing the program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010116110.XA CN111274229A (en) | 2020-02-25 | 2020-02-25 | Method and device for verifying denoising result of retrieved data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010116110.XA CN111274229A (en) | 2020-02-25 | 2020-02-25 | Method and device for verifying denoising result of retrieved data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111274229A true CN111274229A (en) | 2020-06-12 |
Family
ID=71002409
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010116110.XA Withdrawn CN111274229A (en) | 2020-02-25 | 2020-02-25 | Method and device for verifying denoising result of retrieved data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111274229A (en) |
-
2020
- 2020-02-25 CN CN202010116110.XA patent/CN111274229A/en not_active Withdrawn
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112579155B (en) | Code similarity detection method and device and storage medium | |
CN110442847B (en) | Code similarity detection method and device based on code warehouse process management | |
US20170024374A1 (en) | Enhanced Document Input Parsing | |
CN115630640B (en) | Intelligent writing method, device, equipment and medium | |
Ling et al. | Knowledge representation model for crime analysis | |
CN112000929A (en) | Cross-platform data analysis method, system, equipment and readable storage medium | |
Healy et al. | Characterization of graphs using degree cores | |
CN114090784A (en) | Entity label clustering method and device for knowledge graph in material field | |
US11880403B2 (en) | Document data management via graph cliques for layout understanding | |
Sreenivasula Reddy et al. | Intuitionistic fuzzy rough sets and fruit fly algorithm for association rule mining | |
Du et al. | SemCluster: a semi-supervised clustering tool for crowdsourced test reports with deep image understanding | |
CN117492825A (en) | Method for generating stability annotation based on context learning and large language model | |
CN111274229A (en) | Method and device for verifying denoising result of retrieved data | |
CN111274364A (en) | Automatic denoising method and device based on keyword retrieval data | |
Shao et al. | An improved approach to the recovery of traceability links between requirement documents and source codes based on latent semantic indexing | |
Shahzad et al. | Generating process model collection with diverse label and structural features | |
CN111353023A (en) | Target database optimization method and device based on keyword retrieval | |
CN101847097B (en) | Method for maintaining tracking relationship between requirement item and work product | |
CN111368062A (en) | Verification method and device for denoising patent retrieval database | |
Udagawa | Source code retrieval using sequence based similarity | |
CN111309895A (en) | Automatic denoising method and device for retrieval data | |
CN111324726A (en) | Method and device for automatically drying patent database | |
CN111339243A (en) | Method and device for denoising and checking retrieval data based on competitive product information | |
CN111339123A (en) | Double-retrieval patent database establishing method and device | |
CN112579841B (en) | Multi-mode database establishment method, retrieval method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200612 |