CN111339243A - Method and device for denoising and checking retrieval data based on competitive product information - Google Patents

Method and device for denoising and checking retrieval data based on competitive product information Download PDF

Info

Publication number
CN111339243A
CN111339243A CN202010131705.2A CN202010131705A CN111339243A CN 111339243 A CN111339243 A CN 111339243A CN 202010131705 A CN202010131705 A CN 202010131705A CN 111339243 A CN111339243 A CN 111339243A
Authority
CN
China
Prior art keywords
database
obtaining
keyword
denoising
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010131705.2A
Other languages
Chinese (zh)
Inventor
邓梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Rainpat Data Service Co ltd
Original Assignee
Jiangsu Rainpat Data Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Rainpat Data Service Co ltd filed Critical Jiangsu Rainpat Data Service Co ltd
Priority to CN202010131705.2A priority Critical patent/CN111339243A/en
Publication of CN111339243A publication Critical patent/CN111339243A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • G06Q50/184Intellectual property management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Technology Law (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for denoising and checking retrieval data based on competitive product information, wherein a first patent database is obtained from a patent retrieval database according to a first keyword and first classification number information by obtaining a first patent document; obtaining a first denoising instruction, deleting a second patent database from the first patent database, wherein the second patent database comprises second applicant information; obtaining a second patent document from a second patent database, wherein the second patent document has a second keyword; obtaining a first relevance according to the first keyword and the second keyword; and judging whether the first relevance meets a first preset condition, if so, obtaining a first recovery instruction to recover the second patent document, and obtaining a third patent database as a target database. The technical problems that the retrieval database after denoising processing is lack of a validation process and the excessive denoising retrieval result of the target database is inaccurate are solved. The technical effects of verifying excessive denoising according to the competitive product information by utilizing keyword analysis and improving the reliability of the target database are achieved.

Description

Method and device for denoising and checking retrieval data based on competitive product information
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a device for denoising and checking retrieval data based on competitive product information.
Background
With the continuous development and improvement of social systems, the number of patent documents is rapidly increased, so that the protection of the patent rights of enterprises in various countries is more and more important. For an enterprise, how to accurately retrieve and analyze information meeting the needs of the enterprise from a large amount of patent documents is very important for the development of the whole enterprise. In the era of intellectual economy, intellectual property rights are regarded as strategic resources for providing core competitiveness for an enterprise or even a country, and unprecedented importance is highlighted. The patent contains a large amount of technical information, and a user can acquire the technical development trend in the current technical field by searching and analyzing related patents, so that a direction is provided for later research and development, and infringement risks can be avoided. The patent literature retrieval is the basic work that enterprises comprehensively know the prior art, improves the research and development starting point and avoids intellectual property risks. Because original patent data disclosed on the internet is incomplete, language is obscure, and the original patent data is long and difficult to understand, enterprises have difficulty in searching if professional searching methods and skills are not mastered.
However, the applicant of the present invention finds that the prior art has at least the following technical problems:
in the prior art, a retrieval database after denoising processing lacks a validation process, and the technical problems of excessive denoising of a target database or inaccurate retrieval result of the target patent database exist.
Disclosure of Invention
The embodiment of the invention provides a method and a device for denoising and checking retrieval data based on competitive product information, and solves the technical problems that in the prior art, a retrieval database after denoising processing is lack of a validation process, and excessive denoising of a target database or inaccurate retrieval result of the target patent database exist.
In view of the above problems, the embodiments of the present application are provided to provide a method and an apparatus for denoising and checking retrieved data based on auction information.
In a first aspect, the invention provides a method for denoising and checking retrieval data based on competitive product information, which comprises the following steps: obtaining a first patent document having a first keyword and first classification number information; obtaining a first patent database from a patent retrieval database according to the first keyword and the first classification number information; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second patent database from the first patent database, a patent document in the second patent database comprises second applicant information, and the second applicant information is denoising information; obtaining a second patent document from the second patent database, the second patent document having a second keyword; obtaining a first relevance according to the first keyword and the second keyword; and judging whether the first relevance meets a first preset condition, and when the first relevance meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent document into the first patent database to obtain a third patent database, and the third patent database is a target database.
Preferably, after determining whether the first correlation satisfies a first predetermined condition, the method includes: when the first relevance does not satisfy the first predetermined condition, the first patent database is taken as the target database.
Preferably, before obtaining the first patent database from the patent retrieval database according to the first keyword and the first classification number information, the method includes: obtaining a first core word according to the first patent document; obtaining a fourth patent database from the patent retrieval database according to the first core word; obtaining the first classification number information according to the first patent document; retrieving a fifth patent database from the fourth patent database according to the first classification number information; obtaining a first database proportion according to the fourth patent database and the fifth patent database; judging whether the first database proportion meets a second preset condition or not; and when the first database proportion meets the second preset condition, determining that the first core word is the first keyword.
Preferably, the obtaining the first denoising instruction includes: obtaining the first applicant information according to the first patent document; searching the competitive product information from a competitive product database according to the first applicant information; determining second applicant information according to the competitive product information; and obtaining the first denoising instruction according to the second applicant information.
Preferably, the obtaining a first relevance according to the first keyword and the second keyword includes: obtaining a first attribute according to the first keyword; obtaining a second attribute according to the second keyword; and obtaining the first relevance according to the first attribute and the second attribute.
In a second aspect, the present invention provides a device for denoising and checking retrieved data based on contest information, the device comprising:
a first obtaining unit configured to obtain a first patent document having a first keyword and first classification number information;
a second obtaining unit, configured to obtain a first patent database from a patent retrieval database according to the first keyword and the first classification number information;
a third obtaining unit, configured to obtain a first denoising instruction, where the first denoising instruction is that a user deletes a second patent database from the first patent database, where a patent document in the second patent database includes second applicant information, and the second applicant information is denoising information;
a fourth obtaining unit configured to obtain a second patent document from the second patent database, the second patent document having a second keyword;
a fifth obtaining unit, configured to obtain a first relevance according to the first keyword and the second keyword;
a first execution unit, configured to determine whether the first relevance meets a first predetermined condition, and obtain a first recovery instruction when the first relevance meets the first predetermined condition, where the first recovery instruction is used to recover the second patent document to the first patent database to obtain a third patent database, and the third patent database is a target database.
Preferably, the apparatus further comprises:
a second execution unit configured to take the first patent database as the target database when the first correlation does not satisfy the first predetermined condition.
Preferably, the apparatus further comprises:
a sixth obtaining unit, configured to obtain a first core word according to the first patent document;
a seventh obtaining unit, configured to obtain a fourth patent database from the patent search database according to the first core word;
an eighth obtaining unit configured to obtain the first classification number information according to the first patent document;
a ninth obtaining unit, configured to retrieve a fifth patent database from the fourth patent database according to the first classification number information;
a tenth obtaining unit, configured to obtain a first database proportion according to the fourth patent database and the fifth patent database;
the first judging unit is used for judging whether the first database occupation ratio meets a second preset condition or not;
a first determining unit, configured to determine that the first core word is the first keyword when the first database proportion satisfies the second predetermined condition.
Preferably, the apparatus further comprises:
an eleventh obtaining unit configured to obtain the first applicant information according to the first patent document;
a third execution unit, configured to retrieve the competitive products information from the competitive products database according to the first applicant information;
a second determination unit, configured to determine second applicant information according to the auction information;
a twelfth obtaining unit, configured to obtain the first denoising instruction according to the second applicant information.
Preferably, the apparatus further comprises:
a thirteenth obtaining unit, configured to obtain a first attribute according to the first keyword;
a fourteenth obtaining unit, configured to obtain a second attribute according to the second keyword;
a fifteenth obtaining unit, configured to obtain the first association according to the first attribute and the second attribute.
In a third aspect, the present invention provides a data retrieval denoising and checking apparatus based on auction information, including a memory, a processor, and a computer program stored on the memory and operable on the processor, where the processor implements the steps of any one of the above methods when executing the program.
In a fourth aspect, the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of any of the methods described above.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
the embodiment of the invention provides a method and a device for denoising and checking retrieval data based on competitive product information, which are characterized in that a first patent document is obtained, and the first patent document has a first keyword and first classification number information; obtaining a first patent database from a patent retrieval database according to the first keyword and the first classification number information; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second patent database from the first patent database, a patent document in the second patent database comprises second applicant information, and the second applicant information is denoising information; obtaining a second patent document from the second patent database, the second patent document having a second keyword; obtaining a first relevance according to the first keyword and the second keyword; and judging whether the first relevance meets a first preset condition, and when the first relevance meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent document into the first patent database to obtain a third patent database, and the third patent database is a target database. The method achieves the technical effects of verifying excessive denoising or denoising errors occurring according to the competitive product information through analysis processing of the keywords, improving the reliability of the target retrieval database, avoiding misoperation on patent documents meeting requirements, providing favorable support for subsequent patent analysis, and being simple and convenient in process. Therefore, the technical problems that in the prior art, a retrieval database after denoising processing is lack of a validation process, and excessive denoising of a target database or inaccurate retrieval result of the target patent database exists are solved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Fig. 1 is a schematic flow chart of a method for denoising and verifying retrieved data based on contest information in an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a retrieval data denoising and verifying apparatus based on the competitive product information in the embodiment of the present invention;
fig. 3 is a schematic structural diagram of another data denoising and verifying apparatus for retrieving data based on auction information according to an embodiment of the present invention.
Description of reference numerals: a first obtaining unit 11, a second obtaining unit 12, a third obtaining unit 13, a fourth obtaining unit 14, a fifth obtaining unit 15, a first executing unit 16, a bus 300, a receiver 301, a processor 302, a transmitter 303, a memory 304, and a bus interface 306.
Detailed Description
The embodiment of the invention provides a method and a device for denoising and checking retrieval data based on competitive product information, which are used for solving the technical problems that in the prior art, a retrieval database after denoising processing is lack of a validation process, and excessive denoising of a target database or inaccurate retrieval result of the target patent database exists.
The technical scheme provided by the invention has the following general idea:
obtaining a first patent document having a first keyword and first classification number information; obtaining a first patent database from a patent retrieval database according to the first keyword and the first classification number information; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second patent database from the first patent database, a patent document in the second patent database comprises second applicant information, and the second applicant information is denoising information; obtaining a second patent document from the second patent database, the second patent document having a second keyword; obtaining a first relevance according to the first keyword and the second keyword; and judging whether the first relevance meets a first preset condition, and when the first relevance meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent document into the first patent database to obtain a third patent database, and the third patent database is a target database. The method achieves the technical effects of verifying excessive denoising or denoising errors occurring according to the competitive product information through analysis processing of the keywords, improving the reliability of the target retrieval database, avoiding misoperation on patent documents meeting requirements, providing favorable support for subsequent patent analysis, and being simple and convenient in process.
The technical solutions of the present invention are described in detail below with reference to the drawings and specific embodiments, and it should be understood that the specific features in the embodiments and examples of the present invention are described in detail in the technical solutions of the present application, and are not limited to the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Example one
Fig. 1 is a schematic flow chart of a search data denoising and verifying method based on auction information in an embodiment of the present invention. As shown in fig. 1, an embodiment of the present invention provides a method for denoising and verifying retrieved data based on contest information, where the method includes:
step 110: a first patent document is obtained having a first keyword and first classification number information.
Step 120: and obtaining a first patent database from a patent retrieval database according to the first keyword and the first classification number information.
Further, before obtaining the first patent database from the patent retrieval database according to the first keyword and the first classification number information, the method includes: obtaining a first core word according to the first patent document; obtaining a fourth patent database from the patent retrieval database according to the first core word; obtaining the first classification number information according to the first patent document; retrieving a fifth patent database from the fourth patent database according to the first classification number information; obtaining a first database proportion according to the fourth patent database and the fifth patent database; judging whether the first database proportion meets a second preset condition or not; and when the first database proportion meets the second preset condition, determining that the first core word is the first keyword.
Specifically, the first patent document is a search target, and the keywords to be searched for are specified by specifically analyzing the title, content, classification number information, and the like of the first patent document. When the search keyword is determined, the specific content of the first patent document is analyzed to obtain a core word of the first patent document, the core word can be a main body or a main invention point or an element mainly protected in the claims, the core word can be one or a plurality of, according to the determination of the core word, the core word or the core words are searched in a search platform and a search database, if the core words can be searched together or respectively, the search result of the core word is analyzed to determine the search keyword, the analysis process of the search result of the core word is mainly verified by utilizing the classification number information of the first patent document, the classification number information of the patent document can be obtained according to the first patent document, and the core word with more accurate search result is determined as the search keyword by analyzing and summarizing the classification number information of the patent document of the search result of the core word If the database searched by the first core word contains N patent documents, the classification number information of the N patent documents is extracted and analyzed, comparing with the classification number information of the first patent document, if most of the classification number information in the N patent documents is the same as or similar to the classification number information of the first patent document, the core word retrieval result is confirmed to be more accurate, otherwise, the core word retrieval result is not accurate enough, the specific quantity is set according to the actual situation, at least more than half of the core word retrieval result is included, that is, the classification number information of at least half of the N patent documents is identical or similar to that of the first patent document, and the patent documents are determined to meet the requirement, for the patent documents of a plurality of core words, judgment can be sequentially made to determine a final search keyword, and the search keyword can be one or a combination of a plurality of core words. And when the first keyword is determined as the search keyword, combining the first keyword and the first classification number information to obtain a search result which is searched from the search database and is used as the first patent database.
Step 130: obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second patent database from the first patent database, a patent document in the second patent database includes second applicant information, and the second applicant information is denoising information.
Further, the obtaining the first denoising instruction includes: obtaining the first applicant information according to the first patent document; searching the competitive product information from a competitive product database according to the first applicant information; determining second applicant information according to the competitive product information; and obtaining the first denoising instruction according to the second applicant information.
Specifically, in order to improve the accuracy of the search result, the embodiment of the present invention further has an automatic denoising function, denoising is performed based on the competition information of the first patent document to ensure that there is no interference of the competition information in the search result, the basic information of the first patent document is used to determine the first applicant information of the first patent document, the first applicant information is retrieved from a competition information database, such as a competition information platform for enterprise investigation and the like, to obtain competition information related to the first applicant information, the competition is a product competing with each other, the second applicant information is determined from the competition information, the search is performed according to the second applicant information to obtain the patent information of the second applicant, the second applicant information is determined as denoising information, the second applicant information is retrieved from the first patent database, and the patent document related to the second applicant information is obtained as the second patent database, and deleting the second patent database from the first patent database to obtain the denoised first patent database.
Step 140: a second patent document is obtained from the second patent database, the second patent document having a second keyword.
Step 150: and obtaining a first relevance according to the first keyword and the second keyword.
Further, the obtaining a first relevance according to the first keyword and the second keyword includes: obtaining a first attribute according to the first keyword; obtaining a second attribute according to the second keyword; and obtaining the first relevance according to the first attribute and the second attribute.
Specifically, because the denoising process of the embodiment of the present invention is based on the race information, and utilizes the applicant information to perform denoising processing, wherein a denoising error or excessive denoising problem may occur, the embodiment of the present invention adds a function of verifying by using keywords, that is, verifying again the patent documents in the second patent database of denoising processing, verifying by using the keywords of the patent documents in the second patent database, performing comparative analysis between the second keywords contained in the patent documents in the second patent database and the first keywords determined by the first patent document, determining the association between the first keywords and the second keywords, wherein the association determination is performed by analyzing the attributes, word senses, application environments, functional functions and other aspects of the keywords, and the attributes of the first keywords and the attributes of the second keywords can be scored respectively, and obtaining the degree of association of the parameters according to the ratio and the weighting between the parameters.
Step 160: and judging whether the first relevance meets a first preset condition, and when the first relevance meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent document into the first patent database to obtain a third patent database, and the third patent database is a target database.
Specifically, whether a first relevance between a second keyword and a first keyword meets a validity criterion or not is judged, that is, a first preset condition is met, the preset condition is that the first keyword and the second keyword are generally high in relevance, attributes of various aspects such as domain functions are close to or the same, the second keyword and the first keyword can be replaced appropriately, when the preset condition is met, it is determined that denoising processing is wrong, recovery is needed, at the moment, a first recovery instruction is correspondingly obtained, a deleted patent document related to the second keyword is recovered, the deleted patent document is added into a first patent database again, a new patent database is obtained and serves as a third patent database, and the third patent database serves as a target database. The method has the advantages that through analysis processing of the keywords, excessive denoising or denoising errors occurring according to the competitive product information are verified, the reliability of the target retrieval database is improved, misoperation of patent documents meeting requirements is avoided, favorable support is provided for follow-up patent analysis, the process is simple and convenient, and the method is suitable for operation of various personnel. The method solves the technical problems that in the prior art, a retrieval database after denoising processing is lack of a validation process, and excessive denoising of a target database or inaccurate retrieval result of the target patent database exists.
Further, after the determining whether the first relevance meets a first predetermined condition, the method includes: when the first relevance does not satisfy the first predetermined condition, the first patent database is taken as the target database.
Specifically, when the second keyword and the first keyword cannot meet the predetermined condition after the relevance analysis, which indicates that the relevance between the second keyword and the first keyword is not enough, the original denoising result is maintained, that is, the first patent database after denoising is used as the target database.
Example two
Based on the same inventive concept as the search data denoising and verifying method based on the competitive product information in the foregoing embodiment, the present invention further provides a search data denoising and verifying method device based on the competitive product information, as shown in fig. 2, the device includes:
a first obtaining unit 11, the first obtaining unit 11 being configured to obtain a first patent document, the first patent document having a first keyword and first classification number information;
a second obtaining unit 12, wherein the second obtaining unit 12 is configured to obtain a first patent database from a patent retrieval database according to the first keyword and the first classification number information;
a third obtaining unit 13, where the third obtaining unit 13 is configured to obtain a first denoising instruction, where the first denoising instruction is that a user deletes a second patent database from the first patent database, where a patent document in the second patent database includes second applicant information, and the second applicant information is denoising information;
a fourth obtaining unit 14, wherein the fourth obtaining unit 14 is configured to obtain a second patent document from the second patent database, and the second patent document has a second keyword;
a fifth obtaining unit 15, where the fifth obtaining unit 15 is configured to obtain a first relevance according to the first keyword and the second keyword;
a first executing unit 16, where the first executing unit 16 is configured to determine whether the first relevance meets a first predetermined condition, and obtain a first recovery instruction when the first relevance meets the first predetermined condition, where the first recovery instruction is used to recover the second patent document to the first patent database to obtain a third patent database, and the third patent database is a target database.
Further, the apparatus further comprises:
a second execution unit configured to take the first patent database as the target database when the first correlation does not satisfy the first predetermined condition.
Further, the apparatus further comprises:
a sixth obtaining unit, configured to obtain a first core word according to the first patent document;
a seventh obtaining unit, configured to obtain a fourth patent database from the patent search database according to the first core word;
an eighth obtaining unit configured to obtain the first classification number information according to the first patent document;
a ninth obtaining unit, configured to retrieve a fifth patent database from the fourth patent database according to the first classification number information;
a tenth obtaining unit, configured to obtain a first database proportion according to the fourth patent database and the fifth patent database;
the first judging unit is used for judging whether the first database occupation ratio meets a second preset condition or not;
a first determining unit, configured to determine that the first core word is the first keyword when the first database proportion satisfies the second predetermined condition.
Further, the apparatus further comprises:
an eleventh obtaining unit configured to obtain the first applicant information according to the first patent document;
a third execution unit, configured to retrieve the competitive products information from the competitive products database according to the first applicant information;
a second determination unit, configured to determine second applicant information according to the auction information;
a twelfth obtaining unit, configured to obtain the first denoising instruction according to the second applicant information.
Further, the apparatus further comprises:
a thirteenth obtaining unit, configured to obtain a first attribute according to the first keyword;
a fourteenth obtaining unit, configured to obtain a second attribute according to the second keyword;
a fifteenth obtaining unit, configured to obtain the first association according to the first attribute and the second attribute.
Various changes and specific examples of the retrieval data denoising and verifying method based on the competitive product information in the first embodiment of fig. 1 are also applicable to the retrieval data denoising and verifying device based on the competitive product information in the present embodiment, and through the foregoing detailed description of the retrieval data denoising and verifying method based on the competitive product information, those skilled in the art can clearly know the implementation method of the retrieval data denoising and verifying device based on the competitive product information in the present embodiment, so for the brevity of the description, detailed description is not repeated here.
EXAMPLE III
Based on the same inventive concept as the method for denoising and verifying the retrieved data based on the competitive product information in the foregoing embodiment, the present invention further provides a device for denoising and verifying the retrieved data based on the competitive product information, as shown in fig. 3, including a memory 304, a processor 302, and a computer program stored in the memory 304 and operable on the processor 302, wherein the processor 302, when executing the program, implements the steps of any one of the methods for denoising and verifying the retrieved data based on the competitive product information.
Where in fig. 3 a bus architecture (represented by bus 300), bus 300 may include any number of interconnected buses and bridges, bus 300 linking together various circuits including one or more processors, represented by processor 302, and memory, represented by memory 304. The bus 300 may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface 306 provides an interface between the bus 300 and the receiver 301 and transmitter 303. The receiver 301 and the transmitter 303 may be the same element, i.e., a transceiver, providing a means for communicating with various other apparatus over a transmission medium. The processor 302 is responsible for managing the bus 300 and general processing, and the memory 304 may be used for storing data used by the processor 302 in performing operations.
Example four
Based on the same inventive concept as the method for denoising and checking the search data based on the competitive product information in the foregoing embodiments, the present invention further provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the following steps: obtaining a first patent document having a first keyword and first classification number information; obtaining a first patent database from a patent retrieval database according to the first keyword and the first classification number information; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second patent database from the first patent database, a patent document in the second patent database comprises second applicant information, and the second applicant information is denoising information; obtaining a second patent document from the second patent database, the second patent document having a second keyword; obtaining a first relevance according to the first keyword and the second keyword; and judging whether the first relevance meets a first preset condition, and when the first relevance meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent document into the first patent database to obtain a third patent database, and the third patent database is a target database.
In a specific implementation, when the program is executed by a processor, any method step in the first embodiment may be further implemented.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
the embodiment of the invention provides a method and a device for denoising and checking retrieval data based on competitive product information, which are characterized in that a first patent document is obtained, and the first patent document has a first keyword and first classification number information; obtaining a first patent database from a patent retrieval database according to the first keyword and the first classification number information; obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second patent database from the first patent database, a patent document in the second patent database comprises second applicant information, and the second applicant information is denoising information; obtaining a second patent document from the second patent database, the second patent document having a second keyword; obtaining a first relevance according to the first keyword and the second keyword; and judging whether the first relevance meets a first preset condition, and when the first relevance meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent document into the first patent database to obtain a third patent database, and the third patent database is a target database. The method achieves the technical effects of verifying excessive denoising or denoising errors occurring according to the competitive product information through analysis processing of the keywords, improving the reliability of the target retrieval database, avoiding misoperation on patent documents meeting requirements, providing favorable support for subsequent patent analysis, and being simple and convenient in process. Therefore, the technical problems that in the prior art, a retrieval database after denoising processing is lack of a validation process, and excessive denoising of a target database or inaccurate retrieval result of the target patent database exists are solved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A retrieval data denoising and checking method based on competitive product information is characterized by comprising the following steps:
obtaining a first patent document having a first keyword and first classification number information;
obtaining a first patent database from a patent retrieval database according to the first keyword and the first classification number information;
obtaining a first denoising instruction, wherein the first denoising instruction is that a user deletes a second patent database from the first patent database, a patent document in the second patent database comprises second applicant information, and the second applicant information is denoising information;
obtaining a second patent document from the second patent database, the second patent document having a second keyword;
obtaining a first relevance according to the first keyword and the second keyword;
and judging whether the first relevance meets a first preset condition, and when the first relevance meets the first preset condition, obtaining a first recovery instruction, wherein the first recovery instruction is used for recovering the second patent document into the first patent database to obtain a third patent database, and the third patent database is a target database.
2. The method of claim 1, wherein said determining whether said first association satisfies a first predetermined condition comprises:
when the first relevance does not satisfy the first predetermined condition, the first patent database is taken as the target database.
3. The method of claim 1, wherein prior to obtaining the first patent database from the patent search database based on the first keyword and the first category number information, comprising:
obtaining a first core word according to the first patent document;
obtaining a fourth patent database from the patent retrieval database according to the first core word;
obtaining the first classification number information according to the first patent document;
retrieving a fifth patent database from the fourth patent database according to the first classification number information;
obtaining a first database proportion according to the fourth patent database and the fifth patent database;
judging whether the first database proportion meets a second preset condition or not;
and when the first database proportion meets the second preset condition, determining that the first core word is the first keyword.
4. The method of claim 1, wherein the obtaining a first denoising instruction comprises:
obtaining the first applicant information according to the first patent document;
searching the competitive product information from a competitive product database according to the first applicant information;
determining second applicant information according to the competitive product information;
and obtaining the first denoising instruction according to the second applicant information.
5. The method of claim 1, wherein obtaining a first relevance based on the first keyword and the second keyword comprises:
obtaining a first attribute according to the first keyword;
obtaining a second attribute according to the second keyword;
and obtaining the first relevance according to the first attribute and the second attribute.
6. A retrieval data denoising and checking device based on competitive product information is characterized by comprising:
a first obtaining unit configured to obtain a first patent document having a first keyword and first classification number information;
a second obtaining unit, configured to obtain a first patent database from a patent retrieval database according to the first keyword and the first classification number information;
a third obtaining unit, configured to obtain a first denoising instruction, where the first denoising instruction is that a user deletes a second patent database from the first patent database, where a patent document in the second patent database includes second applicant information, and the second applicant information is denoising information;
a fourth obtaining unit configured to obtain a second patent document from the second patent database, the second patent document having a second keyword;
a fifth obtaining unit, configured to obtain a first relevance according to the first keyword and the second keyword;
a first execution unit, configured to determine whether the first relevance meets a first predetermined condition, and obtain a first recovery instruction when the first relevance meets the first predetermined condition, where the first recovery instruction is used to recover the second patent document to the first patent database to obtain a third patent database, and the third patent database is a target database.
7. The apparatus of claim 6, wherein the apparatus further comprises:
a second execution unit configured to take the first patent database as the target database when the first correlation does not satisfy the first predetermined condition.
8. The apparatus of claim 6, wherein the apparatus further comprises:
a sixth obtaining unit, configured to obtain a first core word according to the first patent document;
a seventh obtaining unit, configured to obtain a fourth patent database from the patent search database according to the first core word;
an eighth obtaining unit configured to obtain the first classification number information according to the first patent document;
a ninth obtaining unit, configured to retrieve a fifth patent database from the fourth patent database according to the first classification number information;
a tenth obtaining unit, configured to obtain a first database proportion according to the fourth patent database and the fifth patent database;
the first judging unit is used for judging whether the first database occupation ratio meets a second preset condition or not;
a first determining unit, configured to determine that the first core word is the first keyword when the first database proportion satisfies the second predetermined condition.
9. A device for denoising and verifying retrieved data based on competitive product information, comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the steps of the method according to any one of claims 1-5 when executing the program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN202010131705.2A 2020-02-29 2020-02-29 Method and device for denoising and checking retrieval data based on competitive product information Withdrawn CN111339243A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010131705.2A CN111339243A (en) 2020-02-29 2020-02-29 Method and device for denoising and checking retrieval data based on competitive product information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010131705.2A CN111339243A (en) 2020-02-29 2020-02-29 Method and device for denoising and checking retrieval data based on competitive product information

Publications (1)

Publication Number Publication Date
CN111339243A true CN111339243A (en) 2020-06-26

Family

ID=71185843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010131705.2A Withdrawn CN111339243A (en) 2020-02-29 2020-02-29 Method and device for denoising and checking retrieval data based on competitive product information

Country Status (1)

Country Link
CN (1) CN111339243A (en)

Similar Documents

Publication Publication Date Title
KR101557294B1 (en) Search results ranking using editing distance and document information
US8285702B2 (en) Content analysis simulator for improving site findability in information retrieval systems
CN108460014A (en) Recognition methods, device, computer equipment and the storage medium of business entity
CN107102993B (en) User appeal analysis method and device
Liu et al. Has this bug been reported?
US20110208715A1 (en) Automatically mining intents of a group of queries
Feng et al. Practical duplicate bug reports detection in a large web-based development community
CN114416667A (en) Method and device for rapidly sharing network disk file, network disk and storage medium
CN112883030A (en) Data collection method and device, computer equipment and storage medium
CN113449168A (en) Method, device and equipment for capturing theme webpage data and storage medium
US8862586B2 (en) Document analysis system
CN109829048B (en) Electronic device, interview assisting method, and computer-readable storage medium
CN110069455B (en) File merging method and device
CN111444312A (en) Method and device for multi-platform combined patent retrieval
CN111339243A (en) Method and device for denoising and checking retrieval data based on competitive product information
CN116501733A (en) Data product generation method, device, equipment and storage medium
CN114416174A (en) Model reconstruction method and device based on metadata, electronic equipment and storage medium
CN111368062A (en) Verification method and device for denoising patent retrieval database
CN111274364A (en) Automatic denoising method and device based on keyword retrieval data
CN111353023A (en) Target database optimization method and device based on keyword retrieval
CN111291094A (en) Retrieval method and device based on keywords and multi-platform classification numbers
CN111274229A (en) Method and device for verifying denoising result of retrieved data
CN111274293A (en) Information processing method and device for automatically denoising according to competitive product information
CN111597294A (en) Information searching method and device
CN111309895A (en) Automatic denoising method and device for retrieval data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200626