CN115757759A - Drawing method, device, medium and electronic equipment based on patent technical map - Google Patents

Drawing method, device, medium and electronic equipment based on patent technical map Download PDF

Info

Publication number
CN115757759A
CN115757759A CN202211443316.9A CN202211443316A CN115757759A CN 115757759 A CN115757759 A CN 115757759A CN 202211443316 A CN202211443316 A CN 202211443316A CN 115757759 A CN115757759 A CN 115757759A
Authority
CN
China
Prior art keywords
technical
collection
target
feature words
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202211443316.9A
Other languages
Chinese (zh)
Inventor
洪英文
邹伟东
林振良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qizhidao Network Technology Co Ltd
Original Assignee
Qizhidao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qizhidao Network Technology Co Ltd filed Critical Qizhidao Network Technology Co Ltd
Priority to CN202211443316.9A priority Critical patent/CN115757759A/en
Publication of CN115757759A publication Critical patent/CN115757759A/en
Withdrawn legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a drawing method, a drawing device, a drawing medium and electronic equipment of a technical map based on patents, wherein the method comprises the following steps: acquiring a main classification number and a plurality of technical characteristic words of a target patent; acquiring a patent collection consistent with the main classification number of the target patent, and screening out a similar field patent collection associated with the target patent from the patent collection; calculating first weights of the technical feature words through a keyword weight algorithm, and determining key technical feature words in the technical feature words according to the first weights; and taking the key technical feature words as nodes of a technical map, and drawing the technical map of each patent in the target patent and the similar field patent collection. The method and the device have the advantages that the relation among the technologies of the patents is well considered when the technical map of the patent is drawn, and the reference of the technical map is improved.

Description

Drawing method, device, medium and electronic equipment based on patent technical map
Technical Field
The application relates to the technical field of technical maps, in particular to a drawing method, a drawing device, a drawing medium and electronic equipment of a technical map based on a patent.
Background
The technical map is a special knowledge map, the realization principle of the technical map is based on a complex network technology, and the network distribution of key and hotspot technologies is identified by discovering the trend or technical trend clues of technical research through the relation analysis of the technical field, scientific and technological achievements (papers, patents, achievements and the like), authors, research institutions and keywords. Along with the development demand of competitive information of patent technologies, more and more technical maps are applied to the patent field, and the connection and development conditions between patent technologies are analyzed by constructing the technical maps, so that the direction of the self technical development is well determined.
However, in general, a technical map of a patent is created by extracting technical feature words from one patent and extracting technical feature words from another patent, and using the same technical feature words of the two patents. The technical map constructed in the mode only considers the relation of each technical point corresponding to the technical feature words, does not consider the connection between the technologies of the patents, and therefore the reference of the drawn technical map is low.
Disclosure of Invention
In order to better consider the relation between the technologies of the patents when the technical map of the patents is drawn and improve the reference of the technical map, the application provides a drawing method, a drawing device, a storage medium and electronic equipment based on the technical map of the patents.
In a first aspect of the present application, a method for drawing a patent-based technical map is provided, which specifically includes:
acquiring a main classification number and a plurality of technical characteristic words of a target patent;
acquiring a patent collection consistent with the main classification number of the target patent, and screening out a similar field patent collection associated with the target patent from the patent collection;
calculating first weights of the technical feature words through a keyword weight algorithm, and determining key technical feature words in the technical feature words according to the first weights;
and taking the key technical feature words as nodes of a technical map, and drawing the technical map of each patent in the target patent and the similar field patent collection.
By adopting the technical scheme, the patent collection with the main classification number consistent with the target patent is obtained, the patents with the main technical field consistent with the target patent are found, the similar field patent collection with the related technical field closer to the target patent is further screened from the patent collection, then the first weights of a plurality of technical characteristic words in the target patent are calculated by adopting a keyword weight algorithm, the larger first weight is selected from all the first weights, the technical characteristic word corresponding to the larger first weight is determined as the key technical characteristic word of the target patent, and the technical characteristic word with smaller relation with the patent key technology in the plurality of technical characteristic words is removed. And finally, taking the key technical feature words as nodes of the technical map, taking the nodes as links, establishing a relation between the target patent and the patents with the key technical feature words in the patent congregation in the similar field, and drawing the technical map, so that the drawn technical map is closely connected, and the reference of the technical map is further improved.
Optionally, the obtaining a patent collection consistent with the main classification number of the target patent, and screening a similar-field patent collection associated with the target patent from the patent collection includes:
acquiring a patent collection consistent with the main classification number of the target patent, and extracting the name of the applicant of the patent in the patent collection; judging whether the name of each applicant is a company name or not;
if so, extracting a first technical keyword from an official website corresponding to the company name;
and calculating first target similarity of the first technical keyword and the plurality of technical feature words, and if the first target similarity exceeding a similarity threshold exists, taking the patents corresponding to the first target similarity in the patent collection as a similar field patent collection.
By adopting the technical scheme, after the patent collection with the main classification number consistent with the target patent is determined, the name of the applicant of the patent in the patent collection is subjected to type judgment respectively, and if the name of the applicant is the name of a company, the first technical key words related to the subdivision fields related to the company are extracted from the official network corresponding to the company, so that the technical fields related to the patent can be reflected in a refined mode. And finally, calculating the similarity between the first technical keyword and a plurality of technical feature words in the target patent, and if the similarity between the first technical keyword and one of the technical feature words is higher, so that the patent in the patent set is closer to the technical field of the target patent, and the patent is taken as a patent set in the similar field, so that the technology of the patent in the patent set in the similar field is more closely related to the target patent.
Optionally, after determining whether each name of the applicant is a company name, the method further includes:
if not, extracting the name of the inventor from the patent corresponding to the name of the applicant, and combining the name of the inventor with the name of the applicant to form identity information;
searching a journal paper corresponding to the identity information, extracting a second technical keyword in the journal paper, and calculating the similarity between the second technical keyword and the plurality of technical feature words;
and if the second target similarity exceeding the similarity threshold exists, taking the patents corresponding to the second target similarity in the patent collection as the patent collection in the similar field.
By adopting the technical scheme, if the name of the applicant is not a company name, but a college name or a hospital name, and the like, patents closer to the technical field of the target patent cannot be well screened only by the name of the applicant. And then extracting the name of the inventor of the patent, finding a corresponding journal paper according to the name of the inventor of the patent and the name of the applicant, wherein the extracted second technical key word can better reflect the subdivision technical field specially attacked by the inventor, and if the similarity of the second technical key word and a plurality of technical feature words exceeds a similarity threshold value, which indicates that the technical field of the patent is closer to that of the target patent, the patent is taken as a patent collection in the similar field, so that the technology of the patent in the patent collection in the similar field is more closely related to the target patent.
Optionally, the obtaining a patent collection consistent with the main classification number of the target patent, and screening a similar-field patent collection associated with the target patent from the patent collection includes:
acquiring a patent collection consistent with the main classification number of the target patent, and counting the number of the classification numbers shared by the patent collection and the target patent;
comparing the number of each of the common classification numbers to a number threshold;
and combining the patents of which the number of the common classification numbers in the patent collection is greater than a number threshold value into a patent collection in the similar field.
By adopting the technical scheme, after the patent collection with the main classification number consistent with the target patent is determined, the number of the common classification numbers of each patent and the target patent in the patent collection is respectively counted, then the patents are compared with the number threshold one by one, if the number threshold is larger than the number threshold, the technical field related to the patent is more crossed with the technical field of the target patent, the technical connection is possibly tighter, and finally the patents are combined to be used as the patent collection in the similar field, so that the technology of the patents in the patent collection in the similar field is more closely connected with the target patent.
Optionally, the calculating, by a keyword weight algorithm, first weights of the plurality of technical feature words, and determining, according to each of the first weights, a key technical feature word in the plurality of technical feature words includes:
calculating first weights of the technical feature words through a keyword weight algorithm, and comparing the first weights of the technical feature words with a weight threshold value respectively;
screening a first feature word set with a first weight larger than a weight threshold value from the plurality of technical feature words, and determining the first feature word set as a key technical feature word;
and screening a second feature word set with the first weight not greater than a weight threshold value from the plurality of feature words, screening the feature words with the similarity exceeding a similarity threshold value with the feature words in the first feature word set in the second feature word set, and determining the feature words as key feature words.
By adopting the technical scheme, after the first weight of each technical feature word is calculated, each first weight is respectively compared with a weight threshold value, if the first weight is greater than the weight threshold value, the technical feature word corresponding to the first weight is more important to the technology of the patent, the technical feature words of the type are screened out and combined into a first feature word set, and the technical feature words are determined to be key technical feature words; if the first weight is less than or equal to the weight threshold, the first weight is smaller, the technical feature words corresponding to the first weight are not important to the technology of the patent, the technical feature words of the type are screened and combined into a second feature word set, if the similarity between the technical feature words in the second feature word set and the technical feature words in the first feature word set exceeds the similarity threshold, the technical feature words and the technical feature words are not the same, but have consistent semantics, and can be regarded as the same technical feature in the patent, and finally the technical feature words of the type in the second feature word set are also determined as key technical feature words. Therefore, the screened key technical characteristic words are comprehensive.
Optionally, the method includes that the key technical feature words are used as nodes of a technical map, the technical map of the target patent and the patents in the patent union in the similar field is drawn, and the key technical feature words include at least one of:
taking each key technical characteristic word as a node of a technical map, and screening out the associated patents with the key technical characteristic words of which the occurrence times exceed a preset value from the similar field patent collection;
calculating a second weight of the same key technical feature word in the associated patent according to a keyword weight algorithm;
and forming a patent incidence matrix by the second weights and the first weights, and drawing the technical maps of the patents in the target patent and the similar field patents according to the patent incidence matrix, wherein rows and columns of the patent incidence matrix represent nodes and patents.
By adopting the technical scheme, after the key technical characteristic words are used as nodes of the technical map, patents with the occurrence frequency exceeding a preset value (the occurrence frequency is higher) are found from the patent collection in the similar field as the associated patents of the target patent, the second weights of the key technical characteristic words in the appearing associated patents are respectively calculated, finally, the second weights and the first weights form an associated matrix, the technical map of the target patent is drawn according to the associated matrix, and therefore the target patent in the technical map is closely associated with the technologies of the patents, and the technical map is high in reference.
Optionally, the keyword weighting algorithm is:
wid = TFid log (N/DFi), where TF id Indicating the frequency of occurrence of technical or key technical characteristics i in patent d, DF i The sum of 1 and the number of patents which represent technical characteristic words or key technical characteristic words i in patent collections in similar fields, W id And N represents the sum of the number of patents in the patent pool in the similar field and 1.
By adopting the technical scheme, the keyword weight algorithm can more accurately evaluate the importance degree of the words in one file set or one file in one corpus. In the scheme, the importance degree of a plurality of technical characteristic words in the target patent can be accurately evaluated, and the importance degree of key technical characteristic words in the appearing patent (the patent in the patent collection in the similar field) can also be accurately evaluated.
In a second aspect of the present application, there is provided a drawing apparatus for a patent-based technical map, specifically including:
the patent information acquisition module is used for acquiring the main classification number and a plurality of technical feature words of the target patent;
the patent collection screening module is used for acquiring a patent collection consistent with the main classification number of the target patent and screening a similar field patent collection related to the target patent from the patent collection;
the key characteristic word determining module is used for calculating first weights of the technical characteristic words through a keyword weight algorithm and determining key technical characteristic words in the technical characteristic words according to the first weights;
and the technical map drawing module is used for drawing the technical map of each patent in the target patent and the similar field patent aggregate by taking the key technical feature words as nodes of the technical map.
By adopting the technical scheme, after the patent information acquisition module acquires the classification number and a plurality of technical feature words of the target patent, the patent collection screening module screens out the similar field patent collection close to the technical field of the target patent from the patent collection consistent with the main classification number of the target patent, then the key feature word determination module calculates the first weight of each technical feature word, finally the key technical feature words are determined according to the first weight, and finally the technical map drawing module draws the key technical feature words as nodes to obtain the technical map of the patents in the target patent and the similar field patent collection.
In summary, the present application includes at least one of the following beneficial technical effects:
the method comprises the steps of obtaining a patent collection with a main classification number consistent with a target patent, finding out patents with main technical fields consistent with the target patent, further screening out similar field patent collections with related technical fields closer to the target patent from the patent collection, then calculating each first weight of a plurality of technical feature words in the target patent by adopting a keyword weight algorithm, selecting a higher one of the first weights to be determined as a key technical feature word of the target patent, and removing the technical feature words with small relation with patent key technology from the plurality of technical feature words. And finally, taking the key technical feature words as nodes of the technical map, taking the nodes as links, establishing a relation between the target patent and the patents with the key technical feature words in the patent congregation in the similar field, and drawing the technical map, so that the drawn technical map is closely connected, and the reference of the technical map is further improved.
Drawings
Fig. 1 is a schematic flowchart of a method for drawing a patent-based technical map provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram of another method for drawing a patent-based technical map provided in the embodiment of the present application;
FIG. 3 is a schematic flow chart diagram of another method for drawing a patent-based technical atlas provided in the embodiment of the present application;
fig. 4 is a schematic structural diagram of a drawing device based on a patent technical map according to an embodiment of the present application.
Description of reference numerals: 11. a patent information acquisition module; 12. a patent collection screening module; 13. a key feature word determining module; 14. and a technical map drawing module.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments.
In the description of the embodiments of the present application, the words "exemplary," "such as," or "for example" are used to indicate examples, illustrations, or illustrations. Any embodiment or design described herein as "exemplary," "e.g.," or "e.g.," is not to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the words "exemplary," "such as," or "for example" are intended to present relevant concepts in a concrete fashion.
Referring to fig. 1, the present application discloses a flowchart of a patent-based technical map drawing method, which can be implemented by relying on a computer program and can also run on a von neumann-based patent-based drawing device. The computer program can be integrated in an application, and can also be operated as an independent tool application, and specifically comprises the following steps:
s101: and acquiring a main classification number and a plurality of technical characteristic words of the target patent.
Specifically, the technical map is a graph showing the relationship between the development process and the structure of scientific knowledge. The knowledge resources owned by human over time and the related scientific and technical knowledge are described in a carrier, drawn, mined, analyzed and displayed by using a visualization technology. In the embodiment of the application, the technical map uses the nodes as links to establish the interrelation between the target patent and other patents.
The main classification number is the first classification number of the patent. The Classification number is also called an International Patent Classification (IPC), and the IPC makes contribution to organization, management and retrieval of a large amount of Patent documents. When there are many technical fields related to the patent, there are many classification numbers. The main technical field of patents is characterized by the main classification number. The technical characteristic words are keywords corresponding to the technical scheme for solving the technical problems. The method for acquiring the main classification number of the target patent comprises the following steps: and searching to obtain the classification number of the target patent by taking 'int.cl.' as a search keyword from the text information of the target patent, and then extracting the first classification number in the classification number as the main classification number of the target patent. In addition, according to the terms of the invention name, the abstract, the claim, the technical field, the background technology, the invention content, the specific implementation mode and the like, the text information of the target patent is divided into a plurality of areas, then the text information of the area corresponding to the invention name is divided into three parts through word segmentation technology, the middle part is extracted as a technical feature word, for example, the invention name is a data transmission method, which is divided into one type, data transmission, and the method extracts data transmission as the technical feature word. Extracting the text information of the area corresponding to the 'claim' to obtain the key word containing 'the' and taking the later as the technical characteristic word.
S102: and acquiring a patent collection consistent with the main classification number of the target patent, and screening out a similar field patent collection associated with the target patent from the patent collection.
Specifically, after the main classification number of the target patent is obtained, the main classification number is used as a matching condition, the patent containing the main classification number in the text content is searched through a patent database platform, and the searched patent is sorted into a patent collection. Then all classification numbers of the target patent are compared with all classification numbers of each patent in the patent collection, and if the common classification numbers exist and the number of the common classification numbers exceeds a number threshold value, the target patent is more related to the patent technology in the patent collection. The quantity threshold is a preset fixed value and is a quantity critical value of the common classification number, and is used for measuring the relevance of the two patents. The common classification number is a classification number appearing in the text information of two patents at the same time. And finally, screening out the patents of the type in the patent collection and sorting the patents into patent collections in similar fields.
S103: calculating first weights of the plurality of technical feature words through a keyword weight algorithm, and determining key technical feature words in the plurality of technical feature words according to the first weights.
Specifically, a keyword weighting algorithm, i.e. a (Term Frequency-Inverse Document Frequency, TF-IDF) algorithm, is a statistical method for evaluating the importance degree of a word to one of a set of documents or one of the documents in a corpus. Wherein, TF is Term Frequency (Term Frequency), IDF is Inverse Document Frequency (Inverse Document Frequency), and is a measure of the general importance of a Term.
The keyword weight algorithm is represented as follows: w id =TF id *log(N/DF i ) Wherein, TF id Indicating the frequency of occurrence of technical or key technical characteristics i in patent d, DF i The sum of 1 and the number of patents showing the occurrence of technical characteristic words or key technical characteristic words i in the patent collection of the similar field, namely the number of patents showing the occurrence of technical characteristic words or key technical characteristic words i in the patent collection of the similar field and the target patent, W id And N represents the sum of the number of patents in the patent collection in the similar field and 1, namely the sum of the number of patents in the patent collection in the similar field and the number of target patents. The patent is the subject patent or a patent in a patent pool in a similar field. In the embodiment of the present application, the number of target patents is 1.
Through whichAlgorithm formula, TF of frequency of appearance of technical characteristic word in target patent id DF is obtained by adding 1 to the number of patents with technical characteristic words in patent congregation in similar fields i And N, substituting the algorithm formula to obtain the first weight W of each technical characteristic word in the target patent id . A first weight W id The larger the technical characteristic word is, the higher the importance degree of the technical characteristic word in the target patent is, and the closer the technical characteristic word is to the key technology of the patent. And finally, determining the technical feature words as key technical feature words, and removing other technical feature words except the key technical feature words.
S104: and taking the key technical feature words as nodes of the technical map, and drawing the technical map of each patent in the target patent and similar field patent union.
Specifically, a node refers to a connection point in a technical graph, and a point where relationship lines intersect or branch. The concept of the node is wide, the practical significance of the node depends on the content of the technical map, and key technical feature words are adopted in the embodiment of the application. In other embodiments, the node may also employ a classification number. And then establishing association between the patents in which the key technical feature words appear in the patent collection in the similar field and the target patent through the key technical feature words, and drawing a technical map of each patent in the patent collection in the similar field through a matrix formed by the weight of the key technical feature words in the target patent and the weight of the key technical feature words in the patents in the patent collection in the similar field. It should be noted that, in the embodiment of the present application, the preset bibex cel software is specifically used to draw the technical map, and in other embodiments, the SPSS software or the ustinet software may also be used to draw the technical map.
Referring to fig. 2, the present embodiment discloses a flow chart of another patent-based technical map drawing method, which can be implemented by relying on a computer program and can also run on a von neumann-based patent-based drawing device. The computer program can be integrated in an application, and can also be operated as an independent tool application, and the method specifically comprises the following steps:
s201: and acquiring a main classification number and a plurality of technical characteristic words of the target patent.
Specifically, refer to step S101, which is not described herein again.
S202: and acquiring a patent collection consistent with the main classification number of the target patent, and extracting the name of the applicant of the patent in the patent collection.
Specifically, after a patent collection with the same main classification number as that of the target patent is obtained, the content of the patent in the patent collection containing the keyword is searched by using the applicant as a search keyword, and corresponding applicant name information is extracted from the content. Wherein the applicant of a patent creates a natural person, legal person, or other organization for a patent that is legally or contractually agreed to enjoy the right to apply the patent. Other organizations include corporations, colleges, hospitals, and the like. If the applicant is a company, the applicant name is the company name; if the applicant is a college or hospital, the applicant name is the college name or hospital name.
S203: it is determined whether each applicant name is a company name.
S204: and if so, extracting the first technical key words from the official website corresponding to the company name.
Specifically, after extracting the applicant name of each patent, detecting whether the applicant name contains a "company" keyword, and if the applicant name contains the "company" keyword, determining that the applicant name of the patent is the company name. And then linking to a corresponding company official website by taking the company name as a keyword, determining a company introduction area from the company official website, and extracting a first technical keyword reflecting the subdivision field related to the company from the area. For example, the company name is beijing xx technology limited company, the company name is linked to a corresponding official website, a company introduction area containing keywords about us in the official website is found, text information of the company introduction area is extracted, first technical keywords related to the technical field, such as 'game', 'cloud game', 'codec' and the like, are extracted from the text information, and the technical field of patents corresponding to the company name can be further reduced through the first technical keywords.
S205: and calculating the similarity of the first technical keyword and the plurality of technical feature words, and if the first target similarity exceeding the similarity threshold exists, taking the patent corresponding to the first target similarity in the patent collection as the patent collection in the similar field.
Specifically, the similarity threshold is a semantic similarity threshold for determining whether the first technical keyword and the technical feature word are the same. After the first technical key words are determined, a formula is calculated through semantic similarity: sim (C) i ,C j )=(C i S∩C j S)/(C i S∪C j S), wherein, sim (C) i ,C j ) Representing the similarity of the first technical key word and the technical characteristic word, C i Representing a first technical keyword, C j Representing a technical characteristic word, C i And S is a set formed by the first technical key words and all upper concepts of the first technical key words. C j And S is a set formed by the technical characteristic words and all upper concepts of the technical characteristic words. It should be noted that the similarity refers to semantic similarity. For example, the similarity between the first technical keyword and one of the plurality of technical feature words is 0.9, the similarity threshold is 0.8, and the similarity exceeds the similarity threshold, indicating that the similarity between the first technical keyword and the plurality of technical feature words is high. And finally, determining the patents corresponding to the first target similarity as a patent collection in the similar field.
In an implementable manner, after step S203, the method further includes: if not, extracting the name of the inventor from the patent corresponding to the name of the applicant, and combining the name of the inventor with the name of the applicant to form identity information;
searching a journal paper corresponding to the identity information, extracting a second technical keyword in the journal paper, and calculating the similarity between the second technical keyword and a plurality of technical feature words;
and if the second target similarity exceeding the similarity threshold exists, taking the patents corresponding to the second target similarity in the patent collection as the patent collection in the similar field.
Specifically, if the applicant name extracted from a patent in the patent collection does not include a "company" keyword, the applicant name of the patent is determined not to be a company name, an inventor name (usually the first of a plurality of inventors in the patent is selected) is extracted from the patent, and then the inventor name and the applicant name (i.e., the unit to which the inventor belongs) are combined into identity information. According to the identity information, a platform connected to a known network and the like through the internet is used for searching journal papers meeting the identity information, the content of a column of 'keywords' is extracted from the journal papers, the content of the column of 'keywords' is used as a second technical keyword, the similarity (namely semantic similarity) between the second technical keyword and a plurality of technical feature words is respectively calculated through a semantic similarity calculation formula mentioned in step S205, the similarity is compared with a similarity threshold, if a second target similarity exceeding the similarity threshold exists, the technical field of the journal papers published by the inventor of the patent is close to a target patent, the technical field related to the patent in the patent collection is close to the target patent, and therefore the patent corresponding to the second target similarity is used as a similar field patent collection. In this embodiment, the first word in the content in the column of the "keyword" of the journal paper may be used as the second technical keyword, and in other embodiments, all the words in the content in the column of the "keyword" of the journal paper may also be used as the second technical keyword.
S206: calculating first weights of the plurality of technical feature words through a keyword weight algorithm, and determining key technical feature words in the plurality of technical feature words according to the first weights.
Specifically, refer to step S103, which is not described herein again.
S207: and taking each key technical feature word as a node of the technical map, and screening out the associated patents with the occurrence frequency of the key technical feature words exceeding a preset value from the patent collection in the similar field.
Specifically, after determining key technical feature words in a plurality of technical feature words, taking the key technical feature words as nodes of a technical map, detecting whether each patent in a patent collection in the similar field contains the key technical feature words, if so, further counting the occurrence frequency of the key technical feature words, comparing the counted frequency with a preset value, and if so, indicating that the correlation degree between the patent in which the key technical feature words appear and a target patent is higher, screening the patent as a correlated patent of the target patent; if the correlation degree between the patent with the key technical feature words and the target patent is low, the patent is not screened. In the embodiment of the present application, the preset value may be 5, and in other embodiments, the preset value may also be 6 or 7.
S208: and calculating the second weight of the same key technical feature word in the related patent according to a keyword weight algorithm.
Specifically, after the associated patents are determined from the patent collection in the similar field, the second weight in the associated patents in which each key technical feature word correspondingly appears is calculated according to a keyword weight algorithm, and the number of the second weights includes a plurality. The keyword weighting algorithm is discussed in detail in step S103. The second weight is calculated specifically as follows:
W id =TF id *log(N/DF i ) The frequency TF of the key technical characteristic words appearing in the corresponding related patents id DF is obtained by adding 1 to the number of patents with key technical feature words in patent congregation in similar fields i And N are respectively substituted into the algorithm formula to obtain a second weight.
S209: and forming a patent incidence matrix by the second weights and the first weights, drawing the technical maps of the patents in the target patent and the similar field patents according to the patent incidence matrix, wherein rows and columns of the patent incidence matrix represent nodes and patents.
Specifically, after the second weights are determined, the second weights and the first weights are combined together to form a correlation matrix corresponding to the associated patent and the target patent, for example, if there are 10 associated patents and there are nodes (i.e., 5 key technical feature words), the formed correlation matrix is:
Figure BDA0003948730520000101
the rows of the matrix represent nodes, for a total of 5 rows. The columns of the matrix represent patents, the related patents plus the target patent, the matrix is 11 columns in total, the first column is W11, W21, W31, W41. W51 represents the weight of the first key technical feature word in the first patent, the weight of the second key technical feature word in the first patent, the weight of the third key technical feature word in the first patent, the weight of the fourth key technical feature word in the first patent, and the weight of the fifth key technical feature word in the first patent, respectively. The first behaviors W11, W12, W13, W14, W15, W16, W17, W18, W19, W110, W111 represent the weight of the first key technical feature in the first patent, the weight of the first key technical feature in the second patent, the weight of the first key technical feature in the third patent … …, and the weight of the first key technical feature in the eleventh patent, respectively.
Referring to fig. 3, the present embodiment discloses a flowchart of another patent-based technical map drawing method, which can be implemented by relying on a computer program and can also run on a von neumann-based patent-based drawing device. The computer program can be integrated in an application, and can also be operated as an independent tool application, and specifically comprises the following steps:
s301: and acquiring a main classification number and a plurality of technical characteristic words of the target patent.
Specifically, refer to step S101, which is not described herein again.
S302: and acquiring a patent collection consistent with the main classification number of the target patent, and counting the number of the classification numbers shared by the patent collection and the target patent.
S303: the number of each common classification number is compared to a number threshold.
S304: and combining the patents with the number of the common classification numbers in the patent collection larger than the number threshold value into the patent collection in the similar field.
Specifically, after a patent collection consistent with the main classification number of the target patent is obtained, statistics is carried out on the number of the commonly-occurring classification numbers including the main classification number, namely the number of the common classification numbers, of each patent in the patent collection compared with the target patent, and if the number of the common classification numbers is larger than a number threshold value, the technical proximity of the patent and the target patent is high, and the relevance of the patent and the target patent is high, then the patents of the type are screened from the patent collection to combine the patent collections in the similar dimension field. The number threshold may be 1 in embodiments of the present application.
For example, one of the patents c in the patent collection has classification number: b, B1, B3, classification number of the target patent is: b, B3 and B4, obviously, the classification numbers of the two patents appearing simultaneously are B and B3, the number of the common classification numbers is 2, and is greater than the number threshold value 1, which indicates that the relevance of the patent c and the target patent is high, and the similar technical fields are more, so that the patent c is determined to be one of the similar field sets. This process has already been discussed in step S102, and is not described herein again.
In an implementation mode, a target patent and a patent collection are combined into a candidate collection, the number of patents in which a single secondary classification number and a primary classification number appear together is counted to obtain at least one group of patent numbers, each group of patent numbers is respectively compared with a high-frequency co-occurrence classification number threshold, and the high-frequency co-occurrence classification number threshold adopts the following formula:
Figure BDA0003948730520000111
wherein b represents the number of single co-occurrence of the secondary classification number and the primary classification number only once, and a represents the high-frequency co-occurrence classification number threshold. If the number of the patents which appear together with the main classification number exceeds the threshold value of the high-frequency co-occurrence classification number, the co-occurrence frequency of the auxiliary classification number and the main classification number belongs to high frequency, and further, the correlation degree between the technical field represented by the auxiliary classification number and the technical field represented by the main classification number is higher. Then the patents in the patent collection containing the classification number can be screened and combined into a patent collection in the similar field.
For example, the number of times that only three single sub-classification numbers, namely a + A1, a + A2, and a + A3, and the main classification number appear together in the candidate set is one, so that b is 3,a is 2, that is, the threshold value of the high-frequency co-occurrence classification number is 2, if the number of patents that the single sub-classification number and the main classification number a + A4 appear together is 5, and is greater than the threshold value of the high-frequency co-occurrence classification number, the technical field association between the sub-classification number A4 and the main classification number is strong, and then the patents whose classification numbers include A4 in the patent collection are screened and combined into the similar field patent collection.
S305: calculating first weights of a plurality of technical feature words through a keyword weight algorithm, and comparing the first weights of the technical feature words with a weight threshold value respectively.
S306: and screening a first feature word set with a first weight larger than a weight threshold value from the plurality of technical feature words, and determining the first feature word set as a key technical feature word.
Specifically, the detailed process of calculating the first weights of the plurality of technical feature words through the keyword weight algorithm may refer to step S103, and is not described herein again. The weight threshold is a critical value for determining whether the first weight is a weight corresponding to the key technology. Comparing the first weight of each technical feature word with a weight threshold value respectively, if the first weight is greater than the weight threshold value, indicating that the corresponding technical feature word is a key technical feature word, and forming the key technical feature word into a first feature word set; if the first weight is not larger than the weight threshold, the corresponding technical feature word is possibly a common technical feature word and is possibly associated with the core technology of the target patent to a lower degree.
S307: and screening a second feature word set with the first weight not greater than a weight threshold from the plurality of feature words, screening the feature words with the similarity exceeding a similarity threshold from the second feature word set and the first feature word set, and determining the feature words as key feature words.
Specifically, if the first weight of the technical feature word is not greater than the weight threshold, such technical feature words are screened out to form a second feature word set, and then the similarity between each technical feature word in the second feature word set and the key technical feature word is respectively calculated by using a semantic similarity calculation formula, where the detailed calculation process is discussed in step S205. If the similarity exceeds the similarity threshold, the corresponding technical feature word in the second feature word set is also considered as the key technical feature word, because the meaning of the technical feature word is the same as that of the real key technical feature word. And finally, the technical feature word is moved from the second feature word set to the first feature word set and is also determined as a key technical feature word.
For example, "cloud game" is determined as a key technical feature word from a plurality of technical feature words because the first weight exceeds the weight threshold, and "game on demand" is classified into a second feature word set because the first weight does not exceed the weight threshold. However, the similarity of the game on demand and the cloud game is calculated to exceed the similarity threshold, which shows that the semantics of the game on demand and the cloud game are the same and the same concept is represented. Then "game on demand" should also be identified as a key technical feature word.
S308: and taking the key technical feature words as nodes of the technical map, and drawing the technical map of each patent in the target patent and similar field patent union.
Specifically, refer to step S104, which is not described herein again.
The implementation principle of the drawing method of the technical map based on the patent in the embodiment of the application is as follows: the method comprises the steps of obtaining a main classification number and a plurality of technical feature words of a target patent, searching a patent collection consistent with the main classification number of the target patent, screening out a similar field patent collection close to the technical field related to the target patent from the patent collection, calculating first weights of the technical feature words in the target patent, determining key technical feature words from the technical feature words according to the first weights, and establishing a patent technical map in the target patent and similar field patent collection by taking the key technical feature words as nodes of the technical map.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Please refer to fig. 4, which is a schematic structural diagram of a drawing apparatus based on a patent technical map according to an embodiment of the present application. The rendering means applied to the patent-based technical map may be implemented as all or a part of the apparatus by software, hardware or a combination of both. The device 1 comprises a patent information acquisition module 11, a patent collection screening module 12, a key feature word determination module 13 and a technical map drawing module 14.
The patent information acquisition module 11 is used for acquiring a main classification number and a plurality of technical feature words of a target patent;
the patent collection screening module 12 is used for acquiring a patent collection consistent with the main classification number of the target patent, and screening a similar field patent collection associated with the target patent from the patent collection;
the key feature word determining module 13 is configured to calculate first weights of the plurality of technical feature words through a keyword weight algorithm, and determine a key technical feature word from the plurality of technical feature words according to each first weight;
and the technical map drawing module 14 is used for drawing the technical map of each patent in the target patent and similar field patent aggregate by taking the key technical feature words as the nodes of the technical map.
Optionally, the patent collection screening module 12 is specifically configured to:
acquiring a patent collection consistent with the main classification number of a target patent, and extracting the name of an applicant of the patent in the patent collection;
judging whether the name of each applicant is a company name or not;
if so, extracting a first technical keyword from an official website corresponding to the company name;
and calculating the similarity of the first technical keyword and the plurality of technical feature words, and if the first target similarity exceeding the similarity threshold exists, taking the patent corresponding to the first target similarity in the patent collection as the patent collection in the similar field.
Optionally, the patent collection screening module 12 is specifically further configured to:
if not, extracting the name of the inventor from the patent corresponding to the name of the applicant, and combining the name of the inventor with the name of the applicant to form identity information; searching a journal paper corresponding to the identity information, extracting a second technical keyword in the journal paper, and calculating the similarity between the second technical keyword and a plurality of technical feature words;
and if the second target similarity exceeding the similarity threshold exists, taking the patents corresponding to the second target similarity in the patent collection as the patent collection in the similar field.
Optionally, the patent collection screening module 12 is specifically further configured to:
acquiring a patent collection consistent with the main classification number of the target patent, and counting the number of the classification numbers shared by the patent collection and the target patent; comparing the number of each common classification number with a number threshold;
and combining the patents with the number of the common classification numbers in the patent collection larger than the number threshold value into the patent collection in the similar field.
Optionally, the key feature word determining module 13 is specifically configured to:
calculating first weights of a plurality of technical feature words through a keyword weight algorithm, and comparing the first weights of the technical feature words with a weight threshold value respectively;
screening a first feature word set with a first weight larger than a weight threshold value from a plurality of technical feature words, and determining the first feature word set as a key technical feature word;
and screening a second feature word set with the first weight not greater than a weight threshold from the plurality of feature words, screening the feature words with the similarity exceeding a similarity threshold from the second feature word set and the first feature word set, and determining the feature words as key feature words.
Optionally, the technical map drawing module 14 is specifically configured to:
taking each key technical feature word as a node of a technical map, and screening out associated patents with key technical feature words with occurrence times exceeding a preset value from a patent collection in similar fields;
calculating a second weight of the same key technical feature word in the associated patent according to a keyword weight algorithm;
and forming a patent incidence matrix by the second weights and the first weights, drawing the technical maps of the patents in the target patent and the similar field patents according to the patent incidence matrix, wherein the rows and the columns of the patent incidence matrix represent nodes and patents.
It should be noted that, when the drawing apparatus based on the patent technical map provided in the above embodiment executes the drawing method based on the patent technical map, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to complete all or part of the above described functions. In addition, the drawing device based on the patent technical map provided by the embodiment and the drawing method embodiment based on the patent technical map belong to the same concept, and the embodiment of the implementation process is detailed in the method embodiment and is not repeated herein.
The embodiment of the application further discloses a computer readable storage medium, and the computer readable storage medium stores a computer program, wherein when the computer program is executed by a processor, the method for drawing the technical map based on the patent of the embodiment is adopted.
The computer program may be stored in a computer readable medium, the computer program includes computer program code, the computer program code may be in a source code form, an object code form, an executable file or some intermediate form, and the like, the computer readable medium includes any entity or device capable of carrying the computer program code, a recording medium, a usb disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a Read Only Memory (ROM), a Random Access Memory (RAM), an electrical carrier signal, a telecommunication signal, a software distribution medium, and the like, and the computer readable medium includes but is not limited to the above components.
The drawing method based on the patent technical map of the embodiment is stored in the computer-readable storage medium through the computer-readable storage medium, and is loaded and executed on the processor, so as to facilitate the storage and application of the method.
The embodiment of the application also discloses an electronic device, wherein a computer program is stored in a computer readable storage medium, and when the computer program is loaded and executed by a processor, the drawing method based on the technical map of the patent is adopted.
The electronic device may be an electronic device such as a desktop computer, a notebook computer, or a cloud server, and the electronic device includes but is not limited to a processor and a memory, for example, the electronic device may further include an input/output device, a network access device, a bus, and the like.
The processor may be a Central Processing Unit (CPU), and of course, according to an actual use situation, other general processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like may also be used, and the general processor may be a microprocessor or any conventional processor, and the present application does not limit the present invention.
The memory may be an internal storage unit of the electronic device, for example, a hard disk or a memory of the electronic device, or an external storage device of the electronic device, for example, a plug-in hard disk, a smart card (SMC), a secure digital card (SD) or a flash memory card (FC) provided on the electronic device, and the memory may also be a combination of the internal storage unit of the electronic device and the external storage device, and the memory is used for storing a computer program and other programs and data required by the electronic device, and may also be used for temporarily storing data that has been output or will be output, which is not limited in this application.
The electronic device stores the drawing method based on the patent technical map of the embodiment in a memory of the electronic device, and is loaded and executed on a processor of the electronic device, so that the electronic device is convenient to use.
The above description is merely an exemplary embodiment of the present disclosure, and the scope of the present disclosure is not limited thereto. That is, all equivalent changes and modifications made in accordance with the teachings of the present disclosure are intended to be included within the scope of the present disclosure. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (10)

1. A drawing method of a patent-based technical map is characterized by comprising the following steps:
acquiring a main classification number and a plurality of technical characteristic words of a target patent;
acquiring a patent collection consistent with the main classification number of the target patent, and screening out a similar field patent collection associated with the target patent from the patent collection;
calculating first weights of the technical feature words through a keyword weight algorithm, and determining key technical feature words in the technical feature words according to the first weights;
and taking the key technical feature words as nodes of a technical map, and drawing the technical map of each patent in the target patent and the similar field patent collection.
2. The patent-based technical atlas rendering method of claim 1, wherein the obtaining of the patent collection consistent with the main classification number of the target patent and the screening of the similar field patent collection associated with the target patent from the patent collection comprise:
acquiring a patent collection consistent with the main classification number of the target patent, and extracting the name of the applicant of the patent in the patent collection;
judging whether the name of each applicant is a company name or not;
if yes, extracting a first technical keyword from an official website corresponding to the company name;
and calculating the similarity between the first technical keyword and the plurality of technical feature words, and if the first target similarity exceeding a similarity threshold exists, taking the patent corresponding to the first target similarity in the patent collection as a patent collection in the similar field.
3. The patent-based technical map drawing method according to claim 2, wherein the judging whether each of the applicant names is a company name further comprises:
if not, extracting the name of the inventor from the patent corresponding to the name of the applicant, and combining the name of the inventor with the name of the applicant to form identity information;
searching a journal paper corresponding to the identity information, extracting a second technical keyword in the journal paper, and calculating the similarity between the second technical keyword and the plurality of technical feature words;
and if the second target similarity exceeding the similarity threshold exists, taking the patents corresponding to the second target similarity in the patent collection as the patent collection in the similar field.
4. The patent-based technical map drawing method according to claim 1, wherein the step of obtaining a patent collection consistent with the main classification number of the target patent, and screening out a similar domain patent collection associated with the target patent from the patent collection comprises:
acquiring a patent collection consistent with the main classification number of the target patent, and counting the number of the classification numbers shared by the patent collection and the target patent;
comparing the number of each of the common classification numbers to a number threshold;
and combining the patents of which the number of the common classification numbers in the patent collection is greater than a number threshold value into a patent collection in the similar field.
5. The patent-based technical atlas rendering method according to claim 1, wherein the calculating first weights of the technical feature words through a keyword weight algorithm, and determining key technical feature words in the technical feature words according to each first weight comprises:
calculating first weights of the technical feature words through a keyword weight algorithm, and comparing the first weights of the technical feature words with a weight threshold value respectively;
screening a first feature word set with a first weight larger than a weight threshold value from the plurality of technical feature words, and determining the first feature word set as a key technical feature word;
and screening a second feature word set with the first weight not greater than a weight threshold value from the plurality of feature words, screening the feature words with the similarity exceeding a similarity threshold value with the feature words in the first feature word set in the second feature word set, and determining the feature words as key feature words.
6. The method for drawing a patent-based technical map according to claim 1, wherein the key technical feature words are used as nodes of the technical map to draw the technical map of the patent in the target patent and the patent union in the similar field, and the key technical feature words include at least one of:
taking each key technical characteristic word as a node of a technical map, and screening out the associated patents with the key technical characteristic words of which the occurrence times exceed a preset value from the similar field patent collection;
calculating a second weight of the same key technical feature word in the associated patent according to a keyword weight algorithm;
and forming a patent incidence matrix by the second weights and the first weights, and drawing the technical maps of the patents in the target patent and the similar field patents according to the patent incidence matrix, wherein rows and columns of the patent incidence matrix represent nodes and patents.
7. A patent-based technical atlas rendering method according to claim 1, 5 or 6, wherein the keyword weighting algorithm is:
W id =TF id *log(N/DF i ) Wherein, TF id Indicating the frequency of occurrence of technical or key technical characteristics i in patent d, DF i The sum of 1 and the number of patents which represent technical characteristic words or key technical characteristic words i in patent collections in similar fields, W id And the weight of the technical characteristic word i in the patent d is shown, and N is the sum of the number of patents in the patent collection in the similar field and 1.
8. A drawing device based on a technical map of a patent is characterized by comprising:
the patent information acquisition module (11) is used for acquiring the main classification number and a plurality of technical characteristic words of the target patent;
a patent collection screening module (12) for acquiring a patent collection consistent with the main classification number of the target patent, and screening a similar field patent collection associated with the target patent from the patent collection;
the key characteristic word determining module (13) is used for calculating first weights of the technical characteristic words through a keyword weight algorithm and determining key technical characteristic words in the technical characteristic words according to the first weights;
and the technical map drawing module (14) is used for drawing the technical map of each patent in the target patent and the similar field patent aggregate by taking the key technical feature words as nodes of the technical map.
9. A computer-readable storage medium, in which a computer program is stored, which, when loaded and executed by a processor, carries out the method of any one of claims 1 to 7.
10. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the method of any one of claims 1-7 is employed when the computer program is loaded and executed by the processor.
CN202211443316.9A 2022-11-17 2022-11-17 Drawing method, device, medium and electronic equipment based on patent technical map Withdrawn CN115757759A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211443316.9A CN115757759A (en) 2022-11-17 2022-11-17 Drawing method, device, medium and electronic equipment based on patent technical map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211443316.9A CN115757759A (en) 2022-11-17 2022-11-17 Drawing method, device, medium and electronic equipment based on patent technical map

Publications (1)

Publication Number Publication Date
CN115757759A true CN115757759A (en) 2023-03-07

Family

ID=85373031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211443316.9A Withdrawn CN115757759A (en) 2022-11-17 2022-11-17 Drawing method, device, medium and electronic equipment based on patent technical map

Country Status (1)

Country Link
CN (1) CN115757759A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117151052A (en) * 2023-11-01 2023-12-01 北京知呱呱科技有限公司 Patent query report generation method based on large language model and graph algorithm

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117151052A (en) * 2023-11-01 2023-12-01 北京知呱呱科技有限公司 Patent query report generation method based on large language model and graph algorithm
CN117151052B (en) * 2023-11-01 2024-01-23 北京知呱呱科技有限公司 Patent query report generation method based on large language model and graph algorithm

Similar Documents

Publication Publication Date Title
Zong et al. On assigning place names to geography related web pages
Yu et al. Citation impact prediction for scientific papers using stepwise regression analysis
WO2019091026A1 (en) Knowledge base document rapid search method, application server, and computer readable storage medium
Liu et al. Web service clustering using text mining techniques
Kampanos et al. Accept all: The landscape of cookie banners in Greece and the UK
CN111797214A (en) FAQ database-based problem screening method and device, computer equipment and medium
CN110489560A (en) The little Wei enterprise portrait generation method and device of knowledge based graphical spectrum technology
US9507867B2 (en) Discovery engine
CN108763272B (en) A kind of event information analysis method, computer readable storage medium and terminal device
Lau et al. Automatic domain ontology extraction for context-sensitive opinion mining
CN110134842B (en) Information matching method and device based on information map, storage medium and server
CN110362601A (en) Mapping method, device, equipment and the storage medium of metadata standard
Nyakurukwa et al. The evolution of studies on social media sentiment in the stock market: Insights from bibliometric analysis
Alassi et al. Effectiveness of template detection on noise reduction and websites summarization
US8606810B2 (en) Information analyzing device, information analyzing method, information analyzing program, and search system
CN115757759A (en) Drawing method, device, medium and electronic equipment based on patent technical map
KR20100115600A (en) Method and apparatus for online community post searching based on interactions between online community user and computer readable recording medium storing program thereof
Guha Related Fact Checks: a tool for combating fake news
CN117744652A (en) Domain feature word mining method and device based on large language model
Popović et al. Extraction of temporal networks from term co-occurrences in online textual sources
CN112434126B (en) Information processing method, device, equipment and storage medium
Samantaray et al. Fake news detection using text similarity approach
Hambley et al. Web structure derived clustering for optimised web accessibility evaluation
Jeong et al. Efficient keyword extraction and text summarization for reading articles on smart phone
Cao et al. Extraction of informative blocks from web pages

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518000 2201, block D, building 1, Chuangzhi Yuncheng bid section 1, Liuxian Avenue, Xili community, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Applicant after: Qizhi Technology Co.,Ltd.

Address before: 518000 2201, block D, building 1, Chuangzhi Yuncheng bid section 1, Liuxian Avenue, Xili community, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Applicant before: Qizhi Network Technology Co.,Ltd.

CB02 Change of applicant information
WW01 Invention patent application withdrawn after publication

Application publication date: 20230307

WW01 Invention patent application withdrawn after publication