CN113886574A

CN113886574A - Patent topographic map drawing method and device based on structural text clustering

Info

Publication number: CN113886574A
Application number: CN202111025719.7A
Authority: CN
Inventors: 朱欣昱; 程序; 刘琦; 孔文娟; 李艳; 陈亚鑫; 张素兰
Original assignee: Beijing Zhongzhi Zhihui Technology Co ltd
Current assignee: Beijing Zhongzhi Zhihui Technology Co ltd
Priority date: 2021-09-02
Filing date: 2021-09-02
Publication date: 2022-01-04

Abstract

The invention discloses a method and a device for drawing a patent topographic map based on structural text clustering, wherein the method comprises the following steps: acquiring all target patent texts; extracting key feature words from each target patent text according to different types of fields and preset weights corresponding to each type of field; determining the weight of each key characteristic word in the document in the patent text; determining the inter-document weight of each key characteristic word in all patent texts; determining key feature words added into the clustering set according to the intra-document weight and the inter-document weight; clustering the target patent text according to the key feature words added into the clustering set to obtain a clustering result; and drawing a patent topographic map according to the clustering result. The invention can accurately draw the patent topographic map based on the structural text clustering, thereby accurately reflecting the information of the technology association degree, the technology dense points and the like of the patent technology.

Description

Patent topographic map drawing method and device based on structural text clustering

Technical Field

The invention relates to the technical field of big data, in particular to a method and a device for drawing a patent topographic map based on structural text clustering.

Background

This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

The patent topographic map is different from a generalized statistical chart type patent map, and the patents and the technologies are clustered and arranged in a topographic map with three-dimensional coordinates and elements such as contour lines in a coordinate point mode. Such results are used to intuitively reflect information such as the degree of technical association of the patent technology, the technical concentration point, and the like. The existing drawing method of the patent topographic map has the problem of low drawing precision, so that the information such as the technical association degree, the technical dense points and the like of the patent technology cannot be accurately reflected.

Disclosure of Invention

The embodiment of the invention provides a patent topographic map drawing method based on structural text clustering, which is used for accurately drawing a patent topographic map based on the structural text clustering and comprises the following steps:

acquiring all target patent texts;

extracting key feature words from each target patent text according to different types of fields and preset weights corresponding to each type of field;

determining the weight of each key characteristic word in the document in the patent text; determining the inter-document weight of each key characteristic word in all patent texts;

determining key feature words added into the clustering set according to the intra-document weight and the inter-document weight; clustering the target patent text according to the key feature words added into the clustering set to obtain a clustering result;

and drawing a patent topographic map according to the clustering result.

The embodiment of the invention also provides a device for drawing the patent topographic map based on the structural text clustering, which is used for accurately drawing the patent topographic map based on the structural text clustering and comprises the following components:

the acquisition unit is used for acquiring all target patent texts;

the extraction unit is used for extracting key feature words from each target patent text according to different types of fields and preset weights corresponding to the fields;

the weight determining unit is used for determining the weight of each key characteristic word in the document in the patent text; determining the inter-document weight of each key characteristic word in all patent texts;

the processing unit is used for determining key characteristic words added into the clustering set according to the intra-document weight and the inter-document weight; clustering the target patent text according to the key feature words added into the clustering set to obtain a clustering result;

and the drawing unit is used for drawing the patent topographic map according to the clustering processing result.

The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the computer program, the patent topographic map drawing method for clustering the structural texts is realized.

The embodiment of the invention also provides a computer readable storage medium, which stores a computer program for executing the patent topographic map drawing method for the structural text clustering.

In the embodiment of the invention, the patent topographic map drawing scheme of the structural text clustering comprises the following steps: acquiring all target patent texts; extracting key feature words from each target patent text according to different types of fields and preset weights corresponding to each type of field; determining the weight of each key characteristic word in the document in the patent text; determining the inter-document weight of each key characteristic word in all patent texts; determining key feature words added into the clustering set according to the intra-document weight and the inter-document weight; clustering the target patent text according to the key feature words added into the clustering set to obtain a clustering result; according to the clustering processing result, the patent topographic map is drawn, and the accurate drawing of the patent topographic map based on the structural text clustering can be realized, so that the information such as the technical association degree and the technical dense points of the patent technology can be accurately reflected.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:

FIG. 1 is a schematic flow chart of a method for drawing a topographic patent map based on structured text clustering according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a key feature word extraction process in an embodiment of the present invention;

FIG. 3 is a schematic diagram of polar transformation in an embodiment of the present invention;

FIG. 4 is a diagram illustrating keyword extraction settings according to an embodiment of the present invention;

FIG. 5 is an exemplary diagram of a patent vector in an embodiment of the present invention;

FIG. 6 is a diagram illustrating a patent clustering result according to an embodiment of the present invention;

FIG. 7 is a topographic map of a patent including a patent drawing point according to an embodiment of the present invention;

FIG. 8 is a topographical view of only a center plot in accordance with an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a device for drawing a topographic patent map based on structural text clustering in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.

The embodiment of the invention provides a patent topographic map drawing scheme based on structural text clustering, which aims to research patent texts with structures, perform patent text clustering on the basis, further research a drawing algorithm of a clustering topographic map, enable the clustering topographic map to accurately express corresponding physical meanings, and perform related research on patent analysis on the basis of a topographic map. The patent topographic map drawing scheme based on the structural text clustering is described in detail below.

Fig. 1 is a schematic flow chart of a method for drawing a topographic patent map based on structured text clustering according to an embodiment of the present invention, and as shown in fig. 1, the method includes the following steps:

step 101: acquiring all target patent texts;

step 102: extracting key feature words from each target patent text according to different types of fields and preset weights corresponding to each type of field;

step 103: determining the weight of each key characteristic word in the document in the patent text; determining the inter-document weight of each key characteristic word in all patent texts;

step 104: determining key feature words added into the clustering set according to the intra-document weight and the inter-document weight; clustering the target patent text according to the key feature words added into the clustering set to obtain a clustering result;

step 105: and drawing a patent topographic map according to the clustering result.

The patent topographic map drawing method for the structural text clustering provided by the embodiment of the invention can realize the accurate drawing of the patent topographic map based on the structural text clustering, thereby accurately reflecting the information such as the technical association degree, the technical dense points and the like of the patent technology. This is described in detail below with reference to fig. 2 to 8.

First, the above step 101 is described.

In specific implementation, the meaning of the target patent text is a structural text to be analyzed and then clustered, and all the target patent texts can form a document set.

Next, the above step 102, i.e. extracting the feature information of the structured patent document, is described.

In specific implementation, as an experimental object, the embodiment of the invention extracts keywords from 256 Chinese patents in the field of industrial robots. The patent text is different from information texts such as general news and the like, and due to relative specifications of the whole patent application process, the writing form of the patent text and the article structure are relatively fixed. As is well known, patents contain a large amount of fixed bibliographic information, of which textual information that can participate in text analysis is shown in table 1 below.

Table 1: patent text type information (field type) and corresponding information connotation thereof

The text fields are based on Chinese patent data primary processing indexing. It can be seen that the information content corresponding to different fields is relatively fixed and is different from one field to another.

In an embodiment, as shown in fig. 2, extracting key feature words from each of the target patent texts according to different types of fields and preset weights corresponding to each type of field may include: extracting a key feature word corresponding to a target patent text according to the following method:

extracting candidate feature words from a target patent text according to different types of fields and preset weights corresponding to the fields of each type; i.e. the step of collecting candidate subject words in fig. 2;

calculating co-occurrence factors among candidate characteristic words extracted from a target patent text; i.e. the step of calculating co-occurrence factors in fig. 2;

determining the weight of the candidate characteristic words in the target patent text full text according to the co-occurrence factor; i.e. the step of calculating weights in fig. 2;

extracting a key feature word corresponding to the target patent text according to the weight of the candidate feature word in the target patent text full text; namely, the step of taking the top 20 in the weight normalization in fig. 2, namely, taking the candidate feature words with the top 20 in the weight ranking after the weight normalization as the key feature words.

In particular implementation, in the step of collecting candidate topic words in fig. 2, the field type may be the content as described in the information field column in table 1 above. Specifically, the patent extraction key feature words in the embodiment of the present invention adopt 4 field titles, abstracts, main weights, and full text (4 field types), and the weight settings (preset weights) of each item are shown in fig. 4.

In specific implementation, in the step of calculating the co-occurrence factor in fig. 2, the co-occurrence factor between the candidate feature words extracted from one target patent text may be calculated according to the following formula (4). When co-occurrence factors between candidate feature words are calculated using a formula, w_ipFor the weight of the candidate feature word within each paragraph, w_pIs the weight of a paragraph, w_ipfWord frequency weight, w, for candidate feature words_ipdIs a co-occurrence factor.

In specific implementation, in the step of calculating the weight in fig. 2, the weight of the candidate feature word in the entire target patent text may be calculated according to the following formula (1). W in formula (1)_ipThe calculation method of (2) is shown in the formula (2).

In specific implementation, in the step of finally fetching words, as shown in the step of fetching the first 20 words by weight normalization in fig. 2, words are fetched after the weight normalization processing, so that the precision and efficiency of fetching words can be improved.

In specific implementation, the embodiment of extracting feature words shown in fig. 2 can improve the accuracy of extracting features, thereby improving the accuracy of subsequently drawing a topographic map of a patent. An example of a patent vector after extracting keywords may be as shown in fig. 5.

Third, next, the above step 103 is described.

The main idea of step 103 in the embodiment of the present invention is to divide the weight of the feature word into two parts: in-document weight (w)_l) And inter-document weight (w)_g). The intra-document weight is calculated according to the distribution condition inside the document, and the inter-document weight is mainly calculated according to the condition that the characteristic words appear in the document set. The final weight is the product of the two: w ═ w_l×w_g

1) The determinants of the weights within the document are: word frequency (frequency) + co-occurrence distance (co-location) + paragraph position (opportunity) + concept hierarchy (Similarity).

Since the patent text has a definite paragraph structure and different paragraphs have different importance, in the embodiment of the present invention, the weight of each paragraph is subjectively evaluated, so that the weight of a feature word in the whole text may be the sum of the weights in several paragraphs.

Wherein, w_iIs the weight of a feature word (candidate feature word or key feature word) in the whole text, w_ipIs the weight of a feature word (candidate feature word or key feature word) in each paragraph.

From the above, in one embodiment, determining the intra-document weight of each key feature word in the patent text may include:

determining the weight of each key feature word in each paragraph;

and determining the weight of each key characteristic word in the document in the patent text according to the weight of each key characteristic word in each paragraph.

The embodiment of the invention mainly researches a weight distribution scheme in a paragraph, and assumes that the weight of a paragraph is w_pThen the feature word weight within a paragraph can be expressed as:

w_ip＝w_ipf×(1+w_ipd)×w_p； (2)

wherein: w is a_ipFor the weight of the key feature word (or candidate feature word) within each paragraph, w_pIs the weight of a paragraph, w_ipfIs the word frequency weight, w, of the key feature word (or candidate feature word)_ipdIs a co-occurrence factor.

As can be seen from the above, in one embodiment, determining the weight of the key feature word in each paragraph may include determining the weight of the key feature word in each paragraph according to the above formula (2).

In one paragraph, the frequency of a word represents the weight of a word, i.e. the higher the frequency, the greater the weight, i.e. in one embodiment, the above-mentioned method for mapping a patent terrain map based on structured text clustering may further include calculating the word frequency weight according to the following formula:

wherein, w_ipfIs the word frequency weight of the key feature word f_ipThe occurrence frequency of the key feature words in one paragraph is shown, n is the total number of the key feature words, and j is the serial number of the key feature words.

Meanwhile, the embodiment of the invention evaluates the co-occurrence degree of words in the paragraphs. Suppose that the co-occurrence distances of the two feature words are d1, d2, d3 … … dm, respectively.

Then the co-occurrence factor of the two terms can be defined as:

wherein, w_ipdAs co-occurrence factor, d_jFor the co-occurrence distance, m is the total number of feature words, and j is the serial number of the feature words.

As can be seen from the above, in an embodiment, the above method for drawing a patent terrain map based on structured text clustering may further include calculating a co-occurrence factor according to equation (4).

2) The decision factors for the inter-document weight are: document rate (concurrence).

In one embodiment, determining the inter-document weight of each key feature word in all patent texts may include:

determining the distribution condition of each key characteristic word in all patent texts;

and determining the inter-document weight of each key characteristic word in all patent texts according to the distribution condition of each key characteristic word in all patent texts.

In specific implementation, the inter-document weight means: if the distribution of a certain characteristic word is uniform in the document set, the characteristic word appears in a plurality of texts, so that the characteristic word is considered to have weak capability of representing a certain text, and the inter-document weight of the characteristic word is 0; if the characteristic word only appears in one text, the characteristic word can be considered to have strong capability of representing the text, and the inter-document weight is the largest. That is, in one embodiment, determining the inter-document weight of each key feature word in all patent texts according to the distribution of each key feature word in all patent texts may include: the inter-document weight of each key feature word in all patent texts decreases as the number of key feature words distributed in the patent texts increases.

In specific implementation, the mean square error can be used to evaluate the distribution of a feature word in each document:

suppose the weights of the feature words T in the document set are w respectively_k(k ═ 1,2, … | D |). Now, the weights are mainly evaluated to be equally distributed among the documents. And calculating the distribution situation of the weights by using the characteristics of the mean square error:

that is to say w_gThe larger the weight of the feature word in each document is, the more different the weight of the feature word is, and if the feature word is uniformly distributed in each document, the weight of the feature word is w_gIf it is 0, the feature word is excluded from the cluster (i.e., the feature word is not added to the cluster set for cluster analysis in step 104). Considering the space sparsity problem of the feature words, the method can be simplified as follows:

wherein, w_gIs the inter-document weight, D is the intra-document weight (i.e., the weight of the feature word in the kth document), k is the identification (order) of the document,

is the weight average, and i is the identification (order) of the weight within the document.

Fourthly, next, for ease of understanding, the

above steps

104 and 105 are introduced together.

In the step 104, a text clustering algorithm of K-means may be adopted, and the patent clustering result may be as shown in fig. 6.

In one embodiment, in the step 105, drawing a patent topographic map according to the clustering result may include:

mapping the feature vector corresponding to each key feature word in the clustering processing result to a pre-established polar coordinate axis of a corresponding angle, and calculating to obtain a polar coordinate corresponding to each feature vector;

converting the polar coordinates corresponding to each eigenvector into Cartesian coordinates to obtain the mass center of a polygon surrounded by each eigenvector; the centroid is a plane coordinate of each eigenvector mapped on a Cartesian coordinate system;

calculating the similarity of the cluster where each feature vector is located; the similarity is a Z coordinate of the corresponding feature vector;

and obtaining the patent topographic map according to the plane coordinate and the Z coordinate of each feature vector.

In specific implementation, the patent topographic map drawing algorithm may include:

an N-dimensional data space is mapped to a flat surface for display using polar transformation, as shown in fig. 3.

And distributing the N dimension data according to the circumference (2 pi) and the like, and setting each dimension according to the actual value range of the dimension.

Any one vector V_k＝{v_i(i ═ 0, 1,2, …, N-1), maps the value of each dimension to the coordinate axis of the corresponding angle, and calculates the polar coordinate of the point:

convert it to cartesian coordinates as:

(v_i cosθ_i,v_i sinθ_i)；

such vector V_kThe centroid of the enclosed polygon is:

this centroid coordinate is the vector V_kAnd mapping to a plane coordinate on a Cartesian coordinate system, and taking the similarity of a cluster where the vector is located as a Z coordinate of the point, so far, the design of the drop point of the vector on the patent map is completed, and the drawing result of the patent topographic map can be shown in FIG. 7 and FIG. 8.

In specific implementation, the detailed implementation of drawing the patent map may include:

1) to avoid passing centroids of data of different dimensions, such as 0 ° and 180 °, 90 ° and 270 °, in distributing the feature vectors, 90 ° is chosen as the entire vector coordinate space.

2) Calculate cluster (one of the clustering results) coordinates:

a) and calculating cluster coordinates according to a polar coordinate transformation mode by taking the origin as the center.

b) The distance of each cluster coordinate from the origin is calculated.

c) All cluster coordinates are shrunk by equal scale (the inverse of the farthest distance in all clusters), now within the unit circle.

3) Calculating the patent coordinates:

a) calculate the coverage radius of each cluster: 1/2 of the distance between adjacent nearest clusters.

b) And calculating the patent coordinates according to a polar coordinate transformation mode by taking the cluster coordinates where the patents are located as the center.

And (3) contracting all patent coordinates within the coverage radius of the cluster according to the similarity of each patent and the cluster, namely:

and fifthly, in order to facilitate comprehensive understanding, main interface design of a patent clustering and mapping algorithm program is introduced below.

a) Extracting subject term

The functions are as follows: and analyzing the text content, extracting subject words of the patent and evaluating the contribution weight of each subject word to the full-text subject.

An inlet: the title, abstract, main right, text and other contents of the patent document and the weight of each chapter are input.

And (4) outlet: the keywords of the patent and their respective weights and concept groupings.

b) Clustering function

The functions are as follows: a collection of patent documents is automatically grouped by topic similarity.

An inlet: inputting the ID of each patent document, the subject word, the weight and concept group of the subject word, a reference word list, the number of clusters, whether to calculate the coordinate, the maximum number of circulation, the cluster termination condition and the number of working threads.

And (4) outlet: subject word vectors for each cluster, patent documents contained in each cluster, and distances between each patent document and the center of the cluster in which it is located.

c) Comparing similarity

The functions are as follows: the similarity is compared for the two word vectors.

An inlet: two vectors to be compared.

And (4) outlet: similarity between vectors.

Therefore, the method for drawing the patent topographic map based on the structural text clustering provided by the embodiment of the invention well achieves the following purposes:

1) the method comprises the following steps of extracting the segmentation field of the patent subject term and vectorizing and expressing the patent. This is the basis of text clustering of patents. Due to the adoption of a segmented extraction method and a professional lexicon. The amount of non-technical vocabulary in the patent vector is greatly reduced.

2) Text clustering of patents. Based on a special vectorization means of the patent, the result of patent text clustering is closer to the result of patent technology classification.

3) And (5) drawing a patent topographic map. And the distance calculation between the category central points, between the category central points and the patent points and between the patent points is well realized in the drawing of the patent topographic map. On the basis of overall uniform distribution of the central point, the purpose that the first two types in the 3 types of distance relations reflect text similarity as much as possible is achieved. Meanwhile, the density degree of the patent points also really reflects the distribution situation of technical research.

The embodiment of the invention also provides a device for drawing the patent topographic map based on the structural text clustering, which is described in the following embodiment. Because the principle of the device for solving the problems is similar to the method for drawing the patent topographic map based on the structural text clustering, the implementation of the device can refer to the implementation of the method for drawing the patent topographic map based on the structural text clustering, and repeated parts are not repeated.

Fig. 9 is a schematic structural diagram of a device for drawing a topographic patent map based on structured text clustering according to an embodiment of the present invention, as shown in fig. 9, the device includes:

the acquiring unit 01 is used for acquiring all target patent texts;

the extracting unit 02 is used for extracting key feature words from each target patent text according to different types of fields and preset weights corresponding to each type of field;

the weight determining unit 03 is used for determining the weight of each key feature word in the document in the patent text; determining the inter-document weight of each key characteristic word in all patent texts;

the processing unit 04 is configured to determine a key feature word added to the cluster set according to the intra-document weight and the inter-document weight; clustering the target patent text according to the key feature words added into the clustering set to obtain a clustering result;

and the drawing unit 05 is used for drawing the patent topographic map according to the clustering processing result.

In an embodiment, the extracting unit may be specifically configured to: extracting a key feature word corresponding to a target patent text according to the following method:

extracting candidate feature words from a target patent text according to different types of fields and preset weights corresponding to the fields of each type;

calculating co-occurrence factors among candidate characteristic words extracted from a target patent text;

determining the weight of the candidate characteristic words in the target patent text full text according to the co-occurrence factor;

and extracting a key feature word corresponding to the target patent text according to the weight of the candidate feature word in the target patent text full text.

In an embodiment, the weight determining unit may be specifically configured to:

determining the weight of each key feature word in each paragraph;

In an embodiment, the weight determining unit may be specifically configured to determine the weight of the key feature word in each paragraph according to the following formula:

w_ip＝w_ipf×(1+w_ipd)×w_p；

wherein, w_ipFor the weight of key feature words within each paragraph, w_pIs the weight of a paragraph, w_ipfWord frequency weight, w, for key feature words_ipdIs a co-occurrence factor.

In one embodiment, the weight determination unit may be specifically configured to decrease the inter-document weight of each key feature word in all patent texts as the number of key feature words distributed in the patent texts increases.

In an embodiment, the rendering unit may be specifically configured to:

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A patent topographic map drawing method based on structural text clustering is characterized by comprising the following steps:

acquiring all target patent texts;

and drawing a patent topographic map according to the clustering result.

2. The method for drawing a patent topographic map based on structural text clustering as claimed in claim 1, wherein the extracting key feature words from each of the target patent texts according to different types of fields and preset weights corresponding to each type of field comprises: extracting a key feature word corresponding to a target patent text according to the following method:

3. The method for drawing a patent topographic map based on structural text clustering as claimed in claim 1, wherein the determining the intra-document weight of each key characteristic word in the patent text comprises:

determining the weight of each key feature word in each paragraph;

4. A method for patent terrain mapping based on structured text clustering as recited in claim 3 wherein determining the weight of the key feature words in each paragraph comprises determining the weight of the key feature words in each paragraph according to the following formula:

w_ip＝w_ipf×(1+w_ipd)×w_p；

5. The method for drawing a patent topographic map based on structural text clustering as claimed in claim 1, wherein the determining the inter-document weight of each key characteristic word in all patent texts comprises:

6. The method for drawing a patent topographic map based on structural text clustering as claimed in claim 5, wherein the determining the inter-document weight of each key feature word in all patent texts according to the distribution of each key feature word in all patent texts comprises: the inter-document weight of each key feature word in all patent texts decreases as the number of key feature words distributed in the patent texts increases.

7. The method for drawing the patent topographic map based on the structural text clustering as claimed in claim 1, wherein drawing the patent topographic map according to the clustering processing result comprises:

8. The patent topographic map drawing device based on structural text clustering is characterized by comprising the following components:

the acquisition unit is used for acquiring all target patent texts;

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any one of claims 1 to 7.