Summary of the invention
The present invention proposes a kind of intellectual property information treating method and apparatus, for solve how to intellectual property information into
This technical problem of row Intelligent treatment, so that enterprise understands current industry technology development and patent distribution situation perspectively.
The present invention realizes the Intelligent treatment of intellectual property information in the following way, comprising:
Step 1: building patent information library: according to international classification table, constructing the multiway tree index of classification number, and be every
A classification number one memory space of corresponding distribution, for storing the corresponding classification number semantic vector of classification number, in advance according to classification
Number definition extract classification number keyword, store into classification number semantic vector, to construct classification number concordance list;According to from special
The patent information that sharp database crawls supplements the classification number semantic vector in classification number concordance list;And it is directed to and each climbs
The patent taken generates patent semantic vector using text vector generation method, generates patent information library;
Step 2: crawling the patent information of itself from patent database, itself is formed specially using the method in step 1
Sharp information bank;
Step 3: setting rival's information, crawls the patent information of rival from patent database, competition is formed
Opponent's patent information library;
Step 4: being indexed according to classification number, intersects and compare own patent information bank and rival's patent information library, obtain
Directional similarity and patent similitude;Wherein, intersect and compare including each special in same category number in two information banks of traversal
Benefit calculates same category number in patent similitude and two information banks according to patent semantic vector, according to classification number semanteme to
Amount calculates directional similarity;
Step 5: the ordering scenario of visualization directional similarity, user can adjust according to the sequence of directional similarity and research and develop
Direction;According to patent similitude, the authorization desired value of patent is provided, user can refer to the processing mode that the value determines patent.
Meanwhile the present invention in further include that the text vector of the program is extracted for technical solution to be committed, respectively and respectively
Classification number semantic vector calculates similarity, using similarity be more than certain threshold value classification number as the recommendation classification number of the program,
And according to the directional similarity of the recommendation classification number, it is determined whether the program is submitted patent application.
The application divides the patent distribution situation of enterprise and its rival from classification number, individual two angles of patent
Analysis, calculates directional similarity and patent similitude instructs enterprise to be adjusted itself R&D direction and patent distribution, and energy
Enough to its patent of technical solution intellectual analysis to be committed a possibility that, avoid enterprise from wasting unnecessary application expenses, real
Existing intelligentized intellectual property information processing analysis.
Specific embodiment
With reference to the accompanying drawing, it elaborates to embodiment.
Flow chart of the method for the present invention as shown in Figure 1:
Step 1: building patent information library: according to international classification table, constructing the multiway tree index of classification number, and be every
A classification number one memory space of corresponding distribution, for storing the corresponding classification number semantic vector of classification number, in advance according to classification
Number definition extract classification number keyword, store into classification number semantic vector, to construct classification number concordance list;According to from special
The patent information that sharp database crawls supplements the classification number semantic vector in classification number concordance list;And it is directed to and each climbs
The patent taken generates patent semantic vector using text vector generation method, generates patent information library;
When establishing the multiway tree index based on IPC classification chart, root node is that the portion of classification number indexes, for each foundation
One root node successively carries out the division of multiway tree according to the major class of IPC, group, big group, the sequence of group, is each node
Corresponding classification number semantic vector is generated, and real-time perfoming updates, and can instruct the subsequent classification to other patent information;Simultaneously
Classification number concordance list only can select a part to be constructed as needed, can be depending on the specific research field of enterprise, to subtract
Few data processing amount.
In supplementary classification semantic vector, the classification number information provided using patent, the key provided according to patent
Word, which supplements corresponding classification number semantic vector, to be updated, and adjusts power of the keyword in semantic vector according to keyword source
Weight, keyword source include abstract, background technique, claim, specification, the inventive point letter of patent involved in summary info
Breath, and background technique can more reflect patent fields information, therefore, the keyword for extracting from abstract, background technique is set
Set higher weight information.
The text vector generation method can be used well known various text vector generation methods, such as neural network,
Doc2vec etc..
Step 2: crawling the patent information of itself from patent database, itself is formed specially using the method in step 1
Sharp information bank;
Own patent information bank includes the patent information that itself classification number concordance list and own patent semantic vector are constituted;From
Classification number semantic vector in status class-mark concordance list is updated according to own patent information.
Step 3: setting rival's information, crawls the patent information of rival, using step from patent database
Method in one forms rival's patent information library;
Rival's patent information library includes that rival's classification number concordance list and rival's patent semantic vector are constituted
Patent information;Classification number semantic vector in rival's classification number concordance list has carried out more according to rival's patent information
Newly.
Step 4: intersecting according to classification number concordance list and comparing own patent information bank and rival's patent information library, obtain
Directional similarity and patent similitude out;Wherein, intersect and compare including each in same category number in two information banks of traversal
Patent, according to patent semantic vector calculate the classification number semanteme of same category number in patent similitude and two information banks to
Amount calculates directional similarity according to classification number semantic vector.
For example, for classification number A, there are patent a1, b1, c1 in own patent information bank, in rival's patent information library
There is patent a2, b2;So compare the patent semanteme of (a1, a2), (a1, b2), (b1, a2), (b1, b2), (c1, a2), (c1, b2)
Vector similarity defines patent similitude;Calculate the classification number semantic vector of classification number A and competition pair in own patent information bank
The similarity of the classification number semantic vector of classification number A in hand patent information library defines directional similarity.
Directional similarity can be divided into big group similitude and group's similitude.It is Chong Die with itself research field according to rival
Situation is dynamically adapted parameter and calculates big group similitude or group's similitude, more careful grind can be provided for enterprise itself
Originating party to guidance, big group, group refer to big group of group structure in classification chart.
Step 5: the ordering scenario of visualization directional similarity, user can adjust according to the sequence of directional similarity and research and develop
Direction;According to patent similitude, the authorization desired value of patent is provided, user can refer to the processing mode that the value determines patent.
In step 3, rival's information is dynamically adapted.
It in another embodiment, further include that the text of the program is extracted for technical solution to be committed in the present invention
Vector calculates similarity with each classification number semantic vector respectively, is more than the classification number of certain threshold value as the program using similarity
Recommendation classification number, and according to the directional similarity of the recommendation classification number, it is determined whether the program is submitted patent application.
One specific embodiment of the present invention as shown in Figure 2, a kind of intellectual property information processing unit, which is characterized in that packet
Include following module:
Information bank constructs module, constructs patent information library: according to international classification table, the multiway tree index of classification number is constructed,
It and is each classification number one memory space of corresponding distribution, for storing the corresponding classification number semantic vector of classification number, in advance
The keyword for extracting classification number is defined according to classification number, is stored into classification number semantic vector, to construct classification number concordance list;
According to the patent information crawled from patent database, the classification number semantic vector in classification number concordance list is supplemented;And needle
To each patent crawled, patent semantic vector is generated using text vector generation method, generates patent information library;
Own patent information bank generation module, for crawling itself patent information from patent database, using step
Method in one forms own patent information bank;
Rival's patent information library generation module crawls competing for rival's information to be arranged from patent database
The patent information of opponent is striven, rival's patent information library is formed;
Similarity computing module intersects comparison own patent information bank and rival is special for indexing according to classification number
Sharp information bank obtains directional similarity and patent similarity;Wherein, intersect and compare including identical point in two information banks of traversal
Each patent in class-mark calculates same category number in patent similitude and two databases, root according to patent semantic vector
Directional similarity is calculated according to classification number semantic vector;
Visualization model, for visualizing the ordering scenario of directional similarity, user can be according to the sequence of directional similarity
Adjust R&D direction;According to patent similitude, the authorization desired value of patent is provided, user can refer to the processing that the value determines patent
Mode.
Wherein, information bank constructs module in supplementary classification semantic vector, the classification number information provided according to patent,
Corresponding classification number semantic vector is supplemented according to the keyword that patent provides and is updated, and key is adjusted according to keyword source
Weight of the word in semantic vector, keyword source include abstract, background technique, claim, specification, are plucked for extracting from
It wants, higher weight information is arranged in the keyword of background technique.
Wherein, rival's information is dynamically adapted in the generation module of rival's patent information library.
Wherein, directional similarity can be divided into big group similitude and group's similitude in similarity computing module.
In another embodiment, which further includes analysis module, for extracting this article for technical solution to be committed
The text vector of part calculates similarity with each classification number semantic vector respectively, and the classification number that similarity is more than certain threshold value is made
For the recommendation classification number of the program, and according to the directional similarity of the recommendation classification number, it is determined whether the program is submitted patent
Application.
Above-described embodiment is merely preferred embodiments of the present invention, but protection scope of the present invention is not limited to
This, anyone skilled in the art in the technical scope disclosed by the present invention, the variation that can readily occur in or replaces
It changes, should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claim
Subject to enclosing.