CN109408782A - Research hotspot based on KL distance similarity measurement develops behavioral value method - Google Patents
Research hotspot based on KL distance similarity measurement develops behavioral value method Download PDFInfo
- Publication number
- CN109408782A CN109408782A CN201811216206.2A CN201811216206A CN109408782A CN 109408782 A CN109408782 A CN 109408782A CN 201811216206 A CN201811216206 A CN 201811216206A CN 109408782 A CN109408782 A CN 109408782A
- Authority
- CN
- China
- Prior art keywords
- topic
- publication
- time slice
- theme
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000011160 research Methods 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000005259 measurement Methods 0.000 title claims abstract description 11
- 230000003542 behavioural effect Effects 0.000 title abstract 2
- 238000009826 distribution Methods 0.000 claims abstract description 106
- 238000001514 detection method Methods 0.000 claims abstract description 31
- 230000006399 behavior Effects 0.000 claims description 50
- 238000000605 extraction Methods 0.000 claims description 10
- 230000004927 fusion Effects 0.000 claims description 10
- 230000008033 biological extinction Effects 0.000 claims description 8
- 230000002123 temporal effect Effects 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000011218 segmentation Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 abstract 1
- 238000011524 similarity measure Methods 0.000 abstract 1
- 238000011161 development Methods 0.000 description 10
- 230000018109 developmental process Effects 0.000 description 10
- 238000013528 artificial neural network Methods 0.000 description 9
- 238000004458 analytical method Methods 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 6
- 230000000875 corresponding effect Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- WABPQHHGFIMREM-UHFFFAOYSA-N lead(0) Chemical compound [Pb] WABPQHHGFIMREM-UHFFFAOYSA-N 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of research hotspots based on KL distance similarity measurement to develop behavioral value method, it combines the thematic timing with publication of publication, propose timing publication topic model TS-JTM, to realize that the tense hot spot of academic journals extracts, the theme snapshot publication research hotspot evolution model based on time series is established on this basis, simultaneously, utilize probability distribution KL distance similarity measure, the detection method that theme in measurement adjacent moment theme snapshot develops behavior is proposed, the fine granularity that research hotspot in publication develops is analyzed with realizing.
Description
Technical Field
The invention belongs to the technical field of literature theme analysis and detection, and particularly relates to a KL distance similarity measurement-based method for detecting evolution behaviors of research hotspots.
Background
With the continuous development of scientific research and exploration, research hotspots in academic fields change, and as the change of the academic research hotspots along with the change of the time is promoted by the interpenetration among disciplines and the application of new technologies, some old research problems disappear in the process, and new research problems continuously occur, while some research problems are fissured or fused with other research problems along with the time, and the behaviors lead the development of the academic research hotspots. Therefore, it is necessary to analyze the development of the research hotspots in the academic field and grasp the development trajectories of the research hotspots to predict the development trends of the research hotspots. The method can help the scholars to know the current hot research problem, and can also assist the scientific research personnel and managers to grasp the development rule of the scientific research. The research results and progress of scientific researchers are reflected in academic publications of published academic papers, the academic publications collect a large number of academic research results in a classified manner, and the periodical publication of the publications essentially records the development process of the research field of the journal, so that the research focus of the journal is extracted to find the evolution of the research focus along with time, which is very meaningful.
In the analysis of document theme, an Author theme Model (Author-Topic-Model) is a commonly used theme clustering analysis method, ATM models the interest of the Author of the document, and can analyze the academic preference of the Author[1]. The author topic model is a three-layer Bayesian probability model and comprises three layers of structures of words, topics and author interests. The model can be mapped directly into the topic model of the publication, i.e. the publication selects a certain probabilityAnd generating subject words by the subjects according to a certain probability. However, the evolution of the theme along with the time is an important factor influencing the theme extraction, the author theme model does not consider the time factor, when the author theme model is directly used for the theme extraction of the corpus database of each time slice, the author theme model is an independent model parameter in each time slice, the time dependency is not realized, the influence of the change of the theme along with the time is not considered, and the uncertainty of the theme words in the distribution process is increased. The DTM model is proposed by Blei on the basis of an LDA (latent Dirichlet allocation) model[2]However, the DTM model is not a model for publications to obtain topics included in each publication in the document data set and their evolution over time, and thus cannot meet the requirement of publication topic research.
Therefore, there is no effective means for detecting evolution behavior based on publication time-series topics in the prior art.
Disclosure of Invention
The invention aims to provide a method for detecting evolution behaviors of research hotspots based on KL distance similarity measurement aiming at the defects of the prior art, a time Sequence publication theme model TS-JTM (time Sequence Journal model) is provided by combining the theme and the time Sequence of publications, the publications are subjected to temporal theme extraction according to the time Sequence publication theme model TS-JTM, and theme evolution is measured by combining the theme similarity of KL distance, so that detection of theme continuation, new birth, split, fusion and extinction evolution behaviors is realized.
A method for detecting evolution behaviors of research hotspots based on KL distance similarity measurement comprises the following steps:
step 1: acquiring publication documents, and constructing a subject term corpus with time attributes based on publication time of the publication documents;
dividing time slices by publication document publication time, wherein the subject term corpus is composed of data sets on each time slice, and the data set on each time slice is composed of document feature vectors of publication documents published at matching time;
in the formula, CtIs a data set over a time slice t, (w)i,ji) Document feature vector, w, for publication document iiSet of characteristic words, j, for publication document iiAs a publication to which publication document i belongs, ciIs the ith feature word in the feature word set, n1Is the number of publication documents on time slice t, n2The number of characteristic words on publication document i;
wherein, the characteristic words of the publication documents are obtained after the content of the publication documents is subjected to word segmentation processing;
step 2: constructing a time sequence publication theme model based on publication themes and time sequence;
each time slice in the time-series publication topic model corresponds to a publication topic model, and a dirichlet prior parameter α of publication-topic distribution theta and a dirichlet prior parameter β of topic-word distribution phi in the publication topic model of the next time slice in two adjacent time slices are associated with two dirichlet prior parameters α and β of the previous time slice;
and step 3: sequentially carrying out theme extraction on the data sets on the matched time slices based on a publication theme model on each time slice in the time sequence publication theme model to obtain publication-theme distribution and theme-word distribution on each time slice;
and 4, step 4: the method comprises the steps of obtaining the theme and theme-word distribution of a publication to be tested on each time slice, calculating the KL distance between any two themes of the same publication to be tested on adjacent time slices based on the theme-word distribution, and obtaining the evolution behavior of each theme in the publication to be tested based on a theme snapshot publication research hotspot evolution model;
the topic snapshot publication research hotspot evolution model comprises five types of evolution behavior detection rules, namely topic continuation, new generation, extinction, division and fusion, each type of evolution behavior detection rule is identified based on the similarity of topics on adjacent time slices and the evolution behavior characteristic, the evolution behavior characteristic is related to the similarity, and the similarity of the two topics is measured by adopting KL distance.
On one hand, the invention provides a topic snapshot publication research hotspot evolution model which combines a KL distance to measure the similarity between two topics of the same publication to be tested on two adjacent time slices, covers the detection rules of continuation, new generation, division, fusion and extinction behaviors in topic evolution and realizes the detection of the evolution behavior of a time-series topic on the publication to be tested, wherein various evolution behaviors are characterized in that ① continuation behaviors are adopted that the topic of the current time slice is continued in the next time slice, so that the topic of the current time slice is only very similar to one topic of the next time slice and is not similar to other topics, ② new generation behaviors are adopted that the topic of the current time slice is not connected with the topic of the previous time slice, so that the topic of the current time slice is not similar to all topics of the previous time slice, 32 division behaviors are adopted that the topic of the current time slice is divided, so that a plurality of topics are generated, so that two or more topics of the current time slice are similar to the previous time slice, ④ fusion behaviors are deduced that the topic of the current time slice is similar to the topic of the current time slice, so that the topic of the current time slice, the topic of the same as well as the topic evolution of the topic of the publication to be tested, and the next time slice, and the topic of the same as the topic of the current time slice, and the topic of the current topic of the same topic, and the same as.
On the other hand, by constructing a time sequence publication theme model based on publication themeness and time sequence, considering the influence of the theme along with the time change and adopting a parameter transmission mode to construct the association relation of the publication theme models on adjacent time slices, the uncertainty of the theme words in the theme distribution process is reduced, so that the model is less confused; meanwhile, the time-series publication topic model aims at publication modeling in a literature data set, and the topic of the subject field represented by the publication is stronger than the topic of the subject field represented by the author, so that the time-series publication topic model of the invention is more in line with the requirement of researching the topic evolution of the publication compared with the conventional author topic model ATM and the conventional DTM model.
Further preferably, the topic snapshot publication research hotspot evolution model comprises the following detection rules:
a: when the KL distance between the topic i on the time slice t and the topic on the next time slice t +1 is less than the similarity threshold value and the KL distances between the topic i on the next time slice t +1 and the remaining topics on the next time slice t +1 are greater than or equal to the similarity threshold value, the topic i keeps continuing in the next time slice t + 1:
b: when the KL distance between the topic i on the time slice t and each topic on the adjacent time slice t-1 is larger than the similarity threshold value, the topic i on the time slice t is a new topic:
c: when the KL distance between the theme i on the time slice t and each theme on the next adjacent time slice t +1 is larger than the similarity threshold, the theme i on the time slice t is not continued in the next time slice t +1, and the theme i disappears:
d: when KL distances between the theme i on the time slice t and at least two themes on the next adjacent time slice t +1 are both smaller than a similarity threshold value, the theme i on the time slice t is split into multiple themes in the next time slice t + 1:
e: and when the KL distances between the topic i on the time slice t and at least two topics on the adjacent last time slice t-1 are smaller than the similarity threshold value, fusing the topic i on the time slice t by a plurality of topics in the last time slice t-1.
Further preferably, the detection formula of each detection rule in the topic snapshot publication research hotspot evolution model is as follows:
the detection formula of the continuous evolution behavior in the rule a is as follows:
in the formula,KL distances between a subject i and a subject j on a t +1 time slice and between a subject i and a subject k on a t +1 time slice respectively,topic-word distributions of topic i, topic j, and topic k on T +1 time slices, Tt+1A topic set on t +1 time slice, and threshold _ A is a similarity threshold;
the detection formula of the evolution behavior of the new theme in the rule b is as follows:
in the formula,is KL distance between topic j and topic i on T-1 time slicet-1Is a topic set on a t-1 time slice;
the detection formula of the death evolution behavior in the rule c is as follows:
the detection formula of the splitting evolution behavior in the rule d is as follows:
the detection formula of the fusion evolution behavior in the e rule is as follows:
further preferably, the KL distance calculation formula of the two subjects is as follows:
in the formula,the KL distance of topic j to topic i on t-1 time slice,respectively representing the subject-word distribution of a subject j on a t-1 time slice and a subject i on a t time slice,are respectively asAnd (3) the word probability of a subject word X under the subject-word distribution, wherein X represents a subject word set of a subject j on a t-1 time slice, and X represents any subject word in X.
It should be understood that the above formula is also used when calculating KL distances on two other adjacent time slices, and this formula is a general formula. It should be noted that if the subject word x in the formula does not exist in the subject word set of the subject i at time t, then phii t(x) Taken to a preset small value, for example 0.001.
Further preferably, the similarity threshold is 0.4.
Further preferably, the dirichlet prior parameter α of the publication-topic distribution θ and the dirichlet prior parameter β of the topic-word distribution Φ in the publication topic model on the adjacent time slice in step 2 are correlated as follows:
βt|βt-1~N(βt-1,σ2I)
αt|αt-1~N(αt-1,δ2I)
in the formula, βt、βt-1Dirichlet prior parameters of topic-word distribution in topic models on time slice t and time slice t-1, αt、αt-1Dirichlet prior parameter N (β) for publication-topic distribution in publication topic models at time slice t and time slice t-1, respectivelyt-1,σ2I) And N (α)t-1,δ2I) Are all normally distributed, σ2I and delta2I represents the variance of the corresponding random variable;
βt|βt-1~N(βt-1,σ2I) prior parameter β representing topic-word distribution under time slice ttSubject-word distribution prior parameter β for last time slice t-1t-1And satisfies N (β)t-1,σ2I) Distribution, αt|αt-1~N(αt-1,δ2I) Dirichlet prior parameter α representing publication-topic distribution under time slice ttDirichlet prior parameter α subject to publication-topic distribution under last time slice t-1t-1And satisfies N (α)t-1,δ2I) And (4) distribution.
The invention considers that academic publications are published periodically along with time, the evolution of the subject of the academic publications is gradual, and adjacent time slices are connected in a parameter transmission mode, namely the adjacent time slices are connected by two parameters of Dirichlet prior parameters α and β. since the value of the Dirichlet prior parameters α and β can influence the formation of the subject and change the distribution of words in the subject, the invention transmits the influence of the publication-subject distribution theta and the subject-word distribution phi in the preamble time slices to the adjacent next time slice subject model parameters by two parameters of α and β, thereby reducing the uncertainty of the subject words in distributing the subject and ensuring that the model is less confused.
Further preferably, the topic number of the time-series topic model in step 2, the dirichlet prior parameter α of the topic-topic distribution θ and the dirichlet prior parameter β of the topic-word distribution Φ in the topic model at the first time slice are preset values.
Further preferably, the number of topics of the chronological publication topic model is 50.
Further preferably, the dirichlet prior parameter α of the publication-topic distribution θ in the publication topic model on the first time slice is 1, and the dirichlet prior parameter β of the topic-word distribution Φ is 0.01.
Advantageous effects
1. The invention provides a brand-new theme snapshot publication research hotspot evolution model which measures the similarity between two topics of the same publication to be tested on two adjacent time slices by combining KL (karhunen-Loeve) distance, detects the continuation, the new generation, the division, the fusion and the extinction behaviors of the topic evolution in the topic snapshots at adjacent moments, realizes the fine-grained analysis of the research hotspot evolution in the publication, and fills the blank of solving the problem of effective detection means based on the publication time sequence topic evolution behaviors in the prior art. The detection rule of the evolution behaviors of continuation, new growth, division, fusion and extinction in the topic snapshot publication research hotspot evolution model provided by the invention is deduced based on the similarity of topics of the same publication on adjacent time slices, and the topic evolution process is accurately reflected.
2. The topic model of the time series publication is constructed based on topic and time sequence, the topic model of the time series publication combines the characteristics of a topic model JTM and a model DTM, namely on one hand, the influence of the topic along with the time change is considered, and the association relation of the topic model of the time series publication on adjacent time slices is constructed in a parameter transmission mode, the uncertainty of the topic words in the topic distribution is reduced, the model is less confused, the defect that the topic change along with the time change is not considered by a single topic model is overcome, and the uncertainty of the topic words in the topic distribution is increased, the topic model of the time series publication is connected with the adjacent time slices through two parameters of Dirichlet prior parameters α and β, the topic distribution of the topic is changed because the values of Dirichlet parameters α and β influence the formation of the topic, the topic distribution theta and the topic-distribution influence of the topic-topic distribution in the prior time slices are transmitted to the next time slices through two parameters of the topic model α and β, and the topic distribution model of the topic model of the invention is more suitable for the topic distribution of the topic model, and the topic distribution of the topic model of the topic distribution is more suitable for the topic model of the topic distribution.
3. Experiments prove that the time sequence publication theme model provided by the invention has better performance in the confusion degree and the running time, the confusion degree of the time sequence publication theme model is lower than that of an author theme model ATM and a DTM model, and the running time of the time sequence publication theme model is close to that of the DTM model and is shorter than that of the ATM.
Drawings
FIG. 1 is a schematic flow chart of a method for detecting evolution behavior of a research hotspot based on KL distance similarity measurement according to the present invention;
FIG. 2 is a schematic view of a publication topic model provided by the present invention;
FIG. 3 is a schematic diagram of a temporal publication topic model provided by the present invention;
FIG. 4 is a schematic diagram of topic evolution behavior in a topic snapshot publication research hotspot evolution model provided by the invention;
FIG. 5 is a schematic diagram of the evolution of the subject under publication ID 003 in the years 2010-2016 according to the present invention;
FIG. 6 is a diagram illustrating the confusion contrast of the ATM model, the DTM model and the TS-JTM model according to the present invention.
Detailed Description
The present invention will be further described with reference to the following examples.
Because the research hotspot in the academic field is mainly reflected in the academic publication, how to analyze the evolution behavior of the subject in the data set of the academic publication has important significance for scientific researchers to know the development track of the subject research hotspot and grasp the development rule of the research hotspot. As shown in fig. 1, the present invention provides a method for detecting evolution behavior of research hotspot based on KL distance similarity measurement based on this requirement, which includes the following steps:
step 1: and preprocessing the literature information. The method comprises the steps of firstly obtaining publication documents from a public document information base and preprocessing the publication documents, and then constructing a topic word corpus with time attributes based on publication time of the publication documents.
The pretreatment process comprises the following steps: extracting the title name, abstract, key word, periodical name and publication time of the publication document, formatting, dividing the abstract and the title name into phrases by using a word segmentation tool, deleting stop words, and combining the remaining phrases and the key word into a feature word of the document. In other possible embodiments, the feature words of the document may also be derived from the abstract only, or from the abstract and the keywords; or an abstract or a literature title, and the present invention is not particularly limited thereto.
After the characteristic word set of each document is obtained, the time slices are divided according to the publication time of the document, and the characteristic words of the document belonging to the same time slice and publication information of the document form a data set of the time slices. The data sets for each time slice constitute a corpus of topic words.
For example: and acquiring scientific and technical literature information from a national knowledge network public literature resource library to construct a subject word corpus. A summary of 6487 articles, corresponding journal names and publication time are selected from publications in the computer field of 2010-2016 to serve as experimental data. Dividing all literature information into data sets of 7 time slices according to the year, and then using Chinese academy Chinese word segmentation system NLPIR to segment and remove stop words from each abstract of thesis to form subject word sets of all literaturesWherein is prepared from (w)i,ji) To represent the document feature vector of document i. Wherein wiRepresenting a set of feature words, j, in document iiRepresents a publication published in document i. N in time slice t1Data set C composed of the literaturetCan be expressed as
Step 2: a temporal journal topic model (TS-JTM) is constructed based on the topic and the chronological order.
The model of the time series publication topic model (TS-JTM) in each time slice is a publication topic model, which is shown in FIG. 2.α and β in the model represent Dirichlet (Dirichlet) prior parameters of a publication-topic distribution theta and a topic-word distribution phi, respectively, K represents the total number of publications, and T represents the number of topics.
The journal topic model on adjacent time slices in the time series topic model (TS-JTM) of the present invention has an association relationship, as with the DTM model, as shown in FIG. 3, adjacent time slices are connected by Dirichlet priors α and β, wherein the values of Dirichlet priors α and β affect the formation of the topic and change the distribution of words in the topic.
βt|βt-1~N(βt-1,σ2I) (1)
αt|αt-1~N(αt-1,δ2I) (2)
φt~Dir(βt) (3)
θt~Dir(αt) (4)
Wherein, formula 1 represents prior parameter β of topic-word distribution under time slice ttSubject-word distribution prior parameter β for last time slice t-1t-1And satisfies N (β)t-1,σ2I) Distribution, βtAnd βt-1Satisfying the first order Markov process, and in the same way, equation 2 represents the Dirichlet prior parameter α of publication-subject distribution under time slice ttDirichlet prior parameter α subject to publication-topic distribution under last time slice t-1t-1And satisfies N (α)t-1,δ2I) Distribution equation (3) and equation (4) represent the parameter βtAnd αtRespectively, topic-word distribution in the modeltAnd publication-subject θtDirichlet prior parameter αtAnd βtThe value of (b) will affect publication-topic distributions and topic-word distributions.
Based on the model structure of the topic model of the time series publication, the number of topics in the model and the Dirichlet prior parameter β on the first time slice are set1And α1The value of (2) is obtained by extracting the theme of the data set on the first time slicePublication-topic distribution θ on first time slice1And a subject-word phi1Distribution, and the Dirichlet prior parameter β of the first time slice is obtained by using formula (1) and formula (2)1And α1Calculate new β1' and α1', and new parameters β1' and α1The distribution of the publications-subject and subject-word on each time slice is obtained by repeating the process continuously, namely the Dirichlet prior parameter α on other time slicestAnd βtAccording to α on the previous time slice respectivelyt-1、βt-1And (4) calculating. The process of performing topic extraction on the data set on the matched time slice by using the topic model of the publication on the time slice to obtain publication-topic distribution and topic-word distribution is the implementation process of the prior art, and the invention does not describe the process in detail but only describes the process briefly.
The parameter inference of the publication topic distribution theta and the topic word distribution phi in the time-series publication topic model adopts a Gibbs Sampling (Gibbs Sampling) method. For each word, the publication and topic are sampled according to equation 5, with p (topic | journal) · p (word | topic) on the right in equation 5, i.e., the probability that the publication selects the topic and the topic selects the word. Since there are T topics (topic) and K publications (journal), the physical meaning of the formula is to sample in these K T paths.
In the formula, zi=j,xiK here represents the ith word in a document assigned to the jth Topic (Topic) and kth publication. WiM represents that the ith word is the mth word in the dictionary. Z-i,X-iSubject matter and publication assignments representing words other than the ith word.Representing the total number of words m that have been assigned to topic j before this assignment,indicating the total number of topics j assigned by publication k to topic j so far. N is the total number of words in the dictionary, which consists of all the different feature words in the data set. The formula (1) only needs to record two matrixes in the parameter estimation of the model, wherein one is a counting matrix NxT of a topic-word (word by topic) and the other is a counting matrix KxT of a Journal-topic (Journal by topic), and then the calculation formulas of the topic-word distribution phi and the topic-topic distribution theta are respectively the formula (6) and the formula (7) according to the two counting matrixes.
In the formula, phimjRepresenting the probability, θ, that topic j uses word mkjIndicating the probability that publication k selects topic j, m 'indicating any word assigned to topic j, and j' indicating any topic assigned to publication k.
And step 3: and sequentially carrying out theme extraction on the data sets on the matched time slices based on the publication theme model on each time slice in the time sequence publication theme model to obtain publication-theme distribution and theme-word distribution on each time slice.
Constructing a framework of a time sequence publication topic model based on the step 2, setting the topic number of the time sequence publication topic model, a Dirichlet prior parameter α of a publication-topic distribution theta and an initial value of a Dirichlet prior parameter β of a topic-word distribution phi in the embodiment, and then performing topic extraction on data sets on all time slices in sequence to obtain publication-topic distribution and topic-word distribution on all time slices, wherein the process is to perform topic extraction by using a TS-JTM model, namely performing 1.1, 1.2 and 1.3 on each time slice t in a circulating manner;
1.1 in time slice T, using TS-JTM model to extract subject from data set to obtain subject set TtAnd topic-word distribution;
1.2 set of topics TtAdding to a set TC of time series topics;
1.3 parameters α Using the Current time slice modelt,βtAnd updating the model TS-JTM.
It should be appreciated that updating the TS-JTM model updates the next time-slice model parameters α, β in the temporal journal topic model.
And 4, step 4: the method comprises the steps of obtaining the theme and theme-word distribution of a publication to be tested on each time slice, calculating the KL distance between any two themes of the same publication to be tested on adjacent time slices based on the theme-word distribution, and obtaining the evolution behavior of each theme in the publication to be tested based on a theme snapshot publication research hotspot evolution model.
As shown in FIG. 4, the topic snapshot publication research hotspot evolution model provided by the invention comprises behavior characteristics of the topics, wherein ① one-to-one relation indicates that the topic of the current time slice is continued from the topic of the previous time slice, ② indicates that a new topic exists when the topic in the current time slice is not connected with the topic in the previous time slice, ③ one-to-many relation indicates that the topic of the previous time slice is split and a plurality of topics are generated, ④ many-to-one relation indicates that a plurality of topics are fused into one topic, and ⑤ indicates that the topic in the previous time slice is lost when the topic is not connected with the topic in the next time slice.
To measure the similarity between two topics, the present invention employs the KL distance. KL (Kullback-LeiblerDrigence) distance was proposed by Solomon Kullback and Richard LeiblerGo out[3]Also called relative entropy (relatedentropy), is often used to measure the similarity between two probability distributions, and the use of KL distance can be used to measure the similarity between any two subjects in adjacent time slices. The following formula 8 is a calculation formula of the KL distance, in which,andrespectively representing two probability distributions, the value of the KL distance being such that when the two probability distributions are identicalIs 0.
The similarity between two topics distributed on two adjacent time slices is measured by adopting the KL distance, the corresponding relation between the topics of the adjacent time slices is established, and the probability distribution in the formula corresponds to the topic-word distribution of the topics.
Based on the evolution behaviors in the above 1-5, the subject snapshot publication research hotspot evolution model of the invention comprises the following detection rules:
a: and when the KL distance between the topic i on the time slice t and the topic on the next adjacent time slice t +1 is less than the similarity threshold value and the KL distances between the topic i on the time slice t +1 and the remaining topics on the next adjacent time slice t +1 are both greater than or equal to the similarity threshold value, the topic i keeps continuing in the next time slice t + 1.
b: and when the KL distance between the topic i on the time slice t and each topic on the adjacent time slice t-1 is larger than the similarity threshold value, the topic i on the time slice t is the new topic.
c: when the KL distance between the theme i on the time slice t and each theme on the next adjacent time slice t +1 is larger than the similarity threshold, the theme i on the time slice t is not continued in the next time slice t +1, and the theme i disappears.
d: and when the KL distances between the topic i on the time slice t and at least two topics on the next adjacent time slice t +1 are both smaller than the similarity threshold value, the topic i on the time slice t is split into multiple topics in the next time slice t + 1.
e: and when the KL distances between the topic i on the time slice t and at least two topics on the adjacent last time slice t-1 are smaller than the similarity threshold value, fusing the topic i on the time slice t by a plurality of topics in the last time slice t-1.
To sum up, the detection formulas corresponding to the a-e detection rules are as follows:
wherein,and (3) representing the evolution behavior state identification of the ith theme of the tth time slice, wherein Threshold _ A is a similarity Threshold. Through repeated experiments, when threshold _ a is set to 0.4, the evolution behavior of the theme can be reasonably reflected, and in other feasible embodiments, other values can be taken.
For the treatment of the publication to be tested on each time slice, the following procedures 2.2.1 and 2.2.2 are respectively executed:
2.2.1 extracting the topic set T of the current time slice from the set TCtAnd a theme set T of two time slices adjacent to the current time slicet-1、Tt+1And obtaining the journal to be tested in the collection Tt-1,Tt,Tt+1The subject matter of (1);
2.2.2 detecting the evolution behavior of each topic of the publication to be detected on the current time slice according to the formula 9.
In order to more clearly describe the aspects of the present invention, a number of examples will be provided below.
1. Change of subject word with time
As shown in table 1 below, the publication with ID 003 in the data set, topic number 2 was a topic related to the face recognition field in the topic distribution in 2010. The distribution of topic-words for topic number 2 from 2010 to 2016 is shown in table 2, which shows the 10 topic words with the highest probability for this topic per year. It can be seen from the table that, as time goes on, the core words in the theme of "face recognition" do not change greatly, and popular words related to face recognition, such as "image", "feature", "face recognition", etc., are always in the theme. However, the 'genetic algorithm' appeared in 2013 and the 'deep learning' appeared in 2015 are applications of some new methods in the field of 'face recognition'. From 2010 to 2016, the KL values of two time slice topics adjacent to each other are respectively 0.20, 0.26, 0.23, 0.17, 0.21 and 0.19, and the KL distances are all smaller than the similarity threshold value threshold _ a, during which the KL distances of other topics in the "face recognition" topic and the next time slice are all larger than the similarity threshold value threshold _ a, so that the "face recognition" topic is continued from 2010 to 2016.
TABLE 1 topic-word distribution Table for "face recognition
2. Evolution of publication topics over time
For convenience of description, we will refer to a subject in subsequent articles by its english abbreviation. The first 10 topic words with the highest probability in the 2010-2016 years are shown in Table 2 for the three topics "Neural Network (NN)", "Deep Learning (DL)", "Speech Recognition (SR)". As can be seen from table 2, core words such as "neural network", "neuron", "feature" and the like in the topic NN are basically kept unchanged, and distribution of edge words such as "sample" and "particle group" in different time slices is greatly changed. The words of the topic NN in 2013 and the topic DL in 2014 that are the same in the first 10 topic words are "training", "classification", "performance", "feature", "neuron", and due to the similarity of word distribution, the KL value between the two topics is small and is 0.27, which is smaller than the similarity threshold value threshold _ a, and the KL values of all topics in 2013, which correspond to the smallest value are DL and NN, respectively, and are 0.55, 0.27, 0.21, 0.69, 1.84, 1.16, 0.92 and 1.53, respectively, and the rest are larger than the similarity threshold value, so the topic DL is generated by topic NN splitting.
Table 22010 and 2016 of "speech recognition" and other word distribution tables for three topics
3. Publication topic evolution analysis
The evolution of the subject under publication ID 003 between 2010 and 2016 is shown in FIG. 5. Since the same subject is formed with different numbers by clustering at different time slices, the same subject is represented by an english abbreviation in the figure. As can be seen from fig. 5, the KL values of the topic distributions in 2015 and the topic SR in 2016 are 0.74, 0.46, 0.23, 0.16, 0.81, 0.95, and 1.37, respectively, two topics that are smaller than the similarity threshold are NN and SR, respectively, and the remaining KL values are both greater than the similarity threshold, indicating that the topic NN is fused into the topic SR, the KL distances of the "aircraft" topic in 2014 from the topic in 2015 are 1.72, 1.46, 1.25, 1.07, 1.20, 0.83, 1.59, and the minimum value of KL is 0.83 and is greater than the similarity threshold, so that the "aircraft" topic undergoes extinction in 2015; the KL values of all topics in 2010 and the "cloud computing" topic in 2011 are 1.16, 0.75, 1.37, 2.32, 1.51 respectively, the minimum value of KL is 0.75 which is greater than the similarity threshold, and the topic is new born; similarly, the target tracking topic is always in a continuous state, and the new topic in 2013 is entity identification.
Model performance verification
In order to verify the model performance of the time series publication topic model (TS-JTM) provided by the invention, the invention adopts a perplexity index. Equation 10 below is a calculation equation for Perplexity, where DtestRepresenting a test set, is a collection of M documents, p (W)d) Representing the probability of a word in a document being selected, NdRepresenting the number of words in the document d, Wd=(w1d,w2d,...,wid,...,wnd) Representing the word vector form in document d. A smaller Perplexity value indicates better performance of the model.
To measure the performance of the time series publication topic model (TS-JTM), 3 parameters of the model need to be set before the experiment, the value of the topic quantity | T | is gradually increased from 10, two dirichlet super parameter values of ATM are set to α ═ 50/| T |, β ═ 0.01, two dirichlet super parameter values of DTM and time series publication topic model in the first time slice are set to α ═ 50/| T |, β ═ 0.01, α and β in the remaining time slices are automatically obtained by the model, comparing the experimental results as shown in fig. 6, the horizontal axis represents the topic quantity, the vertical axis represents the Perplexity (Perplexity), we can see that the Perplexity of TS-JTM is always minimum as the change of the topic quantity, which indicates the best performance of TS-JTM, in addition, the Perplexity only decreases as the topic quantity increases, when the topic quantity is greater than 50, the Perplexity is set to be equal to 3950, preferably equal to the topic quantity of TS-JTM 50, and the Perplexity model is set to 3950.
In another aspect, the invention tests the runtime of the time sequential publication topic model (TS-JTM) on a data set, and we compare TS-JTM with the runtime of the Author Topic Model (ATM) and the Dynamic Topic Model (DTM) on the model. The same data was processed using the three models, respectively, with run times of 23.8 minutes, 25.6 minutes, and 24.2 minutes for the three models, respectively. This indicates that the TS-JTM and DTM run times are very close, with the longest ATM run time. In conjunction with the model obfuscation performance of FIG. 4, it is shown that the time series publication topic model (TS-JTM) not only has low obfuscation, but also has good performance at runtime.
In summary, the subject evolution of academic publications reflects the development trend of research hotspots in the academic field. Because the theme and the timeliness of the publication influence the distribution and the evolution process of the topic, the evolution behavior exists in the topic evolution process, and the identification of the evolution track of the publication research hotspot is complicated. The text combines the theme of the publication and the time sequence of the publication, provides a time sequence publication theme model TS-JTM, uses the TS-JTM to realize temporal hotspot extraction of academic publications, and verifies the performance of the model TS-JTM through a confusion contrast experiment. On the basis, a topic snapshot publication research hotspot evolution model based on a time sequence is established, KL distance measurement similarity is used, continuation, new generation, splitting, fusion and extinction behaviors of topic evolution in topic snapshots at adjacent moments are detected, and fine-grained analysis of research hotspot evolution in publications is realized.
It should be emphasized that the examples described herein are illustrative and not restrictive, and thus the invention is not to be limited to the examples described herein, but rather to other embodiments that may be devised by those skilled in the art based on the teachings herein, and that various modifications, alterations, and substitutions are possible without departing from the spirit and scope of the present invention.
The references are as follows:
[1]Rosen-Zvi M,GriffithsT,Steyvers M.The Author-Topic Model forAuthors and Documents[C].Proceedings of the 20th Conference on Uncertainty inArtificial Intelligence.2004:487-494.
[2]Blei D M,Lafferty J D.Dynamic Topic Models[C].Proceedings of the23rd International Conference on Machine Learning,2006:113-120.
[3]David J.C.MacKay.Information Theory,Inference,and LearningAlgorithms[M].Cambridge University Press,2003:22-48.
Claims (9)
1. A KL distance similarity measurement-based method for detecting evolution behaviors of research hotspots is characterized by comprising the following steps: the method comprises the following steps:
step 1: acquiring publication documents, and constructing a subject term corpus with time attributes based on publication time of the publication documents;
dividing time slices by publication document publication time, wherein the subject term corpus is composed of data sets on each time slice, and the data set on each time slice is composed of document feature vectors of publication documents published at matching time;
in the formula, CtIs a data set over a time slice t, (w)i,ji) Document feature vector, w, for publication document iiSet of characteristic words, j, for publication document iiAs a publication to which publication document i belongs, ciIs the ith feature word in the feature word set, n1Is the number of publication documents on time slice t, n2The number of characteristic words on publication document i;
wherein, the characteristic words of the publication documents are obtained after the content of the publication documents is subjected to word segmentation processing;
step 2: constructing a time sequence publication theme model based on publication themes and time sequence;
each time slice in the time-series publication topic model corresponds to a publication topic model, and a dirichlet prior parameter α of publication-topic distribution theta and a dirichlet prior parameter β of topic-word distribution phi in the publication topic model of the next time slice in two adjacent time slices are associated with two dirichlet prior parameters α and β of the previous time slice;
and step 3: sequentially carrying out theme extraction on the data sets on the matched time slices based on a publication theme model on each time slice in the time sequence publication theme model to obtain publication-theme distribution and theme-word distribution on each time slice;
and 4, step 4: the method comprises the steps of obtaining the theme and theme-word distribution of a publication to be tested on each time slice, calculating the KL distance between any two themes of the same publication to be tested on adjacent time slices based on the theme-word distribution, and obtaining the evolution behavior of each theme in the publication to be tested based on a theme snapshot publication research hotspot evolution model;
the topic snapshot publication research hotspot evolution model comprises five types of evolution behavior detection rules, namely topic continuation, new generation, extinction, division and fusion, each type of evolution behavior detection rule is identified based on the similarity of topics on adjacent time slices and the evolution behavior characteristic, the evolution behavior characteristic is related to the similarity, and the similarity of the two topics is measured by adopting KL distance.
2. The method of claim 1, wherein: the topic snapshot publication research hotspot evolution model comprises the following detection rules:
a: when the KL distance between the topic i on the time slice t and the topic on the next time slice t +1 is less than the similarity threshold value and the KL distances between the topic i on the next time slice t +1 and the remaining topics on the next time slice t +1 are greater than or equal to the similarity threshold value, the topic i keeps continuing in the next time slice t + 1:
b: when the KL distance between the topic i on the time slice t and each topic on the adjacent time slice t-1 is larger than the similarity threshold value, the topic i on the time slice t is a new topic:
c: when the KL distance between the theme i on the time slice t and each theme on the next adjacent time slice t +1 is larger than the similarity threshold, the theme i on the time slice t is not continued in the next time slice t +1, and the theme i disappears:
d: when KL distances between the theme i on the time slice t and at least two themes on the next adjacent time slice t +1 are both smaller than a similarity threshold value, the theme i on the time slice t is split into multiple themes in the next time slice t + 1:
e: and when the KL distances between the topic i on the time slice t and at least two topics on the adjacent last time slice t-1 are smaller than the similarity threshold value, fusing the topic i on the time slice t by a plurality of topics in the last time slice t-1.
3. The method of claim 2, wherein: the detection formula of each detection rule in the topic snapshot publication research hotspot evolution model is as follows:
the detection formula of the continuous evolution behavior in the rule a is as follows:
in the formula,KL distances between a subject i and a subject j on a t +1 time slice and between a subject i and a subject k on a t +1 time slice respectively,topic-word distributions of topic i, topic j, and topic k on T +1 time slices, Tt+1A topic set on t +1 time slice, and threshold _ A is a similarity threshold;
the detection formula of the evolution behavior of the new theme in the rule b is as follows:
in the formula,is KL distance between topic j and topic i on T-1 time slicet-1Is a topic set on a t-1 time slice;
the detection formula of the death evolution behavior in the rule c is as follows:
the detection formula of the splitting evolution behavior in the rule d is as follows:
the detection formula of the fusion evolution behavior in the e rule is as follows:
4. the method of claim 1, wherein: the KL distance calculation formula for both topics is as follows:
in the formula,the KL distance of topic j to topic i on t-1 time slice,respectively representing the subject-word distribution of a subject j on a t-1 time slice and a subject i on a t time slice,are respectively asWord probability of a topic word X under topic-word distribution, X representing a topicX represents any subject word in X.
5. The method of claim 1, wherein: the similarity threshold is 0.4.
6. The method as claimed in claim 1, wherein the Dirichlet prior parameter α of the publication-topic distribution θ and the Dirichlet prior parameter β of the topic-word distribution φ in the publication topic model at the adjacent time slice in step 2 are related as follows:
βt|βt-1~N(βt-1,σ2I)
αt|αt-1~N(αt-1,δ2I)
in the formula, βt、βt-1Journal topic models on time slice t and time slice t-1 respectivelyDirichlet prior parameter of medium topic-word distribution, αt、αt-1Dirichlet prior parameter N (β) for publication-topic distribution in publication topic models at time slice t and time slice t-1, respectivelyt-1,σ2I) And N (α)t-1,δ2I) Are all normally distributed, σ2I and delta2I represents the variance of the corresponding random variable;
βt|βt-1~N(βt-1,σ2I) prior parameter β representing topic-word distribution under time slice ttSubject-word distribution prior parameter β for last time slice t-1t-1And satisfies N (β)t-1,σ2I) Distribution, αt|αt-1~N(αt-1,δ2I) Prior parameter α representing publication-topic distribution under time slice ttPrior parameter α of publication-topic distribution under last time slice t-1t-1And satisfies N (α)t-1,δ2I) And (4) distribution.
7. The method as claimed in claim 6, wherein the topic number of the time series topic model in step 2 and the Dirichlet prior parameter α of the topic-topic distribution θ and the Dirichlet prior parameter β of the topic-word distribution φ in the topic model at the first time slice are preset values.
8. The method of claim 7, wherein: the number of topics for the temporal publication topic model is 50.
9. The method as recited in claim 7, wherein the Dirichlet prior parameter α of the publication-topic distribution θ is 1 and the Dirichlet prior parameter β of the topic-word distribution φ is 0.01 in the publication topic model at the first time slice.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811216206.2A CN109408782B (en) | 2018-10-18 | 2018-10-18 | KL distance similarity measurement-based research hotspot evolution behavior detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811216206.2A CN109408782B (en) | 2018-10-18 | 2018-10-18 | KL distance similarity measurement-based research hotspot evolution behavior detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109408782A true CN109408782A (en) | 2019-03-01 |
CN109408782B CN109408782B (en) | 2020-07-03 |
Family
ID=65468456
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811216206.2A Active CN109408782B (en) | 2018-10-18 | 2018-10-18 | KL distance similarity measurement-based research hotspot evolution behavior detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109408782B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102646114A (en) * | 2012-02-17 | 2012-08-22 | 清华大学 | News topic timeline abstract generating method based on breakthrough point |
CN102902700A (en) * | 2012-04-05 | 2013-01-30 | 中国人民解放军国防科学技术大学 | Online-increment evolution topic model based automatic software classifying method |
CN103559176A (en) * | 2012-10-29 | 2014-02-05 | 中国人民解放军国防科学技术大学 | Microblog emotional evolution analysis method and system |
CN103984681A (en) * | 2014-03-31 | 2014-08-13 | 同济大学 | News event evolution analysis method based on time sequence distribution information and topic model |
CN105868415A (en) * | 2016-05-06 | 2016-08-17 | 黑龙江工程学院 | Microblog real-time filtering model based on historical microblogs |
US20160241346A1 (en) * | 2015-02-17 | 2016-08-18 | Adobe Systems Incorporated | Source separation using nonnegative matrix factorization with an automatically determined number of bases |
CN106204140A (en) * | 2016-07-12 | 2016-12-07 | 华东师范大学 | A kind of colony based on KL distance viewpoint migrates detection method |
CN107918611A (en) * | 2016-10-09 | 2018-04-17 | 郑州大学 | A kind of model analyzed microblog topic and developed |
-
2018
- 2018-10-18 CN CN201811216206.2A patent/CN109408782B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102646114A (en) * | 2012-02-17 | 2012-08-22 | 清华大学 | News topic timeline abstract generating method based on breakthrough point |
CN102902700A (en) * | 2012-04-05 | 2013-01-30 | 中国人民解放军国防科学技术大学 | Online-increment evolution topic model based automatic software classifying method |
CN103559176A (en) * | 2012-10-29 | 2014-02-05 | 中国人民解放军国防科学技术大学 | Microblog emotional evolution analysis method and system |
CN103984681A (en) * | 2014-03-31 | 2014-08-13 | 同济大学 | News event evolution analysis method based on time sequence distribution information and topic model |
US20160241346A1 (en) * | 2015-02-17 | 2016-08-18 | Adobe Systems Incorporated | Source separation using nonnegative matrix factorization with an automatically determined number of bases |
CN105868415A (en) * | 2016-05-06 | 2016-08-17 | 黑龙江工程学院 | Microblog real-time filtering model based on historical microblogs |
CN106204140A (en) * | 2016-07-12 | 2016-12-07 | 华东师范大学 | A kind of colony based on KL distance viewpoint migrates detection method |
CN107918611A (en) * | 2016-10-09 | 2018-04-17 | 郑州大学 | A kind of model analyzed microblog topic and developed |
Also Published As
Publication number | Publication date |
---|---|
CN109408782B (en) | 2020-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105677873B (en) | Text Intelligence association cluster based on model of the domain knowledge collects processing method | |
CN110188047B (en) | Double-channel convolutional neural network-based repeated defect report detection method | |
CN106844424A (en) | A kind of file classification method based on LDA | |
CN110807084A (en) | Attention mechanism-based patent term relationship extraction method for Bi-LSTM and keyword strategy | |
CN116992007B (en) | Limiting question-answering system based on question intention understanding | |
CN111832289A (en) | Service discovery method based on clustering and Gaussian LDA | |
CN108304479B (en) | Quick density clustering double-layer network recommendation method based on graph structure filtering | |
CN112115716A (en) | Service discovery method, system and equipment based on multi-dimensional word vector context matching | |
CN108959305A (en) | A kind of event extraction method and system based on internet big data | |
CN110728151B (en) | Information depth processing method and system based on visual characteristics | |
CN106294863A (en) | A kind of abstract method for mass text fast understanding | |
Pembeci | Using word embeddings for ontology enrichment | |
CN107832467A (en) | A kind of microblog topic detecting method based on improved Single pass clustering algorithms | |
Zhou et al. | Neural storyline extraction model for storyline generation from news articles | |
Poudyal et al. | Using Clustering Techniques to Identify Arguments in Legal Documents. | |
CN114676346A (en) | News event processing method and device, computer equipment and storage medium | |
CN109408782B (en) | KL distance similarity measurement-based research hotspot evolution behavior detection method | |
CN110633363A (en) | Text entity recommendation method based on NLP and fuzzy multi-criterion decision | |
Sun et al. | Stylometric and Neural Features Combined Deep Bayesian Classifier for Authorship Verification. | |
CN113987536A (en) | Method and device for determining security level of field in data table, electronic equipment and medium | |
Chou et al. | Text mining technique for Chinese written judgment of criminal case | |
Wu et al. | Leveraging document-level and query-level passage cumulative gain for document ranking | |
Fan et al. | Research and application of automated search engine based on machine learning | |
Efrizoni et al. | Hybrid Modeling to Classify and Detect Outliers on Multilabel Dataset based on Content and Context | |
Nyandag et al. | Keyword extraction based on statistical information for Cyrillic Mongolian script |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240315 Address after: 230000 floor 1, building 2, phase I, e-commerce Park, Jinggang Road, Shushan Economic Development Zone, Hefei City, Anhui Province Patentee after: Dragon totem Technology (Hefei) Co.,Ltd. Country or region after: China Address before: Yuelu District City, Hunan province 410083 Changsha Lushan Road No. 932 Patentee before: CENTRAL SOUTH University Country or region before: China |
|
TR01 | Transfer of patent right |