CN106250910B - Semi-structured data classification method based on label sequence and nGrams - Google Patents

Semi-structured data classification method based on label sequence and nGrams Download PDF

Info

Publication number
CN106250910B
CN106250910B CN201610555498.7A CN201610555498A CN106250910B CN 106250910 B CN106250910 B CN 106250910B CN 201610555498 A CN201610555498 A CN 201610555498A CN 106250910 B CN106250910 B CN 106250910B
Authority
CN
China
Prior art keywords
tsgrams
semi
structured data
feature
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610555498.7A
Other languages
Chinese (zh)
Other versions
CN106250910A (en
Inventor
张利军
李宁
高锦涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunyao Technology (Zhejiang) Co.,Ltd.
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Publication of CN106250910A publication Critical patent/CN106250910A/en
Application granted granted Critical
Publication of CN106250910B publication Critical patent/CN106250910B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a semi-structured data classification method based on a label sequence and nGrams, which is used for solving the technical problem of poor accuracy of the existing semi-structured data classification method. The technical scheme includes that TSGrams features are used as a basic unit for representing semi-structured data, structural information of the semi-structured data is captured through a label sequence, content information of the semi-structured data is captured through nGrams, the TSGrams features and the content information are fused to serve as inclusion relation between feature capture structures and content, mutual relation among different keywords in the content information is considered, information gain is used for screening the TSGrams features, a TSGrams feature construction feature space with strong classification capacity is obtained, a category classification model is built according to the mutual information between the TSGrams features and categories, similarity among different structures is considered during classification, and accuracy of classification of the semi-structured data is improved.

Description

Semi-structured data classification method based on label sequence and nGrams
Technical Field
The invention relates to a semi-structured data classification method, in particular to a semi-structured data classification method based on a label sequence and nGrams.
Background
Semi-structured data classification is generally divided into three steps: firstly, extracting features from a semi-structured data set of known classes, then constructing a classification model by using the extracted features, and finally classifying the data of the unknown classes by using the constructed model. The semi-structured data contains structure and content information, and for the classification based on the structure and the content, the following factors need to be considered when extracting the features:
1. the containment relationship between structure and content, i.e., the content is organized to be contained in different hierarchies;
2. the mutual relations among the elements in the structure, namely brother relations, father-son relations, ancestor-descendant relations and the like among the elements;
3. the interrelationship between keywords in the content.
Most of the existing semi-structured data classification methods based on structures and contents extend the traditional unstructured text document vector space model on the basis of the vector space model, so that the structured text document vector space model contains structural information and is then used for classification. For example, documents 1 "train T, Nayak R, Bruza P d. combining Structure and Content properties for XML Document clustering. proceedings of the 7th architecture Data Mining Conference (AusDM '08)," 2008.219-226 "and" ghos, Mitra P. combining Content and Structure Similarity for XML Document Classification Composite SVM kernels. proceedings of19th International Conference on Pattern Recognition (ICPR'08), Tampa,2008.1-4 "all employ such methods in that the Structure information and the Content information respectively indicate that the mutual relationship between the Structure and the Content is broken, i.e., the first factor mentioned above is ignored.
There are also methods that consider the relationship between structure and Content, such as document 2 "Yuanjia, Goodde, Bohong. XML Web Classification research based on the correlation between structure and text keywords. computer research and development, 2006,43(8): 1361-. Although these methods consider the inclusion relationship between the structure and the content, the structure is modeled as a path, and the path represents the hierarchical (parent-child, ancestor-descendant) relationship between elements, but neglects the mutual relationship between different paths and the similarity of paths, and the like, i.e. the second factor cannot be processed well.
Document 4 "Yang J, Zhang F. XML Document Classification Using Extended VSM. proceedings of Focused Access to XML Documents, the 6th International works for the Evaluation of XML recommendation (INEX'07), Dagstuhl cast, Germany: springer Berlin/Heidelberg,2008.234-244, "extends the vector space model, represents semi-structured data as a matrix, thereby capturing the association between structure and content, and the relationship among the internal (elements) of the structure is embodied by a core matrix, and the structure element structure subtrees are replaced by structure subtrees in the Document 5 'Yang J, Wang S.extended VSM for XML Document Classification Using frequency subtrees of Focused Retrieval and Evaluation, the 8th International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX'09), spring Berlin/Heidelberg,2010.441-448 ], so as to further embody the relevance. This approach, while taking into account the first two factors to some extent, has not yet explicitly addressed the third factor.
In summary, when the existing semi-structured data classification method extracts features from semi-structured data, the three factors are not fully considered, so that the semi-structured data classification model constructed by the methods lacks part of intrinsic information between the semi-structured data features and the semi-structured data categories, thereby affecting the accuracy of semi-structured data classification.
Disclosure of Invention
In order to overcome the defect of poor accuracy of the existing semi-structured data classification method, the invention provides a semi-structured data classification method based on a label sequence and nGrams. The method comprises the steps of taking TSGrams characteristics as a basic unit for representing semi-structured data, capturing structure information of the semi-structured data by using a label sequence, capturing content information of the semi-structured data by using nGrams, fusing the TSGrams characteristics and the content information as an inclusion relation between a characteristic capture structure and the content, considering the mutual relation between different keywords in the content information, screening the TSGrams characteristics by using information gain, obtaining a TSGrams characteristic structure characteristic space with strong classification capability, constructing a category classification model according to the mutual information between the TSGrams characteristics and the categories, and considering the similarity between different structures during classification, so that the accuracy of classification of the semi-structured data is improved.
The technical scheme adopted by the invention for solving the technical problems is as follows: a semi-structured data classification method based on a tag sequence and nGrams is characterized by comprising the following steps:
step one, constructing a TSGrams characteristic space.
(1) And (5) extracting TSGrams characteristics. For each data document D in the data set D, traversing all text nodes of D by using a tree model, forming a label sequence by using paths from a root node to a parent node of the root node, and extracting all nGrams with the length less than or equal to 3 from text contents. And then combining the tag sequences and nGrams to form TSGrams characteristics, and recording a set formed by all the TSGrams characteristics as TSGramsSet.
(2) An information gain is calculated. An information gain IG (f) is calculated for each TSGrams feature f: < s, g > in TSGramsSet. The calculation method is as follows:
Figure BDA0001049512060000031
wherein the content of the first and second substances,
Figure BDA0001049512060000032
Figure BDA0001049512060000033
Figure BDA0001049512060000034
(3) sorting all TSGrams characteristics with the length of1 in the TSGramsSet from large to small according to information gain, and setting the information gain of the Nth characteristic as IGN. Wherein N is a parameter.
(4) Selecting all information gain values in TSGramsSet to be larger than IGNConstitutes a feature space omega.
And step two, constructing a classification model.
(1) Calculating each TSGrams feature f in the feature space omega:<s,g>and class CiMutual information MI (f, C)i). The calculation method is as follows:
Figure BDA0001049512060000035
(2) the feature space Ω is divided into k disjoint subsets, each subset representing a class CiThe subset is called the class CiIs recorded as a classification model of
Figure BDA0001049512060000036
Any feature f in the feature space omega is divided into a classification model phi with the highest mutual informationC*In (1). Namely:
Figure BDA0001049512060000037
(3) according to the above division, any one of the classes CiClassification model of
Figure BDA0001049512060000038
Can be expressed as a vector in the TSGrams eigenspace Ω:
Figure BDA0001049512060000041
wherein, wi,jFor TSGrams feature fj:<sj,gj>In class CiWeight in (1), if fjNot being CiIf the weight is 0, otherwise, the two information are mutually obtained and normalized, that is to say
Figure BDA0001049512060000042
And step three, classifying the unknown class data.
(1) Preprocessing the unknown class semi-structured data d to be classified, obtaining the TSGrams features in the unknown class semi-structured data d by the method, and discarding the features which appear in the unknown class semi-structured data d but are not contained in the feature space omega, so that the unknown class semi-structured data d is represented as a vector in the feature space omega:
Figure BDA0001049512060000043
wherein, wjIs the jth feature f in the TSGrams feature space omegaj:<sj,gj>The values obtained after normalization of the frequencies that appear in document d.
(2) Computing the unknown class semi-structured data d and any class C using its vector representationiClassification model of
Figure BDA0001049512060000044
The similarity between the two is calculated as follows:
Figure BDA0001049512060000045
wherein, wd(<sj,gj>) For TSGrams characteristic < sj,gj>The weights in the unknown class semi-structured data d,
Figure BDA0001049512060000046
for TSGrams characteristic < sk,gj>In-category model
Figure BDA0001049512060000047
The weights in (1), d, and
Figure BDA0001049512060000048
are respectively d and
Figure BDA0001049512060000049
euclidean norm of, and sim(s)j,sk) Is a tag sequence sjAnd skThe similarity between them is defined as:
Figure BDA00010495120600000410
wherein m and n are respectively a tag sequence sjAnd skLength of (d), and ed(s)j,sk) Is s isjAnd skThe edit distance of (1).
(3) Assigning the category of document d to C with the highest similarity*I.e. by
Figure BDA00010495120600000411
The invention has the beneficial effects that: the method takes TSGrams characteristics as a basic unit for representing semi-structured data, captures structural information of the semi-structured data by using a label sequence, captures content information of the semi-structured data by using nGrams, fuses the two to be used as an inclusion relation between a characteristic capture structure and the content, considers the mutual relation between different keywords in the content information, screens the TSGrams characteristics by using information gain, obtains a TSGrams characteristic structure characteristic space with strong classification capability, constructs a category classification model according to the mutual information between the TSGrams characteristics and the categories, considers the similarity between different structures during classification, and improves the accuracy of classification of the semi-structured data.
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
Drawings
FIG. 1 is a flow chart of the present invention label sequence and nGrams based semi-structured data classification method.
Detailed Description
Refer to fig. 1. The method for classifying semi-structured data based on the label sequence and nGrams comprises the following specific steps:
1. a TSGrams feature space is constructed.
1> TSGrams feature extraction: for each data Document D in The data set D, traversing all text nodes of D in a tree model, constructing a tag sequence from The path of The root node to its parent node, and extracting all nGrams with a length less than or equal to 3 from The text content by a method similar to The documents "Tesar R, Strnad V, Jezek K, et al. And then combining the tag sequences and nGrams to form TSGrams characteristics, and recording a set formed by all the TSGrams characteristics as TSGramsSet.
And 2, calculating information gain. An information gain IG (f) is calculated for each TSGrams feature f: < s, g > in TSGramsSet.
The calculation method is as follows:
Figure BDA0001049512060000051
wherein:
Figure BDA0001049512060000052
Figure BDA0001049512060000053
Figure BDA0001049512060000061
3>general in TSGramsSetThe TSGrams features with the length of1 are sorted from large to small according to the information gain, and the information gain of the Nth feature is set as IGN. Wherein N is a parameter, different values are taken according to different training data sets, and adjustment is needed according to experimental results.
4>Selecting all information gain values in TSGramsSet to be larger than IGNConstitutes a feature space omega.
2. And constructing a classification model.
1>Calculating each TSGrams feature f in the feature space omega:<s,g>and class CiMutual information MI (f, C)i). The calculation method is as follows:
Figure BDA0001049512060000062
2>the feature space Ω is divided into k disjoint subsets, each subset representing a class CiThe subset is called the class CiIs recorded as a classification model of
Figure BDA0001049512060000063
Any feature f in the feature space omega is divided into a classification model phi with the highest mutual informationC*In (1). Namely:
Figure BDA0001049512060000064
3>according to the above division, any one of the classes CiClassification model of
Figure BDA0001049512060000065
Can be expressed as a vector in the TSGrams eigenspace Ω:
Figure BDA0001049512060000066
wherein, wi,jFor TSGrams feature fj:<sj,gj>In class CiWeight in (1), if fjNot being CiIf the weight is 0, otherwise, the two information are mutually obtained and normalized, that is to say
Figure BDA0001049512060000067
3. And classifying the unknown class data.
1, preprocessing unknown class semi-structured data d to be classified, acquiring TSGrams features in the unknown class semi-structured data d by the method, and discarding features which appear in the unknown class semi-structured data d but are not included in a feature space Ω, so that the unknown class semi-structured data d can be represented as a vector in the feature space Ω:
Figure BDA0001049512060000068
wherein, wjIs the jth feature f in the TSGrams feature space omegaj:<sj,gj>The values obtained after normalization of the frequencies that appear in document d.
2>Computing the unknown class semi-structured data d and any class C using its vector representationiClassification model of
Figure BDA0001049512060000071
The similarity between the two is calculated as follows:
Figure BDA0001049512060000072
wherein, wd(<sj,gj>) For TSGrams characteristic < sj,gj>The weight in d is such that,
Figure BDA0001049512060000073
for TSGrams characteristic < sk,gj>In-category model
Figure BDA0001049512060000074
The weights in (1), d, and
Figure BDA0001049512060000078
are respectively d and
Figure BDA0001049512060000075
euclidean norm of, and sim(s)j,sk) Is a tag sequence sjAnd skThe similarity between them is defined as:
Figure BDA0001049512060000076
wherein m and n are respectively a tag sequence sjAnd skLength of (d), and ed(s)j,sk) Is s isjAnd skFor the calculation of the edit distance, please refer to the document "Levenshtein VI. binary codes capable of correcting spectral errors and deletions of ones. schemes of Information transmission.1965", which is different from the basic unit edited in the method is the label (tag).
3>Assigning the category of document d to C with the highest similarity*I.e. by
Figure BDA0001049512060000077

Claims (1)

1. A semi-structured data classification method based on label sequences and nGrams is characterized by comprising the following steps:
step one, constructing a TSGrams characteristic space:
(1) TSGrams feature extraction: for each data document D in the data set D, traversing all text nodes of D by using a tree model, forming a label sequence from a path from a root node to a parent node of the data document D, extracting nGrams with the length less than or equal to 3 from text content, combining the label sequence and the nGrams to form TSGrams characteristics, and recording a set formed by all the TSGrams characteristics as TSGramsSet;
(2) calculating information gain: and calculating the information gain IG (f) of each TSGrams characteristic f: < s, g > in the TSGramsSet, wherein the calculation method comprises the following steps:
Figure FDF0000010457700000011
wherein the content of the first and second substances,
Figure FDF0000010457700000012
Figure FDF0000010457700000013
Figure FDF0000010457700000014
(3) sorting all TSGrams characteristics with the length of1 in the TSGramsSet from large to small according to information gain, and setting the information gain of the Nth characteristic as IGNWherein N is a parameter;
(4) selecting all information gain values in TSGramsSet to be larger than IGNThe characteristic of (a) constitutes a characteristic space Ω;
step two, constructing a classification model:
(1) calculating each TSGrams feature f in the feature space omega:<s,g>and class CiMutual information MI (f, C)i) The calculation method comprises the following steps:
Figure FDF0000010457700000015
(2) the feature space Ω is divided into k disjoint subsets, each subset representing a class CiThe subset is called the class CiIs recorded as a classification model of
Figure FDF0000010457700000016
Any feature f in feature space omega is divided into classification models with highest mutual information
Figure FDF0000010457700000017
In (1), namely:
Figure FDF0000010457700000021
(3) according to the above division, any one of the classes CiClassification model of
Figure FDF0000010457700000022
Can be expressed as a vector in the TSGrams eigenspace Ω:
Figure FDF0000010457700000023
wherein, wi,jFor TSGrams feature fj:<sj,gj>In class CiWeight in (1), if fjNot being CiIf the weight is 0, otherwise, the two information are mutually obtained and normalized, that is to say
Figure FDF0000010457700000024
Step three, classifying the unknown class data:
(1) preprocessing the unknown class semi-structured data d to be classified, obtaining the TSGrams features in the unknown class semi-structured data d by the method, and discarding the features which appear in the unknown class semi-structured data d but are not contained in the feature space omega, so that the unknown class semi-structured data d is represented as a vector in the feature space omega:
Figure FDF0000010457700000025
wherein, wjIs the jth feature f in the TSGrams feature space omegaj:<sj,gj>The value obtained after normalization of the frequency of occurrence in the document d;
(2) computing the unknown class semi-structured data d and any class C using its vector representationiClassification model of
Figure FDF0000010457700000026
The similarity between the two groups is calculated by the following method:
Figure FDF0000010457700000027
wherein, wd(<sj,gj>) is TSGrams characteristic < sj,gjWeight in the unknown class semi-structured data d,
Figure FDF0000010457700000028
for TSGrams characteristic < sk,gjIn class model
Figure FDF0000010457700000029
The weights in (1), d, and
Figure FDF00000104577000000210
are respectively d and
Figure FDF00000104577000000211
euclidean norm of, and sim(s)j,sk) Is a tag sequence sjAnd skThe similarity between them is defined as:
Figure FDF00000104577000000212
wherein m and n are respectively a tag sequence sjAnd skLength of (e) andd(sj,sk) Is s isjAnd skThe edit distance of (d);
(3) assigning the category of document d to C with the highest similarity*I.e. by
Figure FDF00000104577000000213
CN201610555498.7A 2016-01-28 2016-07-14 Semi-structured data classification method based on label sequence and nGrams Active CN106250910B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2016100599996 2016-01-28
CN201610059999 2016-01-28

Publications (2)

Publication Number Publication Date
CN106250910A CN106250910A (en) 2016-12-21
CN106250910B true CN106250910B (en) 2021-01-05

Family

ID=57613103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610555498.7A Active CN106250910B (en) 2016-01-28 2016-07-14 Semi-structured data classification method based on label sequence and nGrams

Country Status (1)

Country Link
CN (1) CN106250910B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993235A (en) * 2019-04-10 2019-07-09 苏州浪潮智能科技有限公司 A kind of multivariate data classification method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033867A (en) * 2010-12-14 2011-04-27 西北工业大学 Semantic-similarity measuring method for XML (Extensible Markup Language) document classification
CN102890698A (en) * 2012-06-20 2013-01-23 杜小勇 Method for automatically describing microblogging topic tag
CN103577452A (en) * 2012-07-31 2014-02-12 国际商业机器公司 Website server and method and device for enriching content of website
CN104063472A (en) * 2014-06-30 2014-09-24 电子科技大学 KNN text classifying method for optimizing training sample set
CN104794169A (en) * 2015-03-30 2015-07-22 明博教育科技有限公司 Subject term extraction method and system based on sequence labeling model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033867A (en) * 2010-12-14 2011-04-27 西北工业大学 Semantic-similarity measuring method for XML (Extensible Markup Language) document classification
CN102890698A (en) * 2012-06-20 2013-01-23 杜小勇 Method for automatically describing microblogging topic tag
CN103577452A (en) * 2012-07-31 2014-02-12 国际商业机器公司 Website server and method and device for enriching content of website
CN104063472A (en) * 2014-06-30 2014-09-24 电子科技大学 KNN text classifying method for optimizing training sample set
CN104794169A (en) * 2015-03-30 2015-07-22 明博教育科技有限公司 Subject term extraction method and system based on sequence labeling model

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Extending the Single Words-Based Document Model:A Comparison of Bigrams and 2-Itemsets;Roman Tesar;《proceeding DocEng’06 proceedins of the 2006 ACM symposium on document engineering》;20061013;第2节、第4节、第4.4节、第7.1节 *
Karl-Michael Schneider.A New Feature Selection Score for Multinomial Naïve Bayes Text Classification Based on KL-Divergence.《proceedings ACLdemo’04 proceedings of the ACL 2004 on interactive poster and demonstration sessions》.2004, *
基于标签序列的半结构化数据相似度度量;张利军等;《华中科技大学学报(自然科学版)》;20120823;第40卷(第8期);摘要 *
文本分类的归纳学习算法和描述;郑东飞等;《计算机工程与设计》;20060228;第27卷(第4 期);第679-681页 *

Also Published As

Publication number Publication date
CN106250910A (en) 2016-12-21

Similar Documents

Publication Publication Date Title
CN109471938B (en) Text classification method and terminal
AU2011326430B2 (en) Learning tags for video annotation using latent subtags
US9087297B1 (en) Accurate video concept recognition via classifier combination
CN108256104B (en) Comprehensive classification method of internet websites based on multidimensional characteristics
CN104834940A (en) Medical image inspection disease classification method based on support vector machine (SVM)
CN104239553A (en) Entity recognition method based on Map-Reduce framework
CN109446804B (en) Intrusion detection method based on multi-scale feature connection convolutional neural network
CN110222218A (en) Image search method based on multiple dimensioned NetVLAD and depth Hash
CN107341199B (en) Recommendation method based on document information commonality mode
CN112784031B (en) Method and system for classifying customer service conversation texts based on small sample learning
CN112699953A (en) Characteristic pyramid neural network architecture searching method based on multi-information path aggregation
CN115048464A (en) User operation behavior data detection method and device and electronic equipment
CN116582300A (en) Network traffic classification method and device based on machine learning
CN114676346A (en) News event processing method and device, computer equipment and storage medium
CN106250910B (en) Semi-structured data classification method based on label sequence and nGrams
Adami et al. Bootstrapping for hierarchical document classification
CN116578708A (en) Paper data name disambiguation algorithm based on graph neural network
US11514233B2 (en) Automated nonparametric content analysis for information management and retrieval
CN114265954B (en) Graph representation learning method based on position and structure information
CN111768214A (en) Product attribute prediction method, system, device and storage medium
Saund A graph lattice approach to maintaining and learning dense collections of subgraphs as image features
Asirvatham et al. Web page categorization based on document structure
CN107729557A (en) A kind of classification of inventory information, search method and device
CN114611668A (en) Vector representation learning method and system based on heterogeneous information network random walk
CN110717100B (en) Context perception recommendation method based on Gaussian embedded representation technology

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210915

Address after: 310000 room 660, building 5, No. 16, Zhuantang science and technology economic block, Xihu District, Hangzhou City, Zhejiang Province

Patentee after: Yunyao Technology (Zhejiang) Co.,Ltd.

Address before: 710072 No. 127 Youyi West Road, Shaanxi, Xi'an

Patentee before: Northwestern Polytechnical University