CN107291748A - A kind of feature extracting method and device - Google Patents
A kind of feature extracting method and device Download PDFInfo
- Publication number
- CN107291748A CN107291748A CN201610202581.6A CN201610202581A CN107291748A CN 107291748 A CN107291748 A CN 107291748A CN 201610202581 A CN201610202581 A CN 201610202581A CN 107291748 A CN107291748 A CN 107291748A
- Authority
- CN
- China
- Prior art keywords
- word
- string
- feature
- address text
- segmentation processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Creation or modification of classes or clusters
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application is related to data mining technology field, more particularly to a kind of feature extracting method and device, and the feature extracting method that the application is provided includes:It is determined that carrying out the address text after word segmentation processing;Word number and jump word number are taken according to what is pre-set, word is taken from the address text after the progress word segmentation processing, constitutes the feature word string for carrying out the address text after word segmentation processing;Wherein, the number of the word taken included in each feature word string takes word number described in being equal to, and is each equal to the jump word number in the presence of the word quantity that two adjacent words are separated by the address text in feature word string.Application scheme can carry out jumping word processing to address text, so as to have an opportunity to obtain the stronger feature word string of distinguishability, lift the mining effect to address text.
Description
Technical field
The application is related to data mining technology field, more particularly to a kind of feature extracting method and device.
Background technology
With being skyrocketed through for data warehouse Chinese version information, text mining turns into the research heat of message area
Point.Address information is to be stored in a text form in data warehouse, because address information is in big data point
Occupy very important status in analysis, address feature mining is as one kind of text mining, and its importance also gets over
Come more obvious.
To Chinese address text carry out word segmentation processing be carry out text mining basis, this be by Chinese the characteristics of
Determine.Such as to Chinese address text " Hangzhou, Zhejiang province city Yuhang District 5 constant virtues street Jing Feng communities Wen Yixi
Road " is carried out after word segmentation processing, can obtain including Zhejiang Province, Hangzhou, Yuhang District, 5 constant virtues street, chaste tree
Each word in the text of address after Feng Shequ, these words of a literary West Road address text, word segmentation processing
Have its corresponding address implication (such as individually see Zhejiang, river, save these three words, not possessing any address implication,
But just there is corresponding address implication in the word Zhejiang Province after being combined).Under many circumstances, for one
Chinese address text, if only extracting part word therein, the word of extraction is under many circumstances still with stronger
Distinguishability.
As shown in figure 1, the process to carry out feature extraction to Chinese address text in text classification.From figure
As can be seen that in text mining, carrying out word segmentation processing to Chinese address text first, then carrying out in 1
Feature extraction, namely progress takes word from Chinese address text, is next namely based on and takes word result to be divided
The process of class, therefore, after word segmentation processing is carried out to Chinese address text, influences Chinese address text mining
The primary factor of effect is exactly to carry out feature extraction.
At present, the method for carrying out feature extraction is mainly based upon n meta-models (n-gram) come what is realized, n-gram
Definition be:If address text constitutes (w by m word1w2w3…wm), wherein wiFor in the text of address
I-th of word, then n-gram be defined as:{wiwi+1…wi+n-1|1≤i≤m-n+1}。
Such as, current address text is made up of 5 words, is w1w2w3w4w5, then:
As n=1, the 1-gram of generation has w1、w2、w3、w4、w5;
As n=2, the 2-gram of generation has w1w2、w2w3、w3w4、w4w5;
As n=3, the 3-gram of generation has w1w2w3、, w2w3w4、w3w4w5,;
It is the union for taking all gram to mix n meta-models, such as mixes the gram of ternary model and have:w1、
w2、w3、w4、w5、w1w2、w2w3、w3w4、、w4w5、w1w2w3、w2w3w4、w3w4w5。
It is therefore seen that, it is exactly the continuous extraction n in the text of address to enter row address feature extraction based on n-gram
Individual word, obtains including the feature word string of n word.But in some cases, the word in the text of address is present
Long-distance dependence, or people can neglect some unessential vocabulary when describing same address, to mark
Quasi- address text " Hangzhou, Zhejiang province city Yuhang District 5 constant virtues street Jing Feng communities one West Road of text 969 Ah
In Ba Baxi small streams garden " exemplified by, people are possible to that short committal can be used in input address:" more than Hangzhou
The Hangzhoupro area text one West Road 969 Brazilian small stream garden of Arriba ".Obviously, feature extraction mode is can not extract
The address of this short committal, because " Yuhang District one West Road of text " that is included in the address of the short committal
It is in the text of normal address and discontinuous, and " Yuhang District one West Road of text " exactly with it is very strong can
Distinctiveness.
To sum up, when entering row address feature extraction to address text at present, included in the feature word string extracted
Word be all continuous in the text of address, wherein the stronger feature word string of distinctiveness may do not included so that
Cause the mining effect to address text poor.
The content of the invention
The embodiment of the present application provides a kind of feature extracting method and device, to improve the excavation to address text
Effect.
The embodiment of the present application provides a kind of feature extracting method, including:
It is determined that carrying out the address text after word segmentation processing;Included in address text after the carry out word segmentation processing
N number of word, the N is the integer more than 1;
Word number and jump word number are taken according to what is pre-set, is taken from the address text after the progress word segmentation processing
Word, constitutes the feature word string for carrying out the address text after word segmentation processing;Wherein, in each feature word string
Comprising the number of the word taken be equal to described take in word number, and each feature word string and there are two adjacent words
The word quantity being separated by the address text is equal to the jump word number.
Alternatively, word number and jump word number are taken according to what is pre-set, the address after the progress word segmentation processing
Word is taken in text, the feature word string for carrying out the address text after word segmentation processing is constituted, specifically includes:
Pre-set and take word number to be n, and it is the integer from 1 to k to pre-set jump word number, the n is
Integer more than 1 and less than N, the k is the integer more than 1 and less than N-1;
According to when in front jumping word number s, the address text after the progress word segmentation processing, from current location
Word starts to choose n word, obtains the feature word string;S is the integer more than 0 and less than or equal to k.
Alternatively, according to as front jumping word number s, in the address text after the progress word segmentation processing, from working as
The word of front position starts to choose n word, obtains the feature word string, including:
In address text after the progress word segmentation processing, since the word of the current location, continuous choosing
N word is taken, the first word string is obtained;
In address text after the progress word segmentation processing, it is determined that continuous since the word of the current location
Choose the remaining word after n word;When the quantity of the remaining word is more than or equal to s, from the residue
First word in word starts, and continuously chooses s word, obtains the second word string;
In other words in first word string in addition to first word, first object word, Yi Ji are determined
Determined and the second target word of the first object word number identical in second word string;
By the way that the first object word in first word string is replaced with into second target word, institute is determined
State feature word string.
Alternatively, in other words in first word string in addition to first word, first object word is determined,
And determined and the second target word of the first object word number identical in second word string, including:
Word is jumped using second word in first word string to last word as starting respectively, is performed following
Operation:
When in first word string, to n-th of word of first word string since word is jumped in the starting
When word quantity q is more than or equal to s, the continuous s word jumping word since the starting is defined as described the
One target word, and the word in second word string is defined as second target word;Q be more than 1,
And the integer less than n;
When in first word string, to n-th of word of first word string since word is jumped in the starting
When word quantity q is less than s, q word jumping word since the starting to n-th of word is defined as the first mesh
Word is marked, and since second word string last word, the side of first word towards in the second word string
To continuously q word of selection is used as the second target word.
Alternatively, by the way that the first object word in first word string is replaced with into second target
Word, determines the Feature Words word string, including:
The first object word in first word string is replaced with into second target word, the 3rd word is obtained
String;
According to the sequencing of N number of word arrangement in the address text after the progress word segmentation processing, to described the
Word in three word strings is resequenced, and obtains the feature word string.
The embodiment of the present application provides a kind of feature deriving means, including:
Determining module, for determining to carry out the address text after word segmentation processing;After the carry out word segmentation processing
N number of word is included in the text of address, the N is the integer more than 1;
Word module is taken, for taking word number and jump word number according to what is pre-set, after the progress word segmentation processing
Address text in take word, constitute the feature word string for carrying out the address text after word segmentation processing;Wherein,
The number of the word taken included in each feature word string is equal to described take in word number, and each feature word string and deposited
The word quantity being separated by two adjacent words in the address text is equal to the jump word number.
Application scheme can carry out jumping word processing to address text, so as to have an opportunity to obtain distinguishability stronger
Feature word string, lifted to the mining effect of address text.
Other features and advantage will illustrate in the following description, also, partly from explanation
Become apparent, or understood by implementing the application in book.The purpose of the application and other advantages can
Realize and obtain by specifically noted structure in the specification, claims and accompanying drawing write
.
Brief description of the drawings
Accompanying drawing is used for providing further understanding of the present application, and constitutes a part for specification, with this Shen
Please embodiment together be used for explain the application, do not constitute the limitation to the application.In the accompanying drawings:
Fig. 1 is the text classification flow chart in data mining in the prior art;
Fig. 2 is a kind of feature extracting method flow chart for providing in the embodiment of the present application;
Fig. 3 A generate the schematic diagram of feature word string to be provided in the embodiment of the present application in the case of s < n;
Fig. 3 B is provide in the case of s < n in the embodiment of the present application, generation the another of feature word string shows
It is intended to;
Fig. 4 generates the schematic diagram of feature word string to be provided in the embodiment of the present application in the case of s >=n;
Fig. 5 is a kind of structural representation of the feature deriving means provided in the embodiment of the present application.
Embodiment
The preferred embodiment of the application is illustrated below in conjunction with Figure of description, it will be appreciated that this place
The preferred embodiment of description is merely to illustrate and explained the application, is not used to limit the application.And not
In the case of conflict, the feature in embodiment and embodiment in the application can be mutually combined.
The embodiment of the present application provides a kind of feature extracting method, as shown in Fig. 2 including:
Step 21, it is determined that carrying out the address text after the address text after word segmentation processing, the carry out word segmentation processing
In include N number of word.N is the integer more than 1.
Step 22, word number and jump word number are taken according to what is pre-set, from the address text carried out after word segmentation processing
In take word, constitute the feature word string for carrying out the address text after word segmentation processing.
In step 22, word number and jump word number are taken according to what is pre-set, from the ground carried out after word segmentation processing
Word is taken in the text of location, obtains including the Feature Words set of strings of at least one feature word string, wherein each Feature Words
The number of the word included in string is equal to described take in word number, and each feature word string and existed in the presence of two adjacent words
The word quantity be separated by address text after word segmentation processing is equal to the jump word number.
In specific implementation, the mode continuously extracted is can be combined with from the address text carried out after word segmentation processing
In take word, the feature word string continuously extracted both is included in the Feature Words set of strings so obtained, also including using
The feature word string that the mode for the discontinuous extraction that application scheme is provided is extracted.
In specific implementation, step 22 can be, but not limited to realize as follows:
Step A1:Pre-set and take word number to be n, and pre-set jump word number be from 1 to k (if with reference to
The mode continuously extracted takes word from the text of address, then it is from 0 to integer k), n that can set jump word number
For the integer more than 1 and less than N, k is the integer more than 1 and less than N-1.
Such as, for carrying out the address text w after word segmentation processing1w2w3w4w5, the address text include 5
Individual word w1、w2、w3、w4、w5, i.e. N=5 can set and take word number n=3, set jump word number be 1,
2 or 0,1,2 (i.e. k=2).
In specific implementation, if in addition to the mode of discontinuous extraction, from address also by the way of continuously extracting
Word is taken in text, then jump word number value can be set from 0 to k.In introduced below also with jump word number value from
It is introduced exemplified by 0 to k, it is necessary to illustrate, it only can be this Shen by 0 such case of value to jump word number
Please embodiment a kind of embodiment, jumped in actually implementing word number can not also include value for 0 it is this
Situation.
Step A2:According to as front jumping word number s, in the address text after carrying out word segmentation processing, from present bit
The word put starts to choose n word, obtains the feature word string.
In specific implementation, the embodiment of the present application can realize discontinuous spy based on k-skip-n-gram mode
Extraction is levied, discontinuous feature extraction can also be realized based on conti-k-skip-n-gram mode.Here,
K-skip-n-gram refers to that the adjacent word of any two is all separated by k in the text of address in the feature word string extracted
Individual word, and conti-k-skip-n-gram refers to only exist two adjacent words on ground in the feature word string extracted
It is separated by k word in the text of location, because k-skip-n-gram is in k>2 and n>It is relatively difficult to achieve in program when 3,
The mode that the embodiment of the present application is preferably based on conti-k-skip-n-gram realizes feature extraction.
Step A2 is the process that a circulation is performed.This process can according to when the circulation of front jumping word number,
The mode of the circulation of nested current location is performed, can also be according to the circulation of current location, and nesting works as front jumping
The mode of the circulation of word number is performed.It is described separately below.
Endless form one:With when the circulation of front jumping word number, the circulation of nested current location.
In this manner, for as front jumping word number s, will carry out respectively in the address text after word segmentation processing
First word to N-n-s+1 (in order to obtain meeting the feature word string for taking word number and jumping word said conditions, when
Front position is not more than N-n-s+1) individual word, as the word of current location, perform operation:According to when front jumping word
Number s, in the address text after carrying out word segmentation processing, n word is chosen since the word of current location, is obtained
To feature word string.
Such as, for the address text w after word segmentation processing1w2w3w4w5, the address text includes 5 words
w1、w2、w3、w4、w5, i.e. N=5, setting takes word number n=3, set jump word number s be respectively 0,1,
2 (i.e. k=2), then the operating process in endless form once is:
1) when front jumping word number s is 0.Using the word of current location as w1, from w1Start continuously to choose 3 words,
Obtain feature word string w1w2w3;Next using the word of current location as w2, from w2Start continuous selection 3
Word, obtains feature word string w2w3w4;Next using the word of current location as w3, from w3Start continuous choose
3 words, obtain feature word string w3w4w5;Due to N-n-s+1=5-3-0+1=3, therefore, when front jumping word number
When s is 0, with current location to w3Untill.
2) when front jumping word number s is 1.Using the word of current location as w1, from w1Start to choose 3 words, choosing
It is equal to 1 in the presence of the word quantity that two adjacent words are separated by the text of address in the feature word string taken, then obtains
Feature word string w1w3w4And w1w2w4;Next using the word of current location as w2, from w2Start selection 3
It is equal to 1 in the presence of the word quantity that two adjacent words are separated by the text of address in individual word, the feature word string of selection,
Obtain feature word string w2w3w5And w2w4w5;Due to N-n-s+1=5-3-1+1=2, therefore, when front jumping word
When number s is 1, with current location to w2Untill.
2) when front jumping word number is 2.Using the word of current location as w1, from w1Start to choose 3 words, choose
Feature word string in there is the word quantity that is separated by the text of address of two adjacent words and be equal to 2, obtain feature
Word string w1w4w5And w1w2w5..Due to N-n-s+1=5-3-2+1=1, therefore, when front jumping word number s is 2
When, with current location to w1Untill.
So far, the feature word string obtained using above-mentioned endless form one includes w1w2w3、w2w3w4、w3w4w5、
w1w3w4、w1w2w4、w2w3w5、w2w4w5、w1w4w5、w1w2w5。
Endless form two:With the circulation of current location, nesting works as the circulation of front jumping word number.
In this manner, (it is used as present bit to N-n+1 word using the 1st successively for the word of current location
The word put), it will perform operation from 0 to k as when front jumping word number respectively:According to as front jumping word number s,
In address text after the carry out word segmentation processing, n word is chosen since the word of current location, spy is obtained
Levy word string.
Still with the address text w after word segmentation processing1w2w3w4w5Exemplified by, the address text includes 5 word w1、
w2、w3、w4、w5, i.e. N=5, setting takes word number n=3, and it is respectively 0,1,2 (i.e. to set and jump word number s
K=2), then the operating process under endless form two is:
1) word of current location is w1.Using when front jumping word number s is 0, from w1Start continuously to choose 3 words,
Obtain feature word string w1w2w3;Using when front jumping word number s is 1, from w1Start to choose 3 words, selection
It is equal to 1 in the presence of the word quantity that two adjacent words are separated by the text of address in feature word string, obtains Feature Words
String w1w3w4And w1w2w4;Using when front jumping word number s is 2, from w1Start to choose 3 words, the spy of selection
Levy in word string and there is the word quantity that is separated by the text of address of two adjacent words and be equal to 2, obtain feature word string
w1w4w5And w1w2w5。
2) word of current location is w2.Using when front jumping word number s is 0, from w2Start continuously to choose 3 words,
Obtain feature word string w2w3w4;Using when front jumping word number s is 1, from w2Start to choose 3 words, selection
It is equal to 1 in the presence of the word quantity that two adjacent words are separated by the text of address in feature word string, obtains Feature Words
String w2w3w5And w2w4w5;In order to obtain meeting the feature word string for taking word number and jumping word said conditions, present bit
No more than N-n-s+1 is put, during due to N-n-s+1=2, s=1 (namely s is 1 to the maximum), therefore, present bit
The word put is w2When, using when front jumping word number s is untill 1.
3) word of current location is w3.Using when front jumping word number s is 0, from w3Start continuously to choose 3 words,
Then obtain feature word string w3w4w5;During due to N-n-s+1=3, s=0, therefore, the word of current location is w3
When, using when front jumping word number s is untill 0.
So far, the feature word string obtained using above-mentioned endless form includes w1w2w3、w1w3w4、w1w2w4、
w1w4w5、w1w2w5.、w2w3w4、w2w3w5、w2w4w5、w3w4w5。
Except above two circulate perform mode in addition to, can not also according to will work as front jumping word number and current location according to
The secondary sequential loop for Jia 1 is performed, as long as finally by all possible when front jumping word number and current location are all traveled through
Arrive.
It was found from above-mentioned result of implementation, the embodiment of the present application can obtain 9 kinds of feature word strings, and according to tradition
The mode of continuous extraction can only obtain w1w2w3、w2w3w4、w3w4w5These three feature word strings, therefore,
Using application scheme, discrete feature word string, and the number of obtained feature word string can be not only obtained
Increase is measured, therefore the mining effect to address text can be lifted.
No matter which kind of endless form used, for cyclic process each time (correspondence one as front jumping word number s and
One current location), above-mentioned steps A2 can be, but not limited to realize according to following steps:
Step B1, in the address text after carrying out word segmentation processing, since the word of current location, continuously
N word is chosen, the first word string is obtained.
Here continuous selection, refers to that the direction of last word towards in the address text is continuously chosen
(following not specify what is continuously chosen towards the direction of first word, the direction towards last word referred both to
It is continuous to choose).
Step B1 is a continuous process for selecting word, for the address text w after word segmentation processing1w2w3w4w5,
If the word of current location is w1, then n=3 word is continuously chosen, the first word string is obtained for w1w2w3。
Step B2, in the address text after carrying out word segmentation processing, it is determined that connecting since the word of current location
The continuous remaining word chosen after n word (the first word string);It is more than or equal to s in the quantity of the remaining word
When, since first word in the remaining word, s word is continuously chosen, the second word string is obtained.
Here, if remaining word quantity is less than s, being now can not be since the word of current location, according to current
Jump word number s and select n word because can more than address text border.In this case the circulation
Process terminates.Such as, for the address text w after word segmentation processing1w2w3w4w5, it is assumed that n=3, s=2,
The word of current location is w2, now from w2Remaining word quantity after the 3rd word started is 1, now
It can not perform from w5Start continuously to choose 2 words.Therefore, in the embodiment of the present application, only in remaining word
When quantity is not less than s, the process for obtaining the second word string is just performed.
Such as, for the address text w after word segmentation processing1w2w3w4w5, the word w from current location1Start
Continuously the remaining word after 3 words of selection is w4、w5, the quantity of remaining word is 2, (works as front jumping equal to s
Word number s=2), then from remaining word w4、w5Middle selection the 1st, 2 words, obtain the second word string for w4w5。
In specific implementation, known in the case of front jumping word number s, current location can be set to be to the maximum
N-n-s+1 (referring to above-mentioned endless form one), so ensures that the quantity of remaining word is not less than s.
In step B3, other words in first word string in addition to first word, first object is determined
Word, and determined and the second target word of the first object word number identical in second word string.
Such as, in the first word string w1w2w3In except first word w1Outside other word w2、w3In, really
First object word is determined for w3, in the second word string w4w5In, it is w to determine the second target word5。
Step B4, by the way that the first object word in first word string is replaced with into second target
Word, determines the Feature Words word string.
Here, due to the first object word in the first word string is replaced with after the second target word, the sequence of word is not
Meet the sequence of word in the text of address, now step B4, which can be performed, is:
Step B4*:First object word in first word string is replaced with into the second target word, the 3rd word string is obtained;
Then according to the sequencing of N number of word arrangement in the address text carried out after word segmentation processing, in the 3rd word string
Word resequenced, obtain the feature word string.
Such as, by the first word string w1w2w3In first object word w3Replace with the second target word w5, obtain
w1w5w2, according still further to address text w1w2w3w4w5Middle word puts in order, to w1w5w2Rearrangement,
Obtain Feature Words string w1w2w5。
In specific implementation, the first object word and the second target word that above-mentioned steps B3 is determined, which are removed, meets number
Outside identical condition, in addition it is also necessary to meet the jump word number in the Feature Words word string obtained after performing step B4
Equal to when front jumping word number.In order to meet this condition, all possible can be first chosen in the first word string
One target word, chooses all possible second target word in the second word string, it is then determined that all possible
The combination of one target word and the second target word, then satisfaction is therefrom chosen when the combination of front jumping word number,
But this mode workload is larger, consuming system resource is larger, and based on this, the embodiment of the present application proposes base
In step C1~C2 preferred embodiment, explanation as described below.
Above-mentioned steps B3 specific implementation process can be:
Second word in first word string to last word is jumped into word as starting respectively (such as to distinguish
By the first word string w1w2w3In w2And w3Word is jumped as starting), perform following steps:
Step C1:When in the first word string, since word is jumped in starting to the first word string n-th of word word
When quantity q is more than or equal to s, the continuous s word jumping word since the starting is defined as described first
Target word, and the word in second word string is defined as second target word;Q be more than 1 and
Integer less than n.
Such as, the first word string w1w2w3In, jump word w from starting2Start the 3rd word to the first word string
w3Word quantity 2 be equal to and work as front jumping word number 2, then will jump word w from starting2Start continuous 2 words, it is determined that
For the first object word, i.e., by w2、w3It is defined as the first object word.By the second word string w4w5
In word w4、w5It is defined as second target word.So, by the first word string w1w2w3In the first mesh
Mark word w2、w3Replace with the second target word w4、w5It is w afterwards1w4w5。
Step C2:When in the first word string, since word is jumped in starting to the first word string n-th of word word
When quantity q is less than s, q word jumping word since the starting to n-th of word is defined as first object
Word, and since second word string last word, the direction of first word towards in the second word string,
Continuously choose q word and be used as the second target word.
Such as, the first word string w1w2w3In, jump word w from starting3Start to the 3rd word w3Word quantity 1
Less than when front jumping word number 2, then word w is jumped into starting3It is defined as the first object word, by the second word string w4w5
In last word w5It is defined as second target word.So, by the first word string w1w2w3In
It is w that one target word, which is replaced with after the second target word,1w2w5。
In order to be better understood from the embodiment of the present application, below in conjunction with tool of the specific example to the embodiment of the present application
Body implementation process is illustrated.
Example one
As shown in Figure 3 A and Figure 3 B, it is respectively the generating process signal of the feature word string in the case of s < n
Figure.
In this example, n=6, s=3 are set, the word of current location is the since address text originates word
I word (i.e. current location is i), then gram generating process is as follows:
1st, it is continuous since the word of current location to choose n (n=6) individual word, it is put into buff, now buff
In word be stitched together as the first word string (correspondence above-mentioned steps B1);
2nd, s word is continuously chosen since the 1st word (i.e. correspondence position i+n) after the first word string (right
Answer above-mentioned steps B2), it is used as the second word string;
3rd, the difference that lexeme is put is jumped in the starting in the first word string, and two kinds of situations are segmented into again:First
In word string since starting jump word to the first word string n-th of word word quantity q be not less than s situations such as (feelings
Condition one) and situations (situation two) of the q less than s.
For situation one, such as the 2nd word during word is the first word string is jumped when the starting in the first word string, i.e.,
During i+1 word, by since starting jump word continuous s (s=3) individual word (correspondence position i+1, i+2,
I+3) it is defined as first object word, and by word (the second word string on position i+n, i+n+1, and i+n+2
In word) be defined as the second target word (correspondence above-mentioned steps C1).Shown in detailed process as Fig. 3 A.
For situation two, the 5th word (the correspondence position during word is the first word string is jumped when the starting in the first word string
Put i+4) when, then will since starting jump word in the first word string n-th (n=6) individual word (i.e. position i+4,
Word on i+5) be defined as first object word, will from the second word string last word (correspondence position i+n+2)
Start, the direction of first word towards in the second word string, it is continuous to choose and first object word quantity identical word
It is defined as the second target word, i.e., it is (right to be that selected ci poem on i+n+2 and i+n+1 is taken as the second target word by position
Answer above-mentioned steps C2).Shown in detailed process as Fig. 3 B.
4th, the first object word in the first word string is replaced with into the second target word, and according to N in the text of address
The sequencing of individual word arrangement, resequences to the word string obtained in the 4th step, after being resequenced
Feature word string (correspondence above-mentioned steps B4*.
Example two
As shown in figure 4, showing for the gram another generating process in the case of s >=n that example two is provided
It is intended to.
In this example, n=4, s=5 are set, the word of current location is the since address text originates word
I word (i.e. current location is i), then gram generating process includes as follows:
1st, it is continuous since the word of current location to choose n (n=4) individual word, it is put into buff, now buff
In word be stitched together as the first word string (correspondence above-mentioned steps B1);
2nd, s word is continuously chosen since the 1st word (i.e. correspondence position i+n) after the first word string (right
Answer above-mentioned steps B2), it is used as the second word string;
3rd, in the case of s >=n, no matter the starting in the first word string jumps word in what position, the first word string
Since starting jump word to the first word string n-th of word word quantity q respectively less than s.In such as the first word string
When front jumping word original position be the first word string in the 2nd word (correspondence position i+1) when, will include work as
The word that remaining lexeme including front jumping word original position is put on (i+1, i+2, i+3) is defined as first object word,
And since the second word string last lexeme puts the word on (i+n+4), towards in the second word string
The direction of one word, continuous choose is defined as the second target word with first object word quantity identical word, that is, selects
The word (correspondence above-mentioned steps C2) on i+n+4, i+n+3 and i+n+2 is put in fetch bit.
4th, first object word is replaced with into the second target word, obtains feature word string (the above-mentioned B4 of correspondence).
Here, by the above-mentioned feature word string extracted based on Conti-k-skip-n-gram, with being carried based on n-gram
The feature word string quantity got is contrasted.
Table 1, which is listed, is based respectively on 2-gram and Conti-k-skip-2-gram (k=1,2,3 and 4) is carried
The correction data of feature word string (gram) quantity taken, table 2 list be based respectively on 3-gram and
The correction data for the gram quantity that Conti-k-skip-3-gram (k=1,2,3 and 4) is extracted.
Table 1
| N | 2-gram | Conti-1-skip | Conti-2-skip | Conti-3-skip | Conti-4-skip |
| 5 | 4 | 7 | 9 | 10 | 10 |
| 10 | 9 | 17 | 24 | 30 | 35 |
| 15 | 14 | 27 | 39 | 50 | 60 |
| 20 | 19 | 35 | 51 | 66 | 80 |
Table 2
From table 1 and table 2, the quantity based on the Conti-k-skip-n-gram gram extracted substantially compares
Quantity based on the n-gram gram extracted is more, that is to say, that Conti-k-skip-n-gram can be produced
The gram (there is adjacent word in gram non-conterminous in the text of address) that n-gram can not be produced.
In addition, corresponding to traditional mixing n meta-models, Conti-k-skip can be used in the embodiment of the present application
Mix n meta-models (k=2, n=3).By taking the better address of the Brazilian small stream garden of Arriba as an example:
" Hangzhou, Zhejiang province city Yuhang District 5 constant virtues street Jing Feng communities text one West Road 969 Arriba
Brazilian No. 5 building in small stream garden ", the gram of generation quantity statistics is shown in Table 3.
Table 3
| Conti-2-skip-1-gram | 10+0 |
| Conti-2-skip-2-gram | 9+15 |
| Conti-2-skip-3-gram | 8+26 |
| It is total | 68 |
The gram numbers that Conti-2-skip mixes the generation of 3 meta-models as can be seen from Table 3 are 68, and
Traditional n-gram mixes 3 meta-models and produces 27 gram, produces 41 gram equivalent to more, has more
41 gram include:
1. 15 2-gram:
1) Zhejiang Province Yuhang District
2) Zhejiang Province 5 constant virtues street (※)
3) Hangzhou 5 constant virtues street (※)
4) Hangzhou Jing Feng communities (※)
5) Yuhang District Jing Feng communities (※)
6) the literary West Road (※) in Yuhang District
7) the literary West Road (※) in 5 constant virtues street
8) 5 constant virtues street 969
9) Jing Feng communities 969
10) Jing Feng societies area code
11) a literary West Road number
12) the Brazilian small stream garden (※) of literary West Road Arriba
13) the Brazilian small stream garden (※) of 969 Arribas
14) 969 No. 5 buildings
15) Brazilian small stream garden the 5th building (※) of Arriba
2. 26 3-gram:
1) Zhejiang Province Yuhang District 5 constant virtues street (※)
2) Hangzhou, Zhejiang province city 5 constant virtues street (※)
3) Zhejiang Province 5 constant virtues street Jing Feng communities (※)
4) Jing Feng communities of Hangzhou, Zhejiang province city (※)
5) Hangzhou 5 constant virtues street Jing Feng communities (※)
6) Hangzhou Yuhang District Jing Feng communities (※)
7) the literary West Road (※) in Hangzhou Jing Feng communities
8) the literary West Road (※) in Hangzhou Yuhang District
9) the literary West Road (※) in Yuhang District Jing Feng communities
10) the literary West Road (※) in Yuhang District 5 constant virtues street
11) the literary West Road 969 (※) in Yuhang District
12) Yuhang District 5 constant virtues street 969
13) the literary West Road 969 (※) in 5 constant virtues street
14) 5 constant virtues street Jing Feng communities 969
15) 5 constant virtues street 969
16) 5 constant virtues street Jing Feng societies area code
17) Jing Feng communities 969
18) the literary West Road number in Jing Feng communities
19) Jing Feng societies area code Alibaba Xi Xi gardens
20) the Brazilian small stream garden (※) of the West Road Arriba of Jing Feng communities text one
21) the Brazilian small stream garden (※) of literary West Road Arriba
22) the Brazilian small stream garden (※) of the literary Arriba of a West Road 969
23) Brazilian small stream garden the 5th building (※) of literary West Road Arriba
24) literary 969 No. 5 buildings in a West Road
25) Brazilian small stream garden the 5th building (※) of 969 Arribas
26) No. 969 No. 5 buildings
In above-mentioned gram, there are many distinguishabilities very strong in 41 gram being had more than n-gram
Gram, the testing material based on address applications, by feature selecting, discovery has 27 features (see mark (※)
Part) distinguishability is very strong.It can be seen that, the feature extraction side realized based on Conti-k-skip-n-gram
Method can be obviously improved the mining effect of address text.
In addition, the gram that the embodiment of the present application will be extracted based on n-gram and Conti-k-skip-n-gram
Applied to text classification is carried out to address text and non-address text, table 4 below is the system to the degree of accuracy of classifying
Meter.
Table 4
As known from Table 4, under conditions of a small amount of experimental data, the quantity of feature word string is extracted by being lifted,
Compared to non-address text, for the text of address, the text point realized based on Conti-k-skip-n-gram
The accuracy of class is more preferable.
Based on same inventive concept, the embodiment of the present application additionally provides a kind of spy corresponding with feature extracting method
Extraction element is levied, the feature extracting method provided due to the principle that the device solves problem with the embodiment of the present application
It is similar, therefore repeated no more in place of repetition.
As shown in figure 5, the feature deriving means provided for the embodiment of the present application, including:
Determining module 51, for determining to carry out the address text after word segmentation processing;After the carry out word segmentation processing
Address text in include N number of word, the N is integer more than 1;
Word module 52 is taken, for taking word number and jump word number according to what is pre-set, from the carry out word segmentation processing
Word is taken in address text afterwards, the feature word string for carrying out the address text after word segmentation processing is constituted;Wherein,
The number of the word taken included in each feature word string is equal to described take in word number, and each feature word string and deposited
The word quantity being separated by two adjacent words in the address text is equal to the jump word number.
Alternatively, it is described take word module 52 specifically for:
Pre-set and take word number to be n, and it is the integer from 1 to k to pre-set jump word number, the n is
Integer more than 1 and less than N, the k is the integer more than 1 and less than N-1;According to working as front jumping
In word number s, the address text after the progress word segmentation processing, n are chosen since the word of current location
Word, obtains the feature word string;S is the integer more than 0 and less than or equal to k.
Alternatively, it is described take word module 52 specifically for:
In address text after the progress word segmentation processing, since the word of the current location, continuous choosing
N word is taken, the first word string is obtained;In address text after the progress word segmentation processing, it is determined that from described
The word of current location starts continuously to choose the remaining word after n word;Be more than in the quantity of the remaining word or
During equal to s, since first word in the remaining word, s word is continuously chosen, the second word string is obtained;
In other words in first word string in addition to first word, first object word is determined, and described
Determined and the second target word of the first object word number identical in second word string;By by first word
The first object word in string replaces with second target word, determines the feature word string.
Alternatively, it is described take word module 52 specifically for:
Word is jumped using second word in first word string to last word as starting respectively, is performed following
Operation:
When in first word string, to n-th of word of first word string since word is jumped in the starting
When word quantity q is more than or equal to s, the continuous s word jumping word since the starting is defined as described the
One target word, and the word in second word string is defined as second target word;Q be more than 1,
And the integer less than n;
When in first word string, to n-th of word of first word string since word is jumped in the starting
When word quantity q is less than s, q word jumping word since the starting to n-th of word is defined as the first mesh
Word is marked, and since second word string last word, the side of first word towards in the second word string
To continuously q word of selection is used as the second target word.
Alternatively, it is described take word module 52 specifically for:
The first object word in first word string is replaced with into second target word, the 3rd word is obtained
String;According to the sequencing of N number of word arrangement in the address text after the progress word segmentation processing, to described the
Word in three word strings is resequenced, and obtains the feature word string.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or meter
Calculation machine program product.Therefore, the application can be using complete hardware embodiment, complete software embodiment or knot
The form of embodiment in terms of conjunction software and hardware.Wherein wrapped one or more moreover, the application can be used
Containing computer usable program code computer-usable storage medium (include but is not limited to magnetic disk storage,
CD-ROM, optical memory etc.) on the form of computer program product implemented.
The application is produced with reference to according to the method, equipment (system) and computer program of the embodiment of the present application
The flow chart and/or block diagram of product is described.It should be understood that can by computer program instructions implementation process figure and
/ or each flow and/or square frame in block diagram and the flow in flow chart and/or block diagram and/
Or the combination of square frame.These computer program instructions can be provided to all-purpose computer, special-purpose computer, insertion
Formula processor or the processor of other programmable data processing devices are to produce a machine so that pass through and calculate
The instruction of the computing device of machine or other programmable data processing devices is produced for realizing in flow chart one
The device for the function of being specified in individual flow or multiple flows and/or one square frame of block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or the processing of other programmable datas to set
In the standby computer-readable memory worked in a specific way so that be stored in the computer-readable memory
Instruction produce include the manufacture of command device, the command device realization in one flow or multiple of flow chart
The function of being specified in one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices, made
Obtain and perform series of operation steps on computer or other programmable devices to produce computer implemented place
Reason, so that the instruction performed on computer or other programmable devices is provided for realizing in flow chart one
The step of function of being specified in flow or multiple flows and/or one square frame of block diagram or multiple square frames.
Although having been described for the preferred embodiment of the application, those skilled in the art once know base
This creative concept, then can make other change and modification to these embodiments.So, appended right will
Ask and be intended to be construed to include preferred embodiment and fall into having altered and changing for the application scope.
Obviously, those skilled in the art can carry out various changes and modification without departing from this Shen to the application
Spirit and scope please.So, if these modifications and variations of the application belong to the application claim and
Within the scope of its equivalent technologies, then the application is also intended to comprising including these changes and modification.
Claims (12)
1. a kind of feature extracting method, it is characterised in that including:
It is determined that carrying out the address text after word segmentation processing;Included in address text after the carry out word segmentation processing
N number of word, the N is the integer more than 1;
Word number and jump word number are taken according to what is pre-set, is taken from the address text after the progress word segmentation processing
Word, constitutes the feature word string for carrying out the address text after word segmentation processing;Wherein, in each feature word string
Comprising the number of the word taken be equal to described take in word number, and each feature word string and there are two adjacent words
The word quantity being separated by the address text is equal to the jump word number.
2. the method as described in claim 1, it is characterised in that take word number and jump according to what is pre-set
Word number, word is taken from the address text after the progress word segmentation processing, is constituted after the progress word segmentation processing
The feature word string of address text, is specifically included:
Pre-set and take word number to be n, and it is the integer from 1 to k to pre-set jump word number, the n is
Integer more than 1 and less than N, the k is the integer more than 1 and less than N-1;
According to when in front jumping word number s, the address text after the progress word segmentation processing, from current location
Word starts to choose n word, obtains the feature word string;S is the integer more than 0 and less than or equal to k.
3. method as claimed in claim 2, it is characterised in that according to as front jumping word number s, it is described enter
In address text after row word segmentation processing, n word is chosen since the word of current location, the feature is obtained
Word string, including:
In address text after the progress word segmentation processing, since the word of the current location, continuous choosing
N word is taken, the first word string is obtained;
In address text after the progress word segmentation processing, it is determined that continuous since the word of the current location
Choose the remaining word after n word;When the quantity of the remaining word is more than or equal to s, from the residue
First word in word starts, and continuously chooses s word, obtains the second word string;
In other words in first word string in addition to first word, first object word, Yi Ji are determined
Determined and the second target word of the first object word number identical in second word string;
By the way that the first object word in first word string is replaced with into second target word, institute is determined
State feature word string.
4. method as claimed in claim 3, it is characterised in that first is removed in first word string
In other words outside word, first object word is determined, and determined and described first in second word string
Target word number the second target word of identical, including:
Word is jumped using second word in first word string to last word as starting respectively, is performed following
Operation:
When in first word string, to n-th of word of first word string since word is jumped in the starting
When word quantity q is more than or equal to s, the continuous s word jumping word since the starting is defined as described the
One target word, and the word in second word string is defined as second target word;Q be more than 1,
And the integer less than n;
When in first word string, to n-th of word of first word string since word is jumped in the starting
When word quantity q is less than s, q word jumping word since the starting to n-th of word is defined as the first mesh
Word is marked, and since second word string last word, the side of first word towards in the second word string
To continuously q word of selection is used as the second target word.
5. method as claimed in claim 4, it is characterised in that by by the institute in first word string
State first object word and replace with second target word, determine the Feature Words word string, including:
The first object word in first word string is replaced with into second target word, the 3rd word is obtained
String;
According to the sequencing of N number of word arrangement in the address text after the progress word segmentation processing, to described the
Word in three word strings is resequenced, and obtains the feature word string.
6. the method as described in claim 1, it is characterised in that have two in each feature word string
The word quantity that individual adjacent word is separated by the address text is equal to the jump word number, including:
The word quantity that the adjacent word of any two is separated by the address text in each feature word string
Equal to the jump word number.
7. a kind of feature deriving means, it is characterised in that including:
Determining module, for determining to carry out the address text after word segmentation processing;After the carry out word segmentation processing
N number of word is included in the text of address, the N is the integer more than 1;
Word module is taken, for taking word number and jump word number according to what is pre-set, after the progress word segmentation processing
Address text in take word, constitute the feature word string for carrying out the address text after word segmentation processing;Wherein,
The number of the word taken included in each feature word string is equal to described take in word number, and each feature word string and deposited
The word quantity being separated by two adjacent words in the address text is equal to the jump word number.
8. device as claimed in claim 7, it is characterised in that it is described take word module specifically for:
Pre-set and take word number to be n, and it is the integer from 1 to k to pre-set jump word number, the n is
Integer more than 1 and less than N, the k is the integer more than 1 and less than N-1;According to working as front jumping
In word number s, the address text after the progress word segmentation processing, n are chosen since the word of current location
Word, obtains the feature word string;S is the integer more than 0 and less than or equal to k.
9. device as claimed in claim 8, it is characterised in that it is described take word module specifically for:
In address text after the progress word segmentation processing, since the word of the current location, continuous choosing
N word is taken, the first word string is obtained;In address text after the progress word segmentation processing, it is determined that from described
The word of current location starts continuously to choose the remaining word after n word;Be more than in the quantity of the remaining word or
During equal to s, since first word in the remaining word, s word is continuously chosen, the second word string is obtained;
In other words in first word string in addition to first word, first object word is determined, and described
Determined and the second target word of the first object word number identical in second word string;By by first word
The first object word in string replaces with second target word, determines the feature word string.
10. device as claimed in claim 9, it is characterised in that it is described take word module specifically for:
Word is jumped using second word in first word string to last word as starting respectively, is performed following
Operation:
When in first word string, to n-th of word of first word string since word is jumped in the starting
When word quantity q is more than or equal to s, the continuous s word jumping word since the starting is defined as described the
One target word, and the word in second word string is defined as second target word;Q be more than 1,
And the integer less than n;
When in first word string, to n-th of word of first word string since word is jumped in the starting
When word quantity q is less than s, q word jumping word since the starting to n-th of word is defined as the first mesh
Word is marked, and since second word string last word, the side of first word towards in the second word string
To continuously q word of selection is used as the second target word.
11. device as claimed in claim 10, it is characterised in that it is described take word module specifically for:
The first object word in first word string is replaced with into second target word, the 3rd word is obtained
String;According to the sequencing of N number of word arrangement in the address text after the progress word segmentation processing, to described the
Word in three word strings is resequenced, and obtains the feature word string.
12. device as claimed in claim 7, it is characterised in that any two in each feature word string
The word quantity that individual adjacent word is separated by the address text is equal to the jump word number.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610202581.6A CN107291748B (en) | 2016-03-31 | 2016-03-31 | Feature extraction method and device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610202581.6A CN107291748B (en) | 2016-03-31 | 2016-03-31 | Feature extraction method and device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN107291748A true CN107291748A (en) | 2017-10-24 |
| CN107291748B CN107291748B (en) | 2021-01-15 |
Family
ID=60087452
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201610202581.6A Active CN107291748B (en) | 2016-03-31 | 2016-03-31 | Feature extraction method and device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN107291748B (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101021838A (en) * | 2007-03-02 | 2007-08-22 | 华为技术有限公司 | Text handling method and system |
| US20130311452A1 (en) * | 2012-05-16 | 2013-11-21 | Daniel Jacoby | Media and location based social network |
| CN103714092A (en) * | 2012-09-29 | 2014-04-09 | 北京百度网讯科技有限公司 | Geographic position searching method and geographic position searching device |
| CN104142915A (en) * | 2013-05-24 | 2014-11-12 | 腾讯科技(深圳)有限公司 | Punctuation adding method and system |
-
2016
- 2016-03-31 CN CN201610202581.6A patent/CN107291748B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101021838A (en) * | 2007-03-02 | 2007-08-22 | 华为技术有限公司 | Text handling method and system |
| US20130311452A1 (en) * | 2012-05-16 | 2013-11-21 | Daniel Jacoby | Media and location based social network |
| CN103714092A (en) * | 2012-09-29 | 2014-04-09 | 北京百度网讯科技有限公司 | Geographic position searching method and geographic position searching device |
| CN104142915A (en) * | 2013-05-24 | 2014-11-12 | 腾讯科技(深圳)有限公司 | Punctuation adding method and system |
Also Published As
| Publication number | Publication date |
|---|---|
| CN107291748B (en) | 2021-01-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109934241B (en) | Image multi-scale information extraction method capable of being integrated into neural network architecture | |
| CN102184169B (en) | Method, device and equipment used for determining similarity information among character string information | |
| KR101655835B1 (en) | A multi-layer system for symbol-space based compression of patterns | |
| CN104737155B (en) | The sequence of the conclusion synthesis converted for going here and there | |
| CN101446962B (en) | Data conversion method, device thereof and data processing system | |
| CN106156082B (en) | A body alignment method and device | |
| CN103631385B (en) | Method and device for screening candidate items in character input | |
| CN109885824A (en) | A kind of Chinese name entity recognition method, device and the readable storage medium storing program for executing of level | |
| JP2012155714A (en) | Ordering document content | |
| CN109885826A (en) | Text word vector acquisition method, device, computer equipment and storage medium | |
| CN106371624A (en) | Method and device for providing input candidate item | |
| DE102013221125A1 (en) | System, method and computer program product for performing a string search | |
| KR101379128B1 (en) | Dictionary generation device, dictionary generation method, and computer readable recording medium storing the dictionary generation program | |
| CN107908714A (en) | A kind of aggregation of data sort method and device | |
| CN109918658A (en) | A kind of method and system obtaining target vocabulary from text | |
| CN103309857B (en) | A kind of taxonomy determines method and apparatus | |
| CN105868372A (en) | Label distribution method and device | |
| Gu et al. | Learning joint multimodal representation based on multi-fusion deep neural networks | |
| CN106202224B (en) | Search processing method and device | |
| CN103605521A (en) | Method and device for realizing interface apposition | |
| Elfeky et al. | Analyzing the simple ranking and selection process for constrained evolutionary optimization | |
| CN107145244A (en) | A kind of special-shaped characters input method, device and electronic equipment | |
| JP2009093556A (en) | Index construction method, document retrieval apparatus, and index construction program | |
| CN107291748A (en) | A kind of feature extracting method and device | |
| Goldman et al. | The social lives of land |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20180418 Address after: Four story 847 mailbox of the capital mansion of Cayman Islands, Cayman Islands, Cayman Applicant after: CAINIAO SMART LOGISTICS HOLDING Ltd. Address before: Cayman Islands Grand Cayman capital building, a four storey No. 847 mailbox Applicant before: ALIBABA GROUP HOLDING Ltd. |
|
| TA01 | Transfer of patent application right | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |