CN107894979A - The compound process method, apparatus and its equipment excavated for semanteme - Google Patents
The compound process method, apparatus and its equipment excavated for semanteme Download PDFInfo
- Publication number
- CN107894979A CN107894979A CN201711163429.2A CN201711163429A CN107894979A CN 107894979 A CN107894979 A CN 107894979A CN 201711163429 A CN201711163429 A CN 201711163429A CN 107894979 A CN107894979 A CN 107894979A
- Authority
- CN
- China
- Prior art keywords
- dimensional
- compound word
- words
- word
- compound
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention proposes a kind of compound process method, apparatus and its equipment for being used for semantic excavation, wherein, method includes:Determine M participle of every words in training corpus;According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of M participle, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;K Hash operation is carried out to the character string of N-dimensional compound word, inquire about in the random Harsh dictionary space that pre-establishes and obtain and the unique corresponding position of each Hash operation result, and the K that N-dimensional compound word is generated according to the floating number of K position corresponding with K Hash operation result ties up term vector, wherein, K is the integer more than 1;The N-dimensional target compound word for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors, N-dimensional target compound word is inputed into bag of words carries out semantic excavation.Hereby it is achieved that the semantic feature of more bigger granularity is introduced into bag of words, the effect of bag of words is further lifted.
Description
Technical field
The present invention relates to technical field of information processing, more particularly to it is a kind of be used for the semantic compound process method excavated,
Device and its equipment.
Background technology
Artificial intelligence (Artificial Intelligence), english abbreviation AI.It is research, develop for simulating,
Extension and the extension intelligent theory of people, method, a new technological sciences of technology and application system.Artificial intelligence is to calculate
One branch of machine science, it attempts to understand essence of intelligence, and produce it is a kind of it is new can be in a manner of human intelligence be similar
The intelligence machine made a response, the research in the field include robot, speech recognition, image recognition, natural language processing and specially
Family's system etc..
At present, in text semantic relevant matches task, common bag of words (Bag of Words) model has extensively
Application.In correlation technique, the probability of occurrence of two neighboring word during training language is chatted is counted using Bigram (two-dimensional grammar), is passed through
T-statics sort methods, certain two word while the possibility occurred are obtained, so as to two occurred to larger possibility simultaneously
The compound word that word is bundled to obtain is embedded into term vector space as new semantic feature, and inputs to bag of words.
However, for being required for the T-statics for counting its Bigram again to be formed per a collection of new training corpus
Bigram vocabularys, it then could start to train bag of words, so as to cause larger training expense, and only by two words
The compound word for being bundled to obtain is embedded into term vector space as new semantic feature, and inputs to bag of words, influences word
The effect of bag model.
The content of the invention
The purpose of the present invention is intended to one of technical problem at least solving in correlation technique to a certain extent.
Therefore, first purpose of the present invention is to propose a kind of compound process method for being used for semantic excavation, it is used for
It is high to solve language material training cost of the prior art, and in order to lift bag of words effect, it is necessary to introduce more binary bundles
Word is tied up, influences internal memory performance;The compound word for either only being bundled to obtain using two words is embedded in as new semantic feature
To term vector space, and bag of words are inputed to, the problem of influenceing bag of words effect.
Second object of the present invention is to propose a kind of compound process device for being used for semantic excavation.
Third object of the present invention is to propose a kind of computer equipment.
Fourth object of the present invention is to propose a kind of non-transitorycomputer readable storage medium.
The 5th purpose of the present invention is to propose a kind of computer program product.
For the above-mentioned purpose, first aspect present invention embodiment proposes a kind of compound process side for being used for semantic excavation
Method, it the described method comprises the following steps:Determine M participle of every words in training corpus;Appearance according to described M participle is suitable
The N number of participle generation N-dimensional compound word of sequence selection, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;To the N-dimensional
The character string of compound word carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains and each Hash
Operation result uniquely corresponding position, and according to the generation of the floating number of K position corresponding with K Hash operation result
The K dimension term vectors of N-dimensional compound word, wherein, K is the integer more than 1;Term vector is tieed up according to the K of all N-dimensional compound words to filter out completely
The N-dimensional target compound word of sufficient preparatory condition, the N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.
The embodiment of the present invention is used for the semantic compound process method excavated, by determining in training corpus every word
M participle, N-dimensional compound word is generated then according to the N number of participle of appearance sequential selection of M participle, and to the word of N-dimensional compound word
Symbol string carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains with each Hash operation result only
Position corresponding to one, and tieed up according to the K of the floating number of K position corresponding with K Hash operation result generation N-dimensional compound word
Term vector, the N-dimensional target compound word for meeting preparatory condition is finally filtered out according to the K of all N-dimensional compound words dimension term vectors, by N
Dimension target compound word inputs to bag of words and carries out semantic excavation.Thus, it is possible to be directly trained to language material, language material instruction is reduced
Practice cost, and N-dimensional target compound word can be obtained and input to the semantic excavation of bag of words progress, do not influence internal memory performance
The semantic feature of more bigger granularity is introduced into bag of words simultaneously, further lifts the effect of bag of words.
For the above-mentioned purpose, second aspect of the present invention embodiment proposes a kind of compound process dress for being used for semantic excavation
Put, described device includes:Determining module, for determining M participle of every words in training corpus;First generation module, is used for
According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of described M participle, wherein, M is more than or equal to 2, and N is more than or equal to
2 and it is less than or equal to M;First processing module, for carrying out K Hash operation to the character string of the N-dimensional compound word, inquiry is advance
Obtained in the random Harsh dictionary space of foundation with the unique corresponding position of each Hash operation result, and according to K Hash
The floating number of K position corresponding to operation result generates the K dimension term vectors of the N-dimensional compound word, wherein, K is whole more than 1
Number;Screening module, meet that the N-dimensional target of preparatory condition is compound for tieing up term vector according to the K of all N-dimensional compound words and filtering out
Word;Module is excavated, semantic excavation is carried out for the N-dimensional target compound word to be inputed into bag of words.
The embodiment of the present invention is used for the semantic compound process device excavated, by determining in training corpus every word
M participle, N-dimensional compound word is generated then according to the N number of participle of appearance sequential selection of M participle, and to the word of N-dimensional compound word
Symbol string carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains with each Hash operation result only
Position corresponding to one, and tieed up according to the K of the floating number of K position corresponding with K Hash operation result generation N-dimensional compound word
Term vector, the N-dimensional target compound word for meeting preparatory condition is finally filtered out according to the K of all N-dimensional compound words dimension term vectors, by N
Dimension target compound word inputs to bag of words and carries out semantic excavation.Thus, it is possible to be directly trained to language material, language material instruction is reduced
Practice cost, and N-dimensional target compound word can be obtained and input to the semantic excavation of bag of words progress, do not influence internal memory performance
The semantic feature of more bigger granularity is introduced into bag of words simultaneously, further lifts the effect of bag of words.
For the above-mentioned purpose, third aspect present invention embodiment proposes a kind of computer equipment, including memory, processing
Device and storage on a memory and the computer program that can run on a processor, during the computing device described program, reality
Now it is used for the semantic compound process method excavated as a kind of, methods described includes:Determine M points of every words in training corpus
Word;According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of described M participle, wherein, M is more than or equal to 2, and N is more than
Equal to 2 and it is less than or equal to M;K Hash operation is carried out to the character string of the N-dimensional compound word, inquires about the random Kazakhstan pre-established
Obtained and each Hash operation result uniquely corresponding position, and according to corresponding with K Hash operation result in uncommon dictionary space
The floating number of K position generate the K dimension term vectors of the N-dimensional compound word, wherein, K is the integer more than 1;According to all N
The K dimension term vectors of dimension compound word filter out the N-dimensional target compound word for meeting preparatory condition, and the N-dimensional target compound word is inputted
Semantic excavation is carried out to bag of words.
To achieve these goals, fourth aspect present invention embodiment proposes a kind of computer-readable storage of non-transitory
Medium, when the instruction in the storage medium is performed by processor, enabling perform a kind of answering for semantic excavation
Word treatment method is closed, methods described includes:Determine M participle of every words in training corpus;According to the appearance of described M participle
The N number of participle generation N-dimensional compound word of sequential selection, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;To the N
The character string for tieing up compound word carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains with breathing out every time
Uncommon operation result uniquely corresponding position, and institute is generated according to the floating number of K position corresponding with K Hash operation result
The K dimension term vectors of N-dimensional compound word are stated, wherein, K is the integer more than 1;Term vector is tieed up according to the K of all N-dimensional compound words to filter out
Meet the N-dimensional target compound word of preparatory condition, the N-dimensional target compound word is inputed into bag of words carries out semantic excavation.
To achieve these goals, fifth aspect present invention embodiment proposes a kind of computer program product, when described
When instruction processing unit in computer program product performs, a kind of compound process method for being used for semantic excavation is performed, it is described
Method includes:Determine M participle of every words in training corpus;According to the N number of participle life of the appearance sequential selection of described M participle
Into N-dimensional compound word, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;The character string of the N-dimensional compound word is entered
K Hash operation of row, it is uniquely corresponding with each Hash operation result to inquire about acquisition in the random Harsh dictionary space pre-established
Position, and the K for generating according to the floating number of K position corresponding with K Hash operation result the N-dimensional compound word ties up word
Vector, wherein, K is the integer more than 1;The N-dimensional for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors
Target compound word, the N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partly become from the following description
Obtain substantially, or recognized by the practice of the present invention.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments
Substantially and it is readily appreciated that, wherein:
Fig. 1 is the schematic flow sheet according to an embodiment of the invention for being used for the semantic compound process method excavated;
Fig. 2 is the exemplary plot that layered characteristic according to an embodiment of the invention extracts mode;
Fig. 3 is the exemplary plot in random Harsh dictionary space according to an embodiment of the invention;
Fig. 4 is the flow signal in accordance with another embodiment of the present invention for being used for the semantic compound process method excavated
Figure;
Fig. 5 is that random Harsh dictionary according to an embodiment of the invention can be with original term vector dictionary and the example deposited
Figure;
Fig. 6 is applied customization layer in the compound process method according to an embodiment of the invention for being used for semanteme and excavating
Exemplary plot;
Fig. 7 is the exemplary plot of linear regression model (LRM) screening according to an embodiment of the invention;
Fig. 8 is the structural representation in accordance with another embodiment of the present invention for being used for the semantic compound process device excavated
Fig. 9 is the structural representation of computer equipment according to an embodiment of the invention.
Embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end
Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached
The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.
Below with reference to the accompanying drawings describe the compound process method, apparatus for being used for semantic excavation of the embodiment of the present invention and its set
It is standby.
It is used for the semantic compound process method excavated the embodiments of the invention provide a kind of, can be by Bigram features
Statistical is extended to any N number of adjacent word, and is incorporated into Ngram phrases.Word caused by these are new is unordered to realize system
The parameters such as word frequency are counted, is introduced directly into and is incorporated into as text feature in training, reduce language material training cost, and N can be obtained
Dimension target compound word inputs to bag of words and carries out semantic excavation, by more bigger granularity while internal memory performance is not influenceed
Semantic feature introduces bag of words, further lifts the effect of bag of words.It is specific as follows:
Fig. 1 is the schematic flow sheet according to an embodiment of the invention for being used for the semantic compound process method excavated.
As shown in figure 1, the compound process method for being used for semantic excavation includes:
Step 101, M participle of every words in training corpus is determined.
Step 102, according to the N number of participle generation N-dimensional compound word of appearance sequential selection of M participle, wherein, M is more than or equal to
2, and N is more than or equal to 2 and is less than or equal to M.
In actual applications, there are many words in training corpus, it is necessary first to M participle of every words is determined, such as really
7 participles of fixed " People's Republic of China (PRC) " the words for " in ", " China ", " people ", " people ", " common ", " and ", " state ".
So as to generate N-dimensional compound word according to the N number of participle of the appearance sequential selection of M participle, with " Chinese people's republicanism
Exemplified by state " the words, 7 participle appearance order for " in ", " China ", " people ", " people ", " common ", " and ", " state ".It can select
" in ", the dimension compound words of " China " 2 participle generation 2;Can also select " in ", " China ", " people ", " people " 4 participle 4 dimensions of generation it is compound
Word;It is also an option that " people ", " people ", " common ", " and ", the dimension compound words of " state " 5 participle generation 5.Practical application can be had more
Need to select two or more participle generation multidimensional compound words.
It is understood that being combined together for i-th and the adjacent phrase of i+1, what is obtained is unique
Compound word represents.
In order to which those skilled in the art are more clear how to generate the detailed process of N-dimensional compound word, illustrated with reference to Fig. 2
It is as follows:
As shown in Figure 2, it can be seen that, can be according to M participle for 6 participles " A ", " B ", " C ", " D ", " E ", " F "
The N number of participle generation N-dimensional compound word of appearance sequential selection, for example the adjacent phrase of " A ", " B ", " C " can be combined together
“ABC”;" B ", " C ", " D " adjacent phrase is combined together " BCD " etc. and obtains three-dimensional compound word expression.
Step 103, K Hash operation is carried out to the character string of N-dimensional compound word, inquires about the random Harsh dictionary pre-established
Obtained and each Hash operation result uniquely corresponding position, and according to K position corresponding with K Hash operation result in space
The K dimension term vectors for the floating number generation N-dimensional compound word put, wherein, K is the integer more than 1.
Wherein, random Harsh dictionary space is employed to store newly-generated semantic segment, and it can effectively solve the problem that vocabulary
The problem of blast.The implementation method in random Harsh dictionary space is as shown in Figure 3.
Specifically, unique spy of each layer of compound word has been obtained by way of being extracted layered characteristic as shown in Figure 2
Sign expression, is in general the termID character strings shaped like " 1-2-3 " in Fig. 3.The character string, which carries out Hash operation, to be looked into
Unique correspondence position in random Harsh dictionary corresponding with each Hash operation result is found, takes a value to make on the position
Term vector is tieed up for the K of the N-dimensional compound word, repeats said process until obtaining whole term vectors of the N-dimensional compound word.Random Harsh
The size of vocabulary is unrelated with the new quantity for adding semantic feature in dictionary space, can arbitrarily be configured according to the needs on line.
More specifically, carrying out each Hash operation to the character string of N-dimensional compound word, Hash operation result can be found only
Position corresponding to one, and the term vector of N-dimensional compound word can be generated according to the floating number of corresponding position.Wherein, K is carried out
Secondary Hash operation, K Hash operation result uniquely corresponding K position can be obtained, so as to the floating-point according to K position
It is digitally generated the K dimension term vectors of N-dimensional compound word.Wherein, K is the integer more than 1.
It should be noted that it is compound as N-dimensional that continuous k floating number is extracted on the position that each Hash operation obtains
The K dimension term vectors of word, it is possible to reduce Hash operation word number, operational performance is improved in the case where not influenceing precision.Wherein,
The selection of hash function needs to ensure the randomness and reproducibility of mapping.
Step 104, tie up term vector according to the K of all N-dimensional compound words and filter out and meet that the N-dimensional target of preparatory condition is compound
Word, N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.
Specifically, bag of words the vector of different text features can be carried out simply plus and wait project to after mode it is low
Dimension space, then carry out semantic similarity matching.Preset to further improve the effect of bag of words, it is necessary to filter out satisfaction
The N-dimensional target compound word of condition, semantic excavation is carried out so as to which N-dimensional target compound word is inputed into bag of words.
As a kind of possible implementation, the K dimension term vectors of each N-dimensional compound word are input to default linear regression
In model, the weight for representing each N-dimensional compound word significance level is obtained, according to the K of each N-dimensional compound word dimension term vectors and correspondingly
Weight, the K dimension weighting term vectors of each N-dimensional compound word are obtained, weighting term vector is tieed up according to the K of all N-dimensional compound words and filtered out
Meet the N-dimensional target compound word of preparatory condition.
Wherein, term vector and respective weights are tieed up according to the K of each N-dimensional compound word, the K dimensions for obtaining each N-dimensional compound word add
Power term vector can be the product for the K dimension term vectors and respective weights for calculating each N-dimensional compound word, obtain each N-dimensional compound word
K dimension weighting term vector.
As alternatively possible implementation, the K dimension term vectors of each N-dimensional compound word are input in default algorithm
Handled, so as to directly obtain the N-dimensional target compound word for meeting preparatory condition.
Thus, each word in every words is mapped into a low-dimensional term vector space, used as characteristic vector, often
One vector represents a word, and the language fragments that can add bigger granularity on this basis represent to put forward bag of words effect
Rise clearly.
Further, N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.As a kind of example, application
N-dimensional target compound word in bag of words carries out Semantic detection to text, is screened out according to testing result and is unsatisfactory for text semantic
Compound word.
In summary, the compound process method for being used for semantic excavation of the embodiment of the present invention, by determining training corpus
In every words M participle, generate N-dimensional compound word then according to N number of segment of appearance sequential selection of M participle, and to N-dimensional
The character string of compound word carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains and each Hash
Operation result uniquely corresponding position, and N-dimensional is generated according to the floating number of K position corresponding with K Hash operation result
The K dimension term vectors of compound word, the N-dimensional target for meeting preparatory condition is finally filtered out according to the K of all N-dimensional compound words dimension term vectors
Compound word, N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.Thus, it is possible to directly language material is trained,
Language material training cost is reduced, and N-dimensional target compound word can be obtained and input to the semantic excavation of bag of words progress, is not being influenceed
The semantic feature of more bigger granularity is introduced into bag of words while internal memory performance, further lifts the effect of bag of words.
Based on above-described embodiment, it is to be understood that random Harsh dictionary can protected with original term vector dictionary and depositing
Further expand the expression of semantic segment in the case of staying Bigram.Specifically it is specifically described with reference to Fig. 4 as follows:
Fig. 4 is the flow signal in accordance with another embodiment of the present invention for being used for the semantic compound process method excavated
Figure.As shown in figure 4, the compound process method for being used for semantic excavation includes:
The model for being used for the semantic compound process method application excavated of the present embodiment is as shown in figure 5, random Harsh word
Allusion quotation and original term vector dictionary are simultaneously deposited, and further expand the expression of semantic segment in the case where retaining Bigram.
Step 201, M participle of every words in training corpus is determined.
It should be noted that step S201 description is corresponding with above-mentioned steps S101, thus to step S201 retouch
The description with reference to above-mentioned steps S101 is stated, will not be repeated here.
Step 202, two-dimentional compound word is generated according to 2 participles of appearance sequential selection of M participle.
Step 203, the character string of two-dimentional compound word is carried out calculating acquisition result of calculation, it is empty inquires about original term vector dictionary
Between, obtain corresponding with result of calculation unique positions, using the K for the being digitally generated two-dimentional compound word dimension words with position correspondence to
Amount, wherein, K is the integer more than 1.
Specifically, two-dimentional compound word can be generated according to 2 participles of appearance sequential selection of M participle, with " the Chinese people
Exemplified by republic " the words, 7 participle appearance order for " in ", " China ", " people ", " people ", " common ", " and ", " state ".It can select
Select " in ", the dimension compound words of " China " 2 participle generation 2;" people ", " people " 2 participles can also be selected to generate two-dimentional compound word;May be used also
With select " common ", " and ", 2 two-dimentional compound words of participle generation.Practical application, which can be had more, needs two participle generations of selection
Two-dimentional compound word.
Step 204, according to the N number of participle generation N-dimensional compound word of appearance sequential selection of M participle, wherein, M is more than or equal to
2, and N is more than or equal to 2 and is less than or equal to M.
Step 205, K Hash operation is carried out to the character string of N-dimensional compound word, inquires about the random Harsh dictionary pre-established
Obtained and each Hash operation result uniquely corresponding position, and according to K position corresponding with K Hash operation result in space
The K dimension term vectors for the floating number generation N-dimensional compound word put, wherein, K is the integer more than 1.
It should be noted that step S204-S205 description is corresponding with above-mentioned steps S102-S103, thus to step
Rapid S204-S205 description will not be repeated here with reference to above-mentioned steps S102-S103 description.
Step 206, the K dimension weighting term vectors that the K of two-dimentional compound word is tieed up to term vector and all N-dimensional compound words add and root
The N-dimensional target compound word for meeting preparatory condition is filtered out with result according to adding.
Step 207, Semantic detection is carried out to text using the N-dimensional target compound word in bag of words.
Step 208, the compound word for being unsatisfactory for text semantic is screened out according to testing result.
Specifically, term vector can be carried out by linear regression model (LRM) Logistic Regression as shown in Figure 6
Screening.The K of all N-dimensional compound words is tieed up into term vector, is input in Logistic Regression, is obtained one and characterize the word
The marking of significance level, taken using the marking as weights obtained on former term vector the feature after being weighted according to significance level to
Amount.These last weighted feature vectors can sum up with the K dimension term vectors of two-dimentional compound word, be filtered out completely with result according to adding
The N-dimensional target compound word of sufficient preparatory condition.
Further, Semantic detection is carried out to text using the N-dimensional target compound word in bag of words, according to testing result
Screen out the compound word for being unsatisfactory for text semantic.
Thus, the model structure newly increased can with original structure and deposit, further lift scheme performance.
In order to realize above-described embodiment, the present invention also proposes a kind of compound process device for being used for semantic excavation, and Fig. 7 is
The structural representation according to an embodiment of the invention for being used for the semantic compound process device excavated.As shown in fig. 7, the use
Include in the compound process device that semanteme excavates:Determining module 11, the first generation module 12, first processing module 13, screening
Module 14 and excavation module 15.
Wherein it is determined that module 11, for determining M participle of every words in training corpus.
First generation module 12, N-dimensional compound word is generated for the N number of participle of the appearance sequential selection according to M participle, its
In, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M.
First processing module 13, for carrying out K Hash operation to the character string of N-dimensional compound word, inquire about what is pre-established
Obtained in random Harsh dictionary space with the unique corresponding position of each Hash operation result, and according to K Hash operation knot
The K dimension term vectors of the floating number generation N-dimensional compound word of K position corresponding to fruit, wherein, K is the integer more than 1.
Screening module 14, the N-dimensional mesh of preparatory condition is met for being filtered out according to the K of all N-dimensional compound words dimension term vectors
Mark compound word.
Module 15 is excavated, semantic excavation is carried out for N-dimensional target compound word to be inputed into bag of words.
Wherein, in one embodiment of the invention, screening module 14 is specifically used for:The K of each N-dimensional compound word is tieed up into word
Vector is input in default linear regression model (LRM), obtains the weight for representing each N-dimensional compound word significance level;According to each N
The K dimension term vectors and respective weights of compound word are tieed up, obtains the K dimension weighting term vectors of each N-dimensional compound word;Answered according to all N-dimensionals
The K dimension weighting term vectors for closing word filter out the N-dimensional target compound word for meeting preparatory condition.
Wherein, term vector and respective weights are tieed up according to the K of each N-dimensional compound word, the K dimensions for obtaining each N-dimensional compound word add
Term vector is weighed, including:The K dimension term vectors of each N-dimensional compound word and the product of respective weights are calculated, obtains each N-dimensional compound word
K dimension weighting term vector.
Wherein, in one embodiment of the invention, module 15 is excavated to be specifically used for:Using the N-dimensional mesh in bag of words
Mark compound word and Semantic detection is carried out to text;The compound word for being unsatisfactory for text semantic is screened out according to testing result.
It should be noted that the explanation of the foregoing compound process embodiment of the method to being excavated for semanteme is also suitable
It is used for the semantic compound process device excavated in the embodiment, here is omitted.
In summary, the compound process device for being used for semantic excavation of the embodiment of the present invention, by determining training corpus
In every words M participle, generate N-dimensional compound word then according to N number of segment of appearance sequential selection of M participle, and to N-dimensional
The character string of compound word carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains and each Hash
Operation result uniquely corresponding position, and N-dimensional is generated according to the floating number of K position corresponding with K Hash operation result
The K dimension term vectors of compound word, the N-dimensional target for meeting preparatory condition is finally filtered out according to the K of all N-dimensional compound words dimension term vectors
Compound word, N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.Thus, it is possible to directly language material is trained,
Language material training cost is reduced, and N-dimensional target compound word can be obtained and input to the semantic excavation of bag of words progress, is not being influenceed
The semantic feature of more bigger granularity is introduced into bag of words while internal memory performance, further lifts the effect of bag of words.
Fig. 8 is the structural representation in accordance with another embodiment of the present invention for being used for the semantic compound process device excavated
Figure.As shown in figure 8, also include on the basis of Fig. 7:Second generation module 16 and Second processing module 17.
Wherein, the second generation module 16, for compound according to 2 participle generation two dimensions of appearance sequential selection of M participle
Word.
Second processing module 17, result of calculation is obtained for the character string of two-dimentional compound word calculate, inquiry is original
Term vector dictionary space, unique positions corresponding with result of calculation are obtained, using two-dimentional compound with being digitally generated for position correspondence
The K dimension term vectors of word, wherein, K is the integer more than 1.
Screening module 14 is specifically additionally operable to:The K that the K of two-dimentional compound word is tieed up to term vector and all N-dimensional compound words ties up weighting
Term vector adds and the N-dimensional target compound word for meeting preparatory condition is filtered out with result according to adding.
Thus, the model structure newly increased can with original structure and deposit, further lift scheme performance..
The present invention proposes a kind of computer equipment, and Fig. 9 is the structure of computer equipment according to an embodiment of the invention
Schematic diagram.As shown in figure 9, memory 21, processor 22 and being stored in the meter that can be run on memory 21 and on processor 22
Calculation machine program.
Processor 22 realizes that what is provided in above-described embodiment is used for the semantic compound process excavated when performing described program
Method.
Further, computer equipment also includes:
Communication interface 23, for the communication between memory 21 and processor 22.
Memory 21, for depositing the computer program that can be run on processor 22.
Memory 21 may include high-speed RAM memory, it is also possible to also including nonvolatile memory (non-volatile
Memory), a for example, at least magnetic disk storage.
Processor 22, realize during for performing described program and be used for described in above-described embodiment at the semantic compound word excavated
Reason method.
If memory 21, processor 22 and the independent realization of communication interface 23, communication interface 21, memory 21 and processing
Device 22 can be connected with each other by bus and complete mutual communication.The bus can be industry standard architecture
(Industry Standard Architecture, referred to as ISA) bus, external equipment interconnection (Peripheral
Component, referred to as PCI) bus or extended industry-standard architecture (Extended Industry Standard
Architecture, referred to as EISA) bus etc..The bus can be divided into address bus, data/address bus, controlling bus etc..
For ease of representing, only represented in Fig. 9 with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 21, processor 22 and communication interface 23, are integrated in chip piece
Upper realization, then memory 21, processor 22 and communication interface 23 can complete mutual communication by internal interface.
Processor 22 is probably a central processing unit (Central Processing Unit, referred to as CPU), or
Specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), or by with
It is set to the one or more integrated circuits for implementing the embodiment of the present invention.
In order to realize above-described embodiment, the present invention also proposes a kind of non-transitorycomputer readable storage medium, when described
When instruction in storage medium is performed by processor, enabling perform a kind of compound process side for being used for semantic excavation
Method, methods described include:Determine M participle of every words in training corpus;According to N number of point of the appearance sequential selection of M participle
Word generates N-dimensional compound word, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;The character string of N-dimensional compound word is entered
K Hash operation of row, it is uniquely corresponding with each Hash operation result to inquire about acquisition in the random Harsh dictionary space pre-established
Position, and according to the floating number of K position corresponding with K Hash operation result generate N-dimensional compound word K tie up word to
Amount, wherein, K is the integer more than 1;The N-dimensional mesh for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors
Compound word is marked, N-dimensional target compound word is inputed into bag of words carries out semantic excavation.
In order to realize above-described embodiment, the present invention also proposes a kind of computer program product, when the computer program produces
When instruction processing unit in product performs, a kind of compound process method for being used for semantic excavation is performed, methods described includes:It is determined that
M participle of every words in training corpus;According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of M participle, wherein,
M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;K Hash operation is carried out to the character string of N-dimensional compound word, inquiry is pre-
Obtained in the random Harsh dictionary space first established and each Hash operation result uniquely corresponding position, and being breathed out according to K times
The K dimension term vectors of the floating number generation N-dimensional compound word of K position corresponding to uncommon operation result, wherein, K is whole more than 1
Number;The N-dimensional target compound word for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors, N-dimensional target is answered
Close word and input to the semantic excavation of bag of words progress.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description
Point is contained at least one embodiment or example of the present invention.In this manual, to the schematic representation of above-mentioned term not
Identical embodiment or example must be directed to.Moreover, specific features, structure, material or the feature of description can be with office
Combined in an appropriate manner in one or more embodiments or example.In addition, in the case of not conflicting, the skill of this area
Art personnel can be tied the different embodiments or example and the feature of different embodiments or example described in this specification
Close and combine.
In addition, term " first ", " second " are only used for describing purpose, and it is not intended that instruction or hint relative importance
Or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be expressed or
Implicitly include at least one this feature.In the description of the invention, " multiple " are meant that at least two, such as two, three
It is individual etc., unless otherwise specifically defined.
Any process or method described otherwise above description in flow chart or herein is construed as, and represents to include
Module, fragment or the portion of the code of the executable instruction of one or more the step of being used to realize custom logic function or process
Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discuss suitable
Sequence, including according to involved function by it is basic simultaneously in the way of or in the opposite order, carry out perform function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use
In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (such as computer based system including the system of processor or other can be held from instruction
The system of row system, device or equipment instruction fetch and execute instruction) use, or combine these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass
Defeated program is for instruction execution system, device or equipment or the dress used with reference to these instruction execution systems, device or equipment
Put.The more specifically example (non-exhaustive list) of computer-readable medium includes following:Electricity with one or more wiring
Connecting portion (electronic installation), portable computer diskette box (magnetic device), random access memory (RAM), read-only storage
(ROM), erasable edit read-only storage (EPROM or flash memory), fiber device, and portable optic disk is read-only deposits
Reservoir (CDROM).In addition, computer-readable medium, which can even is that, to print the paper of described program thereon or other are suitable
Medium, because can then enter edlin, interpretation or if necessary with it for example by carrying out optical scanner to paper or other media
His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned
In embodiment, software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage
Or firmware is realized.Such as, if realized with hardware with another embodiment, following skill well known in the art can be used
Any one of art or their combination are realized:With the logic gates for realizing logic function to data-signal from
Logic circuit is dissipated, the application specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile
Journey gate array (FPGA) etc..
Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method carries
Suddenly it is that by program the hardware of correlation can be instructed to complete, described program can be stored in a kind of computer-readable storage medium
In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing module, can also
That unit is individually physically present, can also two or more units be integrated in a module.Above-mentioned integrated mould
Block can both be realized in the form of hardware, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized in the form of software function module and as independent production marketing or in use, can also be stored in a computer
In read/write memory medium.
Storage medium mentioned above can be read-only storage, disk or CD etc..Although have been shown and retouch above
Embodiments of the invention are stated, it is to be understood that above-described embodiment is exemplary, it is impossible to be interpreted as the limit to the present invention
System, one of ordinary skill in the art can be changed to above-described embodiment, change, replace and become within the scope of the invention
Type.
Claims (10)
1. a kind of be used for the semantic compound process method excavated, it is characterised in that comprises the following steps:
Determine M participle of every words in training corpus;
According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of described M participle, wherein, M is more than or equal to 2, and N is more than
Equal to 2 and it is less than or equal to M;
K Hash operation is carried out to the character string of the N-dimensional compound word, inquires about in the random Harsh dictionary space pre-established and obtains
Take and each Hash operation result uniquely corresponding position, and according to the floating-point of K position corresponding with K Hash operation result
The K dimension term vectors of the N-dimensional compound word are digitally generated, wherein, K is the integer more than 1;
The N-dimensional target compound word for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors, by the N-dimensional mesh
Mark compound word inputs to bag of words and carries out semantic excavation.
2. the method as described in claim 1, it is characterised in that described to be filtered out according to the K of all N-dimensional compound words dimension term vectors
Meet the N-dimensional target compound word of preparatory condition, including:
The K dimension term vectors of each N-dimensional compound word are input in default linear regression model (LRM), obtains and represents that each N-dimensional is compound
The weight of word significance level;
Term vector and respective weights are tieed up according to the K of each N-dimensional compound word, obtain the K dimension weighting term vectors of each N-dimensional compound word;
The N-dimensional target compound word for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension weighting term vectors.
3. method as claimed in claim 2, it is characterised in that the K dimension term vectors and correspondingly of each N-dimensional compound word of basis
Weight, the K dimension weighting term vectors of each N-dimensional compound word are obtained, including:
The K dimension term vectors of each N-dimensional compound word and the product of respective weights are calculated, obtains the K dimension weighted words of each N-dimensional compound word
Vector.
4. method as claimed in claim 2, it is characterised in that the M participle of every words in the determination training corpus
Afterwards, in addition to:
Two-dimentional compound word is generated according to 2 participles of appearance sequential selection of described M participle;
The character string of the two-dimentional compound word is carried out calculating acquisition result of calculation, original term vector dictionary space is inquired about, obtains
Unique positions corresponding with the result of calculation, tieed up using the K for being digitally generated the two-dimentional compound word with the position correspondence
Term vector, wherein, K is the integer more than 1;
It is described that the N-dimensional target compound word for meeting preparatory condition, bag are filtered out according to the K of all N-dimensional compound words dimension weighting term vectors
Include:
By the K of the two-dimentional compound word tie up term vector and all N-dimensional compound words K dimension weighting term vectors add and, according to adding
The N-dimensional target compound word for meeting preparatory condition is filtered out with result.
5. the method as described in claim 1-4 is any, it is characterised in that described that the N-dimensional target compound word is inputed into word
Bag model carries out semantic excavation, including:
Semantic detection is carried out to text using the N-dimensional target compound word in the bag of words;
The compound word for being unsatisfactory for text semantic is screened out according to testing result.
6. a kind of be used for the semantic compound process device excavated, it is characterised in that including:
Determining module, for determining M participle of every words in training corpus;
First generation module, for the N number of participle generation N-dimensional compound word of the appearance sequential selection according to described M participle, wherein, M
More than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;
First processing module, for carrying out K Hash operation to the character string of the N-dimensional compound word, inquire about pre-establish with
Obtained in machine Hash dictionary space with the unique corresponding position of each Hash operation result, and according to K Hash operation result
The floating number of corresponding K position generates the K dimension term vectors of the N-dimensional compound word, wherein, K is the integer more than 1;
Screening module, meet that the N-dimensional target of preparatory condition is compound for tieing up term vector according to the K of all N-dimensional compound words and filtering out
Word;
Module is excavated, semantic excavation is carried out for the N-dimensional target compound word to be inputed into bag of words.
7. device as claimed in claim 6, it is characterised in that also include:
Second generation module, for generating two-dimentional compound word according to 2 participles of appearance sequential selection of described M participle;
Second processing module, result of calculation is obtained for the character string of the two-dimentional compound word calculate, inquires about prime word
Vectorial dictionary space, corresponding with result of calculation unique positions are obtained, institute is digitally generated using with the position correspondence
The K dimension term vectors of two-dimentional compound word are stated, wherein, K is the integer more than 1;
The screening module is specifically used for:The K that the K of the two-dimentional compound word is tieed up to term vector and all N-dimensional compound words is tieed up
Weighting term vector adds and the N-dimensional target compound word for meeting preparatory condition is filtered out with result according to adding.
8. a kind of computer equipment, it is characterised in that including memory, processor and storage on a memory and can be in processor
The computer program of upper operation, during the computing device described program, realization is used for as described in any in claim 1-5
The compound process method that semanteme excavates.
9. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, it is characterised in that the program quilt
Realized during computing device and be used for the semantic compound process method excavated as described in any in claim 1-5.
10. a kind of computer program product, it is characterised in that when the instruction in the computer program product is by computing device
When, perform and be used for the semantic compound process method excavated as described in any in claim 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711163429.2A CN107894979B (en) | 2017-11-21 | 2017-11-21 | Compound word processing method, device and equipment for semantic mining |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711163429.2A CN107894979B (en) | 2017-11-21 | 2017-11-21 | Compound word processing method, device and equipment for semantic mining |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107894979A true CN107894979A (en) | 2018-04-10 |
CN107894979B CN107894979B (en) | 2021-09-17 |
Family
ID=61805758
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711163429.2A Active CN107894979B (en) | 2017-11-21 | 2017-11-21 | Compound word processing method, device and equipment for semantic mining |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107894979B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059183A (en) * | 2019-03-22 | 2019-07-26 | 重庆邮电大学 | A kind of automobile industry User Perspective sensibility classification method based on big data |
CN110457692A (en) * | 2019-07-26 | 2019-11-15 | 清华大学 | Compound word indicates learning method and device |
CN110569498A (en) * | 2018-12-26 | 2019-12-13 | 东软集团股份有限公司 | Compound word recognition method and related device |
CN114548115A (en) * | 2022-02-23 | 2022-05-27 | 北京三快在线科技有限公司 | Method and device for explaining compound nouns and electronic equipment |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007219778A (en) * | 2006-02-16 | 2007-08-30 | Murata Mach Ltd | Document processor |
CN101093504A (en) * | 2006-03-24 | 2007-12-26 | 国际商业机器公司 | System for extracting new compound word |
CN102200984A (en) * | 2010-03-24 | 2011-09-28 | 深圳市腾讯计算机系统有限公司 | Search method based on compound words and search engine server |
US8046212B1 (en) * | 2003-10-31 | 2011-10-25 | Access Innovations | Identification of chemical names in text-containing documents |
CN102859515A (en) * | 2010-02-12 | 2013-01-02 | 谷歌公司 | Compound splitting |
CN103646080A (en) * | 2013-12-12 | 2014-03-19 | 北京京东尚科信息技术有限公司 | Microblog duplication-eliminating method and system based on reverse-order index |
CN104657350A (en) * | 2015-03-04 | 2015-05-27 | 中国科学院自动化研究所 | Hash learning method for short text integrated with implicit semantic features |
CN105843960A (en) * | 2016-04-18 | 2016-08-10 | 上海泥娃通信科技有限公司 | Semantic tree based indexing method and system |
CN106687952A (en) * | 2014-09-26 | 2017-05-17 | 甲骨文国际公司 | Techniques for similarity analysis and data enrichment using knowledge sources |
CN107193802A (en) * | 2017-05-25 | 2017-09-22 | 上海耐相智能科技有限公司 | A kind of smart field concept auto acquisition system |
-
2017
- 2017-11-21 CN CN201711163429.2A patent/CN107894979B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8046212B1 (en) * | 2003-10-31 | 2011-10-25 | Access Innovations | Identification of chemical names in text-containing documents |
JP2007219778A (en) * | 2006-02-16 | 2007-08-30 | Murata Mach Ltd | Document processor |
CN101093504A (en) * | 2006-03-24 | 2007-12-26 | 国际商业机器公司 | System for extracting new compound word |
CN102859515A (en) * | 2010-02-12 | 2013-01-02 | 谷歌公司 | Compound splitting |
CN102200984A (en) * | 2010-03-24 | 2011-09-28 | 深圳市腾讯计算机系统有限公司 | Search method based on compound words and search engine server |
CN103646080A (en) * | 2013-12-12 | 2014-03-19 | 北京京东尚科信息技术有限公司 | Microblog duplication-eliminating method and system based on reverse-order index |
CN106687952A (en) * | 2014-09-26 | 2017-05-17 | 甲骨文国际公司 | Techniques for similarity analysis and data enrichment using knowledge sources |
CN104657350A (en) * | 2015-03-04 | 2015-05-27 | 中国科学院自动化研究所 | Hash learning method for short text integrated with implicit semantic features |
CN105843960A (en) * | 2016-04-18 | 2016-08-10 | 上海泥娃通信科技有限公司 | Semantic tree based indexing method and system |
CN107193802A (en) * | 2017-05-25 | 2017-09-22 | 上海耐相智能科技有限公司 | A kind of smart field concept auto acquisition system |
Non-Patent Citations (3)
Title |
---|
DAN SVENSTRUP等: "Hash Embeddings for Efficient Word Representations", 《HTTPS://ARXIV.ORG/ABS/1709.03933?CONTEXT=CS.CL》 * |
LEI SHI等: "Functional Hashing for Compressing Neural Networks", 《HTTPS://ARXIV.ORG/PDF/1605.06560.PDF》 * |
欧阳柳波等: "基于位置标签与词性结合的组合词抽取方法", 《计算机应用研究》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110569498A (en) * | 2018-12-26 | 2019-12-13 | 东软集团股份有限公司 | Compound word recognition method and related device |
CN110569498B (en) * | 2018-12-26 | 2022-12-09 | 东软集团股份有限公司 | Compound word recognition method and related device |
CN110059183A (en) * | 2019-03-22 | 2019-07-26 | 重庆邮电大学 | A kind of automobile industry User Perspective sensibility classification method based on big data |
CN110059183B (en) * | 2019-03-22 | 2022-08-23 | 重庆邮电大学 | Automobile industry user viewpoint emotion classification method based on big data |
CN110457692A (en) * | 2019-07-26 | 2019-11-15 | 清华大学 | Compound word indicates learning method and device |
CN114548115A (en) * | 2022-02-23 | 2022-05-27 | 北京三快在线科技有限公司 | Method and device for explaining compound nouns and electronic equipment |
CN114548115B (en) * | 2022-02-23 | 2023-01-06 | 北京三快在线科技有限公司 | Method and device for explaining compound nouns and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN107894979B (en) | 2021-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110232183B (en) | Keyword extraction model training method, keyword extraction device and storage medium | |
CN107894979A (en) | The compound process method, apparatus and its equipment excavated for semanteme | |
CN111063410B (en) | Method and device for generating medical image text report | |
CN110705294A (en) | Named entity recognition model training method, named entity recognition method and device | |
CN110287480A (en) | A kind of name entity recognition method, device, storage medium and terminal device | |
CN108334499A (en) | A kind of text label tagging equipment, method and computing device | |
CN110083700A (en) | A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks | |
CN108763445A (en) | Construction method, device, computer equipment and the storage medium in patent knowledge library | |
CN108280064A (en) | Participle, part-of-speech tagging, Entity recognition and the combination treatment method of syntactic analysis | |
CN112084331A (en) | Text processing method, text processing device, model training method, model training device, computer equipment and storage medium | |
CN109492666A (en) | Image recognition model training method, device and storage medium | |
CN106155686A (en) | Interface creating method, device and system | |
CN106202010A (en) | The method and apparatus building Law Text syntax tree based on deep neural network | |
CN108763555A (en) | Representation data acquisition methods and device based on demand word | |
CN109344404A (en) | The dual attention natural language inference method of context aware | |
CN110442840A (en) | Sequence labelling network update method, electronic health record processing method and relevant apparatus | |
CN110232123A (en) | The sentiment analysis method and device thereof of text calculate equipment and readable medium | |
CN107977363A (en) | Title generation method, device and electronic equipment | |
CN110555084A (en) | remote supervision relation classification method based on PCNN and multi-layer attention | |
CN107122492A (en) | Lyric generation method and device based on picture content | |
CN113220876B (en) | Multi-label classification method and system for English text | |
CN109766553A (en) | A kind of Chinese word cutting method of the capsule model combined based on more regularizations | |
CN110399488A (en) | File classification method and device | |
CN106951413A (en) | Segmenting method and device based on artificial intelligence | |
CN106844340A (en) | News in brief generation and display methods, apparatus and system based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |