CN107894979A - The compound process method, apparatus and its equipment excavated for semanteme - Google Patents

The compound process method, apparatus and its equipment excavated for semanteme Download PDF

Info

Publication number
CN107894979A
CN107894979A CN201711163429.2A CN201711163429A CN107894979A CN 107894979 A CN107894979 A CN 107894979A CN 201711163429 A CN201711163429 A CN 201711163429A CN 107894979 A CN107894979 A CN 107894979A
Authority
CN
China
Prior art keywords
dimensional
compound word
words
word
compound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711163429.2A
Other languages
Chinese (zh)
Other versions
CN107894979B (en
Inventor
陈徐屹
冯仕堃
朱志凡
何径舟
朱丹翔
曹宇慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201711163429.2A priority Critical patent/CN107894979B/en
Publication of CN107894979A publication Critical patent/CN107894979A/en
Application granted granted Critical
Publication of CN107894979B publication Critical patent/CN107894979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention proposes a kind of compound process method, apparatus and its equipment for being used for semantic excavation, wherein, method includes:Determine M participle of every words in training corpus;According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of M participle, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;K Hash operation is carried out to the character string of N-dimensional compound word, inquire about in the random Harsh dictionary space that pre-establishes and obtain and the unique corresponding position of each Hash operation result, and the K that N-dimensional compound word is generated according to the floating number of K position corresponding with K Hash operation result ties up term vector, wherein, K is the integer more than 1;The N-dimensional target compound word for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors, N-dimensional target compound word is inputed into bag of words carries out semantic excavation.Hereby it is achieved that the semantic feature of more bigger granularity is introduced into bag of words, the effect of bag of words is further lifted.

Description

The compound process method, apparatus and its equipment excavated for semanteme
Technical field
The present invention relates to technical field of information processing, more particularly to it is a kind of be used for the semantic compound process method excavated, Device and its equipment.
Background technology
Artificial intelligence (Artificial Intelligence), english abbreviation AI.It is research, develop for simulating, Extension and the extension intelligent theory of people, method, a new technological sciences of technology and application system.Artificial intelligence is to calculate One branch of machine science, it attempts to understand essence of intelligence, and produce it is a kind of it is new can be in a manner of human intelligence be similar The intelligence machine made a response, the research in the field include robot, speech recognition, image recognition, natural language processing and specially Family's system etc..
At present, in text semantic relevant matches task, common bag of words (Bag of Words) model has extensively Application.In correlation technique, the probability of occurrence of two neighboring word during training language is chatted is counted using Bigram (two-dimensional grammar), is passed through T-statics sort methods, certain two word while the possibility occurred are obtained, so as to two occurred to larger possibility simultaneously The compound word that word is bundled to obtain is embedded into term vector space as new semantic feature, and inputs to bag of words.
However, for being required for the T-statics for counting its Bigram again to be formed per a collection of new training corpus Bigram vocabularys, it then could start to train bag of words, so as to cause larger training expense, and only by two words The compound word for being bundled to obtain is embedded into term vector space as new semantic feature, and inputs to bag of words, influences word The effect of bag model.
The content of the invention
The purpose of the present invention is intended to one of technical problem at least solving in correlation technique to a certain extent.
Therefore, first purpose of the present invention is to propose a kind of compound process method for being used for semantic excavation, it is used for It is high to solve language material training cost of the prior art, and in order to lift bag of words effect, it is necessary to introduce more binary bundles Word is tied up, influences internal memory performance;The compound word for either only being bundled to obtain using two words is embedded in as new semantic feature To term vector space, and bag of words are inputed to, the problem of influenceing bag of words effect.
Second object of the present invention is to propose a kind of compound process device for being used for semantic excavation.
Third object of the present invention is to propose a kind of computer equipment.
Fourth object of the present invention is to propose a kind of non-transitorycomputer readable storage medium.
The 5th purpose of the present invention is to propose a kind of computer program product.
For the above-mentioned purpose, first aspect present invention embodiment proposes a kind of compound process side for being used for semantic excavation Method, it the described method comprises the following steps:Determine M participle of every words in training corpus;Appearance according to described M participle is suitable The N number of participle generation N-dimensional compound word of sequence selection, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;To the N-dimensional The character string of compound word carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains and each Hash Operation result uniquely corresponding position, and according to the generation of the floating number of K position corresponding with K Hash operation result The K dimension term vectors of N-dimensional compound word, wherein, K is the integer more than 1;Term vector is tieed up according to the K of all N-dimensional compound words to filter out completely The N-dimensional target compound word of sufficient preparatory condition, the N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.
The embodiment of the present invention is used for the semantic compound process method excavated, by determining in training corpus every word M participle, N-dimensional compound word is generated then according to the N number of participle of appearance sequential selection of M participle, and to the word of N-dimensional compound word Symbol string carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains with each Hash operation result only Position corresponding to one, and tieed up according to the K of the floating number of K position corresponding with K Hash operation result generation N-dimensional compound word Term vector, the N-dimensional target compound word for meeting preparatory condition is finally filtered out according to the K of all N-dimensional compound words dimension term vectors, by N Dimension target compound word inputs to bag of words and carries out semantic excavation.Thus, it is possible to be directly trained to language material, language material instruction is reduced Practice cost, and N-dimensional target compound word can be obtained and input to the semantic excavation of bag of words progress, do not influence internal memory performance The semantic feature of more bigger granularity is introduced into bag of words simultaneously, further lifts the effect of bag of words.
For the above-mentioned purpose, second aspect of the present invention embodiment proposes a kind of compound process dress for being used for semantic excavation Put, described device includes:Determining module, for determining M participle of every words in training corpus;First generation module, is used for According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of described M participle, wherein, M is more than or equal to 2, and N is more than or equal to 2 and it is less than or equal to M;First processing module, for carrying out K Hash operation to the character string of the N-dimensional compound word, inquiry is advance Obtained in the random Harsh dictionary space of foundation with the unique corresponding position of each Hash operation result, and according to K Hash The floating number of K position corresponding to operation result generates the K dimension term vectors of the N-dimensional compound word, wherein, K is whole more than 1 Number;Screening module, meet that the N-dimensional target of preparatory condition is compound for tieing up term vector according to the K of all N-dimensional compound words and filtering out Word;Module is excavated, semantic excavation is carried out for the N-dimensional target compound word to be inputed into bag of words.
The embodiment of the present invention is used for the semantic compound process device excavated, by determining in training corpus every word M participle, N-dimensional compound word is generated then according to the N number of participle of appearance sequential selection of M participle, and to the word of N-dimensional compound word Symbol string carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains with each Hash operation result only Position corresponding to one, and tieed up according to the K of the floating number of K position corresponding with K Hash operation result generation N-dimensional compound word Term vector, the N-dimensional target compound word for meeting preparatory condition is finally filtered out according to the K of all N-dimensional compound words dimension term vectors, by N Dimension target compound word inputs to bag of words and carries out semantic excavation.Thus, it is possible to be directly trained to language material, language material instruction is reduced Practice cost, and N-dimensional target compound word can be obtained and input to the semantic excavation of bag of words progress, do not influence internal memory performance The semantic feature of more bigger granularity is introduced into bag of words simultaneously, further lifts the effect of bag of words.
For the above-mentioned purpose, third aspect present invention embodiment proposes a kind of computer equipment, including memory, processing Device and storage on a memory and the computer program that can run on a processor, during the computing device described program, reality Now it is used for the semantic compound process method excavated as a kind of, methods described includes:Determine M points of every words in training corpus Word;According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of described M participle, wherein, M is more than or equal to 2, and N is more than Equal to 2 and it is less than or equal to M;K Hash operation is carried out to the character string of the N-dimensional compound word, inquires about the random Kazakhstan pre-established Obtained and each Hash operation result uniquely corresponding position, and according to corresponding with K Hash operation result in uncommon dictionary space The floating number of K position generate the K dimension term vectors of the N-dimensional compound word, wherein, K is the integer more than 1;According to all N The K dimension term vectors of dimension compound word filter out the N-dimensional target compound word for meeting preparatory condition, and the N-dimensional target compound word is inputted Semantic excavation is carried out to bag of words.
To achieve these goals, fourth aspect present invention embodiment proposes a kind of computer-readable storage of non-transitory Medium, when the instruction in the storage medium is performed by processor, enabling perform a kind of answering for semantic excavation Word treatment method is closed, methods described includes:Determine M participle of every words in training corpus;According to the appearance of described M participle The N number of participle generation N-dimensional compound word of sequential selection, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;To the N The character string for tieing up compound word carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains with breathing out every time Uncommon operation result uniquely corresponding position, and institute is generated according to the floating number of K position corresponding with K Hash operation result The K dimension term vectors of N-dimensional compound word are stated, wherein, K is the integer more than 1;Term vector is tieed up according to the K of all N-dimensional compound words to filter out Meet the N-dimensional target compound word of preparatory condition, the N-dimensional target compound word is inputed into bag of words carries out semantic excavation.
To achieve these goals, fifth aspect present invention embodiment proposes a kind of computer program product, when described When instruction processing unit in computer program product performs, a kind of compound process method for being used for semantic excavation is performed, it is described Method includes:Determine M participle of every words in training corpus;According to the N number of participle life of the appearance sequential selection of described M participle Into N-dimensional compound word, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;The character string of the N-dimensional compound word is entered K Hash operation of row, it is uniquely corresponding with each Hash operation result to inquire about acquisition in the random Harsh dictionary space pre-established Position, and the K for generating according to the floating number of K position corresponding with K Hash operation result the N-dimensional compound word ties up word Vector, wherein, K is the integer more than 1;The N-dimensional for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors Target compound word, the N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by the practice of the present invention.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Substantially and it is readily appreciated that, wherein:
Fig. 1 is the schematic flow sheet according to an embodiment of the invention for being used for the semantic compound process method excavated;
Fig. 2 is the exemplary plot that layered characteristic according to an embodiment of the invention extracts mode;
Fig. 3 is the exemplary plot in random Harsh dictionary space according to an embodiment of the invention;
Fig. 4 is the flow signal in accordance with another embodiment of the present invention for being used for the semantic compound process method excavated Figure;
Fig. 5 is that random Harsh dictionary according to an embodiment of the invention can be with original term vector dictionary and the example deposited Figure;
Fig. 6 is applied customization layer in the compound process method according to an embodiment of the invention for being used for semanteme and excavating Exemplary plot;
Fig. 7 is the exemplary plot of linear regression model (LRM) screening according to an embodiment of the invention;
Fig. 8 is the structural representation in accordance with another embodiment of the present invention for being used for the semantic compound process device excavated
Fig. 9 is the structural representation of computer equipment according to an embodiment of the invention.
Embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.
Below with reference to the accompanying drawings describe the compound process method, apparatus for being used for semantic excavation of the embodiment of the present invention and its set It is standby.
It is used for the semantic compound process method excavated the embodiments of the invention provide a kind of, can be by Bigram features Statistical is extended to any N number of adjacent word, and is incorporated into Ngram phrases.Word caused by these are new is unordered to realize system The parameters such as word frequency are counted, is introduced directly into and is incorporated into as text feature in training, reduce language material training cost, and N can be obtained Dimension target compound word inputs to bag of words and carries out semantic excavation, by more bigger granularity while internal memory performance is not influenceed Semantic feature introduces bag of words, further lifts the effect of bag of words.It is specific as follows:
Fig. 1 is the schematic flow sheet according to an embodiment of the invention for being used for the semantic compound process method excavated. As shown in figure 1, the compound process method for being used for semantic excavation includes:
Step 101, M participle of every words in training corpus is determined.
Step 102, according to the N number of participle generation N-dimensional compound word of appearance sequential selection of M participle, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M.
In actual applications, there are many words in training corpus, it is necessary first to M participle of every words is determined, such as really 7 participles of fixed " People's Republic of China (PRC) " the words for " in ", " China ", " people ", " people ", " common ", " and ", " state ".
So as to generate N-dimensional compound word according to the N number of participle of the appearance sequential selection of M participle, with " Chinese people's republicanism Exemplified by state " the words, 7 participle appearance order for " in ", " China ", " people ", " people ", " common ", " and ", " state ".It can select " in ", the dimension compound words of " China " 2 participle generation 2;Can also select " in ", " China ", " people ", " people " 4 participle 4 dimensions of generation it is compound Word;It is also an option that " people ", " people ", " common ", " and ", the dimension compound words of " state " 5 participle generation 5.Practical application can be had more Need to select two or more participle generation multidimensional compound words.
It is understood that being combined together for i-th and the adjacent phrase of i+1, what is obtained is unique Compound word represents.
In order to which those skilled in the art are more clear how to generate the detailed process of N-dimensional compound word, illustrated with reference to Fig. 2 It is as follows:
As shown in Figure 2, it can be seen that, can be according to M participle for 6 participles " A ", " B ", " C ", " D ", " E ", " F " The N number of participle generation N-dimensional compound word of appearance sequential selection, for example the adjacent phrase of " A ", " B ", " C " can be combined together “ABC”;" B ", " C ", " D " adjacent phrase is combined together " BCD " etc. and obtains three-dimensional compound word expression.
Step 103, K Hash operation is carried out to the character string of N-dimensional compound word, inquires about the random Harsh dictionary pre-established Obtained and each Hash operation result uniquely corresponding position, and according to K position corresponding with K Hash operation result in space The K dimension term vectors for the floating number generation N-dimensional compound word put, wherein, K is the integer more than 1.
Wherein, random Harsh dictionary space is employed to store newly-generated semantic segment, and it can effectively solve the problem that vocabulary The problem of blast.The implementation method in random Harsh dictionary space is as shown in Figure 3.
Specifically, unique spy of each layer of compound word has been obtained by way of being extracted layered characteristic as shown in Figure 2 Sign expression, is in general the termID character strings shaped like " 1-2-3 " in Fig. 3.The character string, which carries out Hash operation, to be looked into Unique correspondence position in random Harsh dictionary corresponding with each Hash operation result is found, takes a value to make on the position Term vector is tieed up for the K of the N-dimensional compound word, repeats said process until obtaining whole term vectors of the N-dimensional compound word.Random Harsh The size of vocabulary is unrelated with the new quantity for adding semantic feature in dictionary space, can arbitrarily be configured according to the needs on line.
More specifically, carrying out each Hash operation to the character string of N-dimensional compound word, Hash operation result can be found only Position corresponding to one, and the term vector of N-dimensional compound word can be generated according to the floating number of corresponding position.Wherein, K is carried out Secondary Hash operation, K Hash operation result uniquely corresponding K position can be obtained, so as to the floating-point according to K position It is digitally generated the K dimension term vectors of N-dimensional compound word.Wherein, K is the integer more than 1.
It should be noted that it is compound as N-dimensional that continuous k floating number is extracted on the position that each Hash operation obtains The K dimension term vectors of word, it is possible to reduce Hash operation word number, operational performance is improved in the case where not influenceing precision.Wherein, The selection of hash function needs to ensure the randomness and reproducibility of mapping.
Step 104, tie up term vector according to the K of all N-dimensional compound words and filter out and meet that the N-dimensional target of preparatory condition is compound Word, N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.
Specifically, bag of words the vector of different text features can be carried out simply plus and wait project to after mode it is low Dimension space, then carry out semantic similarity matching.Preset to further improve the effect of bag of words, it is necessary to filter out satisfaction The N-dimensional target compound word of condition, semantic excavation is carried out so as to which N-dimensional target compound word is inputed into bag of words.
As a kind of possible implementation, the K dimension term vectors of each N-dimensional compound word are input to default linear regression In model, the weight for representing each N-dimensional compound word significance level is obtained, according to the K of each N-dimensional compound word dimension term vectors and correspondingly Weight, the K dimension weighting term vectors of each N-dimensional compound word are obtained, weighting term vector is tieed up according to the K of all N-dimensional compound words and filtered out Meet the N-dimensional target compound word of preparatory condition.
Wherein, term vector and respective weights are tieed up according to the K of each N-dimensional compound word, the K dimensions for obtaining each N-dimensional compound word add Power term vector can be the product for the K dimension term vectors and respective weights for calculating each N-dimensional compound word, obtain each N-dimensional compound word K dimension weighting term vector.
As alternatively possible implementation, the K dimension term vectors of each N-dimensional compound word are input in default algorithm Handled, so as to directly obtain the N-dimensional target compound word for meeting preparatory condition.
Thus, each word in every words is mapped into a low-dimensional term vector space, used as characteristic vector, often One vector represents a word, and the language fragments that can add bigger granularity on this basis represent to put forward bag of words effect Rise clearly.
Further, N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.As a kind of example, application N-dimensional target compound word in bag of words carries out Semantic detection to text, is screened out according to testing result and is unsatisfactory for text semantic Compound word.
In summary, the compound process method for being used for semantic excavation of the embodiment of the present invention, by determining training corpus In every words M participle, generate N-dimensional compound word then according to N number of segment of appearance sequential selection of M participle, and to N-dimensional The character string of compound word carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains and each Hash Operation result uniquely corresponding position, and N-dimensional is generated according to the floating number of K position corresponding with K Hash operation result The K dimension term vectors of compound word, the N-dimensional target for meeting preparatory condition is finally filtered out according to the K of all N-dimensional compound words dimension term vectors Compound word, N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.Thus, it is possible to directly language material is trained, Language material training cost is reduced, and N-dimensional target compound word can be obtained and input to the semantic excavation of bag of words progress, is not being influenceed The semantic feature of more bigger granularity is introduced into bag of words while internal memory performance, further lifts the effect of bag of words.
Based on above-described embodiment, it is to be understood that random Harsh dictionary can protected with original term vector dictionary and depositing Further expand the expression of semantic segment in the case of staying Bigram.Specifically it is specifically described with reference to Fig. 4 as follows:
Fig. 4 is the flow signal in accordance with another embodiment of the present invention for being used for the semantic compound process method excavated Figure.As shown in figure 4, the compound process method for being used for semantic excavation includes:
The model for being used for the semantic compound process method application excavated of the present embodiment is as shown in figure 5, random Harsh word Allusion quotation and original term vector dictionary are simultaneously deposited, and further expand the expression of semantic segment in the case where retaining Bigram.
Step 201, M participle of every words in training corpus is determined.
It should be noted that step S201 description is corresponding with above-mentioned steps S101, thus to step S201 retouch The description with reference to above-mentioned steps S101 is stated, will not be repeated here.
Step 202, two-dimentional compound word is generated according to 2 participles of appearance sequential selection of M participle.
Step 203, the character string of two-dimentional compound word is carried out calculating acquisition result of calculation, it is empty inquires about original term vector dictionary Between, obtain corresponding with result of calculation unique positions, using the K for the being digitally generated two-dimentional compound word dimension words with position correspondence to Amount, wherein, K is the integer more than 1.
Specifically, two-dimentional compound word can be generated according to 2 participles of appearance sequential selection of M participle, with " the Chinese people Exemplified by republic " the words, 7 participle appearance order for " in ", " China ", " people ", " people ", " common ", " and ", " state ".It can select Select " in ", the dimension compound words of " China " 2 participle generation 2;" people ", " people " 2 participles can also be selected to generate two-dimentional compound word;May be used also With select " common ", " and ", 2 two-dimentional compound words of participle generation.Practical application, which can be had more, needs two participle generations of selection Two-dimentional compound word.
Step 204, according to the N number of participle generation N-dimensional compound word of appearance sequential selection of M participle, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M.
Step 205, K Hash operation is carried out to the character string of N-dimensional compound word, inquires about the random Harsh dictionary pre-established Obtained and each Hash operation result uniquely corresponding position, and according to K position corresponding with K Hash operation result in space The K dimension term vectors for the floating number generation N-dimensional compound word put, wherein, K is the integer more than 1.
It should be noted that step S204-S205 description is corresponding with above-mentioned steps S102-S103, thus to step Rapid S204-S205 description will not be repeated here with reference to above-mentioned steps S102-S103 description.
Step 206, the K dimension weighting term vectors that the K of two-dimentional compound word is tieed up to term vector and all N-dimensional compound words add and root The N-dimensional target compound word for meeting preparatory condition is filtered out with result according to adding.
Step 207, Semantic detection is carried out to text using the N-dimensional target compound word in bag of words.
Step 208, the compound word for being unsatisfactory for text semantic is screened out according to testing result.
Specifically, term vector can be carried out by linear regression model (LRM) Logistic Regression as shown in Figure 6 Screening.The K of all N-dimensional compound words is tieed up into term vector, is input in Logistic Regression, is obtained one and characterize the word The marking of significance level, taken using the marking as weights obtained on former term vector the feature after being weighted according to significance level to Amount.These last weighted feature vectors can sum up with the K dimension term vectors of two-dimentional compound word, be filtered out completely with result according to adding The N-dimensional target compound word of sufficient preparatory condition.
Further, Semantic detection is carried out to text using the N-dimensional target compound word in bag of words, according to testing result Screen out the compound word for being unsatisfactory for text semantic.
Thus, the model structure newly increased can with original structure and deposit, further lift scheme performance.
In order to realize above-described embodiment, the present invention also proposes a kind of compound process device for being used for semantic excavation, and Fig. 7 is The structural representation according to an embodiment of the invention for being used for the semantic compound process device excavated.As shown in fig. 7, the use Include in the compound process device that semanteme excavates:Determining module 11, the first generation module 12, first processing module 13, screening Module 14 and excavation module 15.
Wherein it is determined that module 11, for determining M participle of every words in training corpus.
First generation module 12, N-dimensional compound word is generated for the N number of participle of the appearance sequential selection according to M participle, its In, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M.
First processing module 13, for carrying out K Hash operation to the character string of N-dimensional compound word, inquire about what is pre-established Obtained in random Harsh dictionary space with the unique corresponding position of each Hash operation result, and according to K Hash operation knot The K dimension term vectors of the floating number generation N-dimensional compound word of K position corresponding to fruit, wherein, K is the integer more than 1.
Screening module 14, the N-dimensional mesh of preparatory condition is met for being filtered out according to the K of all N-dimensional compound words dimension term vectors Mark compound word.
Module 15 is excavated, semantic excavation is carried out for N-dimensional target compound word to be inputed into bag of words.
Wherein, in one embodiment of the invention, screening module 14 is specifically used for:The K of each N-dimensional compound word is tieed up into word Vector is input in default linear regression model (LRM), obtains the weight for representing each N-dimensional compound word significance level;According to each N The K dimension term vectors and respective weights of compound word are tieed up, obtains the K dimension weighting term vectors of each N-dimensional compound word;Answered according to all N-dimensionals The K dimension weighting term vectors for closing word filter out the N-dimensional target compound word for meeting preparatory condition.
Wherein, term vector and respective weights are tieed up according to the K of each N-dimensional compound word, the K dimensions for obtaining each N-dimensional compound word add Term vector is weighed, including:The K dimension term vectors of each N-dimensional compound word and the product of respective weights are calculated, obtains each N-dimensional compound word K dimension weighting term vector.
Wherein, in one embodiment of the invention, module 15 is excavated to be specifically used for:Using the N-dimensional mesh in bag of words Mark compound word and Semantic detection is carried out to text;The compound word for being unsatisfactory for text semantic is screened out according to testing result.
It should be noted that the explanation of the foregoing compound process embodiment of the method to being excavated for semanteme is also suitable It is used for the semantic compound process device excavated in the embodiment, here is omitted.
In summary, the compound process device for being used for semantic excavation of the embodiment of the present invention, by determining training corpus In every words M participle, generate N-dimensional compound word then according to N number of segment of appearance sequential selection of M participle, and to N-dimensional The character string of compound word carries out K Hash operation, inquires about in the random Harsh dictionary space pre-established and obtains and each Hash Operation result uniquely corresponding position, and N-dimensional is generated according to the floating number of K position corresponding with K Hash operation result The K dimension term vectors of compound word, the N-dimensional target for meeting preparatory condition is finally filtered out according to the K of all N-dimensional compound words dimension term vectors Compound word, N-dimensional target compound word is inputed into bag of words and carries out semantic excavation.Thus, it is possible to directly language material is trained, Language material training cost is reduced, and N-dimensional target compound word can be obtained and input to the semantic excavation of bag of words progress, is not being influenceed The semantic feature of more bigger granularity is introduced into bag of words while internal memory performance, further lifts the effect of bag of words.
Fig. 8 is the structural representation in accordance with another embodiment of the present invention for being used for the semantic compound process device excavated Figure.As shown in figure 8, also include on the basis of Fig. 7:Second generation module 16 and Second processing module 17.
Wherein, the second generation module 16, for compound according to 2 participle generation two dimensions of appearance sequential selection of M participle Word.
Second processing module 17, result of calculation is obtained for the character string of two-dimentional compound word calculate, inquiry is original Term vector dictionary space, unique positions corresponding with result of calculation are obtained, using two-dimentional compound with being digitally generated for position correspondence The K dimension term vectors of word, wherein, K is the integer more than 1.
Screening module 14 is specifically additionally operable to:The K that the K of two-dimentional compound word is tieed up to term vector and all N-dimensional compound words ties up weighting Term vector adds and the N-dimensional target compound word for meeting preparatory condition is filtered out with result according to adding.
Thus, the model structure newly increased can with original structure and deposit, further lift scheme performance..
The present invention proposes a kind of computer equipment, and Fig. 9 is the structure of computer equipment according to an embodiment of the invention Schematic diagram.As shown in figure 9, memory 21, processor 22 and being stored in the meter that can be run on memory 21 and on processor 22 Calculation machine program.
Processor 22 realizes that what is provided in above-described embodiment is used for the semantic compound process excavated when performing described program Method.
Further, computer equipment also includes:
Communication interface 23, for the communication between memory 21 and processor 22.
Memory 21, for depositing the computer program that can be run on processor 22.
Memory 21 may include high-speed RAM memory, it is also possible to also including nonvolatile memory (non-volatile Memory), a for example, at least magnetic disk storage.
Processor 22, realize during for performing described program and be used for described in above-described embodiment at the semantic compound word excavated Reason method.
If memory 21, processor 22 and the independent realization of communication interface 23, communication interface 21, memory 21 and processing Device 22 can be connected with each other by bus and complete mutual communication.The bus can be industry standard architecture (Industry Standard Architecture, referred to as ISA) bus, external equipment interconnection (Peripheral Component, referred to as PCI) bus or extended industry-standard architecture (Extended Industry Standard Architecture, referred to as EISA) bus etc..The bus can be divided into address bus, data/address bus, controlling bus etc.. For ease of representing, only represented in Fig. 9 with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 21, processor 22 and communication interface 23, are integrated in chip piece Upper realization, then memory 21, processor 22 and communication interface 23 can complete mutual communication by internal interface.
Processor 22 is probably a central processing unit (Central Processing Unit, referred to as CPU), or Specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), or by with It is set to the one or more integrated circuits for implementing the embodiment of the present invention.
In order to realize above-described embodiment, the present invention also proposes a kind of non-transitorycomputer readable storage medium, when described When instruction in storage medium is performed by processor, enabling perform a kind of compound process side for being used for semantic excavation Method, methods described include:Determine M participle of every words in training corpus;According to N number of point of the appearance sequential selection of M participle Word generates N-dimensional compound word, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;The character string of N-dimensional compound word is entered K Hash operation of row, it is uniquely corresponding with each Hash operation result to inquire about acquisition in the random Harsh dictionary space pre-established Position, and according to the floating number of K position corresponding with K Hash operation result generate N-dimensional compound word K tie up word to Amount, wherein, K is the integer more than 1;The N-dimensional mesh for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors Compound word is marked, N-dimensional target compound word is inputed into bag of words carries out semantic excavation.
In order to realize above-described embodiment, the present invention also proposes a kind of computer program product, when the computer program produces When instruction processing unit in product performs, a kind of compound process method for being used for semantic excavation is performed, methods described includes:It is determined that M participle of every words in training corpus;According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of M participle, wherein, M is more than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;K Hash operation is carried out to the character string of N-dimensional compound word, inquiry is pre- Obtained in the random Harsh dictionary space first established and each Hash operation result uniquely corresponding position, and being breathed out according to K times The K dimension term vectors of the floating number generation N-dimensional compound word of K position corresponding to uncommon operation result, wherein, K is whole more than 1 Number;The N-dimensional target compound word for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors, N-dimensional target is answered Close word and input to the semantic excavation of bag of words progress.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description Point is contained at least one embodiment or example of the present invention.In this manual, to the schematic representation of above-mentioned term not Identical embodiment or example must be directed to.Moreover, specific features, structure, material or the feature of description can be with office Combined in an appropriate manner in one or more embodiments or example.In addition, in the case of not conflicting, the skill of this area Art personnel can be tied the different embodiments or example and the feature of different embodiments or example described in this specification Close and combine.
In addition, term " first ", " second " are only used for describing purpose, and it is not intended that instruction or hint relative importance Or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the invention, " multiple " are meant that at least two, such as two, three It is individual etc., unless otherwise specifically defined.
Any process or method described otherwise above description in flow chart or herein is construed as, and represents to include Module, fragment or the portion of the code of the executable instruction of one or more the step of being used to realize custom logic function or process Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discuss suitable Sequence, including according to involved function by it is basic simultaneously in the way of or in the opposite order, carry out perform function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system including the system of processor or other can be held from instruction The system of row system, device or equipment instruction fetch and execute instruction) use, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass Defeated program is for instruction execution system, device or equipment or the dress used with reference to these instruction execution systems, device or equipment Put.The more specifically example (non-exhaustive list) of computer-readable medium includes following:Electricity with one or more wiring Connecting portion (electronic installation), portable computer diskette box (magnetic device), random access memory (RAM), read-only storage (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device, and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium, which can even is that, to print the paper of described program thereon or other are suitable Medium, because can then enter edlin, interpretation or if necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned In embodiment, software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware with another embodiment, following skill well known in the art can be used Any one of art or their combination are realized:With the logic gates for realizing logic function to data-signal from Logic circuit is dissipated, the application specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..
Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method carries Suddenly it is that by program the hardware of correlation can be instructed to complete, described program can be stored in a kind of computer-readable storage medium In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing module, can also That unit is individually physically present, can also two or more units be integrated in a module.Above-mentioned integrated mould Block can both be realized in the form of hardware, can also be realized in the form of software function module.The integrated module is such as Fruit is realized in the form of software function module and as independent production marketing or in use, can also be stored in a computer In read/write memory medium.
Storage medium mentioned above can be read-only storage, disk or CD etc..Although have been shown and retouch above Embodiments of the invention are stated, it is to be understood that above-described embodiment is exemplary, it is impossible to be interpreted as the limit to the present invention System, one of ordinary skill in the art can be changed to above-described embodiment, change, replace and become within the scope of the invention Type.

Claims (10)

1. a kind of be used for the semantic compound process method excavated, it is characterised in that comprises the following steps:
Determine M participle of every words in training corpus;
According to the N number of participle generation N-dimensional compound word of the appearance sequential selection of described M participle, wherein, M is more than or equal to 2, and N is more than Equal to 2 and it is less than or equal to M;
K Hash operation is carried out to the character string of the N-dimensional compound word, inquires about in the random Harsh dictionary space pre-established and obtains Take and each Hash operation result uniquely corresponding position, and according to the floating-point of K position corresponding with K Hash operation result The K dimension term vectors of the N-dimensional compound word are digitally generated, wherein, K is the integer more than 1;
The N-dimensional target compound word for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension term vectors, by the N-dimensional mesh Mark compound word inputs to bag of words and carries out semantic excavation.
2. the method as described in claim 1, it is characterised in that described to be filtered out according to the K of all N-dimensional compound words dimension term vectors Meet the N-dimensional target compound word of preparatory condition, including:
The K dimension term vectors of each N-dimensional compound word are input in default linear regression model (LRM), obtains and represents that each N-dimensional is compound The weight of word significance level;
Term vector and respective weights are tieed up according to the K of each N-dimensional compound word, obtain the K dimension weighting term vectors of each N-dimensional compound word;
The N-dimensional target compound word for meeting preparatory condition is filtered out according to the K of all N-dimensional compound words dimension weighting term vectors.
3. method as claimed in claim 2, it is characterised in that the K dimension term vectors and correspondingly of each N-dimensional compound word of basis Weight, the K dimension weighting term vectors of each N-dimensional compound word are obtained, including:
The K dimension term vectors of each N-dimensional compound word and the product of respective weights are calculated, obtains the K dimension weighted words of each N-dimensional compound word Vector.
4. method as claimed in claim 2, it is characterised in that the M participle of every words in the determination training corpus Afterwards, in addition to:
Two-dimentional compound word is generated according to 2 participles of appearance sequential selection of described M participle;
The character string of the two-dimentional compound word is carried out calculating acquisition result of calculation, original term vector dictionary space is inquired about, obtains Unique positions corresponding with the result of calculation, tieed up using the K for being digitally generated the two-dimentional compound word with the position correspondence Term vector, wherein, K is the integer more than 1;
It is described that the N-dimensional target compound word for meeting preparatory condition, bag are filtered out according to the K of all N-dimensional compound words dimension weighting term vectors Include:
By the K of the two-dimentional compound word tie up term vector and all N-dimensional compound words K dimension weighting term vectors add and, according to adding The N-dimensional target compound word for meeting preparatory condition is filtered out with result.
5. the method as described in claim 1-4 is any, it is characterised in that described that the N-dimensional target compound word is inputed into word Bag model carries out semantic excavation, including:
Semantic detection is carried out to text using the N-dimensional target compound word in the bag of words;
The compound word for being unsatisfactory for text semantic is screened out according to testing result.
6. a kind of be used for the semantic compound process device excavated, it is characterised in that including:
Determining module, for determining M participle of every words in training corpus;
First generation module, for the N number of participle generation N-dimensional compound word of the appearance sequential selection according to described M participle, wherein, M More than or equal to 2, and N is more than or equal to 2 and is less than or equal to M;
First processing module, for carrying out K Hash operation to the character string of the N-dimensional compound word, inquire about pre-establish with Obtained in machine Hash dictionary space with the unique corresponding position of each Hash operation result, and according to K Hash operation result The floating number of corresponding K position generates the K dimension term vectors of the N-dimensional compound word, wherein, K is the integer more than 1;
Screening module, meet that the N-dimensional target of preparatory condition is compound for tieing up term vector according to the K of all N-dimensional compound words and filtering out Word;
Module is excavated, semantic excavation is carried out for the N-dimensional target compound word to be inputed into bag of words.
7. device as claimed in claim 6, it is characterised in that also include:
Second generation module, for generating two-dimentional compound word according to 2 participles of appearance sequential selection of described M participle;
Second processing module, result of calculation is obtained for the character string of the two-dimentional compound word calculate, inquires about prime word Vectorial dictionary space, corresponding with result of calculation unique positions are obtained, institute is digitally generated using with the position correspondence The K dimension term vectors of two-dimentional compound word are stated, wherein, K is the integer more than 1;
The screening module is specifically used for:The K that the K of the two-dimentional compound word is tieed up to term vector and all N-dimensional compound words is tieed up Weighting term vector adds and the N-dimensional target compound word for meeting preparatory condition is filtered out with result according to adding.
8. a kind of computer equipment, it is characterised in that including memory, processor and storage on a memory and can be in processor The computer program of upper operation, during the computing device described program, realization is used for as described in any in claim 1-5 The compound process method that semanteme excavates.
9. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, it is characterised in that the program quilt Realized during computing device and be used for the semantic compound process method excavated as described in any in claim 1-5.
10. a kind of computer program product, it is characterised in that when the instruction in the computer program product is by computing device When, perform and be used for the semantic compound process method excavated as described in any in claim 1-5.
CN201711163429.2A 2017-11-21 2017-11-21 Compound word processing method, device and equipment for semantic mining Active CN107894979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711163429.2A CN107894979B (en) 2017-11-21 2017-11-21 Compound word processing method, device and equipment for semantic mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711163429.2A CN107894979B (en) 2017-11-21 2017-11-21 Compound word processing method, device and equipment for semantic mining

Publications (2)

Publication Number Publication Date
CN107894979A true CN107894979A (en) 2018-04-10
CN107894979B CN107894979B (en) 2021-09-17

Family

ID=61805758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711163429.2A Active CN107894979B (en) 2017-11-21 2017-11-21 Compound word processing method, device and equipment for semantic mining

Country Status (1)

Country Link
CN (1) CN107894979B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059183A (en) * 2019-03-22 2019-07-26 重庆邮电大学 A kind of automobile industry User Perspective sensibility classification method based on big data
CN110457692A (en) * 2019-07-26 2019-11-15 清华大学 Compound word indicates learning method and device
CN110569498A (en) * 2018-12-26 2019-12-13 东软集团股份有限公司 Compound word recognition method and related device
CN114548115A (en) * 2022-02-23 2022-05-27 北京三快在线科技有限公司 Method and device for explaining compound nouns and electronic equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007219778A (en) * 2006-02-16 2007-08-30 Murata Mach Ltd Document processor
CN101093504A (en) * 2006-03-24 2007-12-26 国际商业机器公司 System for extracting new compound word
CN102200984A (en) * 2010-03-24 2011-09-28 深圳市腾讯计算机系统有限公司 Search method based on compound words and search engine server
US8046212B1 (en) * 2003-10-31 2011-10-25 Access Innovations Identification of chemical names in text-containing documents
CN102859515A (en) * 2010-02-12 2013-01-02 谷歌公司 Compound splitting
CN103646080A (en) * 2013-12-12 2014-03-19 北京京东尚科信息技术有限公司 Microblog duplication-eliminating method and system based on reverse-order index
CN104657350A (en) * 2015-03-04 2015-05-27 中国科学院自动化研究所 Hash learning method for short text integrated with implicit semantic features
CN105843960A (en) * 2016-04-18 2016-08-10 上海泥娃通信科技有限公司 Semantic tree based indexing method and system
CN106687952A (en) * 2014-09-26 2017-05-17 甲骨文国际公司 Techniques for similarity analysis and data enrichment using knowledge sources
CN107193802A (en) * 2017-05-25 2017-09-22 上海耐相智能科技有限公司 A kind of smart field concept auto acquisition system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8046212B1 (en) * 2003-10-31 2011-10-25 Access Innovations Identification of chemical names in text-containing documents
JP2007219778A (en) * 2006-02-16 2007-08-30 Murata Mach Ltd Document processor
CN101093504A (en) * 2006-03-24 2007-12-26 国际商业机器公司 System for extracting new compound word
CN102859515A (en) * 2010-02-12 2013-01-02 谷歌公司 Compound splitting
CN102200984A (en) * 2010-03-24 2011-09-28 深圳市腾讯计算机系统有限公司 Search method based on compound words and search engine server
CN103646080A (en) * 2013-12-12 2014-03-19 北京京东尚科信息技术有限公司 Microblog duplication-eliminating method and system based on reverse-order index
CN106687952A (en) * 2014-09-26 2017-05-17 甲骨文国际公司 Techniques for similarity analysis and data enrichment using knowledge sources
CN104657350A (en) * 2015-03-04 2015-05-27 中国科学院自动化研究所 Hash learning method for short text integrated with implicit semantic features
CN105843960A (en) * 2016-04-18 2016-08-10 上海泥娃通信科技有限公司 Semantic tree based indexing method and system
CN107193802A (en) * 2017-05-25 2017-09-22 上海耐相智能科技有限公司 A kind of smart field concept auto acquisition system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DAN SVENSTRUP等: "Hash Embeddings for Efficient Word Representations", 《HTTPS://ARXIV.ORG/ABS/1709.03933?CONTEXT=CS.CL》 *
LEI SHI等: "Functional Hashing for Compressing Neural Networks", 《HTTPS://ARXIV.ORG/PDF/1605.06560.PDF》 *
欧阳柳波等: "基于位置标签与词性结合的组合词抽取方法", 《计算机应用研究》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569498A (en) * 2018-12-26 2019-12-13 东软集团股份有限公司 Compound word recognition method and related device
CN110569498B (en) * 2018-12-26 2022-12-09 东软集团股份有限公司 Compound word recognition method and related device
CN110059183A (en) * 2019-03-22 2019-07-26 重庆邮电大学 A kind of automobile industry User Perspective sensibility classification method based on big data
CN110059183B (en) * 2019-03-22 2022-08-23 重庆邮电大学 Automobile industry user viewpoint emotion classification method based on big data
CN110457692A (en) * 2019-07-26 2019-11-15 清华大学 Compound word indicates learning method and device
CN114548115A (en) * 2022-02-23 2022-05-27 北京三快在线科技有限公司 Method and device for explaining compound nouns and electronic equipment
CN114548115B (en) * 2022-02-23 2023-01-06 北京三快在线科技有限公司 Method and device for explaining compound nouns and electronic equipment

Also Published As

Publication number Publication date
CN107894979B (en) 2021-09-17

Similar Documents

Publication Publication Date Title
CN110232183B (en) Keyword extraction model training method, keyword extraction device and storage medium
CN107894979A (en) The compound process method, apparatus and its equipment excavated for semanteme
CN111063410B (en) Method and device for generating medical image text report
CN110705294A (en) Named entity recognition model training method, named entity recognition method and device
CN110287480A (en) A kind of name entity recognition method, device, storage medium and terminal device
CN108334499A (en) A kind of text label tagging equipment, method and computing device
CN110083700A (en) A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN108763445A (en) Construction method, device, computer equipment and the storage medium in patent knowledge library
CN108280064A (en) Participle, part-of-speech tagging, Entity recognition and the combination treatment method of syntactic analysis
CN112084331A (en) Text processing method, text processing device, model training method, model training device, computer equipment and storage medium
CN109492666A (en) Image recognition model training method, device and storage medium
CN106155686A (en) Interface creating method, device and system
CN106202010A (en) The method and apparatus building Law Text syntax tree based on deep neural network
CN108763555A (en) Representation data acquisition methods and device based on demand word
CN109344404A (en) The dual attention natural language inference method of context aware
CN110442840A (en) Sequence labelling network update method, electronic health record processing method and relevant apparatus
CN110232123A (en) The sentiment analysis method and device thereof of text calculate equipment and readable medium
CN107977363A (en) Title generation method, device and electronic equipment
CN110555084A (en) remote supervision relation classification method based on PCNN and multi-layer attention
CN107122492A (en) Lyric generation method and device based on picture content
CN113220876B (en) Multi-label classification method and system for English text
CN109766553A (en) A kind of Chinese word cutting method of the capsule model combined based on more regularizations
CN110399488A (en) File classification method and device
CN106951413A (en) Segmenting method and device based on artificial intelligence
CN106844340A (en) News in brief generation and display methods, apparatus and system based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant