CN110020431A - Feature extracting method, device, computer equipment and the storage medium of text information - Google Patents

Feature extracting method, device, computer equipment and the storage medium of text information Download PDF

Info

Publication number
CN110020431A
CN110020431A CN201910168231.6A CN201910168231A CN110020431A CN 110020431 A CN110020431 A CN 110020431A CN 201910168231 A CN201910168231 A CN 201910168231A CN 110020431 A CN110020431 A CN 110020431A
Authority
CN
China
Prior art keywords
text information
length
metanetwork
length adjustment
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910168231.6A
Other languages
Chinese (zh)
Other versions
CN110020431B (en
Inventor
赵峰
王健宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910168231.6A priority Critical patent/CN110020431B/en
Publication of CN110020431A publication Critical patent/CN110020431A/en
Priority to PCT/CN2019/117424 priority patent/WO2020177378A1/en
Application granted granted Critical
Publication of CN110020431B publication Critical patent/CN110020431B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a kind of feature extracting method of text information, device, computer equipment and storage mediums, the described method includes: be arranged and train metanetwork, the metanetwork refers to the network of the corresponding one group of unique filter of the text information for generating with being inputted;It is the input length of the metanetwork by the length adjustment of text information to be identified;The text information after length adjustment is passed to the metanetwork, the corresponding one group of unique filter of the text information is generated by the metanetwork, unique filter refers to the context-sensitive filter with the text information after length adjustment;The text information after length adjustment is passed to unique filter, the corresponding eigenvectors matrix of the text information is extracted by unique filter.The present invention solves the problems, such as that existing text recognition technique can not adapt to that context of co-text, text identification accuracy rate is not good enough.

Description

Feature extracting method, device, computer equipment and the storage medium of text information
Technical field
The present invention relates to information technology field more particularly to a kind of feature extracting methods of text information, device, computer Equipment and storage medium.
Background technique
Convolutional neural networks are in a kind of basic module for being increasingly becoming natural language processing recently, although having obtained success, But most of existing convolutional neural networks are all using the identical static filtering to all input sentence application acquistions Device.It is relevant to text that the maximum deficiency of static filter, which is it not, that is, it comparably treats all types of texts This.For example our people, when reading a scientific popular article and a current political news, reading method is generally different, the weight of reading Point is also usually different;For current political news, we should the information such as main extraction time, place, personage, event, and it is right In scientific popular article, it should give bigger weight in the relationships such as concept, logic, cause and effect.And static filter can only be with equally Weight to treat all contextual informations, thus be restricted in accuracy rate this aspect of text identification.
It can be seen that find it is a kind of can adapt to context of co-text, improve text identification accuracy rate method become ability The technical issues of domain urgent need to resolve.
Summary of the invention
The embodiment of the invention provides a kind of feature extracting method of text information, device, computer equipment and storages to be situated between Matter, to solve the problems, such as that existing text recognition technique can not adapt to that context of co-text, text identification accuracy rate is not good enough.
A kind of feature extracting method of text information, comprising:
Be arranged and train metanetwork, the metanetwork refer to the text information for generating with being inputted it is corresponding one group only The network of one filter;
Obtain text information to be identified;
It is the input length of the metanetwork by the length adjustment of the text information to be identified;
It is passed to the metanetwork using the text information after length adjustment as input, institute is generated by the metanetwork The corresponding one group of unique filter of text information is stated, unique filter refers to and the text information after length adjustment Context-sensitive filter;
It is passed to unique filter using the text information after length adjustment as input, passes through unique filtering Device extracts the corresponding eigenvectors matrix of the text information, text described in each element representation in described eigenvector matrix The feature of information.
Optionally, the length adjustment by the text information to be identified is the input length packet of the metanetwork It includes:
The input length for obtaining the metanetwork, judges whether the length of the text information to be identified reaches described defeated Enter length;
When if not, preset characters are filled to the text information end to be identified, by the text to be identified The length adjustment of information is the input length.
Optionally, the text information using after length adjustment is passed to the metanetwork as input, by described Metanetwork generates the corresponding one group of unique filter of the text information
Vectorization processing is carried out to the text information after length adjustment, obtains vector matrix, in the vector matrix It is embedded in vector including several words, each word is embedded in the equal length of vector;
Convolution algorithm is executed to the vector matrix by the metanetwork, obtains the hidden layer vector of designated length;
Transposition convolution algorithm is executed to the hidden layer vector, the text information corresponding one after obtaining length adjustment The unique filter of group.
Optionally, the text information using after length adjustment is passed to unique filter as input, passes through Unique filter extracts the corresponding eigenvectors matrix of the text information
Vectorization processing is carried out to the text information after length adjustment, obtains vector matrix, in the vector matrix It is embedded in vector including several words, each word is embedded in the equal length of vector;
Convolution algorithm is executed to the vector matrix by unique filter, extracts the corresponding spy of the text information Sign figure;
Pondization operation is executed to the characteristic pattern, the maximum value of every a line in characteristic pattern is extracted as main feature, obtains The corresponding eigenvectors matrix of the text information.
Optionally, it after generating the corresponding eigenvectors matrix of the text information by unique filter, also wraps It includes:
It is passed to full articulamentum using described eigenvector matrix as input, is then passed the output of full articulamentum as input Enter preset Softmax classifier;
The corresponding classification of the text information is obtained according to the output of the Softmax classifier.
A kind of feature deriving means of text information, comprising:
Training module, for being arranged and training metanetwork, the metanetwork refers to the text envelope for generating Yu being inputted Cease the network of corresponding one group of unique filter;
Data obtaining module, for obtaining text information to be identified;
Length adjustment module, for the length adjustment of the text information to be identified is long for the input of the metanetwork Degree;
Filter generation module, for being passed to the metanetwork for the text information after length adjustment as input, The corresponding one group of unique filter of the text information is generated by the metanetwork, unique filter refers to and length tune The context-sensitive filter of the text information after whole;
Characteristic extracting module, for being passed to unique filtering for the text information after length adjustment as input Device extracts the corresponding eigenvectors matrix of the text information by the unique filter, in described eigenvector matrix The feature of text information described in each element representation.
Optionally, the length adjustment module includes:
Length acquiring unit judges the text information to be identified for obtaining the input length of the metanetwork Whether length reaches the input length;
Length adjustment means will if the length for the text information to be identified is not up to the input length Preset characters are filled to the text information end to be identified, using by the length adjustment of the text information to be identified as institute State input length.
Optionally, the filter generation module includes:
Primary vector unit obtains vector for carrying out vectorization processing to the text information after length adjustment Matrix includes that several words are embedded in vector in the vector matrix, each word is embedded in the equal length of vector;
First convolution unit obtains specified length for executing convolution algorithm to the vector matrix by the metanetwork The hidden layer vector of degree;
Transposition convolution unit, for executing transposition convolution algorithm to the hidden layer vector, after obtaining the length adjustment The corresponding one group of unique filter of text information.
A kind of computer equipment, including memory, processor and storage are in the memory and can be in the processing The computer program run on device, the processor realize the feature extraction of above-mentioned text information when executing the computer program Method.
A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, the meter Calculation machine program realizes the feature extracting method of above-mentioned text information when being executed by processor.
For the embodiment of the present invention by being arranged and training metanetwork, the metanetwork refers to the text for generating Yu being inputted The network of the corresponding one group of unique filter of information;When being identified to text information, obtained according to text information to be identified Metanetwork is taken, and is the input length of the metanetwork by the length adjustment of the text information to be identified;Then by length The text information adjusted is passed to the metanetwork as input, and it is corresponding to generate the text information by the metanetwork One group of unique filter, unique filter refers to the context-sensitive mistake with the text information after length adjustment Filter;It is passed to unique filter using the text information after length adjustment as input, passes through unique filter Extract the corresponding eigenvectors matrix of the text information, text envelope described in each element representation in described eigenvector matrix The feature of breath;Learn its weight matrix in convolutional neural networks for text information to be identified to realize, solves The existing text recognition technique based on convolutional neural networks, which can not adapt to context of co-text, text identification accuracy rate is not good enough asks Topic, greatly increases the accuracy rate of the feature extraction of text information.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.
Fig. 1 is a flow chart of the feature extracting method of text information in one embodiment of the invention;
Fig. 2 is a flow chart of step S103 in the feature extracting method of text information in one embodiment of the invention;
Fig. 3 is a flow chart of step S104 in the feature extracting method of text information in one embodiment of the invention;
Fig. 4 is a flow chart of step S105 in the feature extracting method of text information in one embodiment of the invention;
Fig. 5 is a flow chart of the feature extracting method of text information in one embodiment of the invention;
Fig. 6 is a functional block diagram of the feature deriving means of text information in one embodiment of the invention;
Fig. 7 is a schematic diagram of computer equipment in one embodiment of the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.
The feature extracting method of text information provided in an embodiment of the present invention is applied to server.The server can be used The server cluster of independent server either multiple servers composition is realized.In one embodiment, as shown in Figure 1, mentioning For a kind of feature extracting method of text information, include the following steps:
In step s101, it is arranged and training metanetwork, the metanetwork refers to the text envelope for generating Yu being inputted Cease the network of corresponding one group of unique filter.
Herein, it uses to solve most of existing convolutional neural networks to all text information applications to be identified The problem of identical static filter of acquistion, the embodiment of the present invention, which proposes, a kind of learns context of co-text using metanetwork Convolutional neural networks, and be applied to text-processing.Wherein, the metanetwork refers to the text information for generating Yu being inputted The network of corresponding one group of unique filter can be generated upper with the text information that is inputted according to the text information that is inputted The hereafter relevant weight matrix of context is a network about network.Metanetwork filter generated is for inhomogeneity The text information customization of type, it is suitable for the different types of text information, to change in previous convolutional neural networks The status of all types of texts is put on an equal footing using same filter.Due to the context language of the filter and text information Border is related, allows the feature extracted more acurrate.
In embodiments of the present invention, the metanetwork can be any differentiable depth network.The metanetwork is preparatory It is obtained by trained on a large amount of training text collection.When carrying out the feature extraction of text information, obtains training in advance and obtain Metanetwork, and the parameter of the metanetwork is finely adjusted, obtains the metanetwork suitable for this text information.
In step s 102, text information to be identified is obtained.
In embodiments of the present invention, the text information is formed according to specified language linking and Semantic Coherence rule One sentence, including but not limited to words art text information, literary text information.Server can according to actual needs or application Scene needs to obtain text information to be identified.For example, server obtains text information to be identified from presetting database, A large amount of text information is had collected in the presetting database in advance.It is used alternatively, server is obtained by the microphone of client The voice messaging of family input, is then converted to text for the voice messaging, obtains text information to be identified.Or clothes Device of being engaged in by the camera function of client obtains image information, then carries out OCR text identification to described image information, obtain to The text information of identification.It is understood that server can also get text information to be identified in several ways, this Place no longer excessively repeats.
In order to improve the value of metanetwork output result, the accuracy rate of Text character extraction, the text to be identified are improved It is the same or similar, i.e., the described text to be identified that this information, which is preferably with the distribution of the training text and context of the metanetwork, This information is identical as the training text or is a part in training text, and the two is in the levels such as style, type, semanteme It is the same or similar.
It in step s 103, is the input length of the metanetwork by the length adjustment of the text information to be identified.
Herein, the length of the text information refers to the string length of text information.In order to enable metanetwork according to The size of unique filter of different input text generations be it is unified, the embodiment of the present invention is by the text to be identified Before information input metanetwork, the length for adjusting the text information to be identified is the input length of metanetwork.The input Length is the string length of the input parameter of pre-set metanetwork.It is alternatively possible to unite in advance in training metanetwork The distribution of the string length of all training texts is counted, input length of the maximum length as metanetwork is then chosen, with Carry out unification.As shown in Fig. 2, the length adjustment of the text information to be identified is the metanetwork by the step S103 Inputting length includes:
In step S1031, the input length of the metanetwork is obtained, judges the length of the text information to be identified Whether the input length is reached.
Herein, the embodiment of the present invention is by calculating the character number of the text information to be identified, obtain it is described to The string length of the text information of identification.Then the string length is compared with the input length, with judgement Whether the length of the text information to be identified reaches the input length.
If when, i.e., the length of the described text information to be identified reaches the input length, then go to step S104, The corresponding one group of unique filter of the text information to be identified is generated by the metanetwork.
In step S1032, if not when, preset characters are filled to the text information end to be identified, by institute The length adjustment for stating text information to be identified is the input length.Then go to step S104.
Herein, the text information to be identified for inputting length is not up to for string length, the embodiment of the present invention is adopted The text information to be identified is adjusted to input length by the text information to be identified described in preset characters polishing.Institute Stating preset characters is the spcial character for representing blank, such as NUL for metanetwork, convolutional neural networks.
In order to make it easy to understand, long to the input that the length of above-mentioned steps S103 adjustment text information is the metanetwork below Degree is illustrated.Assuming that counting the distribution of the string length of all training texts in advance, then chooses one and most greatly enhance Degree is used as the input length, such as 7.If text information to be identified is " today, weather was very good ", obtained by step S1031 Its string length is 6, not up to input length 7.Then in step S1033 using preset characters NUL come to described to be identified Text information " today, weather was very good " carry out polishing, the text information " the very good NUL of weather today " after obtaining length adjustment.
In step S104, it is passed to the metanetwork using the text information after length adjustment as input, passes through institute State metanetwork and generate the corresponding one group of unique filter of the text information, unique filter refer to after length adjustment The context-sensitive filter of the text information.
As previously mentioned, the convolutional neural networks that the metanetwork can be adjusted according to input parameter learning filter Device.In embodiments of the present invention, the input parameter is the text information to be identified after length adjustment, and the filter is institute State the corresponding one group of unique filter of text information to be identified.Unique filter is upper and lower with text information to be identified It is literary related, so that can different text informations to be identified be refined and be extracted different spies using unique filter Sign.
Optionally, in order to solve the problems, such as that input length is variable, in embodiments of the present invention, the metanetwork generates predetermined Unique filter of adopted size.As shown in figure 3, the step S104 is passed the text information after length adjustment as input Enter the metanetwork, generating the corresponding one group of unique filter of the text information by the metanetwork includes:
In step S1041, vectorization processing is carried out to the text information after length adjustment, obtains vector matrix, It include that several words are embedded in vector in the vector matrix, each word is embedded in the equal length of vector.
Herein, the embodiment of the present invention carries out vectorization processing to the text information after length adjustment, obtains length The corresponding vector matrix of text information adjusted, as the corresponding vector matrix of the text information to be identified.It is described to It include that several words are embedded in vector in moment matrix.Institute's predicate insertion vector, which refers to, carries out the text information after length adjustment The term vector of each word after participle, i.e. each of text information after length adjustment word are mapped as one in vector matrix A column vector.In embodiments of the present invention, the length of institute predicate insertion vector is preassigned, i.e., for different length to The length of the text information of identification, corresponding word insertion vector is all identical.Although using predetermined word when length adjustment Symbol is to fill the text information to be identified, but the preset characters are one for metanetwork, convolutional neural networks A spcial character for representing blank.The text information after length adjustment is converted vector matrix by the embodiment of the present invention, has Conducive to the identification and study for facilitating subsequent convolutional neural networks, that is, facilitate subsequent execution convolution algorithm and transposition convolution fortune It calculates.
In step S1042, convolution algorithm is executed to the vector matrix by the metanetwork, obtains designated length Hidden layer vector.
After obtaining the corresponding vector matrix of the text information after length adjustment, pass through volume preset in metanetwork Lamination executes convolution operation to the vector matrix, i.e., carries out convolution algorithm to the vector matrix by convolutional layer filter, The dot product calculated between filter and the vector matrix obtains the hidden layer of designated length to extract higher level feature Vector.Herein, the parameter for forming convolutional layer filter can be optimized by loss function.
In step S1043, transposition convolution algorithm is executed to the hidden layer vector, the text after obtaining the length adjustment The corresponding one group of unique filter of this information.
Herein, transposition convolution (transpose convolution) operation, also known as deconvolution (deconvolution) or deconvolution, similar to the inverse operation of convolution.The embodiment of the present invention is hidden described in the step S1042 A transposition convolutional layer has been superimposed on hiding layer.After obtaining hidden layer vector, the hidden layer vector is turned by described Convolutional layer is set, transposition convolution algorithm is carried out, one group of convolution kernel is generated, using the convolution kernel as the text after length adjustment The corresponding one group of unique filter of information, i.e., the corresponding one group of unique filter of described text information to be identified.Herein, group It can be optimized by loss function at the parameter of transposition convolutional layer.It is understood that unique filter is and institute The context-sensitive filter of text information to be identified is stated, for the text information customization to be identified, is suitable for institute State text information to be identified.
In embodiments of the present invention, to be adjusted to specified input by step S103 due to text information to be identified long Degree, is then encapsulated into the vector matrix of an equal length, and hidden layer vector is obtained according to the vector matrix, so that The hidden layer vector is unrelated with the text information length to be identified, ensure that the filter generated by metanetwork to every One text information to be identified dimension having the same and size, that is, the size for passing through the filter that metanetwork generates keep one It causes.
Optionally, in embodiments of the present invention, the parameter in above-mentioned convolutional layer and transposition convolutional layer is joint differentiable , therefore the parameter of convolutional layer and the parameter of transposition convolutional layer can be passed through to the reversed biography of gradient together in training metanetwork Algorithm is broadcast to optimize, update.Herein, the thought of back-propagation algorithm (i.e. BP algorithm) is will to be rolled up by convolutional layer and transposition The output of lamination carries out error calculation, and error is reversely relayed step by step, is mainly propagated by excitation and weight updates two rings Iterative cycles iteration is saved, until the eigenvectors matrix of training text reaches scheduled error desired value.It is calculated by backpropagation Method can advanced optimize the parameter of metanetwork, improve the accuracy that metanetwork generates unique filter.
In step s105, it is passed to unique filter using the text information after length adjustment as input, led to It crosses unique filter and extracts the corresponding eigenvectors matrix of the text information, each member in described eigenvector matrix Element indicates the feature of the text information.
After obtaining the corresponding unique filter of the text information after length adjustment by metanetwork, using described Unique filter identifies the text information after length adjustment.Specifically: by the text envelope after length adjustment Breath is passed to unique filter as input, then obtains the output after unique filter, with output work For the corresponding eigenvectors matrix of the text information to be identified.It include the text to be identified in described eigenvector matrix The characteristic information of this information, i.e. semantic information.
Optionally, as shown in figure 4, the step S105 is using the text information after length adjustment as the incoming institute of input Unique filter is stated, extracting the corresponding eigenvectors matrix of the text information by unique filter includes:
In step S1051, vectorization processing is carried out to the text information after length adjustment, obtains vector matrix, It include that several words are embedded in vector in the vector matrix, each word is embedded in the equal length of vector.
It herein, can be to length adjustment before the text information after length adjustment is passed to unique filter The text information afterwards carries out vectorization processing, a column each of text information word being mapped as in vector matrix Vector, obtains the word insertion vector of each word, and combination institute's predicate insertion vector obtains the text information after length adjustment Corresponding vector matrix.Wherein, the length of institute's predicate insertion vector is preassigned, i.e., for the to be identified of different length The length of text information, corresponding word insertion vector is all identical.The embodiment of the present invention is by the text envelope to be identified Breath is converted into vector matrix, is conducive to facilitate subsequent execution convolution algorithm.
Illustratively, it is assumed that the length of the text information after length adjustment is T, and the word of composition is x1,x2,...,xT.It is right After the text information after length adjustment carries out vectorization processing, a vector matrix X ∈ R is obtainedd×T, in vector matrix X The word of the corresponding d dimension of a word in the text information after each column expression length adjustment is embedded in vector.
In step S1052, convolution algorithm is executed to the vector matrix by unique filter, extracts the text The corresponding characteristic pattern of this information.
After obtaining the corresponding vector matrix of the text information after length adjustment, using the vector matrix as defeated Enter to be passed to unique filter and execute convolution operation, i.e., convolution fortune is carried out to the vector matrix by unique filter It calculates, the dot product calculated between filter and the vector matrix obtains the text information pair to extract higher level feature The characteristic pattern answered.
Illustratively, it is assumed that the weight of unique filter is W ∈ RK×h×d, by unique filter and it is described to The window that each size in moment matrix is h carries out convolution algorithm, obtains the characteristic pattern P of the vector matrix.Wherein, the spy Each of sign figure P element piIt is generated by the text fragments that window size is h: pi=f (W × xi:i+h-1+b)。
In above formula, i=1,2 ..., T-h+1, herein, × indicate that convolution operator, b indicate that dimension is being biased towards for K Amount, f indicate nonlinear activation function, such as ReLU.
In step S1053, pondization operation is executed to the characteristic pattern, extracts the maximum value conduct of every a line in characteristic pattern Main feature obtains the corresponding eigenvectors matrix of the text information.
In embodiments of the present invention, the characteristic pattern passes through the maximum consequently as in the incoming maximum pond layer of input Pond layer extracts maximum value to every a line in the characteristic pattern, obtains main feature, combines all main features and obtain a K Dimensional vector, it is using the K dimensional vector as the corresponding eigenvectors matrix of the text information after length adjustment, i.e., described wait know The corresponding eigenvectors matrix of other text information.The embodiment of the present invention throws aside unessential feature by maximum pond layer, only Feature most outstanding is remained, on the one hand characteristic pattern can be made to become smaller, simplifies computation complexity, identification on the one hand can be improved Accuracy.
The feature of text information described in each element representation in described eigenvector matrix, i.e. semantic information.In this hair In bright embodiment, by context-sensitive, the difference of unique filter and the text information to be identified that metanetwork generates The corresponding unique filter of text information to be identified it is not identical, i.e., the weight matrix in convolutional neural networks is not identical. The eigenvectors matrix that the text information to be identified is obtained by unique filter, greatly increases feature extraction Accuracy rate.
To sum up shown, by being arranged and training metanetwork, the metanetwork refers to for generation and institute the embodiment of the present invention The network of the corresponding one group of unique filter of the text information of input;When being identified to text information, according to be identified Text information obtains metanetwork, and is the input length of the metanetwork by the length adjustment of the text information to be identified; Then it is passed to the metanetwork using the text information after length adjustment as input, the text is generated by the metanetwork The corresponding one group of unique filter of this information, unique filter refer to upper and lower with the text information after length adjustment The relevant filter of text;It is passed to unique filter using the text information after length adjustment as input, by described Unique filter extracts the corresponding eigenvectors matrix of the text information, each element representation in described eigenvector matrix The feature of the text information;To realize for the corresponding set filter of text information acquistion to be identified for identification The text information, solves that existing text recognition technique, which can not adapt to context of co-text, text identification accuracy rate is not good enough asks Topic, greatly improves the accuracy rate of the feature extraction of text information.
In embodiments of the present invention, described eigenvector matrix is corresponding with text information to be identified, including several yuan Element, the feature that each element representation is extracted from the text information, i.e. semantic information.Compared to text to be identified The dimension of information, described eigenvector matrix is substantially reduced.It can be further realized based on described eigenvector matrix to described The classification of text information to be identified.As shown in figure 5, the method can also include:
In step s 106, it is passed to full articulamentum using described eigenvector matrix as input, then by full articulamentum Output is passed to preset Softmax classifier as input.
In step s 107, the corresponding classification of the text information is obtained according to the output of the Softmax classifier.
Herein, the embodiment of the present invention purifies described eigenvector matrix by full articulamentum, by feature to Moment matrix is converted into the vector of specified dimension, and subsequent softmax classifier is facilitated to execute sort operation.The full articulamentum is preparatory According to the quantity N of class categories, K*N weight coefficient and N number of bias are set, K is the last one-dimensional of full articulamentum preceding layer Dimension, that is, the dimension of the eigenvectors matrix exported.Then by the weight matrix of described eigenvector matrix and full articulamentum A bias is added after multiplication, it is resulting and be combined into one-dimensional vector, to obtain the output of full articulamentum.
Preset softmax classifier is passed to using the output of the full articulamentum as input.Herein, described Softmax classifier needs to carry out numerical value processing by Softmax function for handling more classification problems, output.About Softmax function is defined as follows:
Among the above, VnIndicate that the element in the one-dimensional vector of full articulamentum output, n indicate that classification indexes, n=1,2, 3 ..., N, total classification number are N.SnIndicate currentElement VnIndex and all elements index and ratio.By all Sn It is combined into one-dimensional vector, obtains the output of softmax classifier.By above formula it is found that softmax classifier will be polytypic complete Articulamentum output numerical value is converted into relative probability, and element characterizes the relative probability between different classes of, is easy to understand and compares Compared with.Based on the output of the softmax classifier, the corresponding classification possibility of the element of maximum probability is maximum, can be clearly Predict that the text information to be identified is the corresponding classification of element of maximum probability.
Optionally, in embodiments of the present invention, the classification can be intended to classification, for example agree to, refuse, wait etc., It is also possible to webpage classification, emotional category, user comment classification etc., herein with no restriction.
In embodiments of the present invention, unique filter and the text information to be identified generated by metanetwork it is upper Hereafter related, the corresponding unique filter of different text informations to be identified is not identical, i.e., weight matrix is not identical.Pass through institute The eigenvectors matrix that unique filter obtains the text information to be identified is stated, the accurate of feature extraction is greatly increased Rate;Classified based on described eigenvector matrix, further improves the accuracy rate of classification.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.
In one embodiment, a kind of feature deriving means of text information are provided, the feature extraction dress of text information in this It sets and is corresponded with the feature extracting method of text information in above-described embodiment.As shown in fig. 6, the feature extraction of text information Device includes training module 61, data obtaining module 62, length adjustment module 63, filter generation module 64, feature extraction mould Block 65.Detailed description are as follows for each functional module:
Training module 61, for being arranged and training metanetwork, the metanetwork refers to the text for generating Yu being inputted The network of the corresponding one group of unique filter of information;
Data obtaining module 62, for obtaining text information to be identified;
Length adjustment module 63, for being the input of the metanetwork by the length adjustment of the text information to be identified Length;
Filter generation module 64, for being passed to first net for the text information after length adjustment as input Network generates the corresponding one group of unique filter of the text information by the metanetwork, and unique filter refers to and length Spend the context-sensitive filter of the text information adjusted;
Characteristic extracting module 65, for being passed to unique filtering for the text information after length adjustment as input Device extracts the corresponding eigenvectors matrix of the text information by the unique filter, in described eigenvector matrix The feature of text information described in each element representation.
Optionally, the length adjustment module 63 includes:
Length acquiring unit judges the text information to be identified for obtaining the input length of the metanetwork Whether length reaches the input length;
Length adjustment means will if the length for the text information to be identified is not up to the input length Preset characters are filled to the text information end to be identified, using by the length adjustment of the text information to be identified as institute State input length.
Optionally, the filter generation module 64 includes:
Primary vector unit obtains vector for carrying out vectorization processing to the text information after length adjustment Matrix includes that several words are embedded in vector in the vector matrix, each word is embedded in the equal length of vector;
First convolution unit obtains specified length for executing convolution algorithm to the vector matrix by the metanetwork The hidden layer vector of degree;
Transposition convolution unit, for executing transposition convolution algorithm to the hidden layer vector, after obtaining the length adjustment The corresponding one group of unique filter of text information.
Optionally, the characteristic extracting module 65 includes:
Secondary vector unit obtains vector for carrying out vectorization processing to the text information after length adjustment Matrix includes that several words are embedded in vector in the vector matrix, each word is embedded in the equal length of vector;
Second convolution unit extracts institute for executing convolution algorithm to the vector matrix by unique filter State the corresponding characteristic pattern of text information;
Pond unit extracts the maximum value conduct of every a line in characteristic pattern for executing pondization operation to the characteristic pattern Main feature obtains the corresponding eigenvectors matrix of the text information.
Optionally, it after generating the corresponding eigenvectors matrix of the text information by unique filter, also wraps It includes:
Categorization module, for being passed to full articulamentum for described eigenvector matrix as input, then by full articulamentum Output is passed to preset Softmax classifier as input;The text envelope is obtained according to the output of the Softmax classifier Cease corresponding classification.
The specific of feature deriving means about text information limits the feature that may refer to above for text information The restriction of extracting method, details are not described herein.Modules in the feature deriving means of above-mentioned text information can whole or portion Divide and is realized by software, hardware and combinations thereof.Above-mentioned each module can be embedded in the form of hardware or independently of computer equipment In processor in, can also be stored in a software form in the memory in computer equipment, in order to processor calling hold The corresponding operation of the above modules of row.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in Figure 7.The computer equipment include by system bus connect processor, memory, network interface and Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The network interface of machine equipment is used to communicate with external terminal by network connection.When the computer program is executed by processor with Realize a kind of feature extracting method of text information.
In one embodiment, a kind of computer equipment is provided, including memory, processor and storage are on a memory And the computer program that can be run on a processor, processor perform the steps of when executing computer program
Be arranged and train metanetwork, the metanetwork refer to the text information for generating with being inputted it is corresponding one group only The network of one filter;
Obtain text information to be identified;
It is the input length of the metanetwork by the length adjustment of the text information to be identified;
It is passed to the metanetwork using the text information after length adjustment as input, institute is generated by the metanetwork The corresponding one group of unique filter of text information is stated, unique filter refers to and the text information after length adjustment Context-sensitive filter;
It is passed to unique filter using the text information after length adjustment as input, passes through unique filtering Device extracts the corresponding eigenvectors matrix of the text information, text described in each element representation in described eigenvector matrix The feature of information.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program performs the steps of when being executed by processor
Be arranged and train metanetwork, the metanetwork refer to the text information for generating with being inputted it is corresponding one group only The network of one filter;
Obtain text information to be identified;
It is the input length of the metanetwork by the length adjustment of the text information to be identified;
It is passed to the metanetwork using the text information after length adjustment as input, institute is generated by the metanetwork The corresponding one group of unique filter of text information is stated, unique filter refers to and the text information after length adjustment Context-sensitive filter;
It is passed to unique filter using the text information after length adjustment as input, passes through unique filtering Device extracts the corresponding eigenvectors matrix of the text information, text described in each element representation in described eigenvector matrix The feature of information.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided by the present invention, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims (10)

1. a kind of feature extracting method of text information characterized by comprising
It is arranged and trains metanetwork, the metanetwork refers to the corresponding one group of unique mistake of the text information for generating with being inputted The network of filter;
Obtain text information to be identified;
It is the input length of the metanetwork by the length adjustment of the text information to be identified;
It is passed to the metanetwork using the text information after length adjustment as input, the text is generated by the metanetwork The corresponding one group of unique filter of this information, unique filter refer to upper and lower with the text information after length adjustment The relevant filter of text;
It is passed to unique filter using the text information after length adjustment as input, is mentioned by unique filter Take the corresponding eigenvectors matrix of the text information, text information described in each element representation in described eigenvector matrix Feature.
2. the feature extracting method of text information as described in claim 1, which is characterized in that described by the text to be identified The length adjustment of this information is that the input length of the metanetwork includes:
The input length for obtaining the metanetwork, judges whether the length of the text information to be identified reaches the input length Degree;
When if not, preset characters are filled to the text information end to be identified, by the text information to be identified Length adjustment be the input length.
3. the feature extracting method of text information as claimed in claim 1 or 2, which is characterized in that it is described will be after length adjustment The text information be passed to the metanetwork as input, it is one group corresponding to generate the text information by the metanetwork Uniquely filter includes:
Vectorization processing is carried out to the text information after length adjustment, vector matrix is obtained, includes in the vector matrix Several words are embedded in vector, each word is embedded in the equal length of vector;
Convolution algorithm is executed to the vector matrix by the metanetwork, obtains the hidden layer vector of designated length;
Transposition convolution algorithm executed to the hidden layer vector, the text information after obtaining length adjustment it is corresponding one group only One filter.
4. the feature extracting method of text information as claimed in claim 1 or 2, which is characterized in that it is described will be after length adjustment The text information be passed to unique filter as input, pass through the unique filter and extract the text information pair The eigenvectors matrix answered includes:
Vectorization processing is carried out to the text information after length adjustment, vector matrix is obtained, includes in the vector matrix Several words are embedded in vector, each word is embedded in the equal length of vector;
Convolution algorithm is executed to the vector matrix by unique filter, extracts the corresponding feature of the text information Figure;
Pondization operation is executed to the characteristic pattern, extracts the maximum value of every a line in characteristic pattern as main feature, is obtained described The corresponding eigenvectors matrix of text information.
5. the feature extracting method of text information as claimed in claim 1 or 2, which is characterized in that passing through unique mistake After filter extracts the corresponding eigenvectors matrix of the text information, further includes:
It is passed to full articulamentum using described eigenvector matrix as input, it is then incoming pre- using the output of full articulamentum as input If Softmax classifier;
The corresponding classification of the text information is obtained according to the output of the Softmax classifier.
6. a kind of feature deriving means of text information characterized by comprising
Training module, for being arranged and training metanetwork, the metanetwork refers to the text information pair for generating Yu being inputted The network for the one group of unique filter answered;
Data obtaining module, for obtaining text information to be identified;
Length adjustment module, for being the input length of the metanetwork by the length adjustment of the text information to be identified;
Filter generation module passes through for being passed to the metanetwork for the text information after length adjustment as input The metanetwork generates the corresponding one group of unique filter of the text information, unique filter refer to after length adjustment The text information context-sensitive filter;
Characteristic extracting module is led to for being passed to unique filter for the text information after length adjustment as input It crosses unique filter and extracts the corresponding eigenvectors matrix of the text information, each member in described eigenvector matrix Element indicates the feature of the text information.
7. the feature deriving means of text information as claimed in claim 6, which is characterized in that the length adjustment module packet It includes:
Length acquiring unit judges the length of the text information to be identified for obtaining the input length of the metanetwork Whether the input length is reached;
Length adjustment means will be preset if the length for the text information to be identified is not up to the input length The length adjustment of the text information to be identified is described defeated to the text information end to be identified by Character Filling Enter length.
8. the feature deriving means of text information as claimed in claims 6 or 7, which is characterized in that the filter generates mould Block includes:
Primary vector unit, for obtaining vector matrix to the text information progress vectorization processing after length adjustment, It include that several words are embedded in vector in the vector matrix, each word is embedded in the equal length of vector;
First convolution unit obtains designated length for executing convolution algorithm to the vector matrix by the metanetwork Hidden layer vector;
Transposition convolution unit, for executing transposition convolution algorithm to the hidden layer vector, the text after obtaining the length adjustment The corresponding one group of unique filter of this information.
9. a kind of computer equipment, including memory, processor and storage are in the memory and can be in the processor The computer program of upper operation, which is characterized in that the processor realized when executing the computer program as claim 1 to The feature extracting method of 5 described in any item text informations.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In the feature of realization such as text information described in any one of claim 1 to 5 mentions when the computer program is executed by processor Take method.
CN201910168231.6A 2019-03-06 2019-03-06 Feature extraction method and device of text information, computer equipment and storage medium Active CN110020431B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910168231.6A CN110020431B (en) 2019-03-06 2019-03-06 Feature extraction method and device of text information, computer equipment and storage medium
PCT/CN2019/117424 WO2020177378A1 (en) 2019-03-06 2019-11-12 Text information feature extraction method and device, computer apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910168231.6A CN110020431B (en) 2019-03-06 2019-03-06 Feature extraction method and device of text information, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110020431A true CN110020431A (en) 2019-07-16
CN110020431B CN110020431B (en) 2023-07-18

Family

ID=67189329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910168231.6A Active CN110020431B (en) 2019-03-06 2019-03-06 Feature extraction method and device of text information, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN110020431B (en)
WO (1) WO2020177378A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889290A (en) * 2019-11-13 2020-03-17 北京邮电大学 Text encoding method and apparatus, text encoding validity checking method and apparatus
WO2020177378A1 (en) * 2019-03-06 2020-09-10 平安科技(深圳)有限公司 Text information feature extraction method and device, computer apparatus, and storage medium
CN116401381A (en) * 2023-06-07 2023-07-07 神州医疗科技股份有限公司 Method and device for accelerating extraction of medical relations

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073704A (en) * 2010-12-24 2011-05-25 华为终端有限公司 Text classification processing method, system and equipment
CN102541958A (en) * 2010-12-30 2012-07-04 百度在线网络技术(北京)有限公司 Method, device and computer equipment for identifying short text category information
CN105404899A (en) * 2015-12-02 2016-03-16 华东师范大学 Image classification method based on multi-directional context information and sparse coding model
US9659248B1 (en) * 2016-01-19 2017-05-23 International Business Machines Corporation Machine learning and training a computer-implemented neural network to retrieve semantically equivalent questions using hybrid in-memory representations
CN107066553A (en) * 2017-03-24 2017-08-18 北京工业大学 A kind of short text classification method based on convolutional neural networks and random forest
CN107169035A (en) * 2017-04-19 2017-09-15 华南理工大学 A kind of file classification method for mixing shot and long term memory network and convolutional neural networks
CN107301246A (en) * 2017-07-14 2017-10-27 河北工业大学 Chinese Text Categorization based on ultra-deep convolutional neural networks structural model
CN107766324A (en) * 2017-09-25 2018-03-06 浙江大学 A kind of text coherence analysis method based on deep neural network
US20180189559A1 (en) * 2016-12-29 2018-07-05 Ncsoft Corporation Apparatus and method for detecting debatable document
CN108536678A (en) * 2018-04-12 2018-09-14 腾讯科技(深圳)有限公司 Text key message extracting method, device, computer equipment and storage medium
US20180329982A1 (en) * 2017-05-09 2018-11-15 Apple Inc. Context-aware ranking of intelligent response suggestions
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10163022B1 (en) * 2017-06-22 2018-12-25 StradVision, Inc. Method for learning text recognition, method for recognizing text using the same, and apparatus for learning text recognition, apparatus for recognizing text using the same
CN107797985B (en) * 2017-09-27 2022-02-25 百度在线网络技术(北京)有限公司 Method and device for establishing synonymous identification model and identifying synonymous text
CN108763319B (en) * 2018-04-28 2022-02-08 中国科学院自动化研究所 Social robot detection method and system fusing user behaviors and text information
CN110020431B (en) * 2019-03-06 2023-07-18 平安科技(深圳)有限公司 Feature extraction method and device of text information, computer equipment and storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073704A (en) * 2010-12-24 2011-05-25 华为终端有限公司 Text classification processing method, system and equipment
CN102541958A (en) * 2010-12-30 2012-07-04 百度在线网络技术(北京)有限公司 Method, device and computer equipment for identifying short text category information
CN105404899A (en) * 2015-12-02 2016-03-16 华东师范大学 Image classification method based on multi-directional context information and sparse coding model
US9659248B1 (en) * 2016-01-19 2017-05-23 International Business Machines Corporation Machine learning and training a computer-implemented neural network to retrieve semantically equivalent questions using hybrid in-memory representations
US20180189559A1 (en) * 2016-12-29 2018-07-05 Ncsoft Corporation Apparatus and method for detecting debatable document
CN107066553A (en) * 2017-03-24 2017-08-18 北京工业大学 A kind of short text classification method based on convolutional neural networks and random forest
CN107169035A (en) * 2017-04-19 2017-09-15 华南理工大学 A kind of file classification method for mixing shot and long term memory network and convolutional neural networks
US20180329982A1 (en) * 2017-05-09 2018-11-15 Apple Inc. Context-aware ranking of intelligent response suggestions
CN107301246A (en) * 2017-07-14 2017-10-27 河北工业大学 Chinese Text Categorization based on ultra-deep convolutional neural networks structural model
CN107766324A (en) * 2017-09-25 2018-03-06 浙江大学 A kind of text coherence analysis method based on deep neural network
CN108536678A (en) * 2018-04-12 2018-09-14 腾讯科技(深圳)有限公司 Text key message extracting method, device, computer equipment and storage medium
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DINGHAN SHEN ET.AL: "Learning Context-Aware Convolutional Filters for Text Processing", 《ARXIV:1709.08294V2》 *
DINGHAN SHEN ET.AL: "Learning Context-Aware Convolutional Filters for Text Processing", 《ARXIV:1709.08294V2》, 29 August 2018 (2018-08-29), pages 1 - 10 *
吴琼 等: "多尺度卷积循环神经网络的情感分类技术", 《华侨大学学报(自然科学版)》 *
吴琼 等: "多尺度卷积循环神经网络的情感分类技术", 《华侨大学学报(自然科学版)》, vol. 38, no. 6, 30 November 2017 (2017-11-30), pages 875 - 879 *
赖文辉 等: "基于词向量和卷积神经网络的垃圾短信识别方法", 计算机应用, no. 09, pages 27 - 34 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020177378A1 (en) * 2019-03-06 2020-09-10 平安科技(深圳)有限公司 Text information feature extraction method and device, computer apparatus, and storage medium
CN110889290A (en) * 2019-11-13 2020-03-17 北京邮电大学 Text encoding method and apparatus, text encoding validity checking method and apparatus
CN110889290B (en) * 2019-11-13 2021-11-16 北京邮电大学 Text encoding method and apparatus, text encoding validity checking method and apparatus
CN116401381A (en) * 2023-06-07 2023-07-07 神州医疗科技股份有限公司 Method and device for accelerating extraction of medical relations
CN116401381B (en) * 2023-06-07 2023-08-04 神州医疗科技股份有限公司 Method and device for accelerating extraction of medical relations

Also Published As

Publication number Publication date
WO2020177378A1 (en) 2020-09-10
CN110020431B (en) 2023-07-18

Similar Documents

Publication Publication Date Title
Goyal et al. Deep learning for natural language processing
US11816439B2 (en) Multi-turn dialogue response generation with template generation
US10055391B2 (en) Method and apparatus for forming a structured document from unstructured information
CN110929515B (en) Reading understanding method and system based on cooperative attention and adaptive adjustment
CN111368996A (en) Retraining projection network capable of delivering natural language representation
CN110020431A (en) Feature extracting method, device, computer equipment and the storage medium of text information
CN108733837A (en) A kind of the natural language structural method and device of case history text
CN112328900A (en) Deep learning recommendation method integrating scoring matrix and comment text
CN110688834B (en) Method and equipment for carrying out intelligent manuscript style rewriting based on deep learning model
CN112257841A (en) Data processing method, device and equipment in graph neural network and storage medium
CN113408706B (en) Method and device for training user interest mining model and user interest mining
Sadr et al. Convolutional neural network equipped with attention mechanism and transfer learning for enhancing performance of sentiment analysis
CN111625715A (en) Information extraction method and device, electronic equipment and storage medium
CN112632256A (en) Information query method and device based on question-answering system, computer equipment and medium
CN112699310A (en) Cold start cross-domain hybrid recommendation method and system based on deep neural network
Sugomori Java Deep Learning Essentials
CN113127604B (en) Comment text-based fine-grained item recommendation method and system
CN114996486A (en) Data recommendation method and device, server and storage medium
CN112632296A (en) Knowledge graph-based paper recommendation method and system with interpretability and terminal
CN116797248A (en) Data traceability management method and system based on block chain
CN109635303B (en) Method for recognizing meaning-changing words in specific field
Malakan et al. Vision transformer based model for describing a set of images as a story
CN114610989B (en) Personalized thesis recommendation method and system based on heterogeneous graph dynamic information compensation
CN114881038A (en) Chinese entity and relation extraction method and device based on span and attention mechanism
CN114861610A (en) Title generation method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant