CN116070638A - Training updating method and system for Chinese sentence feature construction - Google Patents

Training updating method and system for Chinese sentence feature construction Download PDF

Info

Publication number
CN116070638A
CN116070638A CN202310001746.3A CN202310001746A CN116070638A CN 116070638 A CN116070638 A CN 116070638A CN 202310001746 A CN202310001746 A CN 202310001746A CN 116070638 A CN116070638 A CN 116070638A
Authority
CN
China
Prior art keywords
feature
training
information
increment
information increment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310001746.3A
Other languages
Chinese (zh)
Other versions
CN116070638B (en
Inventor
杜浩鹏
徐圣兵
王振友
谢锐
吴宇佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202310001746.3A priority Critical patent/CN116070638B/en
Publication of CN116070638A publication Critical patent/CN116070638A/en
Application granted granted Critical
Publication of CN116070638B publication Critical patent/CN116070638B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a training updating method and a training updating system for Chinese sentence characteristic construction, wherein the method comprises the following steps: extracting features of the training set and the application set to obtain respective feature matrixes; taking the average value of the sum vector of the differences between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment; performing fine adjustment training on the information increment factors based on the feature matrix of the training set to obtain a trained information increment model; and carrying out feature update on the application set by applying the trained information increment model to obtain a new application set feature matrix. The system comprises: the system comprises a feature extraction module, a model construction module, a fine adjustment training module and an application updating module. The invention can dynamically update the feature vector of the word and improve the accuracy of the new semantic expression of the word in different sentences. The training updating method and system for the Chinese sentence characteristic structure can be widely applied to the field of natural language processing.

Description

Training updating method and system for Chinese sentence feature construction
Technical Field
The invention relates to the technical field of natural language processing, in particular to a training updating method and system for Chinese sentence feature construction.
Background
Natural language processing is widely used in communication, language dialogue, document processing, etc. and is widely used in life. With the development of technology, natural language processing is an important technology for connecting human communication with computer data, and is also an important method for enabling human to directly communicate with machines. How to make machines more precise for human services is also a constant problem for researchers.
Natural language processing is today based on pre-trained models. Natural language processes text, most importantly, feature extraction and representation of text. For Chinese, meaning varies in different contexts, and sometimes meaning varies from day to day by location. Because the pre-training model performs unsupervised learning through a large number of corpora, an exact feature vector library is formed, so that the feature vectors of the obtained words are fixed, and when the words and the words form sentences for feature extraction, the words are independent of each other. When some words in the Chinese sentence form another Chinese sentence, the feature vectors of the words are still represented by the same feature vector. With the change of time, certain words can derive new semantics from different sentences, and the direct application of the pre-training model feature extraction to the words can cause the sentences to lose original information, and the situations that the new semantics cannot be accurately captured exist.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide a training updating method and a training updating system for the feature construction of Chinese sentences, which can dynamically update the feature vectors of words and improve the accuracy of new semantic expression of the words in different sentences.
The first technical scheme adopted by the invention is as follows: a training updating method for Chinese sentence characteristic construction comprises the following steps:
extracting features of the training set and the application set to obtain respective feature matrixes;
taking the average value of the sum vector of the difference between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment;
performing fine tuning training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the trained information increment factors;
and updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain an updated application feature matrix.
Further, the step of extracting features from the training set and the application set to obtain respective feature matrices specifically includes:
splitting the Chinese character vectors in the training set and the application set according to the Chinese characters to obtain Chinese character coding vectors;
according to the coding number of the Chinese character coding vector, carrying out feature extraction on each word in the training set and the application set sentences by utilizing a pre-training feature word library to obtain a feature vector of each word;
and superposing the feature vector of each word in the training set and the application set according to the coding number to obtain feature matrixes of the training set and the application set.
Through the preferred step, the characteristic information of the sentence is represented in a matrix mode, the characteristic information of the word is represented in a row vector mode, the characteristic number of the word is represented in a column vector mode, the characteristic information of the sentence can be clearly represented in the representation mode, and redundant fuzzy information or partial information is not lost in the subsequent characteristic information updating process.
Further, the step of taking the average value of the sum vector of the differences between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment, wherein the information increment model expression is as follows:
the information increment is:
Figure BDA0004035216650000021
wherein alpha is i =(x i1 x i2 … x ik ) Representing Chinese character b i Is characterized by the feature vector (i=1, 2, …, m), m represents the number of Chinese characters in the sentence, k represents the current Chinese character b i Number of features, alpha l Feature vectors (l=1, 2, …, m) representing any chinese character of a sentence.
The information increment model is as follows:
α i ′=α i +λΔx i
X t ′=X t +λΔX t
wherein alpha is i Is the characteristic vector of the word, lambda is the information increment factor, deltax i Alpha is the information increment of the word i ' updated feature vector, X, for words after application of information delta model t As a feature matrix of sentences, deltaX t Incremental matrix of feature information for sentence, X t ' updated feature matrix for sentence after application of information increment model, and
Figure BDA0004035216650000022
through the optimization step, the feature vector of each word is updated in an increment, and then the feature matrix of the sentence is updated.
Further, the step of performing fine tuning training on the information increment factor in the information increment model by using the feature matrix based on the training set to obtain the trained information increment factor specifically comprises the following steps:
setting an initial value, a threshold value, an initial value of accuracy and an increment value of the information increment factor;
updating the feature matrix of the training set by using the information increment model to obtain an updated training feature matrix;
sample prediction is carried out on the new training feature matrix based on the fine tuning technology function, and a prediction label is obtained;
comparing the predicted label with the correct label of the training set, and calculating the accuracy;
finely adjusting the information increment factors in the information increment model to obtain a finely adjusted information increment model;
and (3) circulating the steps of feature matrix updating, sample prediction, accuracy calculation and information increment factor fine adjustment until the accuracy and the information increment factor meet the conditions, and obtaining the trained information increment factor.
Through the optimization step, a trained information increment factor which can enable the accuracy to reach the highest is obtained, and the trained information increment model constructed by the information factor can improve the accuracy of new semantic expression in application of application set features.
Further, the step of updating the feature matrix of the application set by the information increment model applying the trained increment factors to obtain an updated application feature matrix, wherein the new application feature matrix is expressed as follows:
X s ′=X s +λ*ΔX s
wherein X is s ' is a new application feature matrix, X s Lambda is the feature matrix of the application set * As an increment factor after training, deltaX s An information delta matrix for an application set.
Through the optimization step, the real-time update of the feature update of the application set is realized, and the new semantics are accurately captured.
The second technical scheme adopted by the invention is as follows: a training update system for a feature construct of a chinese sentence, comprising:
the feature extraction module is used for extracting features of the training set and the application set to obtain respective feature matrixes;
the model construction module is used for taking the average value of the sum vector of the differences between the feature vector of each word and the feature vector of the current word as the increment information of the current word to obtain an information increment model;
the fine tuning training module is used for carrying out fine tuning training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the trained information increment factors;
and the application updating module is used for updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain a new application feature matrix.
The method and the system have the beneficial effects that: the invention provides an information increment matrix and an information increment model, which perform corresponding construction increment fusion updating on words in different sentences to form dynamic updating of word vectors, capture new information, update to obtain more accurate sentence characteristic representation, enable each word to express accurate meaning in different contexts, obtain more accurate sentence characteristic representation and improve the accuracy of subsequent natural language processing tasks.
Drawings
FIG. 1 is a flow chart showing the steps of a training update method for Chinese sentence feature construction according to the present invention;
FIG. 2 is a block diagram of a training update system for Chinese sentence feature construction according to the present invention;
FIG. 3 is a fine tuning training flowchart of a training update method of Chinese sentence feature construction according to a specific example I of the present invention;
FIG. 4 is a fine tuning training flowchart of a second embodiment of a training update method for Chinese sentence feature construction according to the present invention;
Detailed Description
The invention will now be described in further detail with reference to the drawings and to specific examples. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.
Referring to fig. 1, the invention provides a training updating method for Chinese sentence feature construction, which comprises the following steps:
s1, acquiring a training set, a pre-training characteristic word library, an application set and a fine-tuning technical function of an existing pre-training model;
pre-training feature word stock: the character library matrix E comprises M character feature vectors, wherein the length of each feature vector is K.
Figure BDA0004035216650000041
Training set: contains n sentences { S } 1 S 2 … S n Each sentence corresponds to a length { m } 1 m 2 … m n -correct label Y { Y }, corresponding thereto 1 y 2 … y n }. Since the lengths of the sentences are different, it is necessary to perform 0-compensating or cutting operations for each sentence so that the lengths are identical, and the length is set to be m. If the training sets of different tasks are different, or a new training set is added, corresponding updating and retraining are carried out.
Application set: and according to the set given by the natural language processing task to be completed, a plurality of sentences are contained in the set.
Fine tuning the technical function: the downstream task technology of the pre-training model, the fine tuning technology according to different natural language processing tasks, is also different, and the process is described as a function, and the expression of the function is as follows:
{y 1 ′ y 2 ′ … y n ′}=f{X 1 X 2 … X n }
s2, extracting features of a training set and an application set based on a pre-training feature word stock to obtain respective feature matrixes;
s2.1 Chinese sentence S in given training set t T=1, 2, …, n, split Chinese character vector according to Chinese character
Figure BDA0004035216650000042
Wherein m represents S t Number of Chinese characters after cutting or supplementing 0, c i Representing the i-th Chinese character, i=1, 2, …, m, obtaining Chinese character coding vector, and using S for the Chinese character coding vector b Indicating (I)>
Figure BDA0004035216650000043
S2.2, extracting each character feature of the sentence by utilizing the pre-training feature word library according to the coding number of the coding vector, ifThe number of the ith word is b i B of word stock matrix i The lines being the eigenvectors of the word, i.e
Figure BDA0004035216650000044
X t Expressed as sentence S t Is m x K, x ij Representing Chinese character b i The j-th eigenvalue (i=1, 2, …, m, j=1, 2, …, K) of the training set chinese sentence S t Is expressed as: />
Figure BDA0004035216650000051
S2.3, extracting the same characteristics of the application set as those of the training set to obtain a characteristic matrix of the application set.
S3, taking the average value of the sum vector of the difference between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment;
definition alpha i =(x i1 x i2 … x ik ) Representing Chinese characters b The feature vector of i (i=1, 2, …, m) is
Figure BDA0004035216650000052
Definition of the definition
Figure BDA0004035216650000053
Representing sentence S t The j-th feature (j=1, 2, …, K) then has X t =(β 1 β 2 … β K )。
The information increment defining each word is:
Figure BDA0004035216650000054
the information increment model is constructed as follows:
α i ′=α i +λΔx i
X t ′=X t +λΔX t
wherein alpha is i Is the characteristic vector of the word, m represents the number of Chinese characters in the sentence, alpha l Feature vectors (l=1, 2, …, m) representing any chinese character in a sentence, λ being the information increment factor, Δx i Alpha is the information increment of the word i ' updated feature vector, X, for words after application of information delta model t As a feature matrix of sentences, deltaX t Incremental matrix of feature information for sentence, X t ' updated feature matrix for sentence after application of information increment model, and
Figure BDA0004035216650000055
applying an information increment model to all sentences in the training set to obtain new characteristic sentence representation { X } of each sentence in the training set 1 ′ X 2 ′ … X n ′}。
S4, performing fine adjustment training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the trained information increment factors;
setting an initial value, a threshold value, an initial value of accuracy and an increment value of the information increment factor;
updating the feature matrix of the training set by using the information increment model to obtain an updated training feature matrix;
sample prediction is carried out on the new training feature matrix based on the fine tuning technology function, and a prediction label is obtained;
comparing the predicted label with the correct label of the training set, and calculating the accuracy;
finely adjusting the information increment factors in the information increment model to obtain a finely adjusted information increment model;
and (3) circulating the steps of feature matrix updating, sample prediction, accuracy calculation and information increment factor fine adjustment until the accuracy and the information increment factor meet the conditions, and obtaining the trained information increment factor.
An embodiment of the judgment conditions is shown in fig. 3:
setting the initial value of the information increment factor to 0 and the maximum threshold value to lambda max Initially, the method comprisesMaximum accuracy A CCmax And an increment value Δλ;
updating the feature matrix of each sentence of the training set by using the information increment model to obtain a new feature representation set { X } of each sentence 1 ′ X 2 ′ … X n ′};
Based on a fine-tuning technique function { y } 1 ′ y 2 ′ … y n ′}=f{X 1 X 2 … X n Sample prediction is carried out on the new training feature matrix to obtain a prediction label Y' = { Y 1 ′ y 2 ′ … y n ′};
Comparing the predicted label with the correct label of the training set, and calculating the accuracy, wherein the expression of the accuracy is as follows:
Figure BDA0004035216650000061
wherein n is correct To predict the correct number of tags in the training set, n represents the total number of tags in the training set.
Judging whether the accuracy rate is greater than the initial maximum accuracy rate, if so, replacing the initial maximum accuracy rate with the current accuracy rate, judging the information increment factor, if so, outputting the information increment factor as the trained information increment factor, and if not, adding an increment value for the information increment factor; if the accuracy rate is not more than the initial maximum accuracy rate, directly judging the information increment factor, if the information increment factor is more than or equal to the maximum threshold value, outputting the information increment factor as the trained information increment factor, and if the information increment factor is not more than the maximum threshold value, adding an increment value for the information increment factor;
substituting the information increment factor added with one increment value into the information increment model, and replacing the original information increment factor to obtain a finely-adjusted information increment model;
and (5) circulating the steps until the trained information increment factor is obtained.
A second embodiment of the judgment condition is shown in fig. 4:
setting the initial value of the information increment factor to 0 and the maximum threshold value to lambda max Initial maximum accuracy A CCmax And an increment value Δλ;
updating the feature matrix of each sentence of the training set by using the information increment model to obtain a new feature representation set { X } of each sentence 1 ′ X 2 ′ … X n ′};
Based on a fine-tuning technique function { y } 1 ′ y 2 ′ … y n ′}=f{X 1 X 2 … X n Sample prediction is carried out on the new training feature matrix to obtain a prediction label Y' = { Y 1 ′ y 2 ′ … y n ′};
Comparing the predicted label with the correct label of the training set, and calculating the accuracy, wherein the expression of the accuracy is as follows:
Figure BDA0004035216650000062
wherein n is correct To predict the correct number of tags in the training set, n represents the total number of tags in the training set.
Judging whether the information increment factor is greater than or equal to a maximum threshold value, if the information increment factor is greater than or equal to the maximum threshold value, outputting the increment factor as the trained information increment factor, if the information increment factor is not greater than the maximum initial accuracy, judging whether the accuracy is greater than or equal to the maximum initial accuracy, and if the information increment factor is greater than or equal to the maximum initial accuracy, replacing the maximum initial accuracy with the current accuracy and adding an increment value for the information increment factor; if the condition is not met, an increment value is directly added for the information increment factor, and the initial maximum accuracy is unchanged;
substituting the information increment factor added with one increment value into the information increment model, and replacing the original information increment factor to obtain a finely-adjusted information increment model;
and (5) circulating the steps until the trained information increment factor is obtained.
S5, updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain a new application feature matrix;
the new application feature matrix is represented as follows:
X s ′=X s +λ*ΔX s
wherein X is s ' is a new application feature matrix, X s Lambda is the feature matrix of the application set * As an increment factor after training, deltaX s An information delta matrix for an application set.
As shown in FIG. 2, the present invention provides a training update system for Chinese sentence feature construction, the system comprising:
the feature extraction module is used for extracting features of the training set and the application set to obtain respective feature matrixes;
the model construction module is used for taking the average value of the sum vector of the differences between the feature vector of each word and the feature vector of the current word as the increment information of the current word to obtain an information increment model;
the fine tuning training module is used for carrying out fine tuning training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the information increment factors after training;
and the application updating module is used for updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain a new application feature matrix.
The content in the method embodiment is applicable to the system embodiment, the functions specifically realized by the system embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method embodiment.
While the preferred embodiment of the present invention has been described in detail, the invention is not limited to the embodiment, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the invention, and these modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims (8)

1. The training updating method for the Chinese sentence characteristic structure is characterized by comprising the following steps of:
extracting features of the training set and the application set to obtain respective feature matrixes;
taking the average value of the sum vector of the difference between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment;
performing fine tuning training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the trained information increment factors;
and updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain an updated application feature matrix.
2. The training updating method of Chinese sentence feature construction according to claim 1, wherein the step of extracting features from the training set and the application set to obtain respective feature matrices comprises:
splitting the Chinese character vectors in the training set and the application set according to the Chinese characters to obtain Chinese character coding vectors;
according to the coding number of the Chinese character coding vector, carrying out feature extraction on each word of the sentence of the training set and the application set by utilizing a pre-training feature word library to obtain a feature vector of each word;
and superposing the feature vector of each word in the training set and the application set according to the coding number to obtain feature matrixes of the training set and the application set.
3. The training update method of a chinese sentence characteristic structure according to claim 1, wherein the expression of the information increment is as follows:
Figure FDA0004035216640000011
wherein alpha is i =(x i1 x i2 …x ik ) Representing the current Chinese character b i Is characterized by the feature vector (i=1, 2, …, m), m represents the number of Chinese characters in the sentence, k represents the current Chinese character b i Number of features, alpha l Feature vectors (l=1, 2, …, m) representing any chinese character of a sentence.
4. The training updating method of Chinese sentence characteristic construction according to claim 1, wherein the information increment model has the expression:
α i ′=α i +λΔx i
X t ′=X t +λΔX t
wherein alpha is i Is the characteristic vector of the word, lambda is the information increment factor, deltax i Alpha is the information increment of the word i ' updated feature vector, X, for words after application of information delta model t As a feature matrix of sentences, deltaX t Incremental matrix of feature information for sentence, X t ' updated feature matrix for sentence after application of information increment model, and
Figure FDA0004035216640000012
5. the method for updating the feature structure of a chinese sentence according to claim 1, wherein the step of performing fine tuning training on the information increment factor in the information increment model based on the feature matrix of the training set to obtain the trained information increment factor specifically comprises:
setting an initial value, a threshold value, an initial value of accuracy and an increment value of the information increment factor;
updating the feature matrix of the training set by using the information increment model to obtain an updated training feature matrix;
sample prediction is carried out on the new training feature matrix based on the fine tuning technology function, and a prediction label is obtained;
comparing the predicted label with the correct label of the training set, and calculating the accuracy;
finely adjusting the information increment factors in the information increment model to obtain a finely adjusted information increment model;
and (3) circulating the steps of feature matrix updating, sample prediction, accuracy calculation and information increment factor fine adjustment until the accuracy and the information increment factor meet the conditions, and obtaining the trained information increment factor.
6. The training update method of claim 5, wherein the accuracy is expressed as follows:
Figure FDA0004035216640000021
wherein n is orrect In order to predict the correct number of labels in the correct labels of the training set, n represents the total number of labels in the training set.
7. The training updating method of Chinese sentence feature construction according to claim 1, wherein the step of updating the feature matrix of the application set by the information increment model applying the trained increment factor to obtain an updated application feature matrix is represented by the following formula:
X s ′=X s* ΔX s
wherein X is s ' is a new application feature matrix, X s Lambda is the feature matrix of the application set * As an increment factor after training, deltaX s An information delta matrix for an application set.
8. A training update system for a feature construct of a chinese sentence, comprising:
the feature extraction module is used for extracting features of the training set and the application set to obtain respective feature matrixes;
the model construction module is used for taking the average value of the sum vector of the differences between the feature vector of each word and the feature vector of the current word as the increment information of the current word to obtain an information increment model;
the fine tuning training module is used for carrying out fine tuning training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the trained information increment factors;
and the application updating module is used for updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain an updated application feature matrix.
CN202310001746.3A 2023-01-03 2023-01-03 Training updating method and system for Chinese sentence feature construction Active CN116070638B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310001746.3A CN116070638B (en) 2023-01-03 2023-01-03 Training updating method and system for Chinese sentence feature construction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310001746.3A CN116070638B (en) 2023-01-03 2023-01-03 Training updating method and system for Chinese sentence feature construction

Publications (2)

Publication Number Publication Date
CN116070638A true CN116070638A (en) 2023-05-05
CN116070638B CN116070638B (en) 2023-09-08

Family

ID=86178054

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310001746.3A Active CN116070638B (en) 2023-01-03 2023-01-03 Training updating method and system for Chinese sentence feature construction

Country Status (1)

Country Link
CN (1) CN116070638B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504015A (en) * 2014-12-11 2015-04-08 中国科学院遥感与数字地球研究所 Learning algorithm based on dynamic incremental dictionary update
CN105068996A (en) * 2015-09-21 2015-11-18 哈尔滨工业大学 Chinese participle increment learning method
CN106547735A (en) * 2016-10-25 2017-03-29 复旦大学 The structure and using method of the dynamic word or word vector based on the context-aware of deep learning
US20190325029A1 (en) * 2018-04-18 2019-10-24 HelpShift, Inc. System and methods for processing and interpreting text messages
CN110705302A (en) * 2019-10-11 2020-01-17 掌阅科技股份有限公司 Named entity recognition method, electronic device and computer storage medium
CN111488423A (en) * 2020-03-05 2020-08-04 北京一览群智数据科技有限责任公司 Index data-based natural language processing method and system
CN112464674A (en) * 2020-12-16 2021-03-09 四川长虹电器股份有限公司 Word-level text intention recognition method
CN114398855A (en) * 2022-01-13 2022-04-26 北京快确信息科技有限公司 Text extraction method, system and medium based on fusion pre-training

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504015A (en) * 2014-12-11 2015-04-08 中国科学院遥感与数字地球研究所 Learning algorithm based on dynamic incremental dictionary update
CN105068996A (en) * 2015-09-21 2015-11-18 哈尔滨工业大学 Chinese participle increment learning method
CN106547735A (en) * 2016-10-25 2017-03-29 复旦大学 The structure and using method of the dynamic word or word vector based on the context-aware of deep learning
US20190325029A1 (en) * 2018-04-18 2019-10-24 HelpShift, Inc. System and methods for processing and interpreting text messages
CN110705302A (en) * 2019-10-11 2020-01-17 掌阅科技股份有限公司 Named entity recognition method, electronic device and computer storage medium
CN111488423A (en) * 2020-03-05 2020-08-04 北京一览群智数据科技有限责任公司 Index data-based natural language processing method and system
CN112464674A (en) * 2020-12-16 2021-03-09 四川长虹电器股份有限公司 Word-level text intention recognition method
CN114398855A (en) * 2022-01-13 2022-04-26 北京快确信息科技有限公司 Text extraction method, system and medium based on fusion pre-training

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
范宇中, 张玉峰: "文本知识的自动分类方法初探", 情报科学, no. 01, pages 103 - 105 *

Also Published As

Publication number Publication date
CN116070638B (en) 2023-09-08

Similar Documents

Publication Publication Date Title
CN112270379A (en) Training method of classification model, sample classification method, device and equipment
CN111062217B (en) Language information processing method and device, storage medium and electronic equipment
CN110705592A (en) Classification model training method, device, equipment and computer readable storage medium
CN114091460A (en) Multitask Chinese entity naming identification method
CN110598210B (en) Entity recognition model training, entity recognition method, entity recognition device, entity recognition equipment and medium
CN114153971B (en) Error correction recognition and classification equipment for Chinese text containing errors
CN109086265A (en) A kind of semanteme training method, multi-semantic meaning word disambiguation method in short text
CN112084336A (en) Entity extraction and event classification method and device for expressway emergency
CN113486175B (en) Text classification method, text classification device, computer device, and storage medium
Moeng et al. Canonical and surface morphological segmentation for nguni languages
CN111597807B (en) Word segmentation data set generation method, device, equipment and storage medium thereof
CN116258137A (en) Text error correction method, device, equipment and storage medium
CN115827819A (en) Intelligent question and answer processing method and device, electronic equipment and storage medium
CN112016300A (en) Pre-training model processing method, pre-training model processing device, downstream task processing device and storage medium
WO2023134085A1 (en) Question answer prediction method and prediction apparatus, electronic device, and storage medium
CN117115564B (en) Cross-modal concept discovery and reasoning-based image classification method and intelligent terminal
US20220138425A1 (en) Acronym definition network
CN113779994A (en) Element extraction method and device, computer equipment and storage medium
CN113158678A (en) Identification method and device applied to electric power text named entity
CN111368056B (en) Ancient poetry generating method and device
CN113297374A (en) Text classification method based on BERT and word feature fusion
CN112307749A (en) Text error detection method and device, computer equipment and storage medium
CN116070638B (en) Training updating method and system for Chinese sentence feature construction
CN117033961A (en) Multi-mode image-text classification method for context awareness
CN114239575B (en) Statement analysis model construction method, statement analysis method, device, medium and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant