CN115422932A - Word vector training method and device, electronic equipment and storage medium - Google Patents

Word vector training method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115422932A
CN115422932A CN202211352892.2A CN202211352892A CN115422932A CN 115422932 A CN115422932 A CN 115422932A CN 202211352892 A CN202211352892 A CN 202211352892A CN 115422932 A CN115422932 A CN 115422932A
Authority
CN
China
Prior art keywords
word vector
word
vector
training
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211352892.2A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Moore Threads Technology Co Ltd
Original Assignee
Moore Threads Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Moore Threads Technology Co Ltd filed Critical Moore Threads Technology Co Ltd
Priority to CN202211352892.2A priority Critical patent/CN115422932A/en
Publication of CN115422932A publication Critical patent/CN115422932A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The disclosure relates to the technical field of computers, and discloses a word vector training method and device, electronic equipment and a storage medium, wherein the method comprises the following steps: determining a word vector training text corresponding to a target word, wherein the word vector training text is a dictionary definition text of the target word; and performing word vector training by using a preset word vector model based on the word vector training text, and determining a target word vector corresponding to the target word. The method and the device can accurately generate the target word vector corresponding to the target word, and effectively improve the word vector accuracy of different target words with similar words and different semantics.

Description

Word vector training method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a word vector training method and apparatus, an electronic device, and a storage medium.
Background
Converting natural language that can be understood by human into digital vector that can be processed by computer is a necessary step of natural language processing. This method of expressing words using vectors is called word vectors (or word embedding). By establishing a mapping relationship between words and word vectors, words can be represented as vectors that imply semantic information. For words with similar words and different semantics, there is no method for accurately training word vectors of the words in the related art.
Disclosure of Invention
The disclosure provides a word vector training method and device, electronic equipment and a storage medium.
According to an aspect of the present disclosure, there is provided a word vector training method, including: determining a word vector training text corresponding to a target word, wherein the word vector training text is a dictionary definition text of the target word; and performing word vector training by using a preset word vector model based on the word vector training text, and determining a target word vector corresponding to the target word.
In a possible implementation manner, the training a word vector based on the word vector training text by using a preset word vector model to determine a target word vector corresponding to the target word includes: segmenting words of the dictionary definition text to obtain a plurality of training segmented words; initializing the training participles, and determining an initial word vector corresponding to each training participle; and inputting the plurality of initial word vectors into the preset word vector model, and performing word vector training by using the preset word vector model to determine the target word vector.
In one possible implementation, before the segmenting the dictionary definition text, the method further includes: inputting the target word into a word segmentation dictionary under the condition that the target word is a combined word, wherein the word segmentation dictionary is used for performing word segmentation; the segmenting the dictionary definition text to obtain a plurality of training segmentations includes: and segmenting the dictionary definition text based on the segmentation dictionary to obtain the plurality of training segmented words.
In one possible implementation, the method further includes: and under the condition that the target words are included in the dictionary definition text, taking the target words as a training participle as a whole.
In a possible implementation manner, the inputting the plurality of initial word vectors into the preset word vector model, performing word vector training using the preset word vector model, and determining the target word vector includes: sequentially encoding the plurality of initial word vectors by using the preset word vector model to determine text vectors corresponding to the dictionary definition text; decoding the text vector by using the preset word vector model to obtain a predicted word vector corresponding to the target word; determining a loss of error between the text vector and the predictor vector; performing word vector training by minimizing the error loss, determining the target word vector.
In one possible implementation manner, the number of the plurality of initial word vectors is n, where n is a positive integer; the sequentially encoding the plurality of initial word vectors by using the preset word vector model to determine the text vector corresponding to the dictionary definition text includes: based on the weight corresponding to the 1 st initial word vector, carrying out coding processing on the 1 st initial word vector to obtain a hidden layer vector corresponding to the 1 st initial word vector; coding the ith initial word vector based on the weight corresponding to the ith initial word vector and the hidden layer vector corresponding to the (i-1) th initial word vector to obtain the hidden layer vector corresponding to the ith initial word vector, wherein i is a positive integer greater than or equal to 2 and less than or equal to n; and coding the hidden layer vector corresponding to the nth initial word vector based on the coding weight to obtain the text vector.
In a possible implementation manner, the decoding the text vector by using the preset word vector model to obtain a predicted word vector corresponding to the target word includes: decoding the text vector based on the decoding weight to obtain a decoding vector; and carrying out normalization processing on the decoding vector to obtain the predicted word vector.
In one possible implementation, the method further includes: executing a word vector evaluation task by using the target word vector to obtain an evaluation result; under the condition that the evaluation result does not accord with the preset condition, adjusting the structural parameters of the preset word vector model; and performing word vector training by using the adjusted preset word vector model based on the word vector training text to optimize the target word vector.
In one possible implementation, the word vector evaluation task includes: clustering task and text classification task.
In one possible implementation, the method further includes: and generating a target dictionary based on the target words and the target word vectors corresponding to the target words.
According to an aspect of the present disclosure, there is provided a word vector training apparatus including: the first determining module is used for determining a word vector training text corresponding to a target word, wherein the word vector training text is a dictionary definition text of the target word; and the training module is used for carrying out word vector training by using a preset word vector model based on the word vector training text and determining a target word vector corresponding to the target word.
According to an aspect of the present disclosure, there is provided an electronic device including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the memory-stored instructions to perform the above-described method.
According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.
In the embodiment of the present disclosure, for different target words with similar words, the corresponding dictionary definition texts are inevitably different, so that the dictionary definition text of the target word is determined as the word vector training text corresponding to the target word, and further, based on the word vector training text, word vector training is performed by using the preset word vector model, so that the target word vector corresponding to the target word can be accurately generated, and the word vector accuracy of different target words with different word similarities and different semantics is effectively improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure. Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a flow diagram of a word vector training method according to an embodiment of the present disclosure.
Fig. 2 shows a schematic diagram of a preset word vector model according to an embodiment of the present disclosure.
FIG. 3 shows a flow diagram of a method of word vector training in accordance with an embodiment of the present disclosure.
FIG. 4 shows a block diagram of a word vector training apparatus according to an embodiment of the present disclosure.
Fig. 5 shows a block diagram of an electronic device in accordance with an embodiment of the disclosure.
Fig. 6 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The term "and/or" herein is merely an association relationship describing an associated object, and means that there may be three relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of a, B, and C, and may mean including any one or more elements selected from the group consisting of a, B, and C.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
In the related art, the word vector is divided into a static word vector and a dynamic word vector. Static Word vectors are represented by Word2Vec, and dynamic Word vectors are represented by Bert and GPT 3.
Word2Vec has two training methods, one is called CBOW, the core idea is to scratch a Word from a sentence, and predict the scratched Word by using the upper text and the lower text of the Word; the second is called Skip-gram, and CBOW is just the reverse, entering a word, requiring the model to predict its context words. The weighting parameter of the Word predicted by the model is the Word2Vec Word vector of the Word. Word2Vec is a static Word vector algorithm, so called static Word vector algorithm, that is, after training a model on a data set, the Word vector of each Word is fixed. When the word vector is used subsequently, the word vector is the same regardless of the input sentence. For example: the 'millet' in the sentence 'i like eating millet' refers to a food, and the 'millet' in the sentence 'millet is used for making mobile phones erect' refers to a mobile phone brand. However, the Word vector of the Word "millet" obtained by Word2Vec is fixed, and a more accurate Word vector cannot be given according to different contexts. In order to solve the problem of word ambiguity, a dynamic word vector algorithm is used.
The dynamic word vector algorithm is based on the idea that a model is trained in a large corpus. When a downstream task needs to obtain a word vector of a word, the whole sentence is input into the model, the output of the model is used as the word vector of the word, and the word vector contains context information. Based on the dynamic word vector algorithm, different word vectors can be obtained for the same word under different context contexts, so that the problem of word ambiguity is solved. Common models for dynamic word vector algorithms include: GPT, ELMO, bert, etc.
There are many fields where proper nouns exist, most of which are combinatory words. For proper nouns formed by the compound words, if Word2Vec is adopted to generate Word vectors, in many cases, the compound words need to be segmented first, then Word vectors of each segmented Word are added, and then the average is taken as the Word vector of the compound Word.
For example, if Word2Vec is used to generate a Word vector of a proper noun, the steps are as follows: first, word segmentation: collective/land/use certificate; secondly, calculating a Word vector v1 of the participle "collective", a Word vector v2 of the participle "land" and a Word vector v3 of the participle "use certificate" by using a Word2Vec Word vector algorithm; thirdly, the word vectors of the three participles are added and averaged to obtain the word vector v = (v 1+ v2+ v 3)/3 of the proper noun "collective land use pattern".
But there are proper nouns where words resemble each other but have different semantics. For example, the "collective land use ticket" and the "nationally owned land use ticket" are two proper nouns with different semantics although only having a difference of two characters. If Word2Vec is used to generate Word vectors, the two proper nouns are firstly participated, and the Word "collective land use right certificate" is participated as follows: the group/land/use right/certificate, the "national land use right certificate" is divided into: national/land/right of use/certificate. The word vectors for each segmented word are then added and averaged. The word vectors of the two proper nouns are distinguished by the difference between the two participles "collective" and "nationally owned". Because the word vectors of the two participles are relatively close and the two participles occupy a low weight of the whole sentence, the word vectors generated by the two proper nouns are very close. When the word vectors are used as the characteristics of proper nouns to classify the texts, the classification effect is often poor because of no discrimination.
In addition, because these proper nouns usually belong to the same industry field, their contexts are not much different in practical application, and it is difficult to have obvious discrimination by using a dynamic word vector algorithm to generate word vectors, and the effect is often not good when the word vectors are used for carrying out subdivided secondary classification.
In order to solve the technical problem that the word vectors of the proper nouns with similar words but different semantics are inaccurate, the disclosure provides a word vector training method. Each proper noun has its corresponding dictionary definition text, and even if words are similar, the corresponding dictionary definition text is naturally different due to different semantics. Therefore, for proper nouns, the corresponding word vectors are generated by using the corresponding dictionary definition texts, and the word vector accuracy of different proper nouns with similar words and different semantemes can be effectively improved. The following describes the word vector training method provided in the embodiments of the present disclosure in detail.
FIG. 1 shows a flow diagram of a method of word vector training in accordance with an embodiment of the present disclosure. The method may be performed by an electronic device such as a terminal device or a server, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, and the like, and the method may be implemented by a processor calling a computer readable instruction stored in a memory. Alternatively, the method may be performed by a server. As shown in fig. 1, the method includes:
in step S11, a word vector training text corresponding to the target word is determined, where the word vector training text is a dictionary definition text of the target word.
And acquiring a target word in a certain field and a dictionary definition text corresponding to the target word. The target word here may refer to a defined word in which dictionary definition text exists in a certain field.
In an example, a defined word in a certain field and a corresponding dictionary definition text thereof may be obtained based on data sources such as encyclopedia, wikipedia and the like, and the data sources and the specific field are not limited by the present disclosure.
For example, based on encyclopedia, defined words in the computer domain and their corresponding dictionary definition text in encyclopedia are obtained.
For any target word (defined word), determining the corresponding dictionary definition text as the word vector training text corresponding to the target word, that is, using the dictionary definition text corresponding to the target word to train the model to obtain the corresponding target word vector of the target word.
In step S12, word vector training is performed on the word vector training text by using a preset word vector model, and a target word vector corresponding to the target word is determined.
The preset word vector model may be a conventional language model, such as a Seq2Seq model, an LSTM model, a GPT model, etc., or may be a pre-trained model pre-trained on a general data set, which is not limited in this disclosure.
And performing word vector training on a preset word vector model by using the dictionary definition text corresponding to the target word as a word vector training text, so that word vectors of different target words with similar words and different semantics can be effectively distinguished.
In an example, word vector training may be performed on a Graphics Processing Unit (GPU) on a preset word vector model to improve training speed.
Hereinafter, a process of performing word vector training on a preset word vector model based on a word vector training text to determine a target word vector corresponding to a target word will be described in detail with reference to possible implementation manners of the present disclosure, and will not be described herein again.
In the embodiment of the disclosure, for different target words with similar words, the corresponding dictionary definition texts are inevitably different, so that the dictionary definition text of the target word is determined as the word vector training text corresponding to the target word, and further, based on the word vector training text, word vector training is performed by using a preset word vector model, so that the target word vector corresponding to the target word can be accurately generated, and the word vector accuracy of different target words with different word similarities and different semantics is effectively improved.
In one possible implementation manner, training a text based on a word vector, performing word vector training using a preset word vector model, and determining a target word vector corresponding to a target word includes: segmenting words of the dictionary definition text to obtain a plurality of training segmented words; initializing a plurality of training participles, and determining an initial word vector corresponding to each training participle; and inputting a plurality of initial word vectors into a preset word vector model, and performing word vector training by using the preset word vector model to determine a target word vector.
The method comprises the steps of segmenting a dictionary definition text corresponding to a target word to obtain a plurality of training segmented words, further determining an initial word vector corresponding to each training segmented word, inputting a plurality of initial word vectors into a preset word vector model, starting word vector training of the preset word vector model, and obtaining a target word vector corresponding to the target word after training is finished.
In one possible implementation, before segmenting the dictionary definition text, the method further includes: inputting the target words into a word segmentation dictionary under the condition that the target words are combined words; segmenting words of a dictionary definition text to obtain a plurality of training segmented words, wherein the training segmented words comprise: and segmenting the dictionary definition text based on the segmentation dictionary to obtain a plurality of training segmentation words.
In one possible implementation, the method further includes: and in the case that the target words are included in the dictionary definition text, taking the target words as a whole to be used as a training participle.
After target words in a certain field and dictionary definition texts corresponding to the target words are obtained, target words formed by the combined words are screened, and the target words formed by the combined words are input into a word segmentation dictionary, so that the target words formed by the combined words are used as one word without further word segmentation when word segmentation processing is carried out subsequently.
For example, the target word is a combination word "national land use certificate", and its corresponding dictionary definition text is "national land use certificate", which is a legal certificate for proving that a land user (unit or individual) uses national land, protected by law ". After segmenting words of the dictionary definition text corresponding to the target words and removing punctuations, special characters and the like, obtaining a plurality of training segmented words corresponding to the target words: national land use certificate/yes/certificate/land/user/unit/or/individual/use/national land/legal/voucher/protected/legal/protected. Wherein, the whole target word of the national land use certificate is taken as a training word without further word segmentation.
Initializing a plurality of training participles obtained after participling the dictionary definition text corresponding to the target word, determining an initial word vector corresponding to each training participle, and preparing for subsequent word vector training.
In one example, the initial word vector corresponding to each training participle may be determined in a one-bit efficient encoding (one-hot) manner.
For example, by one-hot method, determining an initial word vector of each training segmented word "national land use certificate/yes/certificate/land/user/unit/or/individual/use/national/land/law/certificate/subject/law/protection" corresponding to the target word "national land use certificate":
national land use pattern: v. of 1 ,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1];
The method comprises the following steps: v. of 2 ,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0];
And (3) proving that: v. of 3 ,[0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0];
Land: v. of 4 ,[0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0];
The user: v. of 5 ,[0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0];
Unit: v. of 6 ,[0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0];
Or: v. of 7 ,[0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0];
The individual: v. of 8 ,[0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0];
The following components are used: v. of 9 ,[0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0];
The country has: v. of 10 ,[0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0];
The following steps: v. of 11 ,[0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0];
Law: v. of 12 ,[0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0];
Certificate: v. of 13 ,[0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0];
Receiving: v. of 14 ,[0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0];
Protection: v. of 15 ,[0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
<SOS>:v 16 ,[1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]。
Where < SOS > is a flag bit, there is no actual semantic meaning.
The initial word vector corresponding to each training segmented word may be determined in other initialization manners besides the one-hot manner described above, which is not specifically limited in this disclosure.
And inputting a plurality of initial word vectors into a preset word vector model, performing word vector training by using the preset word vector model, and determining a target word vector corresponding to a target word.
Fig. 2 shows a schematic diagram of a preset word vector model according to an embodiment of the present disclosure. As shown in fig. 2, the preset word vector model is a Seq2Seq model, and includes an Encoder (Encoder) and a Decoder (Decoder). The Encoder and Decoder may adopt a Recurrent Neural Network (RNN) structure, or may adopt other Network structures according to actual situations, which is not specifically limited in this disclosure.
Before training the word vector, setting the structural parameters of a preset word vector model. For example, the input length is 64, units is 128, the word vector dimension is 300, and so on. The specific type and value of the structural parameter can be adjusted according to actual conditions, and the disclosure does not specifically limit the specific type and value.
As shown in fig. 2, a plurality of initial word vectors v are formed 1 -v n And inputting the Encoder to a preset word vector model. With the above objects in mindTaking the word "national land use certificate" as an example, the above-mentioned multiple initial word vectors v 1 -v 16 And inputting an Encoder of a preset word vector model.
In one possible implementation manner, inputting a plurality of initial word vectors into a preset word vector model, performing word vector training using the preset word vector model, and determining a target word vector includes: sequentially encoding a plurality of initial word vectors by using a preset word vector model, and determining text vectors corresponding to the dictionary definition text; decoding the text vector by using a preset word vector model to obtain a predicted word vector corresponding to the target word; determining an error loss between the text vector and the predicted word vector; and performing word vector training by minimizing error loss to determine a target word vector.
Sequentially encoding a plurality of initial word vectors by using an Encoder of a preset word vector model, and determining a text vector of a dictionary definition text corresponding to a target word, namely a text vector c output by the Encoder; and then decoding the text vector c by using a Decoder of a preset word vector model to obtain a predicted word vector v corresponding to the target word 0 (ii) a Determining a text vector c and a predicted word vector v 0 Error loss therebetween
Figure DEST_PATH_IMAGE001
(ii) a By minimizing error losses
Figure 911831DEST_PATH_IMAGE001
And optimizing a preset word vector model to realize word vector training and determine a final target word vector.
The preset word vector model can be optimized through a random gradient (sgd) method, can be optimized through an adaptive momentum random optimization (adam) method, and can be optimized through other optimization methods.
In one possible implementation, the number of the plurality of initial word vectors is n, where n is a positive integer; the method comprises the following steps of sequentially encoding a plurality of initial word vectors by using a preset word vector model, and determining text vectors corresponding to dictionary definition texts, wherein the method comprises the following steps: based on the weight corresponding to the 1 st initial word vector, carrying out coding processing on the 1 st initial word vector to obtain a hidden layer vector corresponding to the 1 st initial word vector; coding the ith initial word vector based on the weight corresponding to the ith initial word vector and the hidden layer vector corresponding to the (i-1) th initial word vector to obtain the hidden layer vector corresponding to the ith initial word vector, wherein i is a positive integer greater than or equal to 2 and less than or equal to n; and coding the hidden layer vector corresponding to the nth initial word vector based on the coding weight to obtain a text vector.
Specifically, the Encoder may sequentially encode a plurality of initial word vectors by the following formulas (1) to (3) to obtain a text vector c:
Figure DEST_PATH_IMAGE002
(1),
Figure DEST_PATH_IMAGE003
(2),
Figure DEST_PATH_IMAGE004
(3)。
wherein, w 1 Is the 1 st initial word vector v 1 Corresponding weight, w i Is the ith initial word vector v i A corresponding weight; h is 1 Is the 1 st initial word vector v 1 Corresponding hidden layer vector, h i Is the ith initial word vector v i A corresponding hidden layer vector; w is a c Is the hidden layer vector h corresponding to the nth initial word vector n The coding weight of (2); b is a preset offset, and the value of b may be 0 or any other preset value, which is not specifically limited by the present disclosure.
In a possible implementation manner, decoding a text vector by using a preset word vector model to obtain a predicted word vector corresponding to a target word, includes: decoding the text vector based on the decoding weight to obtain a decoded vector; and carrying out normalization processing on the decoding vector to obtain a predicted word vector.
Specifically, the Decoder may decode the text vector c by the following formula (4) to obtain a decoded vector h d Further, the decoded vector hd is normalized by the following formula (5) to obtain a predicted word vector v corresponding to the target word 0
Figure DEST_PATH_IMAGE005
(4),
Figure DEST_PATH_IMAGE006
(5)。
Wherein w t Is the decoding weight corresponding to the text vector c; w is a 0 Is a decoded vector h d Corresponding normalized weights; b is a preset offset, and the value of b may be 0 or any other preset value, which is not specifically limited by the present disclosure.
Before the word vector training is performed, the weight, the encoding weight and the decoding weight corresponding to each initial word vector are preset, and the specific preset value of each weight can be adjusted according to the actual situation, which is not specifically limited by the present disclosure.
In the process of executing the word vector training, the word vector training is executed iteratively by adjusting the weight, the coding weight and the decoding weight corresponding to each initial word vector so as to minimize the error loss
Figure DEST_PATH_IMAGE007
. In minimizing error loss
Figure 782966DEST_PATH_IMAGE001
Then, according to each hidden layer h in the Encoder of the preset word vector model i And determining a target word vector corresponding to the target word.
In one possible implementation, the method further includes: and generating a target dictionary based on the target words and the corresponding target word vectors thereof.
And determining a target word vector corresponding to each target word by the above method for a plurality of target words in a certain field, and storing each target word and the corresponding target word vector thereof into a dictionary format to generate a target dictionary.
For example, the target dictionary includes: target word vectors corresponding to the national property right document [0.0234,0.003435,0.345 \8230; ], and target word vectors corresponding to the collective property right document [0.0445,0.0045,0.4566 \8230; ].
In practical application, each target word can be converted into a corresponding target word vector by using the mapping relation in the target dictionary.
In one possible implementation, the method further includes: executing a word vector evaluation task by using the target word vector to obtain an evaluation result; adjusting the structural parameters of the preset word vector model under the condition that the evaluation result does not accord with the preset condition; and training a word vector based on the word vector training text, and performing word vector training by using the adjusted preset word vector model to optimize a target word vector.
After word vector training is completed by using the preset word vector model to obtain a target word vector corresponding to a target word, a word vector evaluation task can be executed by using the target word vector, and the accuracy of the target word vector is judged according to an evaluation result. And under the condition that the evaluation result does not meet the preset condition, the accuracy of the target word vector is low. At the moment, structural parameters of the preset word vector model are adjusted to optimize the preset word vector model, then the text is trained based on the word vectors, word vector training is carried out again by using the adjusted preset word vector model, and the target word vectors are optimized according to the change of error loss.
In an example, the adjusting of the structural parameter of the preset word vector model may be adjusting parameters such as units and dropout, or adjusting other structural parameters, which is not specifically limited in this disclosure.
In one possible implementation, the word vector evaluation task includes: clustering task and text classification task.
The target word vectors are utilized to execute clustering tasks, and the accuracy of the target word vectors is high under the condition that clustering results indicate that synonymous target words can be clustered to the same clustering cluster; and under the condition that the clustering result indicates that the synonymous target words cannot be clustered to the same clustering cluster, the accuracy of the target word vector is low.
Executing a text classification task by using the target word vector, wherein the accuracy of the target word vector is high under the condition of high classification effect; and under the condition of low classification effect, the accuracy of representing the target word vector is low.
The word vector evaluation task for evaluating the accuracy of the target word vector may be the clustering task and the text classification task, and may also select other evaluation tasks capable of evaluating the accuracy of the target word vector according to the actual situation, which is not specifically limited by the present disclosure.
In an example, the above operations may be repeatedly performed until the accuracy of the optimized target word vector meets a preset condition.
FIG. 3 shows a flow diagram of a method of word vector training in accordance with an embodiment of the present disclosure. As shown in fig. 3, the word vector training method includes:
in step S301, data is collected. Based on a target data source, a target word (defined word) of a certain field and a dictionary definition text corresponding to the target word are obtained.
In step S302, a word segmentation dictionary is configured. And screening target words formed by the combined words in the data set, and inputting the target words formed by the combined words into a word segmentation dictionary, so that the target words formed by the combined words are used as one word without further word segmentation when the word segmentation processing is carried out subsequently.
In step S303, the data is segmented and cleaned. And segmenting the dictionary definition text corresponding to the target word, and removing redundant data such as punctuations, special characters and the like.
In step S304, it is determined whether the data preprocessing is completed. If yes, skipping to execute the step S305; if not, the step S301 is skipped to execute.
In step S305, a definition input,Output and target. Performing word segmentation on a dictionary definition text corresponding to a target word to obtain an initial word vector of each training word segmentation, wherein the initial word vector is used as the input of an Encoder in a preset word vector model; a predicted word vector v corresponding to the target word 0 As the output of Decoder in the preset word vector model; a text vector c output from Encoder, and a predicted word vector v 0 Determining the error loss
Figure 890600DEST_PATH_IMAGE007
The task goal of word vector training is to minimize error loss
Figure 129818DEST_PATH_IMAGE001
In step S306, the model selects and sets the structural parameters of the model. The specific selection of the model and the specific setting of the structural parameters may refer to the above description, which is not repeated herein.
In step S307, the model is trained. And performing word vector training based on a preset word vector model.
In step S308, a target word vector is determined. And after word vector training is carried out based on a preset word vector model, determining a target word vector corresponding to the target word.
In step S309, the word vector evaluation task. And evaluating the word vectors based on the word vector evaluation task. For the specific process of the word vector evaluation task, reference may be made to the above description, which is not repeated herein.
In step S310, it is determined whether the evaluation result meets a preset condition. If yes, ending the whole process; if not, the step S311 is skipped.
In step S311, the model is tuned and the execution of step S307 is skipped. And adjusting the structural parameters of the preset word vector model to optimize the preset word vector model.
According to the embodiment of the disclosure, for different target words with similar words, the corresponding dictionary definition texts are necessarily different, so that the dictionary definition text of the target word is determined as the word vector training text corresponding to the target word, and further, based on the word vector training text, word vector training is performed by using the preset word vector model, so that the target word vector corresponding to the target word can be accurately generated, and the word vector accuracy of different target words with different word similarities and different semantics is effectively improved.
When the word vector training method is applied to the file cataloging and classifying task, the classifying effect can be improved by 15pp, and the classifying accuracy rate exceeds 90%.
It is understood that the above-mentioned method embodiments of the present disclosure can be combined with each other to form a combined embodiment without departing from the logic of the principle, which is limited by the space, and the detailed description of the present disclosure is omitted. Those skilled in the art will appreciate that in the above methods of the specific embodiments, the specific order of execution of the steps should be determined by their function and possibly their inherent logic.
In addition, the present disclosure also provides a word vector training apparatus, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any one of the word vector training methods provided in the present disclosure, and the corresponding technical solutions and descriptions and corresponding descriptions in the methods section are not repeated herein.
FIG. 4 shows a block diagram of a word vector training apparatus according to an embodiment of the present disclosure. As shown in fig. 4, the apparatus 40 includes:
a determining module 41, configured to determine a word vector training text corresponding to a target word, where the word vector training text is a dictionary definition text of the target word;
and the training module 42 is configured to train a text based on the word vector, perform word vector training using a preset word vector model, and determine a target word vector corresponding to the target word.
In one possible implementation, the training module 42 includes:
the word segmentation sub-module is used for segmenting words of the dictionary definition text to obtain a plurality of training word segments;
the initialization submodule is used for initializing a plurality of training participles and determining an initial word vector corresponding to each training participle;
and the training submodule is used for inputting a plurality of initial word vectors into a preset word vector model, performing word vector training by using the preset word vector model and determining a target word vector.
In a possible implementation manner, the apparatus 40 further includes:
the word segmentation dictionary building module is used for inputting the target words into a word segmentation dictionary before the word segmentation of the dictionary definition text is carried out and under the condition that the target words are combined words, wherein the word segmentation dictionary is used for carrying out word segmentation;
the word segmentation submodule is specifically used for:
and segmenting the dictionary definition text based on the segmentation dictionary to obtain a plurality of training segmentation words.
In one possible implementation, the word segmentation sub-module is further configured to:
and in the case that the target words are included in the dictionary definition text, taking the target words as a training participle as a whole.
In one possible implementation, the training submodule includes:
the encoding unit is used for sequentially encoding a plurality of initial word vectors by using a preset word vector model and determining text vectors corresponding to the dictionary definition text;
the decoding unit is used for decoding the text vector by using a preset word vector model to obtain a predicted word vector corresponding to the target word;
a loss determination unit for determining an error loss between the text vector and the predicted word vector;
and the training unit is used for performing word vector training by minimizing error loss and determining a target word vector.
In one possible implementation, the number of the plurality of initial word vectors is n, where n is a positive integer;
an encoding unit, specifically configured to:
based on the weight corresponding to the 1 st initial word vector, carrying out coding processing on the 1 st initial word vector to obtain a hidden layer vector corresponding to the 1 st initial word vector;
coding the ith initial word vector based on the weight corresponding to the ith initial word vector and the hidden layer vector corresponding to the (i-1) th initial word vector to obtain the hidden layer vector corresponding to the ith initial word vector, wherein i is a positive integer greater than or equal to 2 and less than or equal to n;
and coding the hidden layer vector corresponding to the nth initial word vector based on the coding weight to obtain a text vector.
In a possible implementation manner, the decoding unit is specifically configured to:
decoding the text vector based on the decoding weight to obtain a decoded vector;
and carrying out normalization processing on the decoding vector to obtain a predicted word vector.
In a possible implementation manner, the apparatus 40 further includes:
the evaluation module is used for executing a word vector evaluation task by using the target word vector to obtain an evaluation result;
the structure parameter adjusting module is used for adjusting the structure parameters of the preset word vector model under the condition that the evaluation result does not accord with the preset condition;
the training module 42 is further configured to train the text based on the word vector, perform word vector training using the adjusted preset word vector model, and optimize the target word vector.
In one possible implementation, the word vector evaluation task includes: clustering task and text classification task.
In a possible implementation manner, the apparatus 40 further includes:
and the dictionary generating module is used for generating a target dictionary based on the target words and the corresponding target word vectors.
The method has specific technical relevance with the internal structure of the computer system, and can solve the technical problems of how to improve the hardware operation efficiency or the execution effect (including reducing data storage capacity, reducing data transmission capacity, improving hardware processing speed and the like), thereby obtaining the technical effect of improving the internal performance of the computer system according with the natural law.
In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.
Embodiments of the present disclosure also provide a computer-readable storage medium, on which computer program instructions are stored, and when executed by a processor, the computer program instructions implement the above method. The computer readable storage medium may be a volatile or non-volatile computer readable storage medium.
An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the memory-stored instructions to perform the above-described method.
The disclosed embodiments also provide a computer program product comprising computer readable code or a non-transitory computer readable storage medium carrying computer readable code, which when run in a processor of an electronic device, the processor in the electronic device performs the above method.
The electronic device may be provided as a terminal, server, or other form of device.
Fig. 5 shows a block diagram of an electronic device according to an embodiment of the disclosure. Referring to fig. 5, the electronic device 800 may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or other terminal device.
Referring to fig. 5, electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.
The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 800 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The input/output interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
Sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the electronic device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a Complementary Metal Oxide Semiconductor (CMOS) or Charge Coupled Device (CCD) image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as a wireless network (Wi-Fi), a second generation mobile communication technology (2G), a third generation mobile communication technology (3G), a fourth generation mobile communication technology (4G), a long term evolution of universal mobile communication technology (LTE), a fifth generation mobile communication technology (5G), or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the electronic device 800 to perform the above-described methods.
The disclosure relates to the field of augmented reality, and in particular relates to a method for detecting or identifying relevant features, states and attributes of a target object by acquiring image information of the target object in a real environment and by means of various visual correlation algorithms, so as to obtain an AR effect combining virtual and reality matched with specific applications. For example, the target object may relate to a face, a limb, a gesture, an action, etc. associated with a human body, or a marker, a marker associated with an object, or a sand table, a display area, a display item, etc. associated with a venue or a place. The vision-related algorithms may involve visual localization, SLAM, three-dimensional reconstruction, image registration, background segmentation, key point extraction and tracking of objects, pose or depth detection of objects, and the like. The specific application can not only relate to interactive scenes such as navigation, explanation, reconstruction, virtual effect superposition display and the like related to real scenes or articles, but also relate to special effect treatment related to people, such as interactive scenes such as makeup beautification, limb beautification, special effect display, virtual model display and the like. The detection or identification processing of the relevant characteristics, states and attributes of the target object can be realized through the convolutional neural network. The convolutional neural network is a network model obtained by performing model training based on a deep learning framework.
Fig. 6 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure. Referring to fig. 6, the electronic device 1900 may be provided as a server or a terminal device. Referring to fig. 6, electronic device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, that are executable by processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the methods described above.
The electronic device 1900 may further include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output interface 1958. The electronic device 1900 may operate based on an operating system, such as the Microsoft Server operating system (Windows Server), stored in the memory 1932 TM ) Apple Inc. of the present application based on the graphic user interface operating System (Mac OS X) TM ) Multi-user, multi-process computer operating system (Unix) TM ) Free and open native code Unix-like operating System (Linux) TM ) Open native code Unix-like operating System (FreeBSD) TM ) Or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the electronic device 1900 to perform the above-described methods.
The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, and for brevity, will not be described again herein.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
If the technical scheme of the application relates to personal information, a product applying the technical scheme of the application clearly informs personal information processing rules before processing the personal information, and obtains personal independent consent. If the technical scheme of the application relates to sensitive personal information, before the sensitive personal information is processed, a product applying the technical scheme of the application obtains individual consent and simultaneously meets the requirement of 'explicit consent'. For example, at a personal information collection device such as a camera, a clear and significant identifier is set to inform that the personal information collection range is entered, the personal information is collected, and if the person voluntarily enters the collection range, the person is regarded as agreeing to collect the personal information; or on the device for processing the personal information, under the condition of informing the personal information processing rule by using obvious identification/information, obtaining personal authorization by modes of popping window information or asking a person to upload personal information of the person by himself, and the like; the personal information processing rule may include information such as a personal information processor, a personal information processing purpose, a processing method, and a type of personal information to be processed.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (13)

1. A method for word vector training, comprising:
determining a word vector training text corresponding to a target word, wherein the word vector training text is a dictionary definition text of the target word;
and performing word vector training by using a preset word vector model based on the word vector training text, and determining a target word vector corresponding to the target word.
2. The method according to claim 1, wherein the training a text based on the word vector, performing word vector training using a preset word vector model, and determining a target word vector corresponding to the target word comprises:
performing word segmentation on the dictionary definition text to obtain a plurality of training words;
initializing the training participles, and determining an initial word vector corresponding to each training participle;
and inputting the plurality of initial word vectors into the preset word vector model, and performing word vector training by using the preset word vector model to determine the target word vector.
3. The method of claim 2, wherein prior to tokenizing the dictionary definition text, the method further comprises:
inputting the target word into a word segmentation dictionary under the condition that the target word is a combined word, wherein the word segmentation dictionary is used for performing word segmentation;
the segmenting the dictionary definition text to obtain a plurality of training segmentations includes:
and segmenting the dictionary definition text based on the segmentation dictionary to obtain the plurality of training segmented words.
4. A method according to claim 2 or 3, characterized in that the method further comprises:
and under the condition that the target words are included in the dictionary definition text, taking the target words as a training participle as a whole.
5. The method according to claim 2 or 3, wherein the inputting the plurality of initial word vectors into the preset word vector model, performing word vector training using the preset word vector model, and determining the target word vector comprises:
sequentially encoding the plurality of initial word vectors by using the preset word vector model to determine text vectors corresponding to the dictionary definition text;
decoding the text vector by using the preset word vector model to obtain a predicted word vector corresponding to the target word;
determining a loss of error between the text vector and the predictor vector;
performing word vector training by minimizing the error loss, determining the target word vector.
6. The method of claim 5, wherein the number of the plurality of initial word vectors is n;
the sequentially encoding the plurality of initial word vectors by using the preset word vector model to determine the text vector corresponding to the dictionary definition text includes:
based on the weight corresponding to the 1 st initial word vector, carrying out coding processing on the 1 st initial word vector to obtain a hidden layer vector corresponding to the 1 st initial word vector;
coding the ith initial word vector based on the weight corresponding to the ith initial word vector and the hidden layer vector corresponding to the (i-1) th initial word vector to obtain the hidden layer vector corresponding to the ith initial word vector, wherein i is a positive integer which is more than or equal to 2 and less than or equal to n;
and coding the hidden layer vector corresponding to the nth initial word vector based on the coding weight to obtain the text vector.
7. The method according to claim 5, wherein the decoding the text vector by using the preset word vector model to obtain a predicted word vector corresponding to the target word comprises:
decoding the text vector based on the decoding weight to obtain a decoding vector;
and carrying out normalization processing on the decoding vector to obtain the predicted word vector.
8. The method according to any one of claims 1 to 3, further comprising:
executing a word vector evaluation task by using the target word vector to obtain an evaluation result;
under the condition that the evaluation result does not accord with the preset condition, adjusting the structural parameters of the preset word vector model;
and performing word vector training by using the adjusted preset word vector model based on the word vector training text to optimize the target word vector.
9. The method of claim 8, wherein the word vector evaluation task comprises: clustering tasks and text classification tasks.
10. The method according to any one of claims 1 to 3, further comprising:
and generating a target dictionary based on the target words and the corresponding target word vectors.
11. A word vector training apparatus, comprising:
the first determining module is used for determining a word vector training text corresponding to a target word, wherein the word vector training text is a dictionary definition text of the target word;
and the training module is used for training word vectors by using a preset word vector model based on the word vector training text and determining the target word vectors corresponding to the target words.
12. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the memory-stored instructions to perform the method of any one of claims 1 to 10.
13. A computer readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1 to 10.
CN202211352892.2A 2022-11-01 2022-11-01 Word vector training method and device, electronic equipment and storage medium Pending CN115422932A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211352892.2A CN115422932A (en) 2022-11-01 2022-11-01 Word vector training method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211352892.2A CN115422932A (en) 2022-11-01 2022-11-01 Word vector training method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115422932A true CN115422932A (en) 2022-12-02

Family

ID=84207493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211352892.2A Pending CN115422932A (en) 2022-11-01 2022-11-01 Word vector training method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115422932A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116341537A (en) * 2023-05-23 2023-06-27 中债金科信息技术有限公司 Multi-granularity word vector evaluation method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210042472A1 (en) * 2018-03-02 2021-02-11 Nippon Telegraph And Telephone Corporation Vector generation device, sentence pair learning device, vector generation method, sentence pair learning method, and program
CN112949255A (en) * 2019-12-11 2021-06-11 中国移动通信有限公司研究院 Word vector training method and device
CN114861673A (en) * 2022-05-13 2022-08-05 阳光保险集团股份有限公司 Semantic analysis method, device and equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210042472A1 (en) * 2018-03-02 2021-02-11 Nippon Telegraph And Telephone Corporation Vector generation device, sentence pair learning device, vector generation method, sentence pair learning method, and program
CN112949255A (en) * 2019-12-11 2021-06-11 中国移动通信有限公司研究院 Word vector training method and device
CN114861673A (en) * 2022-05-13 2022-08-05 阳光保险集团股份有限公司 Semantic analysis method, device and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116341537A (en) * 2023-05-23 2023-06-27 中债金科信息技术有限公司 Multi-granularity word vector evaluation method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111310616B (en) Image processing method and device, electronic equipment and storage medium
CN111524521B (en) Voiceprint extraction model training method, voiceprint recognition method, voiceprint extraction model training device and voiceprint recognition device
CN111612070B (en) Image description generation method and device based on scene graph
CN110909815B (en) Neural network training method, neural network training device, neural network processing device, neural network training device, image processing device and electronic equipment
CN111191715A (en) Image processing method and device, electronic equipment and storage medium
CN110458218B (en) Image classification method and device and classification network training method and device
CN109615006B (en) Character recognition method and device, electronic equipment and storage medium
CN111242303B (en) Network training method and device, and image processing method and device
US11416703B2 (en) Network optimization method and apparatus, image processing method and apparatus, and storage medium
CN111931844B (en) Image processing method and device, electronic equipment and storage medium
CN109145150B (en) Target matching method and device, electronic equipment and storage medium
CN110532956B (en) Image processing method and device, electronic equipment and storage medium
CN111539410B (en) Character recognition method and device, electronic equipment and storage medium
CN113515942A (en) Text processing method and device, computer equipment and storage medium
JP2022522551A (en) Image processing methods and devices, electronic devices and storage media
CN109920016B (en) Image generation method and device, electronic equipment and storage medium
CN113792207A (en) Cross-modal retrieval method based on multi-level feature representation alignment
CN110781813A (en) Image recognition method and device, electronic equipment and storage medium
CN111259967A (en) Image classification and neural network training method, device, equipment and storage medium
CN111523599B (en) Target detection method and device, electronic equipment and storage medium
CN114332503A (en) Object re-identification method and device, electronic equipment and storage medium
CN113781518B (en) Neural network structure searching method and device, electronic equipment and storage medium
CN115422932A (en) Word vector training method and device, electronic equipment and storage medium
CN110070046B (en) Face image recognition method and device, electronic equipment and storage medium
CN114842404A (en) Method and device for generating time sequence action nomination, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination