WO2021047473A1 - 神经网络的训练方法及装置、语义分类方法及装置和介质 - Google Patents
神经网络的训练方法及装置、语义分类方法及装置和介质 Download PDFInfo
- Publication number
- WO2021047473A1 WO2021047473A1 PCT/CN2020/113740 CN2020113740W WO2021047473A1 WO 2021047473 A1 WO2021047473 A1 WO 2021047473A1 CN 2020113740 W CN2020113740 W CN 2020113740W WO 2021047473 A1 WO2021047473 A1 WO 2021047473A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- training
- comment
- network
- vector
- representation
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 575
- 238000000034 method Methods 0.000 title claims abstract description 284
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 56
- 238000012545 processing Methods 0.000 claims abstract description 46
- 239000013598 vector Substances 0.000 claims description 285
- 230000008569 process Effects 0.000 claims description 146
- 238000012552 review Methods 0.000 claims description 120
- 230000006870 function Effects 0.000 claims description 85
- 230000015654 memory Effects 0.000 claims description 32
- 238000013507 mapping Methods 0.000 claims description 22
- 230000006403 short-term memory Effects 0.000 claims description 17
- 230000000306 recurrent effect Effects 0.000 claims description 9
- 230000002457 bidirectional effect Effects 0.000 claims description 6
- 230000004580 weight loss Effects 0.000 claims description 2
- 230000007787 long-term memory Effects 0.000 claims 1
- 230000014509 gene expression Effects 0.000 abstract description 18
- 239000013604 expression vector Substances 0.000 abstract 6
- 238000010586 diagram Methods 0.000 description 16
- 238000005457 optimization Methods 0.000 description 10
- 238000013473 artificial intelligence Methods 0.000 description 6
- 238000010276 construction Methods 0.000 description 5
- 239000003814 drug Substances 0.000 description 4
- 238000011282 treatment Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Definitions
- the embodiments of the present disclosure relate to a training method of a neural network, a training device of a neural network, a semantic classification method, a semantic classification device, and a storage medium.
- Natural intelligence is a theory, method, technology and application system that uses digital computers or digital computer-controlled machines to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
- artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
- Artificial intelligence includes studying the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
- Artificial intelligence technology can be applied to the field of Natural Language Processing (NLP). NLP is the intersection of computer science, artificial intelligence, and information engineering. It involves knowledge of statistics, linguistics, etc., and its goal is to allow computers to process or "understand" natural language to perform tasks such as text classification, language translation, and question answering.
- NLP Natural Language Processing
- At least one embodiment of the present disclosure provides a semantic classification method, including: inputting a first comment about a first object; processing the first comment by using a common representation extractor to extract the information used to characterize the first comment The first common representation vector of the common representation; use the first representation extractor to process the first comment to extract the first single representation vector that is used to characterize the single representation in the first comment; A common representation vector and the first single representation vector are spliced together to obtain a first representation vector; and a first semantic classifier is used to process the first representation vector to obtain the semantic classification of the first comment
- the common representation includes meaning representations used to comment on both the first object and the second object, the second object being an associated comment object different from the first object, and the first object
- the single expression of the comment includes a meaning expression used only for commenting on the first object.
- the semantic classification method provided by some embodiments of the present disclosure further includes: mapping the first comment to a first original vector; wherein, using the common representation extractor to process the first comment includes: using The common representation extractor processes the first original vector; using the first representation extractor to process the first comment includes: using the first representation extractor to process the first original vector To process.
- mapping the first comment to the first original vector includes: using a word vector algorithm to map each word in the first comment as having Specify the length of the vector to obtain the first original vector.
- the common representation extractor and the first representation extractor each include one of a recurrent neural network, a long short-term memory network, and a bidirectional long-short-term memory network, respectively.
- the first semantic classifier includes a softmax classifier.
- the semantic classification method provided by some embodiments of the present disclosure further includes: inputting a second comment about a second object; using the common representation extractor to process the second comment to extract the second comment for characterizing the first object.
- the second common representation vector of the common representation in the two reviews; the second review is processed by a second representation extractor to extract a second single representation vector for characterizing the single representation in the second review ; Splicing the second common representation vector and the second single representation vector to obtain a second representation vector; and using a second semantic classifier to process the second representation vector to obtain the second representation vector Semantic classification of reviews; wherein the single representation of the second review includes a meaning representation used only for reviewing the second object.
- the semantic classification method provided by some embodiments of the present disclosure further includes: mapping the second comment to a second original vector; wherein, using the common representation extractor to process the second comment includes: using The common representation extractor processes the second original vector; using the second representation extractor to process the second comment includes: using the second representation extractor to process the second original vector To process.
- mapping the second comment to the second original vector includes: using a word vector algorithm to map each word in the second comment as having Specify the length of the vector to obtain the second original vector.
- the second representation extractor includes one of a recurrent neural network, a long short-term memory network, and a bidirectional long short-term memory network, and the second semantic classifier includes softmax classification.
- the second semantic classifier includes softmax classification.
- the corpus source of the first comment and the second comment includes at least one of text and voice.
- At least one embodiment of the present disclosure further provides a neural network training method, the neural network including: a generating network, a first branch network, a first classification network, a second branch network, and a second classification network; the training method includes : Semantic classification training stage; wherein, the semantic classification training stage includes: inputting a first training comment about a first object, and using the generation network to process the first training comment to extract a first training common representation vector , Using the first branch network to process the first training comment to extract a first training single representation vector, and splicing the first training common representation vector with the first training single representation vector to obtain A first training representation vector, using the first classification network to process the first training representation vector to obtain a prediction category identifier of the semantic classification of the first training comment; inputting a second training comment about a second object , Using the generation network to process the second training comment to extract a second training common representation vector, and using the second branch network to process the second training comment to extract a second training single representation vector , Splicing the second training
- the semantic classification training stage further includes: mapping the first training comment to a first training original vector, and mapping the second training comment to a second training Original vector; wherein, using the generation network to process the first training comment includes: using the generation network to process the first training original vector; using the first branch network to process the first training comment Processing training comments includes: using the first branch network to process the first training original vectors; using the generation network to process the second training comments, including: using the generation network to process the Processing a second training original vector; processing the second training comment using the second branch network includes: using the second branch network to process the second training original vector.
- mapping the first training comment to the first training original vector includes: using a word vector algorithm to map each word in the first training comment Is a vector with a specified length to obtain the first training original vector; mapping the second training comment to the second training original vector includes: using the word vector algorithm to transfer the second training comment Each word of is mapped to a vector with the specified length to obtain the second training original vector.
- the generation network, the first branch network, and the second branch network all include one of a recurrent neural network, a long short-term memory network, and a bidirectional long short-term memory network , Both the first classification network and the second classification network include a softmax classifier.
- the system loss function is expressed as:
- L obj represents the system loss function
- L( ⁇ , ⁇ ) represents the cross-entropy loss function
- Y1 represents the prediction category identification of the first training review
- T1 represents the true category identification of the first training review
- L(Y1 , T1) represents the loss of the first cross-entropy function of training review
- ⁇ 1 represents the weight loss function of the system
- the first training reviews cross entropy loss function L (Y1, T1) of the weight
- Y2 represents the second
- T1 represents the true category identifier of the second training review
- L(Y2, T2) represents the cross-entropy loss function of the second training review
- ⁇ 2 represents the first training review in the system loss function.
- Y and T are both formal parameters
- N represents the number of training comments
- K represents the number of category identifiers for semantic classification
- the neural network further includes a discriminant network; the training method further includes: generating a confrontation training phase; and alternately performing the generation confrontation training phase and the semantic classification Training phase; wherein the generative confrontation training phase includes: training the discriminant network based on the generative network; training the generative network based on the discriminant network; and alternately performing the above-mentioned training process to Complete the training in the generation confrontation training phase.
- training the discriminant network based on the generation network includes: inputting a third training comment on the first object, and using the generation network to The third training comment is processed to extract a third training common representation vector, and the third training common representation vector is processed using the discriminant network to obtain a third training output; inputting the second object about the second object Four training comments, using the generation network to process the fourth training comment to extract a fourth training common representation vector, and using the discriminant network to process the fourth training common representation vector to obtain fourth training Output; Based on the third training output and the fourth training output, the discriminant network confrontation loss value is calculated by the discriminant network confrontation loss function; the parameters of the discriminant network are modified according to the discriminant network confrontation loss value.
- the discriminant network includes a two-class softmax classifier.
- the discriminant network confrontation loss function is expressed as:
- L D represents the discriminant network confrontation loss function
- z1 represents the third training review
- P data (z1) represents the set of the third training review
- G(z1) represents the third training common representation vector
- D(G(z1)) represents the third training output
- z2 represents the fourth training reviews
- P data (z2) represents the set of the fourth training reviews
- G(z2) represents the fourth training common representation vector
- D(G(z2)) represents the fourth training output
- training the generation network based on the discriminant network includes: inputting a fifth training comment on the first object, and using the generation network to The fifth training comment is processed to extract a fifth training common representation vector, and the fifth training common representation vector is processed using the discriminant network to obtain the fifth training output; Six training comments, using the generation network to process the sixth training comment to extract a sixth training common representation vector, and using the discriminant network to process the sixth training common representation vector to obtain the sixth training Output; based on the fifth training output and the sixth training output, calculate and generate a network confrontation loss value by generating a network confrontation loss function; modify the parameters of the generation network according to the generation network confrontation loss value.
- the generated network counter loss function can be expressed as:
- L G represents the generation network confrontation loss function
- z3 represents the fifth training review
- P data (z3) represents the set of the fifth training review
- G(z3) represents the fifth training common representation vector
- D(G(z3)) represents the fifth training output
- z4 represents the sixth training reviews
- P data (z4) represents the set of the sixth training reviews
- G(z4) represents the sixth training common representation vector
- D(G(z4)) represents the sixth training output
- At least one embodiment of the present disclosure further provides a semantic classification device, including: a memory, configured to store non-transitory computer-readable instructions; and a processor, configured to run the computer-readable instructions.
- a semantic classification device including: a memory, configured to store non-transitory computer-readable instructions; and a processor, configured to run the computer-readable instructions.
- the semantic classification method provided in any embodiment of the present disclosure is executed.
- At least one embodiment of the present disclosure further provides a neural network training device, including: a memory, configured to store non-transitory computer-readable instructions; and a processor, configured to run the computer-readable instructions.
- a neural network training device including: a memory, configured to store non-transitory computer-readable instructions; and a processor, configured to run the computer-readable instructions.
- the training method provided in any embodiment of the present disclosure is executed.
- At least one embodiment of the present disclosure further provides a storage medium for non-transitory storage of computer-readable instructions.
- the non-transitory computer-readable instructions are executed by a computer, the semantic classification method provided by any embodiment of the present disclosure can be executed. Instructions.
- At least one embodiment of the present disclosure further provides a storage medium for non-transitory storage of computer-readable instructions.
- the training method provided in any embodiment of the present disclosure can be executed. instruction.
- FIG. 1 is a flowchart of a semantic classification method provided by at least one embodiment of the present disclosure
- Fig. 2 is an exemplary flow chart of the semantic classification method shown in Fig. 1;
- FIG. 3 is a flowchart of another semantic classification method provided by at least one embodiment of the present disclosure.
- FIG. 4 is an exemplary flowchart of the semantic classification method shown in FIG. 3;
- FIG. 5 is a schematic structural block diagram of a neural network provided by at least one embodiment of the present disclosure.
- Fig. 6 is a flowchart of a neural network training method provided by at least one embodiment of the present disclosure
- FIG. 7 is a schematic training architecture block diagram of a discriminant network in the generation confrontation training stage corresponding to the training method shown in FIG. 6 provided by at least one embodiment of the present disclosure
- FIG. 8 is a schematic flowchart of a process of training a discriminant network provided by at least one embodiment of the present disclosure
- FIG. 9 is a block diagram of a schematic training architecture of a generating network in the generating confrontation training stage corresponding to the training method shown in FIG. 6 provided by at least one embodiment of the present disclosure
- FIG. 10 is a schematic flowchart of a process of training a generation network provided by at least one embodiment of the present disclosure
- FIG. 11 is a schematic block diagram of a training architecture corresponding to the semantic classification training phase of the training method shown in FIG. 6 provided by at least one embodiment of the present disclosure
- FIG. 12 is a schematic flowchart of a training process in a semantic classification training phase in a training method provided by at least one embodiment of the present disclosure
- FIG. 13 is a schematic block diagram of a semantic classification device provided by at least one embodiment of the present disclosure.
- FIG. 14 is a schematic block diagram of a neural network training device provided by at least one embodiment of the present disclosure.
- FIG. 15 is a schematic diagram of a storage medium provided by at least one embodiment of the present disclosure.
- comments about hospitals and doctors can be divided into: comments only used to evaluate hospitals, such as comments such as "complete departments”; comments only used to evaluate doctors, such as Comments such as "excellent medical skills"; and, comments that can be used to evaluate both hospitals and doctors, such as comments such as "good service”.
- comments that can be used to evaluate different review objects are called common expressions; comments that are only used to evaluate a single review target are called single expressions.
- Reviews about hospitals and doctors can be semantically classified according to their review content, for example, they can be divided into favorable reviews, moderate reviews, and negative reviews.
- When performing semantic classification on reviews about hospitals and doctors if the common representation and single representation in the reviews can be extracted to perform semantic classification based on more effective information, it will help improve the objectivity and accuracy of the review analysis.
- the two comment objects of hospital and doctor are defined as related comment objects, that is, the hospital is the related comment object of the doctor, and the doctor is the related comment object of the hospital; similarly, the others are related comments to each other.
- the target situation can also include schools and teachers, take-out platforms and take-out merchants, etc.
- there may be a certain interdependent relationship between two related review objects but it is not limited to this.
- one review object is a component of another review object (such as employees), service providers, or suppliers (such as takeaway services). ), etc.; for another example, the quality of the review of one of the two related review objects may reflect the quality of the other review of the two related review objects to a certain extent.
- the semantic classification method includes: inputting a first comment about a first object; using a common representation extractor to process the first comment to extract a first common representation vector used to characterize the common representation in the first review; using the first The representation extractor processes the first comment to extract the first single representation vector used to represent the single representation in the first comment; concatenate the first common representation vector and the first single representation vector to obtain the first representation vector ; And using the first semantic classifier to process the first representation vector to obtain the semantic classification of the first comment; wherein the common representation includes meaning representations used to comment on both the first object and the second object, and the second The object is an associated comment object different from the first object, and the single representation of the first comment includes a meaning representation used only for commenting on the first object.
- Some embodiments of the present disclosure also provide a semantic classification device corresponding to the above-mentioned semantic classification method, a training method of a neural network, a device corresponding to a training method of a neural network, and a storage medium.
- the semantic classification method provided by at least one embodiment of the present disclosure can extract a common representation and a single representation in a first comment about a first object, and perform semantic classification on the first comment based on the common representation and the single representation, which helps to improve The objectivity and accuracy of the comment analysis.
- FIG. 1 is a flowchart of a semantic classification method provided by at least one embodiment of the present disclosure
- FIG. 2 is an exemplary flowchart of the semantic classification method shown in FIG. 1.
- the semantic classification method includes step S110 to step S150.
- the semantic classification method shown in FIG. 1 will be described in detail below in conjunction with FIG. 2.
- Step S110 Input the first comment about the first object.
- the first object may be any type of review object, such as a hospital, a doctor, a school, a teacher, a takeaway platform, a takeaway merchant, etc., which is not limited in the embodiment of the present disclosure.
- the first comment may come from a forum related to the first object, etc.
- the corpus source of the first comment may include text, speech, pictures (such as emoticons), etc., for example, speech, pictures, etc. can be converted into text by manual or artificial intelligence.
- the language of the first comment may include Chinese, English, Japanese, German, Korean, etc., which is not limited in the embodiment of the present disclosure.
- the semantic classification method can process one or more predetermined languages, and the first comment in other languages (not belonging to the one or more predetermined languages) can be translated (for example, , Translated into a predetermined language) before processing.
- step S110 inputting a first comment on the first object, that is, step S110 may include: mapping the first comment to the first original vector P1. Therefore, processing the first comment in the subsequent steps is processing the first original vector P1.
- a word vector algorithm for example, deep neural network, word2vec program, etc.
- the first original vector P1 includes all the words in the first comment after being mapped All vectors obtained.
- the length of the vector corresponding to each word is the same. It should be noted that in the embodiments of the present disclosure, the length of a vector refers to the number of elements included in the vector.
- the word vector algorithm can be used to map the n characters in the first comment to the vectors Vx1, Vx2,..., Vxn.
- Vx1, Vx2,..., Vxn the vectors
- Vx1, Vx2,..., Vxn have the same length.
- the first original vector has a matrix form.
- Step S120 Use the common representation extractor to process the first comment to extract a first common representation vector for characterizing the common representation in the first comment.
- the common representation extractor can adopt a model based on the relationship of samples in the time series, for example, including but not limited to Recurrent Neural Network (RNN), Long Short Term Memory, LSTM), Bi-directional Long Short Term Memory (Bi-directional Long Short Term Memory, Bi-LSTM), etc.
- RNN Recurrent Neural Network
- LSTM Long Short Term Memory
- Bi-LSTM Bi-directional Long Short Term Memory
- the common representation extractor EE0 is used to process the first original vector P1 to extract the first common representation vector P01.
- the LSTM includes multiple processing units (cells) connected in sequence, and the n vectors Vx1, Vx2 in the first original vector P1 (Vx1, Vx2,..., Vxn) ,..., Vxn are respectively used as the input of the first n processing units of the LSTM, and the output of the nth processing unit of the LSTM is the first common representation vector P01.
- the number of processing units included in the LSTM here is greater than or equal to the number of words of the longest first comment processed by it.
- the common representation includes a common representation of meaning used to comment on both a first object and a second object, where the second object is an associated comment object different from the first object.
- the first subject is a hospital and the second subject is a doctor.
- the common expressions include "good service", "clean”, etc., which can be used to rate hospitals or It is used to evaluate the comments of doctors, or it cannot be used to distinguish whether it is used to evaluate the hospital or the comments of doctors without referring to the context.
- the common representation extractor EE0 can be obtained through training methods that will be introduced later, so as to achieve the function of extracting the common representation in the first comment and the second comment. It should be noted that the embodiments of the present disclosure include but are not limited to this.
- Step S130 Use a first representation extractor to process the first comment to extract a first single representation vector for representing a single representation in the first comment.
- the first representation extractor may also adopt a model based on the relationship of samples on the time series, such as recurrent neural network (RNN), long short-term memory network (LSTM), bi-directional long short-term memory network (Bi-LSTM) )Wait.
- RNN recurrent neural network
- LSTM long short-term memory network
- Bi-LSTM bi-directional long short-term memory network
- the first representation extractor may adopt the same type of model as the same representation extractor.
- the first original vector P1 is processed by the first representation extractor EE1 to extract the first single representation Vector P11.
- the process of processing the first original vector P1 by the first representation extractor EE1 can refer to the process of processing the first original vector P1 by the common representation extractor EE0, which will not be repeated here.
- the single representation in the first comment includes a meaning representation used only for commenting on the first object, that is, the meaning representation is not used for commenting on a second object (that is, an associated comment object that is different from the first object).
- the first object is a hospital and the second object is a doctor.
- the single expression in the first comment includes "complete departments", "advanced equipment” and so on. It is used to evaluate hospitals and cannot be used to evaluate doctors’ comments.
- the first single representation vector P11 includes the information of the single representation in the first comment; in addition, the first single representation vector P11 may also include (of course, or may also Excluding) the commonly expressed information in the first comment; it should be noted that the embodiment of the present disclosure does not limit this.
- the first representation extractor EE1 can be obtained through training methods that will be introduced later, so as to achieve the function of extracting a single representation in the first comment. It should be noted that the embodiments of the present disclosure include but are not limited to this.
- Step S140 concatenate the first common representation vector and the first single representation vector to obtain the first representation vector.
- the first common representation vector P01 and the first single representation vector P11 are spliced to obtain the first representation vector P10.
- the first common representation vector P01 includes s elements (a1, a2,..., as) and the first single representation vector P11 includes t elements (b1, b2,..., bt)
- the first common representation vector P01 and The first single representation vector P11 is spliced, that is, the s+t elements are spliced in a predetermined order.
- it can be spliced into (a1,..., as, b1,..., bt) or (b1,..., bt, a1,...,as) and other forms to obtain the first representation vector P10.
- the embodiment of the present disclosure does not limit the arrangement order of the elements in the first representation vector P10, as long as the first representation vector P10 includes all the elements in the first common representation vector P01 and the first single representation vector P11 That's it.
- Step S150 Use the first semantic classifier to process the first representation vector to obtain the semantic classification of the first comment.
- the first semantic classifier CC1 is used to process the first representation vector P10 to obtain the semantic classification of the first comment.
- the first semantic classifier CC1 may include a softmax classifier, and the softmax classifier includes, for example, a fully connected layer.
- a K-dimensional (that is, including K elements, corresponding to K category identifiers) vector z is obtained.
- the elements in the vector z can be any real numbers; the softmax classifier can divide the K-dimensional
- the vector z is compressed into a K-dimensional vector.
- the formula of the softmax classifier is as follows:
- Z j represents the j-th element in the K-dimensional vector z
- ⁇ (z) represents the predicted probability of each category label (label)
- ⁇ (z) is a real number
- its range is (0,1)
- K-dimensional The sum of the vector ⁇ (z) is 1.
- each category identifier in the K-dimensional vector z is assigned a certain prediction probability, and the category identifier with the largest prediction probability is selected as the category identifier for semantic classification.
- the category number of the category identifiers of the semantic classification is K, for example, K is an integer greater than or equal to 2.
- K is an integer greater than or equal to 2.
- the embodiments of the present disclosure include but are not limited to this.
- the first semantic classifier CC1 can be obtained through training methods that will be introduced later, so as to realize the above-mentioned semantic classification function. It should be noted that the embodiments of the present disclosure include but are not limited to this.
- FIG. 3 is a flowchart of another semantic classification method provided by at least one embodiment of the present disclosure
- FIG. 4 is an exemplary flowchart of the semantic classification method shown in FIG. 3.
- the semantic classification method shown in FIG. 3 further includes steps S160 to S200.
- the operations in step S160 to step S200 in the semantic classification method shown in FIG. 3 are basically similar to the operations in step S110 to step S150, and the main difference lies in: step S110 to step S150 are used for matching
- the first comment on the first object is subjected to semantic classification processing
- steps S160 to S200 are used to perform semantic classification processing on the second comment on the second object, wherein the first object and the second object are related comment objects to each other. Therefore, the details of step S160 to step S200 may correspond to the relevant description of step S110 to step S150.
- steps S160 to S200 of the semantic classification method shown in FIG. 3 will be described in detail with reference to FIG. 4.
- Step S160 Input a second comment on the second object.
- the second object is an associated comment object different from the first object.
- the second object can be a comment object associated with the hospital, such as a doctor or medicine; or, when the first object is a doctor, the second object can be a comment associated with the doctor, such as a hospital or medicine.
- Object Object.
- the embodiments of the present disclosure include but are not limited to this.
- one of the first object and the second object may also be a school, a takeaway platform, etc.
- the other of the first object and the second object may also be For teachers, takeaway businesses, etc.; in other words, as long as the first object and the second object are related to each other's review objects.
- the second comment may originate from a forum related to the second object.
- the first comment and the second comment may originate from the same forum or the like.
- the corpus source of the second comment may also include text, voice, pictures, etc., for example, voice, pictures, etc. can be converted into text manually or artificially.
- the language of the second comment may include Chinese, English, Japanese, German, Korean, etc., which is not limited in the embodiment of the present disclosure.
- the semantic classification method can process one or more predetermined languages, and the second comment in other languages (not belonging to the one or more predetermined languages) can be translated (for example, , Translated into a predetermined language) before processing.
- step S160 inputting the first comment on the first object, that is, step S160 may include: mapping the second comment to the second original vector P2. Therefore, processing the second comment in the subsequent step is processing the second original vector P2.
- a word vector algorithm for example, deep neural network, wotd2vec program, etc.
- the second original vector P2 includes all the words in the second comment after being mapped All vectors obtained.
- the length of the vector corresponding to each word in the second comment is the same as the length of the vector corresponding to each word in the first comment.
- Step S170 Use the common representation extractor to process the second comment to extract a second common representation vector used to characterize the common representation in the second comment.
- the common representation extractor EE0 used in step S120 can also be used in step S170, that is, the common representation extractor EE0 can also process the second comment to extract the The second common representation of the common representation vector P02.
- the common representation extractor EE0 is used to process the second original vector P2 to extract the second common representation vector P02.
- the process of processing the second original vector P2 by the common representation extractor EE0 can refer to the process of processing the first original vector P1 by the common representation extractor EE0, which will not be repeated here.
- the number of processing units included in the LSTM is also greater than or equal to the number of words of the longest second comment processed by it.
- Step S180 Use a second representation extractor to process the second comment to extract a second single representation vector for representing a single representation in the second comment.
- the second representation extractor may also adopt a model based on the relationship of samples on the time series, such as recurrent neural network (RNN), long short-term memory network (LSTM), bi-directional long short-term memory network (Bi-LSTM) )Wait.
- RNN recurrent neural network
- LSTM long short-term memory network
- Bi-LSTM bi-directional long short-term memory network
- the second representation extractor may adopt the same type of model as the common representation extractor.
- the second original vector P2 is processed by the second representation extractor EE2 to extract the second single representation Vector P22.
- the process of processing the second original vector P2 by the second representation extractor EE2 can refer to the process of processing the first original vector P1 by the common representation extractor EE0, which will not be repeated here.
- the single representation in the second comment includes a meaning representation used only to comment on the second object, that is, the meaning representation is not used to comment on the first object (that is, an associated comment object different from the second object).
- the first object is a hospital and the second object is a doctor.
- the single expression in the second comment includes "excellent medical skills", "kindness” and so on. Comments that are used to evaluate doctors and cannot be used to evaluate hospitals.
- the second single representation vector P22 includes the information of the single representation in the second comment; in addition, the second single representation vector P22 may also include (of course, it may not include ) Commonly expressed information in the second comment; it should be noted that the embodiment of the present disclosure does not limit this.
- the second representation extractor EE2 can be obtained through training methods that will be introduced later, so as to achieve the function of extracting a single representation in the second comment. It should be noted that the embodiments of the present disclosure include but are not limited to this.
- Step S190 concatenate the second common representation vector and the second single representation vector to obtain a second representation vector.
- the second common representation vector P02 and the second single representation vector P22 are spliced to obtain the second representation vector P20.
- the splicing process and details in step S190 can refer to the splicing process and details in step S140, which will not be repeated here.
- Step S200 Use the second semantic classifier to process the second representation vector to obtain the semantic classification of the second comment.
- the second semantic classifier CC2 is used to process the second representation vector P20 to obtain the semantic classification of the second comment.
- the second semantic classifier CC2 may also include a softmax classifier, for example, the softmax classifier includes a fully connected layer; for example, the processing process and details of the second semantic classifier CC2 can refer to the processing process of the first semantic classifier CC1 and The details will not be repeated here.
- the common representation extractor EE0, the first representation extractor EE1, and the second representation extractor EE2 perform similar functions, and the three can have the same or similar structures, but the three The included parameters can be different.
- the first semantic classifier CC1 and the second semantic classifier CC2 perform similar functions, and the two may have the same or similar structure, but the parameters included in the two may be different.
- the common representation extractor EE0, the first representation extractor EE1, the second representation extractor EE2, the first semantic classifier CC1, and the second semantic classifier CC2 can all be It is implemented by software, hardware, firmware, or any combination thereof, so that the corresponding processing procedures can be executed respectively.
- the flow of the above-mentioned semantic classification method may include more or less operations (for example, in the semantic classification method shown in FIG. 3, only steps S110 to S150 may be executed.
- the operations can also be performed only from step S160 to step S200), and these operations can be performed sequentially or in parallel (for example, step S120 and step S130 can be performed in parallel, or can be performed sequentially in any order).
- the flow of the image display processing method described above includes multiple operations appearing in a specific order, it should be clearly understood that the order of the multiple operations is not limited.
- the semantic classification method described above can be executed once or multiple times according to predetermined conditions.
- the first comment/second comment when the first comment/second comment is mapped to the first original vector/second original vector, the first comment/second comment can be firstly classified as irrelevant to semantic classification. Filter out the words of (for example, stop words, etc.), and then map the remaining semantic classification-related words in the first comment/second comment to the first original vector/second original vector .
- the common representation extractor EE0, the first representation extractor EE1, and the second representation extractor EE2 trained by a specific training method can filter out words that are not related to semantic classification when extracting meaning representations. word. It should be noted that the embodiments of the present disclosure do not limit this.
- the semantic classification method provided by the embodiment of the present disclosure can extract the common representation and the single representation in the first comment about the first object, and perform semantic classification on the first comment based on the common representation and the single representation, which is helpful to improve the comment.
- the objectivity and accuracy of the analysis can extract the common representation and the single representation in the first comment about the first object, and perform semantic classification on the first comment based on the common representation and the single representation, which is helpful to improve the comment.
- FIG. 5 is a schematic structural block diagram of a neural network provided by at least one embodiment of the present disclosure
- FIG. 6 is a flowchart of a training method of a neural network provided by at least one embodiment of the present disclosure.
- the neural network includes a generation network G, a discrimination network D, a first branch network SN1, a first classification network CN1, a second branch network SN2, and a second branch network CN2.
- the training method includes: generating a confrontation training stage S300 and a semantic classification training stage S400, and performing these two stages of training alternately to obtain a trained neural network.
- the generating network G, the first branch network SN1, the first classifier CN1, the second branch network SN2, and the second classifier CN2 can be used to implement the common semantic classification methods mentioned above.
- the functions of the representation extractor EE0, the first representation extractor EE1, the first semantic classifier CC1, the second representation extractor EE2, and the second semantic classifier CC2 can be used to perform the aforementioned semantic classification method.
- the generating confrontation training stage S300 includes:
- Step S310 Training the discriminant network based on the generation network
- Step S320 Training the generation network based on the discriminant network.
- step S310 and step S320 The above-mentioned training process (ie, step S310 and step S320) is alternately performed to complete the training of the generation confrontation training stage S300.
- the construction of the generating network G may be the same as the construction of the aforementioned common representation extractor EE0, and the construction details and working principles of the generating network G can refer to the related description of the aforementioned common representation extractor EE0, which will not be repeated here.
- the generation network G is used to process comments on the first object and also to process comments on the second object to extract meaning in the comments, where the first object and the second object They are related to each other's comments.
- the discriminant network D can adopt a two-class softmax classifier.
- the discriminating network D is used to determine whether the meaning extracted by the generating network G is used to comment on the first object or the second object.
- FIG. 7 is a schematic training architecture block diagram of a discriminant network in the generated adversarial training phase corresponding to the training method shown in FIG. 6 provided by at least one embodiment of the present disclosure.
- FIG. 8 is a block diagram of a discriminant network provided by at least one embodiment of the present disclosure. A schematic flow chart of the process of training the discriminant network.
- step S310 includes step S311 to step S314, as follows:
- Step S311 Input the third training comment about the first object, use the generative network to process the third training comment to extract the third training common representation vector, and use the discriminant network to process the third training common representation vector to obtain the first Three training output;
- Step S312 Input the fourth training comment about the second object, use the generative network to process the fourth training comment to extract the fourth training common representation vector, and use the discriminant network to process the fourth training common representation vector to obtain the first Four training output;
- Step S313 Based on the third training output and the fourth training output, calculate the discriminant network confrontation loss value through the discriminant network confrontation loss function;
- Step S314 Correct the parameters of the discriminating network according to the discriminating network countermeasure loss value.
- training the discriminant network based on the generative network may also include: judging whether the training of the discriminant network meets a predetermined condition; if the predetermined condition is not met, repeating the training process of the discriminant network; The training process of the discriminant network at this stage is stopped, and the discriminant network trained in this stage is obtained.
- the aforementioned predetermined condition is the discriminant network confrontation loss value corresponding to two consecutive pairs of reviews (for example, in the process of training the discriminant network, each pair of reviews includes a third training review and a fourth training review) It is no longer significantly reduced.
- the foregoing predetermined condition is that the number of training times or training periods of the discriminating network reaches a predetermined number. The embodiment of the present disclosure does not limit this.
- the above example is only a schematic illustration of the training process of the discriminant network.
- sample comments that is, comments on the first object and comments on the second object
- Both can include multiple iterations to modify the parameters of the discriminant network.
- the training process of the discriminant network also includes fine-tune the parameters of the discriminant network to obtain more optimized parameters.
- the initial parameter of the discriminant network D may be a random number, for example, the random number conforms to a Gaussian distribution.
- the initial parameters of the discriminant network D can also be trained parameters in databases commonly used in this field. The embodiment of the present disclosure does not limit this.
- the training process of the discriminant network D may also include an optimization function (not shown in FIG. 7).
- the optimization function can calculate the error value of the parameters of the discriminant network D according to the discriminant network countermeasure loss value calculated by the discriminant network countermeasure loss function. And according to the error value, the parameters of the discriminant network D are corrected.
- the optimization function may use a stochastic gradient descent (SGD) algorithm, a batch gradient descent (batch gradient descent, BGD) algorithm, etc., to calculate the error value of the parameters of the discriminant network D.
- SGD stochastic gradient descent
- BGD batch gradient descent
- the third training comment comes from the comment sample set of the first object; for example, each comment in the comment sample set of the first object has been semantically classified in advance (for example, semantic classification is performed manually), with The category identification of the determined semantic classification; for example, the category identification of the semantic classification in the comment sample set of the first object includes good reviews, moderate reviews, and negative reviews.
- the embodiments of the present disclosure include but are not limited to this.
- the fourth training comment comes from the comment sample set of the second object; for example, each comment in the comment sample set of the second object has been semantically classified in advance (for example, semantic classification is performed manually, etc.), with
- the category identification of the determined semantic classification for example, the category identification of the semantic classification in the comment sample set of the second object includes good reviews, medium reviews, and negative reviews.
- the embodiments of the present disclosure include but are not limited to these.
- the word vector algorithm may be used to map the third training comment and the fourth training comment to the original vector, and the original vectors corresponding to the third training comment and the fourth training comment are processed by the generation network G.
- the processing and details of generating the network G can be referred to the processing and details of the aforementioned common representation extractor EE0, which will not be repeated here.
- the discriminative network adversarial loss function can be expressed as:
- L D represents the discriminative network confrontation loss function
- z1 represents the third training review
- P data (z1) represents the set of third training reviews
- G(z1) represents the third training common representation vector
- D(G(z1)) Represents the third training output
- z2 represents the fourth training reviews
- P data (z2) represents the set of fourth training reviews
- G(z2) represents the fourth training common representation vector
- a batch gradient descent algorithm can be used to optimize the parameters of the discriminant network D.
- discriminant network countermeasure loss function expressed by the above formula is exemplary, and the embodiments of the present disclosure include but are not limited to this.
- the training goal of discriminant network D is to minimize the value of discriminant network confrontation loss.
- the object label of the third training comment is set to 1, that is, the discriminant network D needs to identify that the third training common representation vector comes from the comment about the first object; at the same time, the fourth training The object label of the comment is set to 0, that is, the discriminant network D needs to identify that the fourth training common representation vector comes from the comment on the second object.
- the training goal of the discriminant network D is to enable the discriminant network D to accurately determine the true source of the meaning expression extracted by the generating network G (that is, from the comment on the first object or the comment on the second object), that is, The discrimination network D can accurately determine whether the meaning representation extracted by the generation network G is used to comment on the first object or the second object.
- the parameters of the discriminant network D are constantly revised, so that the discriminant network D after the parameter correction can accurately identify the source of the third training common representation vector and the fourth training common representation vector. That is, the output of the discriminant network D corresponding to the third training comment is constantly approaching 1, and the output of the discriminant network D corresponding to the second training comment is constantly approaching 0, thereby continuously reducing the generation network confrontation loss value.
- FIG. 9 is a schematic training architecture block diagram of a generative network in the generative confrontation training phase corresponding to the training method shown in FIG. 6 provided by at least one embodiment of the present disclosure
- FIG. 10 is a block diagram of a method provided by at least one embodiment of the present disclosure
- step S320 includes step S321 to step S324, as shown below:
- Step S321 Input the fifth training comment about the first object, use the generation network to process the fifth training comment to extract the fifth training common representation vector, and use the discriminant network to process the fifth training common representation vector to obtain the first Five training output;
- Step S322 Input the sixth training comment about the second object, use the generation network to process the sixth training comment to extract the sixth training common representation vector, and use the discriminant network to process the sixth training common representation vector to obtain the first Six training output;
- Step S323 Based on the fifth training output and the sixth training output, a network confrontation loss value is calculated and generated by generating a network confrontation loss function
- Step S324 Correct the parameters of the generating network according to the value of the generating network countermeasure loss.
- training the generative network based on the discriminant network may also include: judging whether the training of the generative network satisfies a predetermined condition, and if the predetermined condition is not met, repeating the above-mentioned training process of the generative network; if the predetermined condition is satisfied, Then stop the training process of the generative network in this stage, and get the generative network trained in this stage.
- the foregoing predetermined condition is the discriminant network confrontation loss value corresponding to two consecutive pairs of reviews (for example, in the process of training the generation network, each pair of reviews includes a fifth training review and a sixth training review) It is no longer significantly reduced.
- the foregoing predetermined condition is that the number of training times or training periods of the generated network reaches a predetermined number. The embodiment of the present disclosure does not limit this.
- the above example is only a schematic illustration of the training process of the generation network.
- sample comments that is, comments on the first object and comments on the second object
- Both can include multiple iterations to modify the parameters of the generated network.
- the training process of the generative network also includes fine-tune the parameters of the generative network to obtain more optimized parameters.
- the initial parameter of the generating network G may be a random number, for example, the random number conforms to a Gaussian distribution.
- the initial parameters of the generating network G can also be trained parameters in databases commonly used in this field. The embodiment of the present disclosure does not limit this.
- the training process of the generator network G may also include an optimization function (not shown in FIG. 7), and the optimization function may calculate the error value of the parameter of the generator network G according to the generated network counter loss value calculated by the generator network counter loss function. And according to the error value, the parameters of the generating network G are corrected.
- the optimization function may use a stochastic gradient descent (SGD) algorithm, a batch gradient descent (BGD) algorithm, etc. to calculate the error value of the parameters of the generated network G.
- SGD stochastic gradient descent
- BGD batch gradient descent
- the fifth training comment is also derived from the comment sample set of the first object, and the embodiments of the present disclosure include but are not limited to this.
- the sixth training comment is also derived from the comment sample set of the second object, and the embodiments of the present disclosure include but are not limited to this.
- the generated network confrontation loss function can be expressed as:
- L G represents a function generating a network against loss
- G (z3) together represent a fifth training vector
- D (G (z3)) Represents the fifth training output
- z4 represents the sixth training review
- P data (z4) represents the set of sixth training reviews
- G(z4) represents the sixth training common representation vector
- D(G(z4)) Represents the sixth training output
- Represents expectations for the sixth training comment collection can be used to optimize the parameters of the generation network G.
- discriminant network countermeasure loss function expressed by the above formula is exemplary, and the embodiments of the present disclosure include but are not limited to this.
- the training goal of the generative network G is to minimize the counter-loss value of the generative network.
- the object label of the fifth training comment is set to 0, that is, the discriminant network D needs to identify that the fifth training common representation vector comes from the comment about the second object; at the same time, the sixth training The object label of the review is set to 1, that is, the discriminant network D needs to identify that the sixth training common representation vector comes from the review about the first object.
- the training goal of the generative network G is to make the discriminant network D unable to accurately determine the true source of the meaning expression extracted by the generative network G (that is, from the comment about the first object or the comment about the second object), even if it is determined
- the network D cannot determine whether the meaning representation extracted by the generating network G is used to comment on the first object or the second object.
- the discrimination network D cannot determine the true source of the meaning representation extracted by the generation network G.
- the parameters of the generative network G are constantly revised, so that the meaning extracted by the generative network G after the parameter correction is expressed as a comment on the first object and a comment on the second object.
- Common representation so that the discriminant network D cannot accurately identify the source of the fifth training common representation vector and the sixth training common representation vector, that is, the output of the discriminant network D corresponding to the fifth training comment is kept away from 1 (that is, it keeps close to 0), and make the output of the discriminant network D corresponding to the four training comments keep away from 0 (that is, keep getting closer to 1), so as to continuously reduce the generation network confrontation loss value.
- the training of the generating network G and the training of the discriminant network D are performed alternately and iteratively.
- the first stage of training is generally performed on the discriminant network D to improve the discriminative ability of the discriminant network D (that is, to identify the true source of the input to the discriminant network D), and obtain the The discriminant network D trained in the first stage; then, the generation network G is trained in the first stage based on the discriminant network D trained in the first stage to improve the extraction of comments about the first object and comments about the second object by the generation network G The ability of the joint representation of, obtains the generative network G trained in the first stage.
- the discriminant network D trained in the first stage is trained in the second stage to improve the discriminating ability of the discriminant network D, and get The discriminant network D trained in the second stage; then, based on the discriminant network D trained in the second stage, the generative network G trained in the first stage is trained in the second stage to improve the extraction of comments on the first object of the generative network G The ability to express together with the comment on the second object, obtain the generative network G trained in the second stage, and so on, and then perform the third-stage training, the fourth-stage training on the discriminant network D and the generative network G,... , Until the output of the generative network G is obtained as a common representation of the comment on the first object and the comment on the second object, so as to complete the training of a generative confrontation training stage S300.
- the anti-antibodies of the generation network G and the discriminant network D are now: comments on the first object (that is, the third training comment And the fifth training comment).
- the output of the generation network G corresponding to each individual training process has different object labels (in the training process of the discriminant network D, the object label of the third training comment is 1, and the object label of the third training comment is 1.
- the object label of the fifth training comment is 0
- the output of the generation network G corresponding to the comment on the second object has Different object labels (in the training process of the discriminant network D, the object label of the fourth training comment is 0, and in the training process of generating the network G, the object label of the sixth training comment is 1).
- the confrontation between the generation network G and the discrimination network D is also reflected in the discrimination network confrontation loss function and the generation network confrontation loss function.
- the meaning extracted by the generated network G after training is expressed as a common representation of the comments on the first object and the comments on the second object (regardless of whether the input of the generated network G is about the first object The comment of is still the comment about the second object), the output of the discriminant network D for the common representation is both 0.5, that is, the generating network G and the discriminant network D reach the Nash equilibrium through the confrontation game.
- the semantic classification training stage S400 includes: training the generation network, the first branch network, the first classification network, the second branch network, and the second classification network.
- the structure of the first branch network SN1 may be the same as the structure of the aforementioned first representation extractor EE1.
- the first branch network SN1 is used to process comments on the first object to extract a single representation in the comment (whether to extract the common representation in the comment is not limited).
- the structure of the second branch network SN2 may be the same as the structure of the aforementioned second representation extractor EE2.
- the second branch network SN2 is used to process comments on the second object to extract a single representation in the comment (whether to extract the common representation in the comment is not limited).
- the structures of the first classification network CN1 and the second classification network CN2 may be the same as those of the aforementioned first semantic classifier CC1 and the second semantic classifier CC2, respectively.
- the structure details of the first classification network CN1 and the second classification network CN2 For the working principle, please refer to the related description of the first semantic classifier CC1 and the second semantic classifier CC2, which will not be repeated here.
- FIG. 11 is a block diagram of a schematic training architecture corresponding to the semantic classification training phase of the training method shown in FIG. 6 provided by at least one embodiment of the present disclosure
- FIG. 12 is a training method provided in at least one embodiment of the present disclosure
- a schematic flow chart of the training process of the semantic classification training phase will be described in detail with reference to FIG. 11 and FIG. 12.
- the semantic classification training stage S400 includes steps S401 to S405.
- Step S401 Input the first training comment about the first object, use the generative network to process the first training comment to extract the first training common representation vector, and use the first branch network to process the first training comment to extract the first training comment.
- a training single representation vector, the first training common representation vector and the first training single representation vector are spliced together to obtain the first training representation vector, and the first training representation vector is processed by the first classification network to obtain the first training The predicted category identifier for the semantic classification of the comment.
- the first training comment is also derived from the comment sample set of the first object, and the embodiments of the present disclosure include but are not limited to this.
- the first training comment has a category identifier T1 (that is, a true category identifier) of a certain semantic classification, for example, the true category identifier is represented in the form of a vector.
- T1 that is, a true category identifier
- the real category identifier is a K-dimensional vector; when the k-th element of the K-dimensional vector is 1, and the other elements are 0, the K-dimensional vector represents the k-th True category identification, where k is an integer and 1 ⁇ k ⁇ K.
- inputting the first training comment on the first object may include: mapping the first training comment to the first training original vector TP1. Therefore, processing the first training comment in the subsequent operation is processing the first training original vector TP1.
- a word vector algorithm for example, deep neural network, word2vec program, etc.
- a word vector algorithm may be used to map each word in the first training comment to a vector of a specified length, so that the first training original vector P1 includes all of the first training comment. All vectors obtained by mapping words. For example, the length of the vector corresponding to each word is the same.
- step S401 can refer to the related descriptions of step S110 to step S150 of the aforementioned semantic classification method, which will not be repeated here.
- the predicted category identifier of the first training review is a vector with the same dimension as its real category identifier.
- the predicted category identifier of the first training review can be expressed in the form of the aforementioned vector, and each element in the vector represents the predicted probability of each category identifier.
- the category identifier with the largest prediction probability is selected as the category identifier of the semantic classification.
- Step S402 Input the second training comment about the second object, use the generative network to process the second training comment to extract the second training common representation vector, and use the second branch network to process the second training comment to extract the second training comment.
- the second training single representation vector, the second training common representation vector and the second training single representation vector are spliced to obtain the second training representation vector, and the second training representation vector is processed by the second classification network to obtain the second training The predicted category identifier for the semantic classification of the comment.
- the second training comment is also derived from the comment sample set of the second object.
- the embodiments of the present disclosure include but are not limited to this.
- the second training review has a category identifier T2 (that is, a real category identifier) of a certain semantic classification.
- the representation form of the real category identifier T2 of the second training review can refer to the representation form of the real category identifier T1 of the first training review. I will not repeat them here.
- inputting the second training comment on the second object may include: mapping the second training comment to the second training original vector TP2. Therefore, processing the second training comment in the subsequent operation is processing the second training original vector TP2.
- a word vector algorithm for example, deep neural network, word2vec program, etc.
- the second training original vector TP2 includes all of the second training comments All vectors obtained by mapping words.
- the length of the vector corresponding to each word in the second training comment is the same as the length of the vector corresponding to each word in the first training comment.
- step S402 can refer to the related description of step S160 to step S200 of the aforementioned semantic classification method, which will not be repeated here.
- the predicted category identifier of the second training review is a vector with the same dimension as its real category identifier.
- the predicted category identifier of the second training review can also be expressed in the form of the aforementioned vector, and each element in the vector represents the prediction of each category identifier. Probability, for example, the category identifier with the largest predicted probability is selected as the category identifier of the semantic classification.
- Step S403 Based on the predicted category identification of the first training review and the predicted category identification of the second training review, a system loss value is calculated through the system loss function;
- system loss function can be expressed as:
- Lobi represents the system loss function
- L( ⁇ , ⁇ ) represents the cross-entropy loss function
- Y1 represents the prediction category identification of the first training review
- T1 represents the true category identification of the first training review
- L(Y1, T1) represents The cross-entropy loss function of the first training review
- ⁇ 1 represents the weight of the cross-entropy loss function L(Y1, T1) of the first training review in the system loss function
- Y2 represents the prediction category identifier of the second training review
- T1 represents the first training review.
- L(Y2, T2) represents the cross entropy loss function of the second training review
- ⁇ 2 represents the weight of the cross entropy loss function L(Y2, T2) of the second training review in the system loss function .
- the cross-entropy loss function L( ⁇ , ⁇ ) can be expressed as:
- Y and T are both formal parameters
- N represents the number of training reviews (for example, the first training review or the second training review)
- K represents the number of category identifiers for semantic classification.
- the training goal of the semantic classification training stage S400 is to minimize the loss of the system. For example, the smaller the value of the cross-entropy loss function L(Y1, T1) of the first training comment is, the closer the predicted category identifier of the first training comment is to the true category identifier of the first training comment, that is, the value of the first training comment The more accurate the semantic classification is; similarly, the smaller the value of the cross-entropy loss function L(Y2, T2) of the second training review is, it indicates that the predicted category identification of the second training review is closer to the true category identification of the second training review. That is, the semantic classification of the second training comment is more accurate.
- Step S404 Correct the parameters of the generating network, the first branch network, the first classification network, the second branch network, and the second classification network according to the system loss value.
- the initial parameters of the first branch network SN1, the first classification network CN1, the second branch network SN2, and the second classification network CN2 may be random numbers, for example, the random numbers conform to a Gaussian distribution.
- the initial parameters of the first branch network SN1, the first classification network CN1, the second branch network SN2, and the second classification network CN2 may also be trained parameters in databases commonly used in the art. The embodiment of the present disclosure does not limit this.
- the training process of the semantic classification training stage S400 may also include an optimization function (not shown in FIG. 11).
- the optimization function may be calculated according to the system loss value calculated by the system loss function to generate the network G, the first branch network SN1, and the first branch network SN1.
- the optimization function can use stochastic gradient descent (SGD) algorithm, batch gradient descent (BGD) algorithm, etc. to calculate the generation network G, the first branch network SN1, the first classification network CN1, and the second branch The error value of the parameters of the network SN2 and the second classification network CN2.
- the semantic classification training stage S400 may also include: judging whether the training of the generation network, the first branch network, the first classification network, the second branch network, and the second classification network meets predetermined conditions, and if the predetermined conditions are not met, repeat execution The training process of the above-mentioned semantic classification training stage S400; if the predetermined conditions are met, the current training process of the semantic classification training stage S400 is stopped, and the generated network, the first branch network, the first classification network, and the second branch trained at the current stage are obtained Network and second classification network.
- the foregoing predetermined condition is the system loss corresponding to two consecutive pairs of comments (for example, in the training process of the semantic classification training stage S400, each pair of comments includes a first training comment and a second training comment). The value no longer decreases significantly.
- the foregoing predetermined condition is that the number of training times or the training period of the semantic classification training stage S400 reaches a predetermined number. The embodiment of the present disclosure does not limit this.
- the training process of the semantic classification training stage S400 only schematically illustrates the training process of the semantic classification training stage S400.
- sample comments that is, comments on the first object and comments on the second object
- Both can include multiple iterations to modify the parameters of the generated network.
- the training process of the semantic classification training stage S400 also includes fine-tune the parameters of the generation network, the first branch network, the first classification network, the second branch network, and the second classification network to obtain more optimization. Parameters.
- the generating confrontation training stage S300 and the semantic classification stage S400 are alternately iteratively performed, wherein the generating network G participates in the training of these two training stages at the same time.
- the generative adversarial training stage S300 can improve the ability of the generative network G to extract common representations, but at the same time, the generative network G may also extract the first training comment and the second training comment that will be used Words that are not related to semantic classification; for example, the semantic classification stage S400 can enable the generation network G to obtain the function of filtering these words that are not related to semantic classification, thereby helping to improve the accuracy of semantic classification and the operating efficiency of the neural network.
- the neural network training method can train the neural network, wherein the trained generation network G, the first branch network SN1, the second branch network SN2, the first classification network CN1, and the second classification network CN2 can be used to implement the functions of the common representation extractor EE0, the first representation extractor EE1, the second representation extractor EE2, the first semantic classifier CC1, and the second semantic classifier CC2 in the aforementioned semantic classification method, so that Perform the aforementioned semantic classification method.
- FIG. 13 is a schematic block diagram of a semantic classification device provided by at least one embodiment of the present disclosure.
- the semantic classification device 500 includes a memory 510 and a processor 520.
- the memory 510 is used for non-transitory storage of computer readable instructions
- the processor 520 is used for running the computer readable instructions.
- the semantic classification method provided by any embodiment of the present disclosure is executed.
- the neural network training method provided in any embodiment of the present disclosure may also be executed.
- the memory 510 and the processor 520 may directly or indirectly communicate with each other.
- components such as the memory 510 and the processor 520 may communicate through a network connection.
- the network may include a wireless network, a wired network, and/or any combination of a wireless network and a wired network.
- the network may include a local area network, the Internet, a telecommunication network, the Internet of Things (Internet of Things) based on the Internet and/or a telecommunication network, and/or any combination of the above networks, and so on.
- the wired network may, for example, use twisted pair, coaxial cable, or optical fiber transmission for communication, and the wireless network may use, for example, a 3G/4G/5G mobile communication network, Bluetooth, Zigbee, or WiFi.
- the present disclosure does not limit the types and functions of the network here.
- the processor 520 may control other components in the semantic classification apparatus to perform desired functions.
- the processor 520 may be a central processing unit (CPU), a tensor processor (TPU), or a graphics processor GPU, and other devices with data processing capabilities and/or program execution capabilities.
- the central processing unit (CPU) can be an X86 or ARM architecture.
- the GPU can be directly integrated on the motherboard alone or built into the north bridge chip of the motherboard.
- the GPU can also be built into the central processing unit (CPU).
- the memory 510 may include any combination of one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
- Volatile memory may include random access memory (RAM) and/or cache memory (cache), for example.
- Non-volatile memory may include, for example, read only memory (ROM), hard disk, erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, flash memory, etc.
- one or more computer instructions may be stored in the memory 510, and the processor 520 may execute the computer instructions to implement various functions.
- Various application programs and various data can also be stored in the computer-readable storage medium, such as the comment sample set of the first object, the comment sample set of the second object, the first original vector, the second original vector, and application usage and data. / Or various data generated etc.
- one or more steps in the semantic classification method described above may be executed.
- one or more steps in the neural network training method described above may be executed.
- the semantic classification device provided by the embodiments of the present disclosure is exemplary rather than restrictive. According to actual application requirements, the semantic classification device may also include other conventional components or structures, for example, to achieve semantic classification. For the necessary functions of the device, those skilled in the art can set other conventional components or structures according to specific application scenarios, which are not limited in the embodiments of the present disclosure.
- FIG. 14 is a schematic block diagram of a neural network training device provided by at least one embodiment of the present disclosure.
- the neural network training device 500' includes a memory 510' and a processor 520'.
- the memory 510' is used for non-transitory storage of computer-readable instructions
- the processor 520' is used for running the computer-readable instructions
- the computer-readable instructions are executed when the processor 520' runs Training method of neural network.
- the semantic classification method provided in any embodiment of the present disclosure can also be executed.
- the memory 510' and the processor 520' respectively have functions and settings similar to the above-mentioned memory 510 and the processor 520, which have been described in detail above and will not be repeated here.
- FIG. 15 is a schematic diagram of a storage medium provided by an embodiment of the present disclosure.
- the storage medium 600 non-transitory stores computer-readable instructions 601.
- any of the embodiments of the present disclosure can be executed.
- the instruction of the semantic classification method or the instruction of the neural network training method provided by any embodiment of the present disclosure can be executed. It is also possible to execute the semantic classification method provided by any embodiment of the present disclosure after executing the instruction of the neural network training method provided by any embodiment of the present disclosure.
- one or more computer instructions may be stored on the storage medium 600.
- Some computer instructions stored on the storage medium 600 may be, for example, instructions for implementing one or more steps in the above semantic classification method.
- the other computer instructions stored on the storage medium may be, for example, instructions for implementing one or more steps in the above-mentioned neural network training method.
- the storage medium may include the storage components of a tablet computer, the hard disk of a personal computer, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), optical disk read-only memory (CD -ROM), flash memory, or any combination of the above storage media, can also be other suitable storage media.
- RAM random access memory
- ROM read-only memory
- EPROM erasable programmable read-only memory
- CD -ROM optical disk read-only memory
- flash memory or any combination of the above storage media, can also be other suitable storage media.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (24)
- 一种语义分类方法包括:输入关于第一对象的第一评论;使用共同表示提取器对所述第一评论进行处理,以提取用于表征所述第一评论中的共同表示的第一共同表示向量;使用第一表示提取器对所述第一评论进行处理,以提取用于表征所述第一评论中的单一表示的第一单一表示向量;将所述第一共同表示向量和所述第一单一表示向量进行拼接,以得到第一表示向量;以及使用第一语义分类器对所述第一表示向量进行处理,以得到所述第一评论的语义分类;其中,所述共同表示包括既用于评论所述第一对象又用于评论第二对象的意思表示,所述第二对象为与所述第一对象不同的关联评论对象,所述第一评论的单一表示包括仅用于评论所述第一对象的意思表示。
- 根据权利要求1所述的语义分类方法,还包括:将所述第一评论映射为第一原始向量;其中,使用所述共同表示提取器对所述第一评论进行处理,包括:使用所述共同表示提取器对所述第一原始向量进行处理;使用所述第一表示提取器对所述第一评论进行处理,包括:使用所述第一表示提取器对所述第一原始向量进行处理。
- 根据权利要求2所述的语义分类方法,其中,将所述第一评论映射为所述第一原始向量,包括:使用词向量算法将所述第一评论中的每个字映射为具有指定长度的向量,以得到所述第一原始向量。
- 根据权利要求1-3任一项所述的语义分类方法,其中,所述共同表示提取器和所述第一表示提取器各自分别包括循环神经网络、长短期记忆网络和双向长短期记忆网络之一,所述第一语义分类器包括softmax分类器。
- 根据权利要求1-3任一项所述的语义分类方法,还包括:输入关于第二对象的第二评论;使用所述共同表示提取器对所述第二评论进行处理,以提取用于表征所述第二评论中的所述共同表示的第二共同表示向量;使用第二表示提取器对所述第二评论进行处理,以提取用于表征所述第二评论中的单一表示的第二单一表示向量;将所述第二共同表示向量和所述第二单一表示向量进行拼接,以得到第二表示向量;以及使用第二语义分类器对所述第二表示向量进行处理,以得到所述第二评论的语义分类;其中,所述第二评论的单一表示包括仅用于评论所述第二对象的意思表示。
- 根据权利要求5所述的语义分类方法,还包括:将所述第二评论映射为第二原始向量;其中,使用所述共同表示提取器对所述第二评论进行处理,包括:使用所述共同表示提取器对所述第二原始向量进行处理;使用所述第二表示提取器对所述第二评论进行处理,包括:使用所述第二表示提取器对所述第二原始向量进行处理。
- 根据权利要求6所述的语义分类方法,其中,将所述第二评论映射为所述第二原始向量,包括:使用词向量算法将所述第二评论中的每个字映射为具有指定长度的向量,以得到所述第二原始向量。
- 根据权利要求5-7任一项所述的语义分类方法,其中,所述第二表示提取器包括循环神经网络、长短期记忆网络和双向长短期记忆网络之一,所述第二语义分类器包括softmax分类器。
- 根据权利要求5-8任一项所述的语义分类方法,其中,所述第一评论和所述第二评论的语料来源包括文本和语音至少之一。
- 一种神经网络的训练方法,所述神经网络包括:生成网络、第一分支网络、第一分类网络、第二分支网络和第二分类网络;所述训练方法包括:语义分类训练阶段;其中,所述语义分类训练阶段包括:输入关于第一对象的第一训练评论,使用所述生成网络对所述第一训练评论进行处理,以提取第一训练共同表示向量,使用所述第一分支网络对所述第一训练评论进行处理,以提取第一训练单一表示向量,将所述第一训练共同表示向量与所述第一训练单一表示向量进行拼接,以得到第一训练表示向量,使用所述第一分类网络对所述第一训练表示向量进行处理,以得到所述第一训练评论的语义分类的预测类别标识;输入关于第二对象的第二训练评论,使用所述生成网络对所述第二训练评论进行处理,以提取第二训练共同表示向量,使用所述第二分支网络对所述第二训练评论进行处理,以提取第二训练单一表示向量,将所述第二训练共同表示向量与所述第二训练单一表示向量进行拼接,以得到第二训练表示向量,使用所述第二分类网络对所述第二训练表示向量进行处理,以得到所述第二训练评论的语义分类的预测类别标识;基于所述第一训练评论的预测类别标识和所述第二训练评论的预测类别标识,通过系统损失函数计算系统损失值;以及根据所述系统损失值对所述生成网络、所述第一分支网络、所述第一分类网络、所述第二分支网络和所述第二分类网络的参数进行修正;其中,所述第一对象和所述第二对象为关联评论对象。
- 根据权利要求10所述的训练方法,其中,所述语义分类训练阶段还包括:将所述第一训练评论映射为第一训练原始向量,将所述第二训练评论映射为第二训练原始向量;其中,使用所述生成网络对所述第一训练评论进行处理,包括:使用所述生成网络对所述第一训练原始向量进行处理;使用所述第一分支网络对所述第一训练评论进行处理,包括:使用所述第一分支网络对所述第一训练原始向量进行处理;使用所述生成网络对所述第二训练评论进行处理,包括:使用所述生成网络对所述第二训练原始向量进行处理;使用所述第二分支网络对所述第二训练评论进行处理,包括:使用所述第二分支网络对所述第二训练原始向量进行处理。
- 根据权利要求11所述的训练方法,其中,将所述第一训练评论映射为所述第一训练原始向量,包括:使用词向量算法将所述第一训练评论中的每个字映射为具有指定长度的向量,以得 到所述第一训练原始向量;将所述第二训练评论映射为所述第二训练原始向量,包括:使用所述词向量算法将所述第二训练评论中的每个字映射为具有所述指定长度的向量,以得到所述第二训练原始向量。
- 根据权利要求10-12任一项所述的训练方法,其中,所述生成网络、所述第一分支网络和所述第二分支网络各自分别包括循环神经网络、长短期记忆网络和双向长短期记忆网络之一,所述第一分类网络和所述第二分类网络均包括softmax分类器。
- 根据权利要求10-12任一项所述的训练方法,其中,所述系统损失函数表示为:L obj=λ 1·L(Y1,T1)+λ 2·L(Y2,T2)其中,L obj表示系统损失函数,L(·,·)表示交叉熵损失函数,Y1表示所述第一训练评论的预测类别标识,T1表示所述第一训练评论的真实类别标识,L(Y1,T1)表示第一训练评论的交叉熵损失函数,λ 1表示在所述系统损失函数中所述第一训练评论的交叉熵损失函数L(Y1,T1)的权重,Y2表示所述第二训练评论的预测类别标识,T1表示所述第二训练评论的真实类别标识,L(Y2,T2)表示第二训练评论的交叉熵损失函数,λ 2表示在所述系统损失函数中所述第二训练评论的交叉熵损失函数L(Y2,T2)的权重;所述交叉熵损失函数L(·,·)表示为:
- 根据权利要求10-12任一项所述的训练方法,其中,所述神经网络还包括判别网络;所述训练方法还包括:生成对抗训练阶段;以及交替地执行所述生成对抗训练阶段和所述语义分类训练阶段;其中,所述生成对抗训练阶段包括:基于所述生成网络,对所述判别网络进行训练;基于所述判别网络,对所述生成网络进行训练;以及交替地执行上述训练过程,以完成所述述生成对抗训练阶段的训练。
- 根据权利要求15所述的训练方法,其中,基于所述生成网络,对所述判别网络进行训练,包括:输入关于所述第一对象的第三训练评论,使用所述生成网络对所述第三训练评论进行处理,以提取第三训练共同表示向量,使用所述判别网络对所述第三训练共同表示向量进行处理,以得到第三训练输出;输入关于所述第二对象的第四训练评论,使用所述生成网络对所述第四训练评论进行处理,以提取第四训练共同表示向量,使用所述判别网络对所述第四训练共同表示向量进行处理,以得到第四训练输出;基于所述第三训练输出和所述第四训练输出,通过判别网络对抗损失函数计算判别网络对抗损失值;以及根据所述判别网络对抗损失值对所述判别网络的参数进行修正。
- 根据权利要求16所述的训练方法,其中,所述判别网络包括二分类的softmax分类器。
- 根据权利要求15-18任一项所述的训练方法,其中,基于所述判别网络,对所述生成网络进行训练,包括:输入关于所述第一对象的第五训练评论,使用所述生成网络对所述第五训练评论进行处理,以提取第五训练共同表示向量,使用所述判别网络对所述第五训练共同表示向量进行处理,以得到第五训练输出;输入关于所述第二对象的第六训练评论,使用所述生成网络对所述第六训练评论进行处理,以提取第六训练共同表示向量,使用所述判别网络对所述第六训练共同表示向量进行处理,以得到第六训练输出;基于所述第五训练输出和所述第六训练输出,通过生成网络对抗损失函数计算生成 网络对抗损失值;以及根据所述生成网络对抗损失值对所述生成网络的参数进行修正。
- 一种语义分类装置,包括:存储器,用于存储非暂时性计算机可读指令;以及处理器,用于运行所述计算机可读指令,其中,所述计算机可读指令被所述处理器运行时执行根据权利要求1-9任一项所述的语义分类方法。
- 一种神经网络的训练装置,包括:存储器,用于存储非暂时性计算机可读指令;以及处理器,用于运行所述计算机可读指令,其中,所述计算机可读指令被所述处理器运行时执行根据权利要求10-20任一项所述的训练方法。
- 一种存储介质,非暂时性地存储计算机可读指令,其中,当所述非暂时性计算机可读指令由计算机执行时可以执行根据权利要求1-9任一项所述的语义分类方法的指令。
- 一种存储介质,非暂时性地存储计算机可读指令,其中,当所述非暂时性计算机可读指令由计算机执行时可以执行根据权利要求10-20任一项所述的训练方法的指令。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/418,836 US11934790B2 (en) | 2019-09-09 | 2020-09-07 | Neural network training method and apparatus, semantic classification method and apparatus and medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910863457.8 | 2019-09-09 | ||
CN201910863457.8A CN110598786B (zh) | 2019-09-09 | 2019-09-09 | 神经网络的训练方法、语义分类方法、语义分类装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021047473A1 true WO2021047473A1 (zh) | 2021-03-18 |
Family
ID=68859161
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/113740 WO2021047473A1 (zh) | 2019-09-09 | 2020-09-07 | 神经网络的训练方法及装置、语义分类方法及装置和介质 |
Country Status (3)
Country | Link |
---|---|
US (1) | US11934790B2 (zh) |
CN (1) | CN110598786B (zh) |
WO (1) | WO2021047473A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110598786B (zh) * | 2019-09-09 | 2022-01-07 | 京东方科技集团股份有限公司 | 神经网络的训练方法、语义分类方法、语义分类装置 |
CN111858923A (zh) * | 2019-12-24 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | 一种文本分类方法、系统、装置及存储介质 |
CN112164125B (zh) * | 2020-09-15 | 2022-07-26 | 华南理工大学 | 一种监督可控的人脸多属性分离生成的方法 |
CN117218693A (zh) * | 2022-05-31 | 2023-12-12 | 青岛云天励飞科技有限公司 | 人脸属性预测网络生成方法、人脸属性预测方法及装置 |
CN115618884B (zh) * | 2022-11-16 | 2023-03-10 | 华南师范大学 | 基于多任务学习的言论分析方法、装置以及设备 |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229582A (zh) * | 2018-02-01 | 2018-06-29 | 浙江大学 | 一种面向医学领域的多任务命名实体识别对抗训练方法 |
CN108664589A (zh) * | 2018-05-08 | 2018-10-16 | 苏州大学 | 基于领域自适应的文本信息提取方法、装置、系统及介质 |
CN109377448A (zh) * | 2018-05-20 | 2019-02-22 | 北京工业大学 | 一种基于生成对抗网络的人脸图像修复方法 |
CN109447906A (zh) * | 2018-11-08 | 2019-03-08 | 北京印刷学院 | 一种基于生成对抗网络的图片合成方法 |
CN109783812A (zh) * | 2018-12-28 | 2019-05-21 | 中国科学院自动化研究所 | 基于自注意力机制的中文命名实体识别方法及装置 |
US20190259474A1 (en) * | 2018-02-17 | 2019-08-22 | Regeneron Pharmaceuticals, Inc. | Gan-cnn for mhc peptide binding prediction |
CN110188776A (zh) * | 2019-05-30 | 2019-08-30 | 京东方科技集团股份有限公司 | 图像处理方法及装置、神经网络的训练方法、存储介质 |
CN110598786A (zh) * | 2019-09-09 | 2019-12-20 | 京东方科技集团股份有限公司 | 神经网络的训练方法、语义分类方法、语义分类装置 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10360507B2 (en) | 2016-09-22 | 2019-07-23 | nference, inc. | Systems, methods, and computer readable media for visualization of semantic information and inference of temporal signals indicating salient associations between life science entities |
US10347244B2 (en) * | 2017-04-21 | 2019-07-09 | Go-Vivace Inc. | Dialogue system incorporating unique speech to text conversion method for meaningful dialogue response |
CN107679217B (zh) * | 2017-10-19 | 2021-12-07 | 北京百度网讯科技有限公司 | 基于数据挖掘的关联内容提取方法和装置 |
CN107766585B (zh) * | 2017-12-07 | 2020-04-03 | 中国科学院电子学研究所苏州研究院 | 一种面向社交网络的特定事件抽取方法 |
CN108363753B (zh) * | 2018-01-30 | 2020-05-19 | 南京邮电大学 | 评论文本情感分类模型训练与情感分类方法、装置及设备 |
CN108763204A (zh) * | 2018-05-21 | 2018-11-06 | 浙江大学 | 一种多层次的文本情感特征提取方法和模型 |
CN109544524B (zh) * | 2018-11-15 | 2023-05-23 | 中共中央办公厅电子科技学院 | 一种基于注意力机制的多属性图像美学评价系统 |
CN109740154B (zh) * | 2018-12-26 | 2021-10-26 | 西安电子科技大学 | 一种基于多任务学习的在线评论细粒度情感分析方法 |
US11748613B2 (en) * | 2019-05-10 | 2023-09-05 | Baidu Usa Llc | Systems and methods for large scale semantic indexing with deep level-wise extreme multi-label learning |
CN110222182B (zh) * | 2019-06-06 | 2022-12-27 | 腾讯科技(深圳)有限公司 | 一种语句分类方法及相关设备 |
-
2019
- 2019-09-09 CN CN201910863457.8A patent/CN110598786B/zh active Active
-
2020
- 2020-09-07 WO PCT/CN2020/113740 patent/WO2021047473A1/zh active Application Filing
- 2020-09-07 US US17/418,836 patent/US11934790B2/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229582A (zh) * | 2018-02-01 | 2018-06-29 | 浙江大学 | 一种面向医学领域的多任务命名实体识别对抗训练方法 |
US20190259474A1 (en) * | 2018-02-17 | 2019-08-22 | Regeneron Pharmaceuticals, Inc. | Gan-cnn for mhc peptide binding prediction |
CN108664589A (zh) * | 2018-05-08 | 2018-10-16 | 苏州大学 | 基于领域自适应的文本信息提取方法、装置、系统及介质 |
CN109377448A (zh) * | 2018-05-20 | 2019-02-22 | 北京工业大学 | 一种基于生成对抗网络的人脸图像修复方法 |
CN109447906A (zh) * | 2018-11-08 | 2019-03-08 | 北京印刷学院 | 一种基于生成对抗网络的图片合成方法 |
CN109783812A (zh) * | 2018-12-28 | 2019-05-21 | 中国科学院自动化研究所 | 基于自注意力机制的中文命名实体识别方法及装置 |
CN110188776A (zh) * | 2019-05-30 | 2019-08-30 | 京东方科技集团股份有限公司 | 图像处理方法及装置、神经网络的训练方法、存储介质 |
CN110598786A (zh) * | 2019-09-09 | 2019-12-20 | 京东方科技集团股份有限公司 | 神经网络的训练方法、语义分类方法、语义分类装置 |
Also Published As
Publication number | Publication date |
---|---|
CN110598786B (zh) | 2022-01-07 |
CN110598786A (zh) | 2019-12-20 |
US20220075955A1 (en) | 2022-03-10 |
US11934790B2 (en) | 2024-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021047473A1 (zh) | 神经网络的训练方法及装置、语义分类方法及装置和介质 | |
Dharwadkar et al. | A medical chatbot | |
WO2022007823A1 (zh) | 一种文本数据处理方法及装置 | |
WO2021233112A1 (zh) | 基于多模态机器学习的翻译方法、装置、设备及存储介质 | |
CN109684445B (zh) | 口语化医疗问答方法及系统 | |
KR102424085B1 (ko) | 기계-보조 대화 시스템 및 의학적 상태 문의 장치 및 방법 | |
CN111832312B (zh) | 文本处理方法、装置、设备和存储介质 | |
Liu et al. | Natural language inference in context-investigating contextual reasoning over long texts | |
CN110322959B (zh) | 一种基于知识的深度医疗问题路由方法及系统 | |
CN112100406A (zh) | 数据处理方法、装置、设备以及介质 | |
CN113707299A (zh) | 基于问诊会话的辅助诊断方法、装置及计算机设备 | |
CN113988013A (zh) | 基于多任务学习和图注意力网络的icd编码方法及装置 | |
CN114648032B (zh) | 语义理解模型的训练方法、装置和计算机设备 | |
CN117238437A (zh) | 基于知识图谱的病情诊断辅助方法及系统 | |
Biswas et al. | Symptom-based disease detection system in bengali using convolution neural network | |
CN111553140A (zh) | 数据处理方法、数据处理设备及计算机存储介质 | |
Shukla et al. | Optimization assisted bidirectional gated recurrent unit for healthcare monitoring system in big-data | |
CN117747087A (zh) | 问诊大模型的训练方法、基于大模型的问诊方法和装置 | |
JP2022141191A (ja) | 機械学習プログラム、機械学習方法および翻訳装置 | |
CN111144134B (zh) | 基于OpenKiWi的翻译引擎自动化评测系统 | |
CN112259232A (zh) | 一种基于深度学习的vte风险自动评估系统 | |
CN115659987B (zh) | 基于双通道的多模态命名实体识别方法、装置以及设备 | |
CN116956934A (zh) | 任务处理方法、装置、设备及存储介质 | |
Shen et al. | Intelligent recognition of portrait sketch components for child autism assessment | |
Ismael et al. | Chatbot System for Mental Health in Bahasa Malaysia |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20863899 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20863899 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20863899 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.02.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20863899 Country of ref document: EP Kind code of ref document: A1 |