CN112214595A - Category determination method, device, equipment and medium - Google Patents

Category determination method, device, equipment and medium Download PDF

Info

Publication number
CN112214595A
CN112214595A CN202010849763.9A CN202010849763A CN112214595A CN 112214595 A CN112214595 A CN 112214595A CN 202010849763 A CN202010849763 A CN 202010849763A CN 112214595 A CN112214595 A CN 112214595A
Authority
CN
China
Prior art keywords
classification model
target
training
layer
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010849763.9A
Other languages
Chinese (zh)
Inventor
赖雅玲
黄德荣
吴楠
张彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202010849763.9A priority Critical patent/CN112214595A/en
Publication of CN112214595A publication Critical patent/CN112214595A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method, a device, equipment and a medium for determining categories, wherein the method comprises the following steps: acquiring a target statement; determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence; the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement. According to the technical scheme of the embodiment of the invention, the classification model for processing each target sentence can be obtained by training the pre-constructed classification model, the classification label of the target sentence can be determined based on the classification model, and the label category corresponding to the target sentence is further obtained.

Description

Category determination method, device, equipment and medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment and a medium for determining categories.
Background
At present, when a user consults a problem, the category corresponding to the consultation problem needs to be determined manually, and then the corresponding label category is determined.
However, the method for determining the label category corresponding to each sentence and the corresponding problem solution has the technical problems of high labor cost and low accuracy.
Disclosure of Invention
The invention provides a category determination method, a category determination device, category determination equipment and a storage medium, which are used for realizing an optimized category determination method and accurately and conveniently determining a classification category corresponding to a target statement.
In a first aspect, an embodiment of the present invention provides a category determination method, which is applied to a knowledge-graph question-answer, and includes:
acquiring a target statement;
determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence;
the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement.
In a second aspect, an embodiment of the present invention further provides a category determining apparatus, configured in a knowledge-graph question-answer, including:
the target statement acquisition module is used for acquiring a target statement;
the category determination module is used for determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence;
the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement.
In a third aspect, an embodiment of the present invention further provides an apparatus, where the apparatus includes:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a category determination method as in any of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention further provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform the category determination method according to any one of the embodiments of the present invention.
According to the technical scheme of the embodiment of the invention, the classification model for processing each target sentence can be obtained by training the pre-constructed classification model, the classification label of the target sentence can be determined based on the classification model, the label category corresponding to the target sentence is further obtained, and the data processing effect is improved.
Drawings
In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, a brief description is given below of the drawings used in describing the embodiments. It should be clear that the described figures are only views of some of the embodiments of the invention to be described, not all, and that for a person skilled in the art, other figures can be derived from these figures without inventive effort.
Fig. 1 is a schematic flowchart of a category determination method according to an embodiment of the present invention;
FIG. 2 is a diagram of a constructed classification model according to a second embodiment of the present invention;
FIG. 3 is a schematic flowchart of training a classification model according to a second embodiment of the present invention;
fig. 4 is a schematic diagram of a long-term and short-term memory network according to a second embodiment of the present invention;
fig. 5 is a schematic diagram of a forgetting gate in a long-and-short memory network according to a second embodiment of the present invention;
fig. 6 is a schematic diagram of an input gate in a long-and-short term memory network according to a second embodiment of the present invention;
FIG. 7 is a schematic diagram of a memory cell in a long-short term memory network according to a second embodiment of the present invention;
fig. 8 is a schematic diagram of an output gate in a long-and-short term memory network according to a second embodiment of the present invention;
fig. 9 is a flowchart illustrating a category determination method according to a second embodiment of the present invention;
fig. 10 is a schematic structural diagram of a category determining apparatus according to a third embodiment of the present invention;
fig. 11 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a schematic flow chart of a category determination method according to an embodiment of the present invention, where the present embodiment is applicable to a situation of a category tag corresponding to a certain statement in a specific application scenario, and optionally, the scenario may be in a knowledge-graph question-and-answer system, and the method may be executed by a category determination device, and the device may be implemented in a form of software and/or hardware. The hardware may be an electronic device, which may be a PC terminal or a mobile terminal, etc.
It should be noted that the category determination method provided by this embodiment is not only different from the prior art, but also applies this method to the knowledge-graph question-answering, and compared with the problem of the synonym and keyword matching ambiguity in the conventional knowledge-graph question-answering system based on keyword matching, the category determination method provided by this embodiment can be used to solve the problem.
As shown in fig. 1, the method of this embodiment includes:
and S110, acquiring a target statement.
If the category label corresponding to a certain statement needs to be determined, the statement can be used as a target statement. Illustratively, the target sentence may be a label corresponding to the XX building project land use life, or the like.
After the target statement is acquired, the method further comprises the following steps: dividing the target sentence into at least one vocabulary to be processed based on a preset word segmentation tool; and determining a target vocabulary corresponding to the target sentence according to at least one vocabulary to be processed.
It should be noted that the target sentence may be divided into at least one to-be-processed vocabulary based on the word segmentation dictionary, or the target sentence may be divided into at least one to-be-processed vocabulary based on the ending tool. The vocabulary to be processed after the vocabulary such as the auxiliary words and the stop words in the vocabulary to be processed can be used as the target vocabulary. That is, after the target sentence is segmented, the auxiliary words, stop words, and the like in all the segmented words can be removed, and the words obtained in this way can be used as the target words corresponding to the target sentence.
It should be noted that, since the classification model can process the input vector, after the target vocabulary is obtained, the target word vector corresponding to the target vocabulary also needs to be determined, so that the classification model determines the classification identifier corresponding to the target word vector.
Optionally, determining a target word frequency corresponding to each target word based on a predetermined word frequency dictionary; based on the target word frequency, an initialization word vector of the target vocabulary is determined, and the target word vector input to the classification model is determined based on the initialization word vector.
The words in the word frequency dictionary and the word frequencies corresponding to the words are related to specific application scenes. For example, the application scene is a real estate application scene, the words in the word frequency dictionary are words corresponding to real estate, and the word frequency corresponding to the words is determined according to the occurrence frequency of each word.
It should be noted that, the target vocabularies corresponding to different sentences have different numbers and correspondingly, the word vector dimensions corresponding to the target vocabularies are different, and in order to ensure that the classification model processes the corresponding word vectors, the word vector dimensions may be predetermined. For example, the word vector corresponding to each sentence is determined to be 1 × 10, and three flags may be reserved in the head of the dictionary, where the three flags are [ PAD ], [ UNK ], [ CODE ], and respectively indicate a filler word, an unknown word, and a numbered word. The dictionary is used for vectorization of text.
Specifically, after the target vocabulary corresponding to the target sentence is determined, the word frequency corresponding to each target vocabulary may be determined according to a predetermined word frequency dictionary. According to the target word frequency corresponding to each target word, an initialization word vector corresponding to the target word can be determined, so that the target word vector input to the classification model can be determined according to the initialization word vector.
In the embodiment, the target word vector of the target vocabulary is determined according to the initialization word vector of the target vocabulary, and the determination can be based on a word embedding layer or a mapping word vector submodel in a classification model. Optionally, the initialization word vector is processed into a target word vector containing statement information based on the word embedding layer or the mapping word vector submodel.
The word embedding layer is used for mapping each statement into a target word vector containing semantic information. And the mapping word vector submodel is a layer in the classification model and is used for processing the initialization word vector into a target word vector corresponding to the initialization word vector.
And S120, determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by pre-selection training, and determining the classification category of the target sentence.
In this embodiment, the target sentence may be divided into at least one vocabulary, and the obtained vocabulary may be used as the target vocabulary. The classification model is composed of a double-layer long-time memory network, a full connection layer and an activation function layer, and the constructed classification model is trained to obtain a classification model which is processed with the target vocabulary and determines the classification category corresponding to the target vocabulary. The classification categories may be individuals, collaboration agreements, floor and floor projects, and the like.
Specifically, the target sentence may be divided into at least one target vocabulary, the target vocabulary is input into a classification model obtained by pre-training, and a classification category corresponding to the target sentence may be determined based on the classification model.
In order to obtain a classification model for processing a target sentence, before training to obtain the classification model, the method further includes: and constructing a classification model.
Optionally, constructing the classification model includes: the word embedding layer or the mapping word vector submodel, the double-layer long-time memory network layer, the full connection layer and the activation function layer; inputting an initial word vector corresponding to a target sentence into the word embedding layer or the word vector submodel, and taking an output result as the input of a double-layer long-time memory network layer; taking the output of the double-layer long-time memory network layer as the input of the full connection, and inputting the output to the full connection layer; and taking the output result of the full connection layer as the input of the activation function layer, inputting the output result into the activation function layer, and taking the output result of the activation function layer as the classification category corresponding to the target statement.
It is understood that, in the present embodiment, the classification model needs to be constructed in advance. The classification model comprises a word embedding layer, a double-layer long-time memory network layer, a full connection layer and an activation function layer. The word embedding layer is used as a first input layer in the construction of the classification model and used for receiving an initial word vector corresponding to the target statement and processing the initial word vector into a target word vector corresponding to the target statement. And taking the output of the word embedding layer as an input value of a double-layer long-time memory network layer, inputting an output result to the full-connection layer, and taking the output result of the full-connection layer as an input layer of an activation function to construct a classification model.
After the classification model is built, the classification model may be trained. Training the classification model comprises: acquiring a plurality of training sample data, wherein the training sample data comprises training sentences and label types corresponding to the training sentences; wherein the label category is determined based on a predetermined mapping relation table; performing text vectorization on the training sample data aiming at each training sample data to obtain initial text vectors with uniform input lengths; inputting training sample data into a pre-constructed classification model to be trained to obtain a training result; correcting a loss function in the classification model to be trained based on the training result and a sample result of the training sample data, and training to obtain the classification model to be used by taking the convergence of the loss function as a training target; and verifying the classification model to be used based on the verification sample data, and taking the classification model to be used as the classification model when a verification result meets a preset condition.
In order to improve the accuracy of the model, training sample data can be acquired as much as possible. For each training sample data, the text of the training sample data can be vectorized to obtain an initial text vector with uniform input length. It should be noted that the number of words corresponding to different training sample data is different, and correspondingly, the vector dimension of text vectorization is also different, and in order to avoid such problems, a vector including the dimension of each training sample data is determined, and the vector dimension is used as the input of training the classification model to be trained.
Before training the to-be-classified model, the model parameters in the to-be-trained classification model can be set as default values, so that the model parameters in the to-be-trained classification model are corrected when training is carried out on the basis of training sample data. The training sample data comprises an entity classification training set and an attribute classification training set.
Specifically, training sample data can be input into the classification model to be trained to obtain an output value corresponding to the training sample data, based on the standard value and the training output value in the training sample data, a loss value between the standard value and the output value can be calculated, and based on the loss value, the model parameter in the classification model to be trained is determined. The training error of the loss function, i.e., the loss parameter, may be used as a condition for detecting whether the loss function reaches convergence currently, for example, whether the training error is smaller than a preset error or whether the error variation trend tends to be stable, or whether the current iteration number is equal to the preset number. If the detection reaches the convergence condition, for example, the training error of the loss function is smaller than the preset error or the error change tends to be stable, indicating that the training of the classification model to be trained is completed, at this time, the iterative training may be stopped. If the current condition is not met, sample data can be further acquired to train the classification model to be trained until the training error of the loss function is within the preset range. When the training error of the loss function reaches convergence, the classification model to be trained can be used as the classification model.
On the basis of the above technical solution, the verifying the classification model to be used based on verification sample data, and when a verification result satisfies a preset condition, using the classification model to be used as a classification model includes: acquiring a preset amount of verification sample data; inputting the verification sample data into the classification model to be used, and determining the accuracy of the classification model to be used based on the output result of the classification model to be used; when the accuracy reaches a preset accuracy threshold, taking the classification model to be used as a classification model; if the accuracy rate does not reach the accuracy rate threshold value, obtaining training sample data to continue training the classification model to be used until the accuracy rate of the classification model to be used reaches the preset accuracy rate threshold value.
On the basis of the technical scheme, in order to improve the accuracy of the classification model, the classification model to be used, which is obtained by training as much verification sample data as possible, can be obtained for verification. The method comprises the steps that verification sample data can be input into a classification model to be used, the classification model to be used can output a verification result corresponding to the verification sample data, the accuracy rates of the verification result and the training result can be determined according to the verification result and the training result in the verification sample data, and if the accuracy rate is lower than a preset accuracy rate threshold value, the classification model to be used serves as an available classification model; if the accuracy is higher than the preset accuracy threshold, the accuracy of the classification model to be used is low, and the classification model to be used needs to be trained. Optionally, training sample data is continuously obtained, the classification model to be used is trained based on the obtained training sample data until the accuracy of the classification model to be used is detected to reach a preset accuracy threshold, and the classification model to be used is used as the classification model.
According to the technical scheme of the embodiment of the invention, the classification model for processing each target statement can be obtained by training the pre-constructed classification model, the classification label of the target statement can be determined based on the classification model, the label category corresponding to the target statement is further obtained, and the data processing effect is improved
According to the technical scheme of the embodiment of the invention, the target sentence is processed through the pre-constructed and trained classification model, so that the classification category corresponding to the target sentence can be determined, and the technical effect of rapidly processing the input target sentence is realized.
Example two
As a preferred embodiment of the foregoing embodiment, it should be noted that the classification model provided in this embodiment includes a word embedding layer, a two-layer long-term memory network, a full connection layer, and an activation function layer. In order to further understand the technical solution of the embodiment of the present invention in detail, a schematic diagram of a classification model constructed and used in the embodiment of the present invention and a specific training method are introduced. As shown in fig. 2, the classification model includes a word embedding layer, a double-layer long-term memory network LSTM, a full connection layer, and an activation function layer according to the input and output of the classification model.
After the classification model is constructed, in this embodiment, a training diagram for training the classification model is shown in fig. 3.
S310, obtaining training sample data.
Specifically, a training sample data set and a data label are obtained. The training sample data set is determined for manual collection and labeling. Illustratively, 3000 pieces of data may be acquired, involving five classifications. Of course, in order to improve the accuracy of the model, training sample data may be acquired as much as possible.
And S320, marking the entity and the attribute category of the training data.
Each piece of training sample data comprises a question and a label, namely a sentence, wherein the question can be a question possibly covered in a real estate instruction graph, and can relate to personal information, cooperation agreement information, enterprise information, building and building project information and the like. The labels label may be nodes of the knowledge graph involved in various problems in the data. Optionally, the tag corresponding to the individual is 0, the tag corresponding to the cooperation agreement is 1, the tag corresponding to the enterprise is 2, the tag corresponding to the building is 3, and the tag corresponding to the building project is 4. The labeled data comprises two parts, wherein one part is labeled by the entity category in the question, and the other part is labeled by the attribute category in the question.
S330, carrying out bus segmentation on the data, determining each segmentation to carry out text vectorization, and determining the uniform input length.
Acquiring training sample data, dividing sentences corresponding to each training data into at least one vocabulary through the ending segmentation, and removing vocabulary such as auxiliary words and stop words in the at least one vocabulary. And counting the occurrence frequency of each vocabulary in the training sample data, and sequencing each vocabulary in sequence from high to low according to the frequency to form a word frequency dictionary. Wherein, in order to ensure that the dimension of the text vector corresponding to each training sentence is uniform. Three flags can be preset in the head of the dictionary, which can be: [ PAD ], [ UNK ], [ CODE ], which represent the filler word, unknown word, and the numbering word, respectively. The purpose of establishing the word frequency dictionary is to determine vectors corresponding to each word in the training sample data.
It should be noted that, in order to enrich the content of the vocabulary in the word frequency dictionary, training sample data may be obtained as much as possible, and then the word frequency dictionary is determined based on the training sample data.
Based on the word frequency dictionary, and the respective vocabulary, a text vector corresponding thereto may be determined.
It should be noted that, since the training data is a training sentence, the training sentence can be converted into a language recognized by a computer.
Specifically, based on the word frequency dictionary, the word label corresponding to the training sentence can be determined. In order to ensure the consistency of the length of each training sentence input to the classification model, the maximum input length of the training sentences can be taken, and when the vocabulary corresponding to some training sentences is different from the maximum input length, the predetermined flag bits can be adopted for completion. That is, if a word does not exist in the word frequency, the word can be replaced with the following table where [ UNK ] is located in the dictionary, and if the word is a string of number numbers, the word can be replaced with the following table where [ CODE ] is located in the dictionary.
Illustratively, the training sentence is "how many small loan accounts are? "the segmentation result based on the crust segmentation is: [ how many small loan accounts ]; replacing words with subscripts corresponding to the dictionary to form an array [2, 5, 13, 1 ]; assuming that the maximum length of the query is 6, then [2, 5, 13, 1, 0, 0] is right-filled with a subscript (assumed to be 0) corresponding to [ PAD ], that is, an initialization vector corresponding to the training sentence can be obtained, and the initialization vector can be input into the classification model to obtain a word vector containing semantic information.
And S340, training the constructed classification model based on the training sample data to obtain the classification model to be used.
That is, based on steps S310 to S330, an initialization word vector corresponding to each training sample data may be determined. The constructed classification model may be trained based on the initialization word vectors.
The word embedding layer is configured to map the input vector into a word vector including semantic information, that is, the word vector of the semantic information corresponding to each training sample data may be determined, and a dimension of the word embedding layer may be embedding _ size ═ 128. The method comprises the following steps that a neural network is memorized in a double-layer long-and-short time mode, namely a double-layer LSTM, and is used for learning the characteristics of input vectors and classifying, wherein a hidden layer hidden _ size is 256; a full connection layer for converging the rich feature characterization network into an output target class number, target _ size being 5; and activating a function layer for converting the output information into a probability value of 0-1.
Recurrent Neural Network (RNN) is a type of Neural network used to process sequence data. Common sequence data such as voice, text, etc. need to be processed depending on time and memory. The long short-term memory network LSTM (long short-term memory) is a variant of RNN, the RNN only has short-term memory due to gradient disappearance, the LSTM network combines the short-term memory and the long-term memory through delicate gate control, and the problem of gradient disappearance is solved to a certain extent.
The standard recurrent neural network has only a simple layer structure inside, while the LSTM has 4 layers inside as shown in fig. 4 below:
briefly introduce the specific internal structure:
the first step, see fig. 5, is the forgetting gate-which decides what information to discard from the cell state. The door will read ht-1And xtOutputting a value between 0 and 1 to each of the cells in the cell state Ct-1The numbers in (1). The second step, see fig. 6, is to determine what new information is deposited in the cell state. (1) The sigmoid layer, called the "input gate layer," decides what value we are going to update. the tanh layer creates a new candidate value vector,
Figure RE-GDA0002797722060000121
may be added to the state. These two pieces of information produce updates to the state. Third step, see FIG. 7, update old cell status, Ct-1Is updated to Ct. C, taking the old state with ftMultiply and discard the information we determined to need to discard. Then add
Figure RE-GDA0002797722060000122
This is the new candidate, which changes according to how much we decide to update each state. Fourth, see fig. 8, determine what value to output, (1) sigmoid layer to determine which part of the cell state to output. (2) The cell state is processed by tanh (to obtain a value between-1 and 1) and multiplied by the output of the sigmoid gate, and finally we will only output that part of the output we determine. Therefore, the LSTM model realizes the long-term memory function through the control network of the input gate, the forgetting gate and the output gate。
Specifically, the training sample data is obtained for the constructed classification model and the novel training, and when the loss function convergence of the classification model to be used is detected, the obtained classification model to be trained can be used as the classification model to be used.
And S350, verifying the classification model to be used based on the verification sample data, and taking the classification model to be used as the classification model when the verification result meets the preset condition.
Specifically, check sample data is obtained, and the sample data is also processed into an initialization vector corresponding to the training sample data. And inputting the verification sample data into the classification model to be used, and taking the classification model to be used as the classification model when the verification result meets the preset condition.
Of course, if the accuracy of the classification model to be used is not outside the preset accuracy range, the training sample data can be continuously acquired to train the classification model to be used until the obtained accuracy of the classification model to be used is within the preset accuracy range.
Fig. 9 is a flowchart illustrating a category determining method according to a second embodiment of the present invention. As a preferred embodiment of the above embodiment, as shown in fig. 9, the method includes:
it should be noted that the category determining method provided by the embodiment of the present invention may be applied to a knowledge-graph question-answering system, which may be a tree-shaped graph, and based on the category determining method provided by the embodiment, the technical effect of accurately determining the matching result from the knowledge graph may be achieved.
And S410, acquiring a target statement.
Specifically, when a tag corresponding to a certain statement needs to be determined, the statement may be used as a target statement.
And S420, processing the target sentence based on the deep learning retrieval sentence classification model to obtain a classification identifier corresponding to the target sentence.
The method comprises the steps of utilizing natural language processing and deep learning technology to carry out semantic analysis on retrieval sentences, extracting entities and attributes of the retrieval sentences through constructing a cyclic neural network, predicting knowledge map nodes to which the entities and the attributes belong, and providing the nodes to a result filtering module, so that elastic search can accurately match the node positions corresponding to the entities and the attributes of the retrieval sentences, the uncertainty of fuzzy matching is reduced, and the retrieval accuracy is improved.
The deep learning retrieval sentence classification model is the classification model mentioned in the above embodiments.
Specifically, the target sentence is input into the deep learning retrieval sentence classification model, and the target sentence is processed based on the classification model, so that the classification mark corresponding to the target sentence can be obtained.
And S430, knowledge display.
It is to be understood that, after determining the classification identifier corresponding to the target sentence, the classification identifier may be displayed at a preset position.
According to the technical scheme of the embodiment of the invention, the classification model for processing each target sentence can be obtained by training the pre-constructed classification model, the classification label of the target sentence can be determined based on the classification model, and the label category corresponding to the target sentence is further obtained.
EXAMPLE III
Fig. 10 is a schematic structural diagram of a category determining apparatus according to a third embodiment of the present invention, where the apparatus includes: a target sentence acquisition module 510 and a category determination module 520.
The target statement acquisition module is used for acquiring a target statement; the category determination module is used for determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence; the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement.
On the basis of the above technical solutions, after the target sentence acquisition module acquires the target sentence, the target sentence acquisition module is further configured to:
dividing the target sentence into at least one vocabulary to be processed based on a preset word segmentation tool;
and determining a target word of the target sentence according to the at least one word to be processed.
On the basis of the above technical solutions, the apparatus further includes:
determining target word frequency corresponding to each target vocabulary according to a predetermined word frequency dictionary;
and determining an initialization word vector of the target vocabulary based on the target word frequency so as to determine a target word vector input to the classification model based on the initialization word vector.
On the basis of the above technical solutions, the classification model further includes a word embedding layer or a mapping word vector submodel, and the determining a target word vector input into the classification model based on the initialization word vector includes:
and presetting the initialization word vector as a target word vector containing voice information based on a word embedding layer or a mapping word vector submodel.
On the basis of the above technical solutions, the method further comprises: constructing the classification model; the constructing the classification includes: the word embedding layer or the mapping word vector submodel, the double-layer long-time memory network layer, the full connection layer and the activation function layer;
inputting an initial word vector corresponding to a target sentence into the word embedding layer or the word vector submodel, and taking an output result as the input of a double-layer long-time memory network layer;
taking the output of the double-layer long-time memory network layer as the input of the full connection, and inputting the output to the full connection layer;
and taking the output result of the full connection layer as the input of the activation function layer, inputting the output result into the activation function layer, and taking the output result of the activation function layer as the classification category corresponding to the target statement.
On the basis of the above technical solutions, the apparatus further includes: the training the classification model includes:
obtaining a plurality of training sample data, wherein the training sample data comprises training sentences and label types corresponding to the training sentences; wherein the label category is determined based on a predetermined mapping relation table;
aiming at each training sample data, performing text vectorization on the training sample data to obtain initial text vectors with uniform input lengths;
inputting the training sample data into a pre-constructed classification model to be trained to obtain a training result;
correcting a loss function in the classification model to be trained based on a training result and a sample result of the training sample data, and training to obtain the classification model to be used by taking the convergence of the loss function as a training target;
and verifying the classification model to be used based on verification sample data, and taking the classification model to be used as the classification model when a verification result meets a preset condition.
On the basis of the above technical solutions, the verifying the classification model to be used based on verification sample data, and when a verification result satisfies a preset condition, using the classification model to be used as a classification model includes:
acquiring a preset amount of verification sample data;
inputting the verification sample data into the classification model to be used, and determining the accuracy of the classification model to be used based on the output result of the classification model to be used;
when the accuracy reaches a preset accuracy threshold, taking the classification model to be used as a classification model;
if the accuracy rate does not reach the accuracy rate threshold value, obtaining training sample data to continue training the classification model to be used until the accuracy rate of the classification model to be used reaches the preset accuracy rate threshold value.
According to the technical scheme of the embodiment of the invention, the classification model for processing each target sentence can be obtained by training the pre-constructed classification model, the classification label of the target sentence can be determined based on the classification model, and the label category corresponding to the target sentence is further obtained.
The category determining device provided by the embodiment of the invention can execute the category determining method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the executing method.
It should be noted that, the units and modules included in the apparatus are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.
EXAMPLE III
Fig. 11 is a schematic structural diagram of an apparatus according to a third embodiment of the present invention. FIG. 11 illustrates a block diagram of an exemplary device 60 suitable for use in implementing embodiments of the present invention. The device 60 shown in fig. 11 is only an example and should not bring any limitation to the function and scope of use of the embodiments of the present invention.
As shown in FIG. 11, device 60 is embodied in a general purpose computing device. The components of the device 60 may include, but are not limited to: one or more processors or processing units 601, a system memory 602, and a bus 603 that couples various system components including the system memory 602 and the processing unit 601.
Bus 603 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Device 60 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by device 60 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 602 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)604 and/or cache memory 605. The device 60 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 606 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 11, and commonly referred to as a "hard drive"). Although not shown in FIG. 11, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to the bus 603 by one or more data media interfaces. Memory 602 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 608 having a set (at least one) of program modules 607 may be stored, for example, in memory 602, such program modules 607 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. The program modules 607 generally perform the functions and/or methods of the described embodiments of the invention.
Device 60 may also communicate with one or more external devices 609 (e.g., keyboard, pointing device, display 610, etc.), with one or more devices that enable a user to interact with device 60, and/or with any devices (e.g., network card, modem, etc.) that enable device 60 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 611. Also, device 60 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via network adapter 612. As shown, a network adapter 612 communicates with the other modules of device 60 via bus 603. It should be appreciated that although not shown in FIG. 11, other hardware and/or software modules may be used in conjunction with device 60, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 601 executes various functional applications and data processing by executing programs stored in the system memory 602, for example, to implement the category determination method provided by the embodiment of the present invention.
Example four
A storage medium containing computer-executable instructions for performing a method for category determination when executed by a computer processor is also provided in a fourth embodiment of the present invention.
The method comprises the following steps:
acquiring a target statement;
determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence;
the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (11)

1. A category determination method applied to knowledge-graph questions and answers comprises the following steps:
acquiring a target statement;
determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence;
the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement.
2. The method of claim 1, further comprising, after the obtaining the target statement:
dividing the target sentence into at least one vocabulary to be processed based on a preset word segmentation tool;
and determining a target word of the target sentence according to the at least one word to be processed.
3. The method of claim 1, further comprising:
determining target word frequency corresponding to each target vocabulary according to a predetermined word frequency dictionary;
and determining an initialization word vector of the target vocabulary based on the target word frequency so as to determine a target word vector input to the classification model based on the initialization word vector.
4. The method of claim 3, further comprising a word embedding layer or a mapping word vector submodel in the classification model, wherein the determining a target word vector for input into the classification model based on the initialization word vector comprises:
and processing the initialization word vector into a target word vector containing statement information based on a word embedding layer or a mapping word vector submodel.
5. The method of claim 1, further comprising: constructing the classification model; the constructing the classification includes: the word embedding layer or the mapping word vector submodel, the double-layer long-time memory network layer, the full connection layer and the activation function layer;
inputting an initial word vector corresponding to a target sentence into the word embedding layer or the word vector submodel, and taking an output result as the input of a double-layer long-time memory network layer;
taking the output of the double-layer long-time memory network layer as the input of the full connection, and inputting the output to the full connection layer;
and taking the output result of the full connection layer as the input of the activation function layer, inputting the output result into the activation function layer, and taking the output result of the activation function layer as the classification category corresponding to the target statement.
6. The method of claim 5, further comprising: training the classification model;
the training the classification model includes:
obtaining a plurality of training sample data, wherein the training sample data comprises training sentences and label types corresponding to the training sentences; wherein the label category is determined based on a predetermined mapping relation table;
aiming at each training sample data, performing text vectorization on the training sample data to obtain initial text vectors with uniform input lengths;
inputting the training sample data into a pre-constructed classification model to be trained to obtain a training result;
correcting a loss function in the classification model to be trained based on a training result and a sample result of the training sample data, and training to obtain the classification model to be used by taking the convergence of the loss function as a training target;
and verifying the classification model to be used based on verification sample data, and taking the classification model to be used as the classification model when a verification result meets a preset condition.
7. The method according to claim 6, wherein the verifying the classification model to be used based on verification sample data, and when a verification result satisfies a preset condition, using the classification model to be used as the classification model comprises:
acquiring a preset amount of verification sample data;
inputting the verification sample data into the classification model to be used, and determining the accuracy of the classification model to be used based on the output result of the classification model to be used;
when the accuracy reaches a preset accuracy threshold, taking the classification model to be used as a classification model;
if the accuracy rate does not reach the accuracy rate threshold value, obtaining training sample data to continue training the classification model to be used until the accuracy rate of the classification model to be used reaches the preset accuracy rate threshold value.
8. The method according to any one of claims 1-7, wherein the method is applied to each sentence in a knowledge graph trivia.
9. A class determination device configured in a knowledge-graph question-answer, comprising:
the target statement acquisition module is used for acquiring a target statement;
the category determination module is used for determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence;
the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement.
10. An apparatus, characterized in that the apparatus comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the category determination method of any of claims 1-8.
11. A storage medium containing computer-executable instructions for performing the category determination method of any one of claims 1-8 when executed by a computer processor.
CN202010849763.9A 2020-08-21 2020-08-21 Category determination method, device, equipment and medium Pending CN112214595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010849763.9A CN112214595A (en) 2020-08-21 2020-08-21 Category determination method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010849763.9A CN112214595A (en) 2020-08-21 2020-08-21 Category determination method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN112214595A true CN112214595A (en) 2021-01-12

Family

ID=74059367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010849763.9A Pending CN112214595A (en) 2020-08-21 2020-08-21 Category determination method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN112214595A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11010692B1 (en) * 2020-12-17 2021-05-18 Exceed AI Ltd Systems and methods for automatic extraction of classification training data
CN113392642A (en) * 2021-06-04 2021-09-14 北京师范大学 System and method for automatically labeling child-bearing case based on meta-learning
CN113569024A (en) * 2021-07-19 2021-10-29 上海明略人工智能(集团)有限公司 Card category identification method and device, electronic equipment and computer storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019782A (en) * 2017-09-26 2019-07-16 北京京东尚科信息技术有限公司 Method and apparatus for exporting text categories
US10528866B1 (en) * 2015-09-04 2020-01-07 Google Llc Training a document classification neural network
CN110705225A (en) * 2019-08-15 2020-01-17 平安信托有限责任公司 Contract marking method and device
CN111159366A (en) * 2019-12-05 2020-05-15 重庆兆光科技股份有限公司 Question-answer optimization method based on orthogonal theme representation
CN111538823A (en) * 2020-04-26 2020-08-14 支付宝(杭州)信息技术有限公司 Information processing method, model training method, device, equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10528866B1 (en) * 2015-09-04 2020-01-07 Google Llc Training a document classification neural network
CN110019782A (en) * 2017-09-26 2019-07-16 北京京东尚科信息技术有限公司 Method and apparatus for exporting text categories
CN110705225A (en) * 2019-08-15 2020-01-17 平安信托有限责任公司 Contract marking method and device
CN111159366A (en) * 2019-12-05 2020-05-15 重庆兆光科技股份有限公司 Question-answer optimization method based on orthogonal theme representation
CN111538823A (en) * 2020-04-26 2020-08-14 支付宝(杭州)信息技术有限公司 Information processing method, model training method, device, equipment and medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11010692B1 (en) * 2020-12-17 2021-05-18 Exceed AI Ltd Systems and methods for automatic extraction of classification training data
CN113392642A (en) * 2021-06-04 2021-09-14 北京师范大学 System and method for automatically labeling child-bearing case based on meta-learning
CN113392642B (en) * 2021-06-04 2023-06-02 北京师范大学 Automatic labeling system and method for child care cases based on meta learning
CN113569024A (en) * 2021-07-19 2021-10-29 上海明略人工智能(集团)有限公司 Card category identification method and device, electronic equipment and computer storage medium

Similar Documents

Publication Publication Date Title
CN111090987B (en) Method and apparatus for outputting information
CN110442718B (en) Statement processing method and device, server and storage medium
CN107908635B (en) Method and device for establishing text classification model and text classification
CN112015859B (en) Knowledge hierarchy extraction method and device for text, computer equipment and readable medium
CN110276023B (en) POI transition event discovery method, device, computing equipment and medium
CN108985358B (en) Emotion recognition method, device, equipment and storage medium
CN109614625B (en) Method, device and equipment for determining title text relevancy and storage medium
CN110704576B (en) Text-based entity relationship extraction method and device
WO2021139247A1 (en) Construction method, apparatus and device for medical domain knowledge map, and storage medium
WO2021218024A1 (en) Method and apparatus for training named entity recognition model, and computer device
CN108062388A (en) Interactive reply generation method and device
CN112214595A (en) Category determination method, device, equipment and medium
CN110222330B (en) Semantic recognition method and device, storage medium and computer equipment
CN111767366A (en) Question and answer resource mining method and device, computer equipment and storage medium
CN111666766B (en) Data processing method, device and equipment
CN111046671A (en) Chinese named entity recognition method based on graph network and merged into dictionary
WO2021208727A1 (en) Text error detection method and apparatus based on artificial intelligence, and computer device
CN113095080B (en) Theme-based semantic recognition method and device, electronic equipment and storage medium
CN113434683B (en) Text classification method, device, medium and electronic equipment
CN113158656B (en) Ironic content recognition method, ironic content recognition device, electronic device, and storage medium
CN112100377A (en) Text classification method and device, computer equipment and storage medium
CN110929532B (en) Data processing method, device, equipment and storage medium
CN110532562B (en) Neural network training method, idiom misuse detection method and device and electronic equipment
CN116385937A (en) Method and system for solving video question and answer based on multi-granularity cross-mode interaction framework
CN111125550B (en) Point-of-interest classification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220916

Address after: 25 Financial Street, Xicheng District, Beijing 100033

Applicant after: CHINA CONSTRUCTION BANK Corp.

Address before: 25 Financial Street, Xicheng District, Beijing 100033

Applicant before: CHINA CONSTRUCTION BANK Corp.

Applicant before: Jianxin Financial Science and Technology Co.,Ltd.

TA01 Transfer of patent application right