CN112214595A

CN112214595A - Category determination method, device, equipment and medium

Info

Publication number: CN112214595A
Application number: CN202010849763.9A
Authority: CN
Inventors: 赖雅玲; 黄德荣; 吴楠; 张彪
Original assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Current assignee: China Construction Bank Corp
Priority date: 2020-08-21
Filing date: 2020-08-21
Publication date: 2021-01-12

Abstract

The invention discloses a method, a device, equipment and a medium for determining categories, wherein the method comprises the following steps: acquiring a target statement; determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence; the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement. According to the technical scheme of the embodiment of the invention, the classification model for processing each target sentence can be obtained by training the pre-constructed classification model, the classification label of the target sentence can be determined based on the classification model, and the label category corresponding to the target sentence is further obtained.

Description

Category determination method, device, equipment and medium

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment and a medium for determining categories.

Background

At present, when a user consults a problem, the category corresponding to the consultation problem needs to be determined manually, and then the corresponding label category is determined.

However, the method for determining the label category corresponding to each sentence and the corresponding problem solution has the technical problems of high labor cost and low accuracy.

Disclosure of Invention

The invention provides a category determination method, a category determination device, category determination equipment and a storage medium, which are used for realizing an optimized category determination method and accurately and conveniently determining a classification category corresponding to a target statement.

In a first aspect, an embodiment of the present invention provides a category determination method, which is applied to a knowledge-graph question-answer, and includes:

acquiring a target statement;

determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence;

the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement.

In a second aspect, an embodiment of the present invention further provides a category determining apparatus, configured in a knowledge-graph question-answer, including:

the target statement acquisition module is used for acquiring a target statement;

the category determination module is used for determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence;

In a third aspect, an embodiment of the present invention further provides an apparatus, where the apparatus includes:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement a category determination method as in any of the embodiments of the present invention.

In a fourth aspect, embodiments of the present invention further provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform the category determination method according to any one of the embodiments of the present invention.

According to the technical scheme of the embodiment of the invention, the classification model for processing each target sentence can be obtained by training the pre-constructed classification model, the classification label of the target sentence can be determined based on the classification model, the label category corresponding to the target sentence is further obtained, and the data processing effect is improved.

Drawings

In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, a brief description is given below of the drawings used in describing the embodiments. It should be clear that the described figures are only views of some of the embodiments of the invention to be described, not all, and that for a person skilled in the art, other figures can be derived from these figures without inventive effort.

Fig. 1 is a schematic flowchart of a category determination method according to an embodiment of the present invention;

FIG. 2 is a diagram of a constructed classification model according to a second embodiment of the present invention;

FIG. 3 is a schematic flowchart of training a classification model according to a second embodiment of the present invention;

fig. 4 is a schematic diagram of a long-term and short-term memory network according to a second embodiment of the present invention;

fig. 5 is a schematic diagram of a forgetting gate in a long-and-short memory network according to a second embodiment of the present invention;

fig. 6 is a schematic diagram of an input gate in a long-and-short term memory network according to a second embodiment of the present invention;

FIG. 7 is a schematic diagram of a memory cell in a long-short term memory network according to a second embodiment of the present invention;

fig. 8 is a schematic diagram of an output gate in a long-and-short term memory network according to a second embodiment of the present invention;

fig. 9 is a flowchart illustrating a category determination method according to a second embodiment of the present invention;

fig. 10 is a schematic structural diagram of a category determining apparatus according to a third embodiment of the present invention;

fig. 11 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1 is a schematic flow chart of a category determination method according to an embodiment of the present invention, where the present embodiment is applicable to a situation of a category tag corresponding to a certain statement in a specific application scenario, and optionally, the scenario may be in a knowledge-graph question-and-answer system, and the method may be executed by a category determination device, and the device may be implemented in a form of software and/or hardware. The hardware may be an electronic device, which may be a PC terminal or a mobile terminal, etc.

It should be noted that the category determination method provided by this embodiment is not only different from the prior art, but also applies this method to the knowledge-graph question-answering, and compared with the problem of the synonym and keyword matching ambiguity in the conventional knowledge-graph question-answering system based on keyword matching, the category determination method provided by this embodiment can be used to solve the problem.

As shown in fig. 1, the method of this embodiment includes:

and S110, acquiring a target statement.

If the category label corresponding to a certain statement needs to be determined, the statement can be used as a target statement. Illustratively, the target sentence may be a label corresponding to the XX building project land use life, or the like.

After the target statement is acquired, the method further comprises the following steps: dividing the target sentence into at least one vocabulary to be processed based on a preset word segmentation tool; and determining a target vocabulary corresponding to the target sentence according to at least one vocabulary to be processed.

It should be noted that the target sentence may be divided into at least one to-be-processed vocabulary based on the word segmentation dictionary, or the target sentence may be divided into at least one to-be-processed vocabulary based on the ending tool. The vocabulary to be processed after the vocabulary such as the auxiliary words and the stop words in the vocabulary to be processed can be used as the target vocabulary. That is, after the target sentence is segmented, the auxiliary words, stop words, and the like in all the segmented words can be removed, and the words obtained in this way can be used as the target words corresponding to the target sentence.

It should be noted that, since the classification model can process the input vector, after the target vocabulary is obtained, the target word vector corresponding to the target vocabulary also needs to be determined, so that the classification model determines the classification identifier corresponding to the target word vector.

Optionally, determining a target word frequency corresponding to each target word based on a predetermined word frequency dictionary; based on the target word frequency, an initialization word vector of the target vocabulary is determined, and the target word vector input to the classification model is determined based on the initialization word vector.

The words in the word frequency dictionary and the word frequencies corresponding to the words are related to specific application scenes. For example, the application scene is a real estate application scene, the words in the word frequency dictionary are words corresponding to real estate, and the word frequency corresponding to the words is determined according to the occurrence frequency of each word.

It should be noted that, the target vocabularies corresponding to different sentences have different numbers and correspondingly, the word vector dimensions corresponding to the target vocabularies are different, and in order to ensure that the classification model processes the corresponding word vectors, the word vector dimensions may be predetermined. For example, the word vector corresponding to each sentence is determined to be 1 × 10, and three flags may be reserved in the head of the dictionary, where the three flags are [ PAD ], [ UNK ], [ CODE ], and respectively indicate a filler word, an unknown word, and a numbered word. The dictionary is used for vectorization of text.

Specifically, after the target vocabulary corresponding to the target sentence is determined, the word frequency corresponding to each target vocabulary may be determined according to a predetermined word frequency dictionary. According to the target word frequency corresponding to each target word, an initialization word vector corresponding to the target word can be determined, so that the target word vector input to the classification model can be determined according to the initialization word vector.

In the embodiment, the target word vector of the target vocabulary is determined according to the initialization word vector of the target vocabulary, and the determination can be based on a word embedding layer or a mapping word vector submodel in a classification model. Optionally, the initialization word vector is processed into a target word vector containing statement information based on the word embedding layer or the mapping word vector submodel.

The word embedding layer is used for mapping each statement into a target word vector containing semantic information. And the mapping word vector submodel is a layer in the classification model and is used for processing the initialization word vector into a target word vector corresponding to the initialization word vector.

And S120, determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by pre-selection training, and determining the classification category of the target sentence.

In this embodiment, the target sentence may be divided into at least one vocabulary, and the obtained vocabulary may be used as the target vocabulary. The classification model is composed of a double-layer long-time memory network, a full connection layer and an activation function layer, and the constructed classification model is trained to obtain a classification model which is processed with the target vocabulary and determines the classification category corresponding to the target vocabulary. The classification categories may be individuals, collaboration agreements, floor and floor projects, and the like.

Specifically, the target sentence may be divided into at least one target vocabulary, the target vocabulary is input into a classification model obtained by pre-training, and a classification category corresponding to the target sentence may be determined based on the classification model.

In order to obtain a classification model for processing a target sentence, before training to obtain the classification model, the method further includes: and constructing a classification model.

Optionally, constructing the classification model includes: the word embedding layer or the mapping word vector submodel, the double-layer long-time memory network layer, the full connection layer and the activation function layer; inputting an initial word vector corresponding to a target sentence into the word embedding layer or the word vector submodel, and taking an output result as the input of a double-layer long-time memory network layer; taking the output of the double-layer long-time memory network layer as the input of the full connection, and inputting the output to the full connection layer; and taking the output result of the full connection layer as the input of the activation function layer, inputting the output result into the activation function layer, and taking the output result of the activation function layer as the classification category corresponding to the target statement.

It is understood that, in the present embodiment, the classification model needs to be constructed in advance. The classification model comprises a word embedding layer, a double-layer long-time memory network layer, a full connection layer and an activation function layer. The word embedding layer is used as a first input layer in the construction of the classification model and used for receiving an initial word vector corresponding to the target statement and processing the initial word vector into a target word vector corresponding to the target statement. And taking the output of the word embedding layer as an input value of a double-layer long-time memory network layer, inputting an output result to the full-connection layer, and taking the output result of the full-connection layer as an input layer of an activation function to construct a classification model.

After the classification model is built, the classification model may be trained. Training the classification model comprises: acquiring a plurality of training sample data, wherein the training sample data comprises training sentences and label types corresponding to the training sentences; wherein the label category is determined based on a predetermined mapping relation table; performing text vectorization on the training sample data aiming at each training sample data to obtain initial text vectors with uniform input lengths; inputting training sample data into a pre-constructed classification model to be trained to obtain a training result; correcting a loss function in the classification model to be trained based on the training result and a sample result of the training sample data, and training to obtain the classification model to be used by taking the convergence of the loss function as a training target; and verifying the classification model to be used based on the verification sample data, and taking the classification model to be used as the classification model when a verification result meets a preset condition.

In order to improve the accuracy of the model, training sample data can be acquired as much as possible. For each training sample data, the text of the training sample data can be vectorized to obtain an initial text vector with uniform input length. It should be noted that the number of words corresponding to different training sample data is different, and correspondingly, the vector dimension of text vectorization is also different, and in order to avoid such problems, a vector including the dimension of each training sample data is determined, and the vector dimension is used as the input of training the classification model to be trained.

Before training the to-be-classified model, the model parameters in the to-be-trained classification model can be set as default values, so that the model parameters in the to-be-trained classification model are corrected when training is carried out on the basis of training sample data. The training sample data comprises an entity classification training set and an attribute classification training set.

Specifically, training sample data can be input into the classification model to be trained to obtain an output value corresponding to the training sample data, based on the standard value and the training output value in the training sample data, a loss value between the standard value and the output value can be calculated, and based on the loss value, the model parameter in the classification model to be trained is determined. The training error of the loss function, i.e., the loss parameter, may be used as a condition for detecting whether the loss function reaches convergence currently, for example, whether the training error is smaller than a preset error or whether the error variation trend tends to be stable, or whether the current iteration number is equal to the preset number. If the detection reaches the convergence condition, for example, the training error of the loss function is smaller than the preset error or the error change tends to be stable, indicating that the training of the classification model to be trained is completed, at this time, the iterative training may be stopped. If the current condition is not met, sample data can be further acquired to train the classification model to be trained until the training error of the loss function is within the preset range. When the training error of the loss function reaches convergence, the classification model to be trained can be used as the classification model.

On the basis of the above technical solution, the verifying the classification model to be used based on verification sample data, and when a verification result satisfies a preset condition, using the classification model to be used as a classification model includes: acquiring a preset amount of verification sample data; inputting the verification sample data into the classification model to be used, and determining the accuracy of the classification model to be used based on the output result of the classification model to be used; when the accuracy reaches a preset accuracy threshold, taking the classification model to be used as a classification model; if the accuracy rate does not reach the accuracy rate threshold value, obtaining training sample data to continue training the classification model to be used until the accuracy rate of the classification model to be used reaches the preset accuracy rate threshold value.

On the basis of the technical scheme, in order to improve the accuracy of the classification model, the classification model to be used, which is obtained by training as much verification sample data as possible, can be obtained for verification. The method comprises the steps that verification sample data can be input into a classification model to be used, the classification model to be used can output a verification result corresponding to the verification sample data, the accuracy rates of the verification result and the training result can be determined according to the verification result and the training result in the verification sample data, and if the accuracy rate is lower than a preset accuracy rate threshold value, the classification model to be used serves as an available classification model; if the accuracy is higher than the preset accuracy threshold, the accuracy of the classification model to be used is low, and the classification model to be used needs to be trained. Optionally, training sample data is continuously obtained, the classification model to be used is trained based on the obtained training sample data until the accuracy of the classification model to be used is detected to reach a preset accuracy threshold, and the classification model to be used is used as the classification model.

According to the technical scheme of the embodiment of the invention, the classification model for processing each target statement can be obtained by training the pre-constructed classification model, the classification label of the target statement can be determined based on the classification model, the label category corresponding to the target statement is further obtained, and the data processing effect is improved

According to the technical scheme of the embodiment of the invention, the target sentence is processed through the pre-constructed and trained classification model, so that the classification category corresponding to the target sentence can be determined, and the technical effect of rapidly processing the input target sentence is realized.

Example two

As a preferred embodiment of the foregoing embodiment, it should be noted that the classification model provided in this embodiment includes a word embedding layer, a two-layer long-term memory network, a full connection layer, and an activation function layer. In order to further understand the technical solution of the embodiment of the present invention in detail, a schematic diagram of a classification model constructed and used in the embodiment of the present invention and a specific training method are introduced. As shown in fig. 2, the classification model includes a word embedding layer, a double-layer long-term memory network LSTM, a full connection layer, and an activation function layer according to the input and output of the classification model.

After the classification model is constructed, in this embodiment, a training diagram for training the classification model is shown in fig. 3.

S310, obtaining training sample data.

Specifically, a training sample data set and a data label are obtained. The training sample data set is determined for manual collection and labeling. Illustratively, 3000 pieces of data may be acquired, involving five classifications. Of course, in order to improve the accuracy of the model, training sample data may be acquired as much as possible.

And S320, marking the entity and the attribute category of the training data.

Each piece of training sample data comprises a question and a label, namely a sentence, wherein the question can be a question possibly covered in a real estate instruction graph, and can relate to personal information, cooperation agreement information, enterprise information, building and building project information and the like. The labels label may be nodes of the knowledge graph involved in various problems in the data. Optionally, the tag corresponding to the individual is 0, the tag corresponding to the cooperation agreement is 1, the tag corresponding to the enterprise is 2, the tag corresponding to the building is 3, and the tag corresponding to the building project is 4. The labeled data comprises two parts, wherein one part is labeled by the entity category in the question, and the other part is labeled by the attribute category in the question.

S330, carrying out bus segmentation on the data, determining each segmentation to carry out text vectorization, and determining the uniform input length.

Acquiring training sample data, dividing sentences corresponding to each training data into at least one vocabulary through the ending segmentation, and removing vocabulary such as auxiliary words and stop words in the at least one vocabulary. And counting the occurrence frequency of each vocabulary in the training sample data, and sequencing each vocabulary in sequence from high to low according to the frequency to form a word frequency dictionary. Wherein, in order to ensure that the dimension of the text vector corresponding to each training sentence is uniform. Three flags can be preset in the head of the dictionary, which can be: [ PAD ], [ UNK ], [ CODE ], which represent the filler word, unknown word, and the numbering word, respectively. The purpose of establishing the word frequency dictionary is to determine vectors corresponding to each word in the training sample data.

It should be noted that, in order to enrich the content of the vocabulary in the word frequency dictionary, training sample data may be obtained as much as possible, and then the word frequency dictionary is determined based on the training sample data.

Based on the word frequency dictionary, and the respective vocabulary, a text vector corresponding thereto may be determined.

It should be noted that, since the training data is a training sentence, the training sentence can be converted into a language recognized by a computer.

Specifically, based on the word frequency dictionary, the word label corresponding to the training sentence can be determined. In order to ensure the consistency of the length of each training sentence input to the classification model, the maximum input length of the training sentences can be taken, and when the vocabulary corresponding to some training sentences is different from the maximum input length, the predetermined flag bits can be adopted for completion. That is, if a word does not exist in the word frequency, the word can be replaced with the following table where [ UNK ] is located in the dictionary, and if the word is a string of number numbers, the word can be replaced with the following table where [ CODE ] is located in the dictionary.

Illustratively, the training sentence is "how many small loan accounts are? "the segmentation result based on the crust segmentation is: [ how many small loan accounts ]; replacing words with subscripts corresponding to the dictionary to form an array [2, 5, 13, 1 ]; assuming that the maximum length of the query is 6, then [2, 5, 13, 1, 0, 0] is right-filled with a subscript (assumed to be 0) corresponding to [ PAD ], that is, an initialization vector corresponding to the training sentence can be obtained, and the initialization vector can be input into the classification model to obtain a word vector containing semantic information.

And S340, training the constructed classification model based on the training sample data to obtain the classification model to be used.

That is, based on steps S310 to S330, an initialization word vector corresponding to each training sample data may be determined. The constructed classification model may be trained based on the initialization word vectors.

The word embedding layer is configured to map the input vector into a word vector including semantic information, that is, the word vector of the semantic information corresponding to each training sample data may be determined, and a dimension of the word embedding layer may be embedding _ size ═ 128. The method comprises the following steps that a neural network is memorized in a double-layer long-and-short time mode, namely a double-layer LSTM, and is used for learning the characteristics of input vectors and classifying, wherein a hidden layer hidden _ size is 256; a full connection layer for converging the rich feature characterization network into an output target class number, target _ size being 5; and activating a function layer for converting the output information into a probability value of 0-1.

Recurrent Neural Network (RNN) is a type of Neural network used to process sequence data. Common sequence data such as voice, text, etc. need to be processed depending on time and memory. The long short-term memory network LSTM (long short-term memory) is a variant of RNN, the RNN only has short-term memory due to gradient disappearance, the LSTM network combines the short-term memory and the long-term memory through delicate gate control, and the problem of gradient disappearance is solved to a certain extent.

The standard recurrent neural network has only a simple layer structure inside, while the LSTM has 4 layers inside as shown in fig. 4 below:

briefly introduce the specific internal structure:

the first step, see fig. 5, is the forgetting gate-which decides what information to discard from the cell state. The door will read h_t-1And x_tOutputting a value between 0 and 1 to each of the cells in the cell state C_t-1The numbers in (1). The second step, see fig. 6, is to determine what new information is deposited in the cell state. (1) The sigmoid layer, called the "input gate layer," decides what value we are going to update. the tanh layer creates a new candidate value vector,

may be added to the state. These two pieces of information produce updates to the state. Third step, see FIG. 7, update old cell status, C_t-1Is updated to C_t. C, taking the old state with f_tMultiply and discard the information we determined to need to discard. Then add

This is the new candidate, which changes according to how much we decide to update each state. Fourth, see fig. 8, determine what value to output, (1) sigmoid layer to determine which part of the cell state to output. (2) The cell state is processed by tanh (to obtain a value between-1 and 1) and multiplied by the output of the sigmoid gate, and finally we will only output that part of the output we determine. Therefore, the LSTM model realizes the long-term memory function through the control network of the input gate, the forgetting gate and the output gate。

Specifically, the training sample data is obtained for the constructed classification model and the novel training, and when the loss function convergence of the classification model to be used is detected, the obtained classification model to be trained can be used as the classification model to be used.

And S350, verifying the classification model to be used based on the verification sample data, and taking the classification model to be used as the classification model when the verification result meets the preset condition.

Specifically, check sample data is obtained, and the sample data is also processed into an initialization vector corresponding to the training sample data. And inputting the verification sample data into the classification model to be used, and taking the classification model to be used as the classification model when the verification result meets the preset condition.

Of course, if the accuracy of the classification model to be used is not outside the preset accuracy range, the training sample data can be continuously acquired to train the classification model to be used until the obtained accuracy of the classification model to be used is within the preset accuracy range.

Fig. 9 is a flowchart illustrating a category determining method according to a second embodiment of the present invention. As a preferred embodiment of the above embodiment, as shown in fig. 9, the method includes:

it should be noted that the category determining method provided by the embodiment of the present invention may be applied to a knowledge-graph question-answering system, which may be a tree-shaped graph, and based on the category determining method provided by the embodiment, the technical effect of accurately determining the matching result from the knowledge graph may be achieved.

And S410, acquiring a target statement.

Specifically, when a tag corresponding to a certain statement needs to be determined, the statement may be used as a target statement.

And S420, processing the target sentence based on the deep learning retrieval sentence classification model to obtain a classification identifier corresponding to the target sentence.

The method comprises the steps of utilizing natural language processing and deep learning technology to carry out semantic analysis on retrieval sentences, extracting entities and attributes of the retrieval sentences through constructing a cyclic neural network, predicting knowledge map nodes to which the entities and the attributes belong, and providing the nodes to a result filtering module, so that elastic search can accurately match the node positions corresponding to the entities and the attributes of the retrieval sentences, the uncertainty of fuzzy matching is reduced, and the retrieval accuracy is improved.

The deep learning retrieval sentence classification model is the classification model mentioned in the above embodiments.

Specifically, the target sentence is input into the deep learning retrieval sentence classification model, and the target sentence is processed based on the classification model, so that the classification mark corresponding to the target sentence can be obtained.

And S430, knowledge display.

It is to be understood that, after determining the classification identifier corresponding to the target sentence, the classification identifier may be displayed at a preset position.

According to the technical scheme of the embodiment of the invention, the classification model for processing each target sentence can be obtained by training the pre-constructed classification model, the classification label of the target sentence can be determined based on the classification model, and the label category corresponding to the target sentence is further obtained.

EXAMPLE III

Fig. 10 is a schematic structural diagram of a category determining apparatus according to a third embodiment of the present invention, where the apparatus includes: a target sentence acquisition module 510 and a category determination module 520.

The target statement acquisition module is used for acquiring a target statement; the category determination module is used for determining a target vocabulary corresponding to the target sentence, processing the target vocabulary based on a classification model obtained by preselected training, and determining the classification category of the target sentence; the classification model is composed of a double-layer long-short time memory network layer, a full connection layer and an activation function layer and is used for determining the category of each statement.

On the basis of the above technical solutions, after the target sentence acquisition module acquires the target sentence, the target sentence acquisition module is further configured to:

dividing the target sentence into at least one vocabulary to be processed based on a preset word segmentation tool;

and determining a target word of the target sentence according to the at least one word to be processed.

On the basis of the above technical solutions, the apparatus further includes:

determining target word frequency corresponding to each target vocabulary according to a predetermined word frequency dictionary;

and determining an initialization word vector of the target vocabulary based on the target word frequency so as to determine a target word vector input to the classification model based on the initialization word vector.

On the basis of the above technical solutions, the classification model further includes a word embedding layer or a mapping word vector submodel, and the determining a target word vector input into the classification model based on the initialization word vector includes:

and presetting the initialization word vector as a target word vector containing voice information based on a word embedding layer or a mapping word vector submodel.

On the basis of the above technical solutions, the method further comprises: constructing the classification model; the constructing the classification includes: the word embedding layer or the mapping word vector submodel, the double-layer long-time memory network layer, the full connection layer and the activation function layer;

inputting an initial word vector corresponding to a target sentence into the word embedding layer or the word vector submodel, and taking an output result as the input of a double-layer long-time memory network layer;

taking the output of the double-layer long-time memory network layer as the input of the full connection, and inputting the output to the full connection layer;

and taking the output result of the full connection layer as the input of the activation function layer, inputting the output result into the activation function layer, and taking the output result of the activation function layer as the classification category corresponding to the target statement.

On the basis of the above technical solutions, the apparatus further includes: the training the classification model includes:

obtaining a plurality of training sample data, wherein the training sample data comprises training sentences and label types corresponding to the training sentences; wherein the label category is determined based on a predetermined mapping relation table;

aiming at each training sample data, performing text vectorization on the training sample data to obtain initial text vectors with uniform input lengths;

inputting the training sample data into a pre-constructed classification model to be trained to obtain a training result;

correcting a loss function in the classification model to be trained based on a training result and a sample result of the training sample data, and training to obtain the classification model to be used by taking the convergence of the loss function as a training target;

and verifying the classification model to be used based on verification sample data, and taking the classification model to be used as the classification model when a verification result meets a preset condition.

On the basis of the above technical solutions, the verifying the classification model to be used based on verification sample data, and when a verification result satisfies a preset condition, using the classification model to be used as a classification model includes:

acquiring a preset amount of verification sample data;

inputting the verification sample data into the classification model to be used, and determining the accuracy of the classification model to be used based on the output result of the classification model to be used;

when the accuracy reaches a preset accuracy threshold, taking the classification model to be used as a classification model;

if the accuracy rate does not reach the accuracy rate threshold value, obtaining training sample data to continue training the classification model to be used until the accuracy rate of the classification model to be used reaches the preset accuracy rate threshold value.

The category determining device provided by the embodiment of the invention can execute the category determining method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the executing method.

It should be noted that, the units and modules included in the apparatus are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.

EXAMPLE III

Fig. 11 is a schematic structural diagram of an apparatus according to a third embodiment of the present invention. FIG. 11 illustrates a block diagram of an exemplary device 60 suitable for use in implementing embodiments of the present invention. The device 60 shown in fig. 11 is only an example and should not bring any limitation to the function and scope of use of the embodiments of the present invention.

As shown in FIG. 11, device 60 is embodied in a general purpose computing device. The components of the device 60 may include, but are not limited to: one or more processors or processing units 601, a system memory 602, and a bus 603 that couples various system components including the system memory 602 and the processing unit 601.

Bus 603 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Device 60 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by device 60 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 602 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)604 and/or cache memory 605. The device 60 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 606 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 11, and commonly referred to as a "hard drive"). Although not shown in FIG. 11, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to the bus 603 by one or more data media interfaces. Memory 602 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 608 having a set (at least one) of program modules 607 may be stored, for example, in memory 602, such program modules 607 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. The program modules 607 generally perform the functions and/or methods of the described embodiments of the invention.

Device 60 may also communicate with one or more external devices 609 (e.g., keyboard, pointing device, display 610, etc.), with one or more devices that enable a user to interact with device 60, and/or with any devices (e.g., network card, modem, etc.) that enable device 60 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 611. Also, device 60 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via network adapter 612. As shown, a network adapter 612 communicates with the other modules of device 60 via bus 603. It should be appreciated that although not shown in FIG. 11, other hardware and/or software modules may be used in conjunction with device 60, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 601 executes various functional applications and data processing by executing programs stored in the system memory 602, for example, to implement the category determination method provided by the embodiment of the present invention.

Example four

A storage medium containing computer-executable instructions for performing a method for category determination when executed by a computer processor is also provided in a fourth embodiment of the present invention.

The method comprises the following steps:

acquiring a target statement;

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A category determination method applied to knowledge-graph questions and answers comprises the following steps:

acquiring a target statement;

2. The method of claim 1, further comprising, after the obtaining the target statement:

3. The method of claim 1, further comprising:

4. The method of claim 3, further comprising a word embedding layer or a mapping word vector submodel in the classification model, wherein the determining a target word vector for input into the classification model based on the initialization word vector comprises:

and processing the initialization word vector into a target word vector containing statement information based on a word embedding layer or a mapping word vector submodel.

5. The method of claim 1, further comprising: constructing the classification model; the constructing the classification includes: the word embedding layer or the mapping word vector submodel, the double-layer long-time memory network layer, the full connection layer and the activation function layer;

6. The method of claim 5, further comprising: training the classification model;

the training the classification model includes:

7. The method according to claim 6, wherein the verifying the classification model to be used based on verification sample data, and when a verification result satisfies a preset condition, using the classification model to be used as the classification model comprises:

acquiring a preset amount of verification sample data;

8. The method according to any one of claims 1-7, wherein the method is applied to each sentence in a knowledge graph trivia.

9. A class determination device configured in a knowledge-graph question-answer, comprising:

10. An apparatus, characterized in that the apparatus comprises:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the category determination method of any of claims 1-8.

11. A storage medium containing computer-executable instructions for performing the category determination method of any one of claims 1-8 when executed by a computer processor.