WO2020253052A1

WO2020253052A1 - Behavior recognition method based on natural semantic understanding, and related device

Info

Publication number: WO2020253052A1
Application number: PCT/CN2019/117867
Authority: WO
Inventors: 沈越; 苏宇; 王小鹏
Original assignee: 平安普惠企业管理有限公司
Priority date: 2019-06-18
Filing date: 2019-11-13
Publication date: 2020-12-24
Also published as: CN110321558B; CN110321558A

Abstract

A behavior recognition method based on natural semantic understanding, and a related device. The method comprises: extracting, by means of a word segmentation algorithm in an auto-encoding model, text features from a plurality of sentences in a first document so as to constitute a plurality of first vectors, wherein the text features in each sentence constitute a first vector; training, by means of an attention network in the auto-encoding model, the plurality of first vectors so as to obtain an attention weight of each first vector in the plurality of first vectors; inputting the plurality of first vectors and the attention weight of each first vector in the plurality of first vectors into an LSTM for training so as to generate a first semantic vector; decoding, by means of the LSTM, the first semantic vector so as to obtain a plurality of first decoding vectors; and if the plurality of first decoding vectors and the plurality of first vectors satisfy a pre-set similarity condition, comparing the first semantic vector with a second semantic vector of a second document so as to determine whether there is a target behavior. By means of the method, a target behavior can be determined more accurately.

Description

A behavior recognition method and related equipment based on natural semantic understanding

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 18, 2019, the application number is 201910529267.2, and the application name is "an anti-cheating method and related equipment based on natural semantic understanding". The reference is incorporated in this application.

Technical field

This application relates to the field of computer technology, in particular to a behavior recognition method and related equipment based on natural semantic understanding.

Background technique

At present, many recruitments have written examinations. Non-compliance (such as cheating) in written examinations has been common. At present, many companies use manual screening and comparison to identify non-compliant behaviors. However, the number of applicants is small. It can be screened manually, but cannot be screened manually for a large number of applicants. With the development of artificial intelligence, some companies have tried to identify non-compliant behaviors through computers. The current principle of computer recognition is to directly compare two documents. If the contents of the two documents are the same, there is non-compliance. If they are not the same Then there is no non-compliance. For this method of determining non-compliance, it is easy for the offender to avoid being detected. For example, the offender changes the answer slightly, such as the substitution of synonyms; The order of statements in the document is slightly changed, and so on. After the keywords are changed and the sentence sequence is adjusted, the computer does not think that there is non-compliance, but the actual non-compliance is objective. How to identify non-compliance behaviors more accurately and efficiently through computers is a technical problem being studied by those skilled in the art.

Summary of the invention

The embodiments of the present application disclose a behavior recognition method and device based on natural semantic understanding, which can more accurately determine cheating behavior.

In the first aspect, the embodiments of the present application provide a behavior recognition method based on natural semantic understanding, and the method includes:

Extracting text features in multiple sentences in the first document through a word segmentation algorithm in the self-encoding model to form multiple first vectors, where the text features in each sentence form a first vector;

Training the plurality of first vectors through the attention network in the auto-encoding model to obtain the attention weight of each first vector in the plurality of first vectors;

Input the plurality of first vectors and the attention weight of each first vector of the plurality of first vectors into the Long Short-Term Memory (LSTM) training of the self-encoding model, To generate the first semantic vector;

Decoding the first semantic vector by the LSTM to obtain a plurality of first decoding vectors;

If the plurality of first decoding vectors and the plurality of first vectors satisfy a preset similarity condition, the first semantic vector is compared with the second semantic vector of the second document to determine whether there is a target behavior.

Through the implementation of the above method, the extracted text features more reflect the semantics of the sentence itself. The coding layer also uses LSTM to generate semantic vectors, which can better describe the semantics of the document.

In the second aspect, an embodiment of the present application provides a behavior recognition device based on natural semantic understanding, and the device includes:

The first extraction unit is used to extract the text features in multiple sentences in the first document through the word segmentation algorithm in the self-encoding model to form multiple first vectors, wherein the text features in each sentence form a first vector ；

A first training unit, configured to train the multiple first vectors through the attention network in the self-encoding model to obtain the attention weight of each first vector in the multiple first vectors;

The first generating unit is configured to input the plurality of first vectors and the attention weight of each first vector of the plurality of first vectors into the long and short-term memory network LSTM training in the self-encoding model to Generate the first semantic vector;

A first decoding unit, configured to decode the first semantic vector through the LSTM to obtain multiple first decoding vectors;

The comparison unit is configured to compare the first semantic vector with the second semantic vector of the second document if the plurality of first decoded vectors and the plurality of first vectors satisfy a preset similarity condition to determine whether There is a target behavior.

By running the above unit, the extracted text features more reflect the semantics of the sentence itself. The coding layer also uses LSTM to generate semantic vectors, which can better describe the semantics of the document.

In a third aspect, an embodiment of the present application provides a device that includes a processor and a memory, wherein the memory is used to store instructions, and when the instructions run on the processor, the first aspect or the first aspect is implemented. The method described in any possible implementation of one aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium that stores instructions in the computer-readable storage medium, and when it runs on a processor, it implements the first aspect or any one of the first aspect. Possible implementation methods described.

In a fifth aspect, an embodiment of the present application provides a non-volatile computer-readable storage medium having instructions stored in the non-volatile computer-readable storage medium, and when it runs on a processor, the first aspect is implemented , Or the method described in any possible implementation of the first aspect.

In a sixth aspect, the embodiments of the present application provide a computer program product, which, when the computer program product runs on a processor, implements the first aspect or the method described in any possible implementation manner of the first aspect.

Description of the drawings

In order to explain the embodiments of the present application or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the embodiments of the present application or the background technology.

FIG. 1 is a schematic flowchart of a behavior recognition method based on natural semantic understanding provided by an embodiment of the present application;

Figure 2 is a schematic structural diagram of a device provided by an embodiment of the present application;

Fig. 3 is a schematic structural diagram of another device provided by an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be described below in conjunction with the drawings.

The main idea of the embodiments of this application is to obtain the semantic vector of the document through an autoencoder (AE), and then compare the semantic vectors of the two documents. If the two semantic vectors are relatively close, it means that the two documents are similar. To determine the existence of the target behavior. The self-encoding model includes an encoding layer and a decoding layer. The encoding layer includes word segmentation algorithms (for example, Convolutional Neural Networks (CNN)), attention network Attention, and Long Short-Term Memory (Long Short-Term Memory). LSTM); The decoding layer includes LSTM.

Among them, the word segmentation algorithm is used to extract text features from a document in sentence units to form text vectors. The attention network attention is used to train multiple text vectors to obtain the attention weight of each text vector in the multiple text vectors. Generally speaking, if the words represented by the text features are more important, you can usually get a higher Attention weight. The LSTM is used to train a semantic vector at the encoding layer according to each character feature and the attention weight of each character feature. The LSTM is also used to decode the semantic vector at the decoding layer. The vector obtained after decoding can be called a decoding vector. The goal of the self-encoding model is to make the finally decoded decoding vector converge to the word vector in the encoding link as much as possible. If the convergence reaches a certain degree, then the semantic vector obtained by the LSTM encoding in the self-encoding model can basically represent The semantics of the corresponding text.

In the process of identifying the target behavior (such as cheating), it usually involves comparing two documents (for example, two applicants’ respective answers, one applicant’s answer and standard answers, etc.). These two documents will be referred to hereinafter The first document and the second document for ease of description.

Please refer to Figure 1. Figure 1 is a behavior recognition method based on natural semantic understanding provided by an embodiment of the present application. The method can be implemented based on the self-encoding model shown in Figure 1, and the device that executes the method can be a hardware device. (Such as a server) or a cluster composed of multiple hardware devices (such as a server cluster), the method includes but is not limited to the following steps:

Step S101: The device extracts text features in multiple sentences in the first document through the word segmentation algorithm in the self-encoding model to form multiple first vectors.

Specifically, the word segmentation algorithm can be a convolutional neural network CNN, and the convolutional neural network can better denoise and remove redundancy in sentences (filter out words or words that have no small impact in the sentence). In addition, the model parameters in the word segmentation algorithm may include parameters previously obtained by training a large number of other documents, or may include artificially configured parameters.

In the embodiment of this application, text features are extracted from the first document in sentence units to form a feature vector. For example, if the first document includes 20 sentences, then text features can be extracted from each sentence, and the text in each sentence The feature constitutes a feature vector. In order to distinguish it from the feature vector extracted from the second document, the feature vector composed of the text features extracted from the first document can be called the first vector, and the text feature extracted from the second document The constructed feature vector is the second vector. Optionally, if the first document contains 20 sentences, text can also be extracted from only some of the sentences (for example, 18 sentences among them, these 18 can be selected from these 20 sentences by a predefined algorithm) Feature is still a feature vector composed of text features in each sentence.

For example, if there is a sentence such as "My hobby is playing basketball and table tennis" in the first document, and the text features extracted from this sentence through the word segmentation algorithm are "I", "的", "Hobby" , "Yes", "Hit", "Basketball", "He", "Ping Pong", when determining the first vector by these character features (ie words), you can directly use all words or select some words . The word-to-vector conversion method can use One-hot or pre-trained word vectors. Optionally, if all words are used to convert into vectors, then the feature vector obtained from these 8 character features can be a first vector X11=(t1, t2, t3, t4, t5, t6, t7, t8) , Where t1 means "I", t2 means "of", t3 means "hobby", t4 means "yes", t5 means "play", t6 means "basketball", t7 means "and", t8 means "table tennis" ". In this way, multiple first vectors can be obtained.

Step S102: The device trains the multiple first vectors through the attention network in the self-encoding model to obtain the attention weight of each first vector in the multiple first vectors.

Specifically, the attention network is used to characterize the importance of different first vectors, and the model parameters of the attention network may include parameters obtained by training a large number of other vectors (including important vectors and unimportant vectors). It may include artificially set parameters. Therefore, when the above-mentioned multiple first vectors are input to the attention, the attention weight of each first vector in the multiple first vectors can be obtained. A vector is more effective in embodying semantics.

For example, if the multiple first vectors are: X11, X12, X13, X14, X15, X16, X17, X18, X19, X10, the attention weights of these first vectors obtained through attention network training are as follows Table 1 shows:

Table 1

第一向量First vector	注意力权重Attention weight
X11X11	0.010.01
X12X12	0.050.05
X13X13	0.10.1
X14X14	0.20.2
X15X15	0.050.05
X16X16	0.090.09
X17X17	0.0910.091
X18X18	0.0090.009
X19X19	0.30.3
X10X10	0.10.1

It can be seen from Table 1 that the attention weights of the first X19, X14, X13, and X10 are larger. Therefore, it is expected that these first vectors are more convenient to express the semantics of the first document than other first vectors. A lot of information.

Step S103: The device inputs the plurality of first vectors and the attention weight of each first vector of the plurality of first vectors into the long short-term memory network LSTM training in the self-encoding model to generate a first semantic vector.

Specifically, the LSTM can generate a semantic vector based on the feature vector of a representative word. The LSTM in the embodiment of the present application not only needs to generate the first semantic vector based on the input of each first vector, but also based on the attention of each first vector. Power weight. When describing semantics, there is a greater tendency to focus on the first vector with greater attention power. For example, the first vector X19 mainly expresses the meaning of "like", and the first vector X15 mainly expresses the meaning of "hate", and the attention weight of the first vector X19 is much greater than the attention weight of the second vector X15 Power weight, then the generated first semantic vector is more inclined to express the meaning of "like".

LSTM obtains the first semantic vector according to multiple first vectors and corresponding attention weights, which can be regarded as an encoding process. Before encoding, there are multiple vectors, and after encoding, a vector is obtained. Table 2 exemplarily shows the before and after encoding. vector.

Table 2

Step S104: The device decodes the first semantic vector through the LSTM to obtain multiple first decoded vectors.

Specifically, after the encoding layer obtains the first semantic vector through LSTM, the first semantic vector is also decoded through LSTM in the decoding layer, and the vector obtained by decoding may be called the first decoding vector to facilitate subsequent description. Before decoding is a vector, after decoding, there are multiple vectors. Table 3 exemplarily shows the vectors before and after decoding.

table 3

The goal of the self-encoder in the embodiment of this application is to make the multiple first decoded vectors obtained by LSTM decoding of the decoding layer converge to the multiple first vectors obtained by the word segmentation algorithm, that is, make the multiple first decoded vectors as close as possible Multiple first vectors (loss function can be defined in advance to specify the degree of convergence). Generally speaking, it is necessary to perform the above steps S101-S104 multiple times. After each execution of the steps S101-S104, if the multiple first decoding vectors and multiple first vectors cannot meet the expected similar conditions, the word segmentation in the self-encoding model The model parameters of at least one of the algorithm, the attention network, and the LSTM are optimized, and steps S101-S104 are executed again after the optimization; and the loop is repeated until the multiple first decoded vectors and the multiple first vectors cannot meet the expected similar conditions.

The expected similarity condition (also called the preset similarity condition) can be configured to configure the self-encoding model, so that the self-encoding model has the ability to judge whether the expected similarity condition is reached. In the following, a simpler case is used to describe the situation where multiple first decoding vectors and multiple first vectors cannot meet the expected similar conditions (more complicated rules can be configured in actual applications).

For example, it is defined that more than 70% of the plurality of first decoding vectors after decoding are the same as the first vector, it is considered that the plurality of first decoding vectors and the plurality of first vectors meet the expected similarity condition . So if there are 10 first vectors, there are 10 first decoded vectors after decoding, of which 8 first vectors correspond to the 8 first decoded vectors one to one, and only the remaining 2 first decoded vectors do not correspond to the same The first vector has the same rate of 80%, which is greater than the prescribed 70%. Therefore, it is considered that the 10 first decoded vectors and the 10 first vectors meet the expected similarity condition.

Step S105: The device extracts the text features in multiple sentences in the second document through the word segmentation algorithm in the self-encoding model to form multiple second vectors.

Specifically, the embodiment of the present application extracts text features from the second document in sentence units to form a feature vector. For example, if the first document includes 16 sentences, then the text features can be extracted from each sentence separately, and each sentence In order to distinguish from the feature vector extracted from the first document, the feature vector composed of the text feature extracted from the second document can be called the second vector, which is called extracted from the first document The feature vector formed by the character features of is the first vector. Optionally, if the second document contains 16 sentences, text can also be extracted from only some of the sentences (for example, 15 of the sentences, these 15 can be selected from the 16 sentences through a predefined algorithm) Feature is still a feature vector composed of text features in each sentence.

For example, if there is a sentence like "My hobby is playing basketball and badminton" in the second document, and the text features extracted from this sentence through the word segmentation algorithm are "hobby", "basketball", and "feather", Then the feature vector obtained from these three character features can be a second vector X21=(t1, t2, t4), where t1 represents "hobby", t2 represents "basketball", and t4 represents "badminton". In this way, multiple second vectors can be obtained.

Step S106: Train the multiple second vectors through the attention network in the self-encoding model to obtain the attention weight of each of the multiple second vectors.

Specifically, the attention network is used to characterize the importance of different second vectors, and the model parameters of the attention network can include parameters obtained by training a large number of other vectors (including important vectors and unimportant vectors). It can include artificially set parameters. Therefore, when the multiple second vectors are input to the attention, the attention weight of each second vector in the multiple second vectors can be obtained. The two vectors are more effective in embodying semantics.

For example, if the multiple second vectors are: X21, X22, X23, X24, X25, X26, X27, X28, X29, X210, the attention weights of these second vectors obtained through attention network training are as Table 1 shows:

Table 4

第二向量Second vector	注意力权重Attention weight
X21X21	0.020.02
X22X22	0.040.04
X23X23	0.150.15
X24X24	0.150.15
X25X25	0.040.04
X26X26	0.10.1
X27X27	0.090.09
X28X28	0.010.01
X29X29	0.30.3
X20X20	0.10.1

It can be seen from Table 4 that the second X29, X24, X23, and X20 have larger attention weights. Therefore, it is expected that these second vectors are more convenient to express the semantics of the second document than other second vectors. A lot of information.

Step S107: Input the plurality of second vectors and the attention weight of each second vector in the plurality of second vectors into the long-short-term memory network LSTM training in the self-encoding model to generate a second semantic vector.

Specifically, the LSTM can generate semantic vectors based on feature vectors representing words. In the process of generating second semantic vectors, the LSTM in the embodiment of this application must not only rely on the input of each second vector, but also based on the attention of each second vector. Power weight. When describing semantics, there is more tendency to focus on the second vector that has the greatest attention power. For example, the second vector X29 mainly expresses the meaning of "happy", and the second vector X25 mainly expresses the meaning of "irritable", and the attention weight of the second vector X29 is much greater than the attention weight of the second vector X25 Power weight, then the generated second semantic vector is more inclined to express the meaning of "happy".

LSTM obtains the second semantic vector according to multiple second vectors and corresponding attention weights, which can be regarded as an encoding process. Before encoding, there are multiple vectors, and after encoding, a vector is obtained. Table 5 exemplarily shows the before and after encoding. vector.

table 5

Step S108: Decode the second semantic vector through the LSTM to obtain multiple second decoded vectors.

Specifically, after the encoding layer obtains the second semantic vector through the LSTM, the second semantic vector is also decoded through the LSTM in the decoding layer. The vector obtained by decoding may be called the second decoding vector to facilitate subsequent description. Before decoding is a vector, after decoding, there are multiple vectors. Table 6 exemplarily shows the vectors before and after decoding.

Table 6

The goal of the self-encoder in the embodiment of this application is to make the multiple second decoding vectors obtained by LSTM decoding of the decoding layer converge to multiple second vectors obtained through the word segmentation algorithm, that is, to make the multiple second decoding vectors as close as possible Multiple second vectors. Generally speaking, it is necessary to perform the above steps S105-S108 multiple times. After each execution of the steps S105-S108, if the multiple second decoding vectors and multiple second vectors cannot meet the expected similar conditions, the word segmentation in the self-encoding model The model parameters of at least one of the algorithm, the attention network and the LSTM are optimized, and steps S105-S108 are executed again after the optimization; and the loop is repeated until the multiple second decoded vectors and the multiple second vectors cannot meet the expected similar conditions.

The expected similarity condition (also called the preset similarity condition) can be configured to configure the self-encoding model, so that the self-encoding model has the ability to judge whether the expected similarity condition is reached. The following uses a simpler case to describe the situation where multiple second decoding vectors and multiple second vectors cannot meet the expected similar conditions (more complicated rules can be configured in practical applications).

For example, it is defined that more than 70% of the second decoded vectors after decoding are the same as the second vector, it is considered that the plurality of second decoded vectors and the plurality of second vectors meet the expected similarity condition . So if there are 10 second vectors, there are 10 second decoded vectors after decoding, of which 8 second vectors correspond to the 8 second decoded vectors one-to-one, and only the remaining 2 second decoded vectors do not correspond to the same For the second vector, the same rate reaches 80%, which is greater than the prescribed 70%. Therefore, it is considered that the 10 second decoded vectors and the 10 second vectors meet the expected similarity condition.

Step S109: The device compares the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior (such as a cheating behavior).

Specifically, when the plurality of first decoding vectors and the plurality of first vectors satisfy the preset similarity condition, the first semantic vector can well reflect the semantics of the first document; and the plurality of second decoding vectors and the plurality of When the second vector meets the preset similarity condition, the second semantic vector can well reflect the semantics of the second document; therefore, when the plurality of first decoded vectors and the plurality of first vectors meet the preset similarity condition , And when the plurality of second decoding vectors and the plurality of second vectors meet the preset similarity condition, the similarity between the first semantic vector and the second semantic vector can be compared to reflect the first document and the second document The similarity. There are many ways to compare the similarity between the first semantic vector and the second semantic vector, which will be illustrated below.

For example, comparing the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior may be specifically: determining the cosine value of the first semantic vector and the second semantic vector; if the cosine If the value is greater than or equal to the preset threshold, it is considered that the semantics of the first document and the second document are very similar, so it is determined that there is a target behavior. The size of the preset threshold can be continuously set according to actual needs, and optionally can be set to a value between 0.6-0.9.

In an optional solution, before step S101 is performed, keyword replacement has been performed on the first document, and keyword replacement has been performed on the second document before step S102 is performed. It should be noted that after some synonymous keywords are replaced, it is more conducive to device extraction and word segmentation, and it is also more conducive to comparison between different documents. For example, if there is a sentence "I am proficient in front-end development" in the first document and a sentence "I am good at front-end development" in the second document, in essence, "good at" and "proficient" in these two sentences are synonyms. The semantics of is the same. If no synonym replacement is performed, these two sentences have a certain risk of being recognized as different meanings relative to the device.

It should be noted that the first document and the second document above can be the application answer sheets of two different candidates respectively, or the answer sheets of two different candidates during the examination process, or they can be two comparable in other scenarios. Sexual documentation.

Through the implementation of the above method, the feature of the word is extracted in the unit of the sentence in the document, thereby generating a feature vector for each sentence. In this way, the important semantics of each can be retained as much as possible, so that the subsequent generation of the semantic vector will be semantic The vector can better reflect the semantics of the document.

The foregoing describes the method of the embodiment of the present application in detail, and the device of the embodiment of the present application is provided below.

Please refer to FIG. 2, which is a schematic structural diagram of a device 20 according to an embodiment of the present application. The device 20 may include a first extraction unit 201, a first training unit 202, a first generation unit 203, and a first decoding unit 204. With the comparison unit 205, the detailed description of each unit is as follows.

The first extraction unit 201 is used for extracting text features in multiple sentences in the first document through the word segmentation algorithm in the self-encoding model to form multiple first vectors, wherein the text features in each sentence form a first vector ；

The first training unit 202 is configured to train the multiple first vectors through the attention network in the self-encoding model to obtain the attention weight of each first vector in the multiple first vectors;

The first generating unit 203 is configured to input the plurality of first vectors and the attention weight of each first vector of the plurality of first vectors into the long and short-term memory network LSTM training in the self-encoding model to Generate the first semantic vector;

The first decoding unit 204 is configured to decode the first semantic vector through the LSTM to obtain multiple first decoding vectors;

The comparing unit 205 is configured to compare the first semantic vector with the second semantic vector of the second document if the plurality of first decoded vectors and the plurality of first vectors satisfy a preset similarity condition to determine whether There is a target behavior.

By running the above unit, extract word features in the unit of sentence in the document, thereby generating a feature vector for each sentence, instead of constructing a feature vector based on the word features in the entire document, this method can retain as much as possible The important semantics in each makes the semantic vector better reflect the semantics of the document when the semantic vector is subsequently generated. In addition, the coding layer of the self-encoding model uses CNN to extract word features. CNN has good noise reduction and de-redundancy performance, so the extracted text features better reflect the semantics of the sentence itself. In addition, the attention network of the coding layer trains the attention weight of each feature vector in the unit of feature vector, instead of training the attention weight of each feature in the unit of word feature, which can significantly reduce the training pressure of attention weight. Improving the training efficiency of attention weights also makes the trained attention weights more valuable. The coding layer also uses LSTM to generate semantic vectors, which can better describe the semantics of the document.

In a possible implementation manner, the device 20 further includes:

The second extraction unit is used to extract text features in multiple sentences in the second document through the word segmentation algorithm in the self-encoding model to form multiple second vectors, wherein the text features in each sentence constitute a first vector Two vectors

A second training unit, configured to train the multiple second vectors through the attention network in the self-encoding model to obtain the attention weight of each second vector in the multiple second vectors;

The second generating unit is configured to input the plurality of second vectors and the attention weight of each second vector of the plurality of second vectors into the long and short-term memory network LSTM training in the self-encoding model to Generate the second semantic vector;

The second decoding unit is configured to decode the second semantic vector through the LSTM to obtain a plurality of second decoding vectors, wherein the plurality of second decoding vectors and the plurality of second vectors satisfy a preset similarity condition .

In another possible implementation manner, the comparing unit compares the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior, including:

Determining the cosine values of the first semantic vector and the second semantic vector;

If the cosine value is greater than or equal to the preset threshold, it is determined that there is a target behavior.

In yet another possible implementation manner, it also includes:

The adjustment unit is configured to adjust the word features in the multiple sentences in the first document to form multiple first vectors by the first extraction unit using the word segmentation algorithm in the self-encoding model to adjust the Parameters of at least one of the word segmentation algorithm, the attention network, and the LSTM, so that the output of the self-encoding model converges to the input of the self-encoding model.

In another possible implementation manner, the first extraction unit is configured to extract text features in multiple sentences in the first document through a word segmentation algorithm in the self-encoding model to form multiple first vectors, specifically :

The text features in the multiple sentences in the first document are extracted through the convolutional neural network CNN in the self-encoding model to form multiple first vectors.

It should be noted that the implementation of each unit may also correspond to the corresponding description of the method embodiment shown in FIG. 1.

Please refer to FIG. 3, which is a device 30 provided by an embodiment of the present application. The device 30 includes a processor 301, a memory 302, and a communication interface 303. The processor 301, the memory 302, and the communication interface 303 are connected to each other through a bus. .

The memory 302 includes but is not limited to random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (erasable programmable read-only memory, EPROM), or A portable read-only memory (compact disc read-only memory, CD-ROM), the memory 302 is used for related instructions and data. The communication interface 303 is used to receive and send data.

The processor 301 may be one or more central processing units (CPUs). When the processor 301 is a CPU, the CPU may be a single-core CPU or a multi-core CPU.

The processor 301 in the device 30 is configured to read the program code stored in the memory 302 and perform the following operations:

Inputting the plurality of first vectors and the attention weight of each of the plurality of first vectors into the long-term short-term memory network LSTM training in the self-encoding model to generate a first semantic vector;

Through the implementation of the above method, the word features are extracted in the unit of the sentence in the document, thereby generating a feature vector for each sentence, instead of forming a feature vector based on the word features in the entire document, this method can be retained as much as possible The important semantics in each makes the semantic vector better reflect the semantics of the document when the semantic vector is subsequently generated. In addition, the coding layer of the self-encoding model uses CNN to extract word features. CNN has good noise reduction and de-redundancy performance, so the extracted text features better reflect the semantics of the sentence itself. In addition, the attention network of the coding layer trains the attention weight of each feature vector in the unit of feature vector, instead of training the attention weight of each feature in the unit of word feature, which can significantly reduce the training pressure of attention weight. Improving the training efficiency of attention weights also makes the trained attention weights more valuable. The coding layer also uses LSTM to generate semantic vectors, which can better describe the semantics of the document.

In a possible implementation manner, before the processor compares the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior, it is also used to:

Extracting text features in multiple sentences in the second document by the word segmentation algorithm in the self-encoding model to form multiple second vectors, wherein the text features in each sentence form a second vector;

Training the plurality of second vectors through the attention network in the self-encoding model to obtain the attention weight of each second vector in the plurality of second vectors;

Inputting the plurality of second vectors and the attention weight of each second vector of the plurality of second vectors into the long short-term memory network LSTM training in the self-encoding model to generate a second semantic vector;

The second semantic vector is decoded by the LSTM to obtain a plurality of second decoding vectors, wherein the plurality of second decoding vectors and the plurality of second vectors satisfy a preset similarity condition.

In another possible implementation manner, the processor compares the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior, specifically:

In another possible implementation manner, before the processor extracts the text features in the multiple sentences in the first document through the word segmentation algorithm in the self-encoding model to form multiple first vectors, it is also used to:

Adjusting the parameters of at least one of the word segmentation algorithm, the attention network and the LSTM in the self-encoding model, so that the output of the self-encoding model converges to the input of the self-encoding model.

In another possible implementation manner, the processor extracts the text features in the multiple sentences in the first document through the word segmentation algorithm in the self-encoding model to form multiple first vectors, specifically:

It should be noted that the implementation of each operation can also correspond to the corresponding description of the method embodiment shown in FIG. 1

An embodiment of the present application also provides a computer-readable storage medium, which stores instructions in the computer-readable storage medium, and when it runs on a processor, the method flow shown in FIG. 1 is implemented.

The embodiment of the present application also provides a non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores instructions, and when it runs on a processor, the method flow shown in FIG. 1 Achieved.

The embodiment of the present application also provides a computer program product. When the computer program product runs on a processor, the method flow shown in FIG. 1 is realized.

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiments can be implemented by instructing relevant hardware through a computer program. The program can be stored in a computer readable storage medium. When the program is executed, , May include the processes of the above-mentioned method embodiments. The aforementioned storage media include: ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

Claims

A behavior recognition method based on natural semantic understanding, which is characterized in that it includes:

Extracting text features in multiple sentences in the first document through a word segmentation algorithm in the self-encoding model to form multiple first vectors, where the text features in each sentence form a first vector;

Training the plurality of first vectors through the attention network in the auto-encoding model to obtain the attention weight of each first vector in the plurality of first vectors;

Inputting the plurality of first vectors and the attention weight of each first vector of the plurality of first vectors into the long and short-term memory network training in the self-encoding model to generate a first semantic vector;

Decoding the first semantic vector through the long and short-term memory network to obtain a plurality of first decoding vectors;

If the plurality of first decoding vectors and the plurality of first vectors satisfy a preset similarity condition, comparing the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior; where The second document is a document used for reference comparison, and the second semantic vector is used to characterize the semantics of the second document.
The method according to claim 1, wherein before comparing the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior, the method further comprises:

Extracting text features in multiple sentences in the second document by the word segmentation algorithm in the self-encoding model to form multiple second vectors, wherein the text features in each sentence form a second vector;

Training the plurality of second vectors through the attention network in the self-encoding model to obtain the attention weight of each second vector in the plurality of second vectors;

Inputting the plurality of second vectors and the attention weight of each second vector of the plurality of second vectors into the long-short-term memory network training in the self-encoding model to generate a second semantic vector;

The second semantic vector is decoded by the long- and short-term memory network to obtain a plurality of second decoding vectors, wherein the plurality of second decoding vectors and the plurality of second vectors satisfy a preset similarity condition.
The method according to claim 1 or 2, wherein the comparing the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior comprises:

Determining the cosine values of the first semantic vector and the second semantic vector;

If the cosine value is greater than or equal to the preset threshold, it is determined that there is a target behavior.
The method according to claim 1 or 2, characterized in that, before extracting the text features in the multiple sentences in the first document by the word segmentation algorithm in the self-encoding model to form multiple first vectors, the method further comprises:

Adjust the parameters of at least one of the word segmentation algorithm, the attention network, and the long short-term memory network in the self-encoding model, so that the output of the self-encoding model converges to the input of the self-encoding model .
The method according to claim 1 or 2, characterized in that, extracting text features in multiple sentences in the first document through a word segmentation algorithm in the self-encoding model to form multiple first vectors, comprising:

The text features in the multiple sentences in the first document are extracted through the convolutional neural network in the self-encoding model to form multiple first vectors.
The method according to claim 1 or 2, wherein the model parameters in the word segmentation algorithm include parameters obtained by training a large number of other documents in advance.
A behavior recognition device based on natural semantic understanding, which is characterized in that it includes:

The first extraction unit is used to extract the text features in multiple sentences in the first document through the word segmentation algorithm in the self-encoding model to form multiple first vectors, wherein the text features in each sentence form a first vector ；

A first training unit, configured to train the multiple first vectors through the attention network in the self-encoding model to obtain the attention weight of each first vector in the multiple first vectors;

The first generating unit is configured to input the plurality of first vectors and the attention weight of each first vector of the plurality of first vectors into the long and short-term memory network training in the self-encoding model to generate First semantic vector

A first decoding unit, configured to decode the first semantic vector through the long and short-term memory network to obtain multiple first decoding vectors;

The comparison unit is configured to compare the first semantic vector with the second semantic vector of the second document if the plurality of first decoded vectors and the plurality of first vectors satisfy a preset similarity condition to determine whether There is a target behavior; wherein, the second document is a document used for reference comparison, and the second semantic vector is used to characterize the semantics of the second document.
The device according to claim 7, further comprising:

The second extraction unit is used to extract text features in multiple sentences in the second document through the word segmentation algorithm in the self-encoding model to form multiple second vectors, wherein the text features in each sentence constitute a first vector Two vectors

A second training unit, configured to train the multiple second vectors through the attention network in the self-encoding model to obtain the attention weight of each second vector in the multiple second vectors;

The second generating unit is configured to input the plurality of second vectors and the attention weight of each second vector of the plurality of second vectors into the long and short-term memory network training in the self-encoding model to generate Second semantic vector;

The second decoding unit is configured to decode the second semantic vector through the long and short-term memory network to obtain a plurality of second decoding vectors, wherein the plurality of second decoding vectors and the plurality of second vectors meet the requirements Set similar conditions.
The device according to claim 7 or 8, wherein the comparison unit compares the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior, comprising:

Determining the cosine values of the first semantic vector and the second semantic vector;

If the cosine value is greater than or equal to the preset threshold, it is determined that there is a target behavior.
The device according to claim 7 or 8, further comprising: an adjustment unit configured to extract words in multiple sentences in the first document by the word segmentation algorithm in the self-encoding model in the first extraction unit Before features are used to form multiple first vectors, the parameters of at least one of the word segmentation algorithm, the attention network, and the long- and short-term memory network in the self-encoding model are adjusted to make the self-encoding model effective The output converges to the input of the self-encoding model.
The device according to claim 7 or 8, wherein the first extraction unit is configured to extract text features in multiple sentences in the first document through a word segmentation algorithm in the self-encoding model to form multiple first documents. A vector is specifically: the text features in multiple sentences in the first document are extracted through the convolutional neural network in the self-encoding model to form multiple first vectors.
A behavior recognition device based on natural semantic understanding, which is characterized by comprising a processor, a memory and a transceiver, the memory is used to store a computer program, and the processor calls the computer program to perform the following operations:

Extracting text features in multiple sentences in the first document through a word segmentation algorithm in the self-encoding model to form multiple first vectors, where the text features in each sentence form a first vector;

Training the plurality of first vectors through the attention network in the auto-encoding model to obtain the attention weight of each first vector in the plurality of first vectors;

Inputting the plurality of first vectors and the attention weight of each first vector of the plurality of first vectors into the long and short-term memory network training in the self-encoding model to generate a first semantic vector;

Decoding the first semantic vector through the long and short-term memory network to obtain a plurality of first decoding vectors;

If the plurality of first decoding vectors and the plurality of first vectors satisfy a preset similarity condition, comparing the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior; where The second document is a document used for reference comparison, and the second semantic vector is used to characterize the semantics of the second document.
The device according to claim 12, wherein before the processor compares the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior, it is further used for:

Extracting text features in multiple sentences in the second document by the word segmentation algorithm in the self-encoding model to form multiple second vectors, wherein the text features in each sentence form a second vector;

Training the plurality of second vectors through the attention network in the self-encoding model to obtain the attention weight of each second vector in the plurality of second vectors;

Inputting the plurality of second vectors and the attention weight of each second vector of the plurality of second vectors into the long-short-term memory network training in the self-encoding model to generate a second semantic vector;

The second semantic vector is decoded by the long- and short-term memory network to obtain a plurality of second decoding vectors, wherein the plurality of second decoding vectors and the plurality of second vectors satisfy a preset similarity condition.
The device according to claim 12 or 13, wherein the processor compares the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior, specifically:

Determining the cosine values of the first semantic vector and the second semantic vector;

If the cosine value is greater than or equal to the preset threshold, it is determined that there is a target behavior.
The device according to claim 12 or 13, characterized in that, before the processor extracts the text features in the multiple sentences in the first document through the word segmentation algorithm in the self-encoding model to form the multiple first vectors, Used for:

Adjust the parameters of at least one of the word segmentation algorithm, the attention network, and the long short-term memory network in the self-encoding model, so that the output of the self-encoding model converges to the input of the self-encoding model .
The device according to claim 12 or 13, wherein the processor uses a word segmentation algorithm in the self-encoding model to extract text features in multiple sentences in the first document to form multiple first vectors, specifically :

The text features in the multiple sentences in the first document are extracted through the convolutional neural network in the self-encoding model to form multiple first vectors.
A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed on a processor, the following operations are realized:

Extracting text features in multiple sentences in the first document through a word segmentation algorithm in the self-encoding model to form multiple first vectors, where the text features in each sentence form a first vector;

Training the plurality of first vectors through the attention network in the auto-encoding model to obtain the attention weight of each first vector in the plurality of first vectors;

Inputting the plurality of first vectors and the attention weight of each first vector of the plurality of first vectors into the long and short-term memory network training in the self-encoding model to generate a first semantic vector;

Decoding the first semantic vector through the long and short-term memory network to obtain a plurality of first decoding vectors;

If the plurality of first decoding vectors and the plurality of first vectors satisfy a preset similarity condition, comparing the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior; where The second document is a document used for reference comparison, and the second semantic vector is used to characterize the semantics of the second document.
17. The computer-readable storage medium according to claim 17, wherein before comparing the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior, the method further comprises:

Extracting text features in multiple sentences in the second document by the word segmentation algorithm in the self-encoding model to form multiple second vectors, wherein the text features in each sentence form a second vector;

Training the plurality of second vectors through the attention network in the self-encoding model to obtain the attention weight of each second vector in the plurality of second vectors;

Inputting the plurality of second vectors and the attention weight of each second vector of the plurality of second vectors into the long-short-term memory network training in the self-encoding model to generate a second semantic vector;

The second semantic vector is decoded by the long- and short-term memory network to obtain a plurality of second decoding vectors, wherein the plurality of second decoding vectors and the plurality of second vectors satisfy a preset similarity condition.
The computer-readable storage medium according to claim 17 or 18, wherein the comparing the first semantic vector with the second semantic vector of the second document to determine whether there is a target behavior comprises:

Determining the cosine values of the first semantic vector and the second semantic vector;

If the cosine value is greater than or equal to the preset threshold, it is determined that there is a target behavior.
The computer-readable storage medium according to claim 17 or 18, wherein the word feature in the multiple sentences in the first document is extracted by the word segmentation algorithm in the self-encoding model to form multiple first vectors. ,Also includes:

Adjust the parameters of at least one of the word segmentation algorithm, the attention network, and the long short-term memory network in the self-encoding model, so that the output of the self-encoding model converges to the input of the self-encoding model .