CN111898365A

CN111898365A - Method and device for detecting text

Info

Publication number: CN111898365A
Application number: CN202010260122.XA
Authority: CN
Inventors: 李银锋; 黄明星; 赖晨东; 周彬; 刘婷婷
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-04-03
Filing date: 2020-04-03
Publication date: 2020-11-06

Abstract

The embodiment of the application discloses a method and a device for detecting a text. One embodiment of the method comprises: acquiring an article description text, and segmenting the article description text to obtain a word set; generating a word sequence according to the positions of all words in the word set in the article description text on the basis of the word set; inputting the word sequence into a pre-trained text convolution neural network to obtain a text detection result for indicating whether the object description text contains target content; and outputting a text detection result. The embodiment improves the detection efficiency of the article description text.

Description

Method and device for detecting text

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a method and a device for detecting texts.

Background

With the development of e-commerce platforms, online shopping is becoming a main consumption channel for consumers. In order to attract the attention of consumers, many merchants typically edit some of the leading description of the goods, such as "best refrigerator", "star colleague", and so on. Such item description information is often an advertisement violation with fraudulent, spurious promotional information. In order to improve the accuracy of commodity information and better standardize the behavior of issuing the commodity information by merchants, the e-commerce platform needs to audit the commodity description information. The related manner of examining and verifying the commodity description information is usually to perform manual examination on the commodity description information.

Disclosure of Invention

The embodiment of the application provides a method and a device for pushing information.

In a first aspect, an embodiment of the present application provides a method for detecting a text, including: acquiring an article description text, and segmenting the article description text to obtain a word set; generating a word sequence according to the positions of all words in the word set in the article description text on the basis of the word set; inputting the word sequence into a pre-trained text convolution neural network to obtain a text detection result for indicating whether the object description text contains target content; and outputting a text detection result.

In some embodiments, the text convolutional neural network comprises a first convolutional layer for extracting text feature vectors of the article description text and a second convolutional layer for integrating the text feature vectors.

In some embodiments, inputting the word sequence into a pre-trained text convolutional neural network to obtain a text detection result for indicating whether the article description text contains the target content, including: inputting the word sequence into an embedded layer of a text convolution neural network to obtain a word vector matrix; inputting the word vector matrix into the first convolution layer to obtain a text characteristic vector of the article description text; inputting the text feature vector into the second convolution layer to obtain an integrated text feature vector; and inputting the integrated text feature vector into a full connection layer of the text convolution neural network to obtain the probability that the article description text contains the target content, and determining a text detection result for indicating whether the article description text contains the target content or not based on the probability.

In some embodiments, before inputting the sequence of words into an embedding layer of the text convolutional neural network, resulting in a word vector matrix, the method further comprises: and initializing the embedded layer by utilizing a preset corresponding relation table, wherein the corresponding relation table is used for representing the corresponding relation between the training words and the training word vectors, and the corresponding relation table comprises preset first characters and/or preset second characters.

In some embodiments, generating a word sequence based on the word set according to the position of each word in the word set in the item description text includes: comparing the number of words in the word set with a preset word number threshold; based on the comparison result, generating an initial word sequence according to the position of each word in the word set in the article description text; based on the initial word sequence, a word sequence is generated.

In some embodiments, based on the comparison, generating an initial word sequence by the position of each word in the word set in the item description text includes: if the number of the words in the word set is smaller than the word number threshold, the words in the word set are sorted according to the sequence of the positions of the words in the article description text from front to back, a first number of preset first characters are added after the sorting result, and an initial word sequence is obtained, wherein the first number is the difference value between the word number threshold and the number of the words in the word set.

In some embodiments, based on the comparison, generating an initial word sequence by the position of each word in the word set in the item description text includes: and if the number of the words in the word set is greater than the threshold value of the number of the words, selecting the words with the threshold value of the number of the words from the word set, and sequencing the selected words according to the sequence of the positions of the words in the article description text from front to back to obtain an initial word sequence.

In some embodiments, generating the sequence of words based on the initial sequence of words comprises: and determining whether the word exists in a preset training word set or not aiming at each word in the initial word sequence, and replacing the word by using a preset second character if the word does not exist in the training word set.

In a second aspect, an embodiment of the present application provides an apparatus for detecting text, including: the acquisition unit is configured to acquire an article description text and perform word segmentation on the article description text to obtain a word set; the generating unit is configured to generate a word sequence according to the position of each word in the word set in the article description text based on the word set; the input unit is configured to input the word sequence into a pre-trained text convolution neural network to obtain a text detection result for indicating whether the object description text contains the target content; an output unit configured to output a text detection result.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device, on which one or more programs are stored, which, when executed by the one or more processors, cause the one or more processors to implement the method as described in any implementation manner of the first aspect.

In a fourth aspect, the present application provides a computer-readable medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method as described in any implementation manner of the first aspect.

According to the method and the device for detecting the text, provided by the embodiment of the application, a word set is obtained by segmenting the obtained article description text; then, based on the word set, generating a word sequence according to the position of each word in the word set in the article description text; and then, inputting the word sequence into a pre-trained text convolution neural network to obtain and output a text detection result for indicating whether the article description text contains the target content. The method adopts the text convolution neural network to detect the article description text, and compared with the traditional manual auditing method, the method improves the detection efficiency of the article description text.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which various embodiments of the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for detecting text according to the present application;

FIG. 3 is a schematic diagram of an application scenario of a method for detecting text according to the present application;

FIG. 4 is a flow diagram of yet another embodiment of a method for detecting text in accordance with the present application;

FIG. 5 is a schematic block diagram illustrating one embodiment of an apparatus for detecting text in accordance with the present application;

FIG. 6 is a schematic block diagram of a computer system suitable for use in implementing an electronic device according to embodiments of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 shows an exemplary system architecture 100 to which embodiments of the method for detecting text of the present application may be applied.

As shown in fig. 1, the system architecture 100 may include

user terminals

1011, 1012, 1013,

networks

1021, 1022, a server 103, and an output terminal 104. Network 1021 is the medium used to provide communication links between

user terminals

1011, 1012, 1013 and server 103. Network 1022 is the medium used to provide communication links between server 103 and output terminals 104. The

networks

1021, 1022 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.

A user may interact with the server 103 via the network 1021 using

user terminals

1011, 1012, 1013 to send or receive messages or the like (e.g., the

user terminals

1011, 1012, 1013 may send item description text to the server 103), etc. Various communication client applications, such as shopping applications, search applications, instant messaging software, etc., may be installed on the

user terminals

1011, 1012, 1013.

The text auditor may use the output terminal 104 to interact with the server 103 via the network 1022 to send or receive messages and the like (e.g., the output terminal 104 receives text detection results output by the server 103). The output terminal 104 may be installed with various communication client applications, such as an audit management application, instant messaging software, and the like.

The

user terminals

1011, 1012, 1013 and the output terminal 104 may be hardware or software. When the

user terminals

1011, 1012, 1013 and the output terminal 104 are hardware, they may be various electronic devices supporting information interaction, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the

user terminals

1011, 1012, 1013 and the output terminal 104 are software, they may be installed in the electronic devices listed above. It may be implemented as multiple pieces of software or software modules, or as a single piece of software or software module. And is not particularly limited herein.

The server 103 may be a server that provides various services. For example, it may be a background server that analyzes the item description text. The server 103 may first obtain an item description text from the

user terminals

1011, 1012, 1013, and perform word segmentation on the item description text to obtain a word set; then, based on the word set, generating a word sequence according to the position of each word in the word set in the article description text; then, the word sequence can be input into a pre-trained text convolution neural network to obtain a text detection result for indicating whether the object description text contains target content; finally, the text detection result may be output, for example, the text detection result may be sent to the output terminal 104 for output.

The server 103 may be hardware or software. When the server 103 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 103 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the method for detecting text provided in the embodiment of the present application is generally performed by the server 103.

It should be noted that the item description text may be stored locally in the server 103, and the server 103 may obtain the item description text locally. The exemplary system architecture 100 may not have

user terminals

1011, 1012, 1013 and the network 1021 at this time.

It should be further noted that the server 103 may be connected to a display device (e.g., a display screen) to display the output text detection result. The exemplary system architecture 100 may not have a network 1022 and an output terminal 104 present at this time.

It should be understood that the number of user terminals, networks, servers and output terminals in fig. 1 is merely illustrative. There may be any number of user terminals, networks, servers, and output terminals, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a method for detecting text in accordance with the present application is shown. The method for detecting the text comprises the following steps:

step 201, obtaining an article description text, and performing word segmentation on the article description text to obtain a word set.

In this embodiment, an execution subject (e.g., a server shown in fig. 1) of the method for detecting a text may acquire an item description text. The above item description text may include, but is not limited to, at least one of: item title text, descriptive text identified from the item picture, and descriptive text identified from the item video.

Then, the execution subject may perform word segmentation on the article description text to obtain a word set. The executing body can perform word segmentation on the article description text in word segmentation modes such as a dictionary-based word segmentation algorithm, a statistic-based machine learning algorithm and a bus word segmentation algorithm.

In this embodiment, the dictionary-based word segmentation algorithm may also be referred to as a character string matching word segmentation algorithm, and the algorithm matches a character string to be matched with a term in a pre-established dictionary according to a certain policy, and if a term is found, it indicates that the matching is successful, and identifies the word. The above statistical-based Machine learning algorithm generally applies HMM (Hidden Markov Model), CRF (Conditional Random Field), SVM (Support Vector Machine), deep learning, and other algorithms. Taking CRF as an example, the basic idea is to perform labeling training on Chinese characters, not only considering the occurrence frequency of words, but also considering the context. The above-mentioned ending word segmentation algorithm can load self-defined words, such as "star-like," wild robbery, "and" exploded money ", etc., and the purpose of adding the self-defined words is to ensure that some specific words can be successfully segmented.

Before the article description text is participled, the execution main body may clean the article description text, for example, may delete non-text symbols such as punctuation marks and special symbols in the article description text.

Step 202, based on the word set, generating a word sequence according to the position of each word in the word set in the article description text.

In this embodiment, the execution subject may generate a word sequence according to the position of each word in the word set in the item description text based on the word set obtained in step 201. Specifically, the execution main body may sort the words in the word set according to a sequence of the words in the word set from front to back in the article description text, so as to obtain a word sequence.

Step 203, inputting the word sequence into a pre-trained text convolution neural network to obtain a text detection result for indicating whether the object description text contains the target content.

In this embodiment, the executing agent may input the word sequence generated in step 202 into a pre-trained text convolutional neural network, so as to obtain a text detection result indicating whether the article description text contains the target content. The text convolutional neural network applies the convolutional neural network to a text classification task, and extracts key information in a text by using a plurality of convolutional kernels with different sizes, so that local correlation can be better captured.

In particular, the above-described text convolutional neural network may generally include an embedding layer, a convolutional layer, a pooling layer, and a fully-connected layer. The execution body may input the word sequence into an embedding layer of the text convolutional neural network to obtain a word vector matrix. The embedding layer can encode the word sequences through indexes so as to represent the position relation between words in the text. Then, the execution body may input the word vector matrix into a convolution layer of the text convolution neural network to obtain a text feature vector. For example, the above-described word vector matrix may be input into one-dimensional convolution kernels of sizes 2, 3, and 4, respectively, each convolution kernel may have two output channels. And then, inputting the text feature vector into a pooling layer of the text convolutional neural network to obtain a pooled feature vector. Since the sizes of features obtained by convolution kernels of different sizes are also different, a pooling function may be used for each feature to make them the same dimension. Features may be extracted using maximum pooling and average pooling. Then, the pooled feature vectors can be input into the full-link layer of the text convolutional neural network, so as to obtain the probability that the object description text contains the target content. Here, the target content may include content that does not comply with a preset rule, for example, content that does not comply with an advertising law, with a fraudulent dummy promotion, such as "best", "surprise", "first brand", and the like. The target content may be preset, and may be, for example, content existing in a preset target content table. Finally, a text detection result indicating whether the object description text contains the target content may be determined based on the probability. The text detection result may include a text detection result indicating that the item description text contains target content, and may be represented by "1" or "T"; the method can also comprise a text detection result used for indicating that the object description text does not contain the target content, and can be characterized by '0' or 'F'. Specifically, the probability may be compared with a preset probability threshold, and if the probability is greater than or equal to the probability threshold, it may be determined that the item description text includes the target content, and a text detection result indicating that the item description text includes the target content is generated. If the probability is smaller than the probability threshold, it may be determined that the item description text does not include the target content, and a text detection result indicating that the item description text does not include the target content is generated.

It should be noted that the size of the convolution kernel in the convolution layer of the above text convolution neural network is generally related to the length of the input word sequence.

In this embodiment, the execution main body may optimize the text convolutional neural network by using a random gradient descent algorithm by using a cross entropy function, so as to obtain the network parameters of the text convolutional neural network. In the process of updating the network parameters of the text convolutional neural network, the word vector matrix may be used as the parameters of the text convolutional neural network to perform fine tuning on the text convolutional neural network.

And step 204, outputting a text detection result.

In this embodiment, the execution subject may output the text detection result obtained in step 203. After that, the article information of the article indicated by the corresponding article description text can be further processed through the text detection result. For example, if the text detection result indicates that the item description text includes the target content, the corresponding item may be placed on the shelf.

In some optional implementations of the embodiment, the text convolutional neural network may include a first convolutional layer and a second convolutional layer, the first convolutional layer may be used to extract text feature vectors of the article description text, and the second convolutional layer may be used to integrate the text feature vectors.

In some optional implementations of this embodiment, the execution subject may generate a word sequence based on the word set according to a position of each word in the word set in the item description text by: the execution subject may compare the number of words in the word set with a preset threshold number of words. The term number threshold may be used to limit the number of the participles included in the article description text. Then, based on the comparison result, an initial word sequence can be generated according to the position of each word in the word set in the article description text. Specifically, if the number of words in the word set is equal to the word number threshold, the words in the word set may be sorted according to the order of the positions of the words in the article description text from front to back, so as to obtain an initial word sequence. A word sequence may then be generated based on the initial word sequence. Here, the initial word sequence may be determined as a word sequence.

In some optional implementations of this embodiment, the executing entity may generate the initial word sequence according to the position of each word in the word set in the item description text based on the comparison result in the following manner: if the number of words in the word set is smaller than the word number threshold, the execution subject may sort the words in the word set according to a sequence of positions of the words in the article description text from front to back. Then, a first number of preset first characters may be added after the sorting result to obtain an initial word sequence. The first number may be a difference between the threshold number of words and the number of words in the set of words. For example, if the threshold number of words is 10 and the number of words in the set of words is 8, the first number may be 2. Here, the first character may be a 'null' character such that the number of words in the initial word sequence is the word number threshold number.

In some optional implementations of this embodiment, the executing entity may generate the initial word sequence according to the position of each word in the word set in the item description text based on the comparison result in the following manner: if the number of words in the word set is greater than the threshold number of words, the execution main body may select a threshold number of words from the word set. As an example, the execution subject may select the word number threshold number of words from the word set in an order from front to back of positions of the words in the item description text. As another example, the execution subject may also select the word number threshold number of words from the word set in an order from the back to the front of the positions of the words in the item description text. Then, the selected words can be sequenced according to the order of the positions of the words in the article description text from front to back, so as to obtain an initial word sequence.

In some optional implementations of this embodiment, the execution subject may generate the word sequence based on the initial word sequence by: for each word in the initial word sequence, the execution subject may determine whether the word is present in a preset training word set. The training words in the training Word set may be used to train a Word2vec (Word to vector) model. Word2vec is a tool for Word vector computation. Word2vec can not only train on million orders of magnitude of dictionary and billions of data sets with high efficiency, but also can obtain training result Word vectors (Word embedding), and can well measure similarity between words. Here, the Word2vec Model is usually CBOW (Continuous Bag-of-Words Model), which is a Model for predicting the occurrence probability of a current Word according to a context Word. Inputting a plurality of training words into a Word2vec model to obtain a Word vector corresponding to each training Word as a Word vector table, and then using the Word vector table for initializing an Embedding (Embedding) layer of a Text Convolutional Neural network (TextCNN). If the word does not exist in the training word set, the execution main body can replace the word by using a preset second character. Here, the second character may be a 'pos' character.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for detecting text according to the present embodiment. In the application scenario of fig. 3, the server 301 may first obtain an item description text 303 from the user terminal 302. Here, the article description text 303 may be "star-like black jeans tall waist slimming legged hole breaking ninth pants". Then, the server 301 may perform word segmentation on the article description text 303 "star same style black jeans tall and waist slimming and small foot fit hole breaking ninth pants" to obtain a word set 304. Here, the word set 304 includes "same star", "black", "jeans", "high waist", "slim", "small foot", "joker", "broken hole", and "ninth pants". Server 301 may then generate word sequence 305 based on word set 304, according to the position of each word in word set 304 in item description text 303. Specifically, the server 301 may sort the words in the word set 304 in the order from the top to the bottom of the items in the item description text 303, so as to obtain a word sequence 305. Here, the word sequence 305 is: star with-black-jeans-high waist-slimming-small feet-joker-hole-ninth pants. Server 301 may then input word sequence 305 into pre-trained text convolutional neural network 306, resulting in text detection result 307 indicating whether object description text 303 contains the target content. Here, the target content may be content with fraudulent promotions that does not comply with advertising laws. The obtained text detection result 307 is "1", and represents the text detection result 307 including the target content in the item description text 303. Finally, the server 301 may transmit the text detection result 307 to the terminal device 308 to output the text detection result 307.

According to the method provided by the embodiment of the application, the article description text is detected by adopting the text convolution neural network, and compared with a traditional manual auditing mode, the mode improves the detection efficiency of the article description text.

With further reference to FIG. 4, there is shown a flow 400 of yet another embodiment of a method for detecting text according to the present application, comprising the steps of:

step 401, obtaining an article description text, and performing word segmentation on the article description text to obtain a word set.

Step 402, based on the word set, generating a word sequence according to the position of each word in the word set in the article description text.

In the present embodiment, the steps 401-402 can be performed in a similar manner to the steps 201-202, and are not described herein again.

Step 403, inputting the word sequence into the embedding layer of the text convolutional neural network to obtain a word vector matrix.

In this embodiment, the execution body may input the word sequence into an embedding layer of the text convolutional neural network to obtain a word vector matrix. The embedding layer can encode the word sequences through indexes so as to represent the position relation between words in the text.

As an example, the word sequence may have a length of S, that is, may include S words, and a word vector matrix formed by word vectors of the words may be obtained by inputting the word sequence of S words into the embedding layer of the text convolutional neural network. The word vector for each word may be N-dimensional, e.g., may be 125-dimensional. At this time, the size of the word vector matrix is S × 125.

Step 404, inputting the word vector matrix into the first convolution layer of the text convolution neural network to obtain the text feature vector of the article description text.

In this embodiment, the executing entity may input the word vector matrix obtained in step 403 into the first convolution layer of the text convolutional neural network to obtain the text feature vector of the article description text. Here, the first convolution layer may include K1 convolution kernels of n1 × 125, K2 convolution kernels of n2 × 125, and K3 convolution kernels of n3 × 125. Note that n1, n2, and n3 are all smaller than the word sequence length S. Taking K1 convolution kernels of n1 × 125 as an example, the above word vector matrix is convolved with a convolution kernel of n1 × 125. Specifically, each convolution kernel slides downwards through the word vector matrix in sequence in a sliding window mode, convolution operation is performed once every sliding step, and S-n1+1 elements are generated finally. The S-n1+1 elements generated may be determined as the text feature vector C₁. For example, the generated S-n1+1 elements can be spliced to obtain a text feature vector C₁。

Here, the result of convolution of the convolution kernel matrix at the current time and the word vector matrix may be determined by the following formula (1):

wherein, W_k1Characterizing a convolution kernel matrix with the size of n1 × 125, i being the initial position of the convolution kernel in the word vector matrix at the current time, i: i + n1-1 being the position of the convolution kernel in the word vector matrix at the current time during the sliding process of the convolution kernel, S_i:i+n1-1The part of the word vector matrix that is convolved with the convolution kernel at the current time,

the characterization performs convolution operation on the convolution kernel matrix with the size of n1 × 125 and the overlapped part of the convolution kernel matrix at the current moment and the word vector matrix, b_k1Is a preset bias vector, f is a ReLU nonlinear activation function, c_iThe output result of the nonlinear activation function at the current moment is obtained.

Here, the above word vector matrix may be convolved with K2 convolution kernels of n2 × 125 to obtain S-n2+1 elements in the same manner as in the above example. S-n2+1 elements may be determined as text feature vector C₂. And performing convolution operation on the word vector matrix and K3 convolution kernels with n3 multiplied by 125 to obtain S-n3+1 elements. S-n3+1 elements may be determined as text feature vector C₃。

Step 405, inputting the text feature vector into a second convolution layer of the text convolution neural network to obtain an integrated text feature vector.

In this embodiment, the executing entity may input the text feature vector obtained in step 404 into the second convolution layer of the text convolution neural network to obtain an integrated text feature vector. The second convolution layer may include a convolution kernel having a convolution size of (S-n1+1) × 1, a convolution kernel having a convolution size of (S-n2+1) × 1, and a convolution kernel having a convolution size of (S-n3+1) × 1. Here, convolving the text feature vector with the convolution kernel in the second convolution layer corresponds to performing a full-concatenation weighted sum for each dimension of the text feature vector.

Specifically, for each of the K1 text feature vectors, the execution subject may assign the text feature vector C to the text feature vector C₁Performing convolution operation on (S-n1+1) multiplied by 1) and convolution kernel with convolution size (S-n1+1) multiplied by 1 to obtain text classification characteristic Z_n1. Classifying K1 texts into feature vectors Z_n1Splicing to obtain a text classification feature vector Z_k1n1。

Similarly, the execution subject may be configured to execute the execution subject for each of the K2 text feature vectorsThe text feature vector C₂Performing convolution operation on (S-n2+1) multiplied by 1) and convolution kernel with convolution size (S-n2+1) multiplied by 1 to obtain text classification characteristic Z_n2. Classifying K2 texts into feature vectors Z_n2Splicing to obtain a text classification feature vector Z_k2n2。

For each of the K3 text feature vectors, the execution body may assign the text feature vector C to the text feature vector C₃Performing convolution operation on (S-n3+1) multiplied by 1) and convolution kernel with convolution size (S-n3+1) multiplied by 1 to obtain text classification characteristic Z_n3. Classifying K3 texts into feature vectors Z_n3Splicing to obtain a text classification feature vector Z_k3n3。

Thereafter, the execution subject may classify the text into the feature vector Z_k1n1The text classification feature vector Z_k2n2And the above text classification feature vector Z_k3n3And splicing to obtain the integrated text characteristic vector Z.

Here, the text classification characteristic Z may be determined by the following formula (2)_n1：

Wherein, W_s-n1+1Is a convolution kernel matrix of size (S-n1+1) × 1, C₁Is a text feature vector of size (S-n1+1) × 1,

characterizing a convolution kernel matrix W that will be of size (S-n1+1) × 1_s-n1+1And a text feature vector C of size (S-n1+1) × 1₁Performing a convolution operation, b_k2Is a preset bias vector, f is a ReLU nonlinear activation function, Z_n1To use a convolution kernel matrix W of size (S-n1+1) × 1_s-n1+1And a text feature vector C of size (S-n1+1) × 1₁And performing convolution operation to obtain text classification characteristics.

Here, the text classification feature Z may be determined in a manner similar to the above-described formula (2)_n2And text classification feature Z_n3。

It should be noted that, in determining the text classification feature Z_n1In this case, the predetermined offset vector b may not be considered_k2That is, the text classification feature Z may also be determined by the following formula (3)_n1：

It should be noted that, in determining the text classification characteristic Z_n1In this case, the result obtained by the convolution operation may not be input to the ReLU nonlinear activation function, that is, the text classification feature Z may be determined by the following formula (4)_n1：

Step 406, inputting the integrated text feature vector into a full connection layer of the text convolutional neural network to obtain a probability that the article description text contains the target content, and determining a text detection result for indicating whether the article description text contains the target content based on the probability.

In this embodiment, the executing entity may input the integrated text feature vector Z obtained in step 405 into a full connection layer of the text convolutional neural network, so as to obtain a probability that the item description text contains the target content. Here, the target content may include content that does not comply with a preset rule, for example, content that does not comply with an advertising law, with a fraudulent dummy promotion, such as "best", "surprise", "first brand", and the like. The target content may be preset, and may be, for example, content existing in a preset target content table.

Here, the probability that the target content is contained in the item description text may be determined by the following formula (5):

y＝f(WZ+b) (5)

wherein, W is a full-connection layer matrix, Z is an integrated text feature vector, b is a full-connection layer offset vector, the output function f may adopt a sigmoid function, and y is the probability that the object description text contains the target content.

Then, the executing agent may determine a text detection result indicating whether the object description text includes the target content based on the probability. The text detection result may include a text detection result indicating that the item description text contains target content, and may be represented by "1" or "T"; the method can also comprise a text detection result used for indicating that the object description text does not contain the target content, and can be characterized by '0' or 'F'. Specifically, the probability may be compared with a preset probability threshold, and if the probability is greater than or equal to the probability threshold, it may be determined that the item description text includes the target content, and a text detection result indicating that the item description text includes the target content is generated. If the probability is smaller than the probability threshold, it may be determined that the item description text does not include the target content, and a text detection result indicating that the item description text does not include the target content is generated.

Step 407, outputting a text detection result.

In this embodiment, the execution subject may output the text detection result obtained in step 406. After that, the article information of the article indicated by the corresponding article description text can be further processed through the text detection result. For example, if the text detection result indicates that the item description text includes the target content, the corresponding item may be placed on the shelf.

In some optional implementation manners of this embodiment, before the word sequence is input into the embedding layer of the text convolutional neural network to obtain the word vector matrix, the execution main body may initialize the embedding layer by using a preset correspondence table. The correspondence table may be used to represent the correspondence between the training words and the training word vectors. The training words in the training Word set can be input into the Word2vec model to obtain a training Word vector corresponding to each training Word, and each training Word and the corresponding training Word vector form the corresponding relation table. Here, the dimension of the model parameter of the Word2vec model may be 125, and the window may be set to 4. The dimension of the Word vector obtained by this Word2vec model is typically 25. The execution body may load the correspondence table to the embedded layer, thereby implementing initialization of the embedded layer. The correspondence table may include a preset first character and/or a preset second character. Here, the first character may be a 'null' character, and the second character may be a 'pos' character.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for detecting a text in this embodiment represents a step of inputting a word sequence into a modified text convolutional neural network (including a first convolutional layer and a second convolutional layer, where the second convolutional layer is used to integrate text feature vectors output by the first convolutional layer), so as to obtain a text detection result. Therefore, in the scheme described in this embodiment, the text feature vectors are aggregated by using the second convolution layer of the improved text convolution neural network, so that the text features output in the first convolution layer can be retained to the maximum extent, the loss of the text features is reduced, and the accuracy of text detection is improved.

With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an apparatus for detecting a text, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable to various electronic devices.

As shown in fig. 5, the apparatus 500 for detecting text of the present embodiment includes: an acquisition unit 501, a generation unit 502, an input unit 503, and an output unit 504. The obtaining unit 501 is configured to obtain an article description text, and perform word segmentation on the article description text to obtain a word set; the generating unit 502 is configured to generate a word sequence according to the position of each word in the word set in the item description text based on the word set; the input unit 503 is configured to input the word sequence into a pre-trained text convolutional neural network, so as to obtain a text detection result for indicating whether the object description text contains the target content; the output unit 504 is configured to output the text detection result.

In the present embodiment, specific processing of the acquiring unit 501, the generating unit 502, the inputting unit 503 and the outputting unit 504 of the apparatus 500 for detecting a text may refer to step 201, step 202, step 203 and step 204 in the corresponding embodiment of fig. 2.

In some optional implementation manners of this embodiment, the input unit 503 may input the word sequence into a pre-trained text convolutional neural network, so as to obtain a text detection result indicating whether the article description text contains the target content, as follows: the input unit 503 may input the word sequence into the embedding layer of the text convolutional neural network to obtain a word vector matrix. The embedding layer can encode the word sequences through indexes so as to represent the position relation between words in the text.

Then, the input unit 503 may input the word vector matrix into the first convolution layer of the text convolution neural network to obtain the text feature vector of the article description text. Here, the first convolution layer may include K1 convolution kernels of n1 × 125, K2 convolution kernels of n2 × 125, and K3 convolution kernels of n3 × 125. Note that n1, n2, and n3 are all smaller than the word sequence length S. Taking K1 convolution kernels of n1 × 125 as an example, the above word vector matrix is convolved with a convolution kernel of n1 × 125. Specifically, each convolution kernel slides downwards through the word vector matrix in sequence in a sliding window mode, convolution operation is performed once every sliding step, and S-n1+1 elements are generated finally. The S-n1+1 elements generated may be determined as the text feature vector C₁. For example, the generated S-n1+1 elements can be spliced to obtain a text feature vector C₁。

Then, the input unit 503 may input the text feature vector into the second convolution layer of the text convolution neural network to obtain an integrated text feature vector. The second convolution layer may include a convolution kernel having a convolution size of (S-n1+1) × 1, a convolution kernel having a convolution size of (S-n2+1) × 1, and a convolution kernel having a convolution size of (S-n3+1) × 1. Here, convolving the text feature vector with the convolution kernel in the second convolution layer corresponds to performing a full-concatenation weighted sum for each dimension of the text feature vector.

Specifically, for each text feature vector of K1 text feature vectors, the input unit 503 may convert the text feature vector C into a text feature vector C₁Performing convolution operation on (S-n1+1) multiplied by 1) and convolution kernel with convolution size (S-n1+1) multiplied by 1 to obtain text classification characteristic Z_n1. Classifying K1 texts into feature vectors Z_n1Splicing to obtain a text classification feature vector Z_k1n1。

Similarly, for each text feature vector of the K2 text feature vectors, the input unit 503 may convert the text feature vector C into the text feature vector C₂Performing convolution operation on (S-n2+1) multiplied by 1) and convolution kernel with convolution size (S-n2+1) multiplied by 1 to obtain text classification characteristic Z_n2. Classifying K2 texts into feature vectors Z_n2Splicing to obtain a text classification feature vector Z_k2n2。

For each text feature vector of the K3 text feature vectors, the input unit 503 may convert the text feature vector C into a corresponding text feature vector C₃Performing convolution operation on (S-n3+1) multiplied by 1) and convolution kernel with convolution size (S-n3+1) multiplied by 1 to obtain text classification characteristic Z_n3. Classifying K3 text into feature vectorsZ_n3Splicing to obtain a text classification feature vector Z_k3n3。

Thereafter, the input unit 503 may classify the text into the feature vector Z_k1n1The text classification feature vector Z_k2n2And the above text classification feature vector Z_k3n3And splicing to obtain the integrated text characteristic vector Z.

Finally, the input unit 503 may input the integrated text feature vector Z into a full connection layer of the text convolutional neural network, so as to obtain a probability that the object description text contains the target content. Here, the target content may include content that does not comply with a preset rule, for example, content that does not comply with an advertising law, with a fraudulent dummy promotion, such as "best", "surprise", "first brand", and the like. The target content may be preset, and may be, for example, content existing in a preset target content table. Then, the input unit 503 may determine a text detection result indicating whether the object description text includes the target content based on the probability. The text detection result may include a text detection result indicating that the item description text contains target content, and may be represented by "1" or "T"; the method can also comprise a text detection result used for indicating that the object description text does not contain the target content, and can be characterized by '0' or 'F'. Specifically, the probability may be compared with a preset probability threshold, and if the probability is greater than or equal to the probability threshold, it may be determined that the item description text includes the target content, and a text detection result indicating that the item description text includes the target content is generated. If the probability is smaller than the probability threshold, it may be determined that the item description text does not include the target content, and a text detection result indicating that the item description text does not include the target content is generated.

In some optional implementations of the present embodiment, the apparatus 500 for detecting text may further include an initialization unit (not shown in the figure). The initialization unit may initialize the embedded layer using a preset correspondence table. The correspondence table may be used to represent the correspondence between the training words and the training word vectors. The training words in the training Word set can be input into the Word2vec model to obtain a training Word vector corresponding to each training Word, and each training Word and the corresponding training Word vector form the corresponding relation table. Here, the dimension of the model parameter of the Word2vec model may be 125, and the window may be set to 4. The dimension of the Word vector obtained by this Word2vec model is typically 25. The initialization unit may load the correspondence table to the embedded layer, thereby implementing initialization of the embedded layer. The correspondence table may include a preset first character and/or a preset second character. Here, the first character may be a 'null' character, and the second character may be a 'pos' character.

In some optional implementations of this embodiment, the generating unit 502 may generate the word sequence according to the position of each word in the word set in the item description text based on the word set as follows: the generating unit 502 may compare the number of words in the word set with a preset threshold of the number of words. The term number threshold may be used to limit the number of the participles included in the article description text. Then, based on the comparison result, an initial word sequence can be generated according to the position of each word in the word set in the article description text. Specifically, if the number of words in the word set is equal to the word number threshold, the words in the word set may be sorted according to the order of the positions of the words in the article description text from front to back, so as to obtain an initial word sequence. A word sequence may then be generated based on the initial word sequence. Here, the initial word sequence may be determined as a word sequence.

In some optional implementations of this embodiment, the generating unit 502 may generate the initial word sequence according to the position of each word in the word set in the item description text based on the comparison result as follows: if the number of words in the word set is smaller than the threshold value of the number of words, the generating unit 502 may sort the words in the word set according to the order of the positions of the words in the item description text from front to back. Then, a first number of preset first characters may be added after the sorting result to obtain an initial word sequence. The first number may be a difference between the threshold number of words and the number of words in the set of words. Here, the first character may be a 'null' character such that the number of words in the initial word sequence is the word number threshold number.

In some optional implementations of this embodiment, the generating unit 502 may generate the initial word sequence according to the position of each word in the word set in the item description text based on the comparison result as follows: if the number of words in the word set is greater than the threshold number of words, the generating unit 502 may select a number of words with the threshold number of words from the word set. As an example, the generating unit 502 may select the words with the number of words threshold from the word set in an order from front to back of the positions of the words in the item description text. As another example, the generating unit 502 may select the words with the number of words threshold from the word set according to the order from the back to the front of the positions of the words in the item description text. Then, the selected words can be sequenced according to the order of the positions of the words in the article description text from front to back, so as to obtain an initial word sequence.

In some optional implementations of this embodiment, the generating unit 502 may generate the word sequence based on the initial word sequence as follows: for each word in the initial word sequence, the generating unit 502 may determine whether the word exists in a preset training word set. The training words in the training Word set can be used for training the Word2vec model. Word2vec is a tool for Word vector computation. Word2vec can not only train on million orders of magnitude of dictionary and billions of data sets with high efficiency, but also can obtain training result Word vectors, and can well measure similarity between words. Here, the Word2vec model is usually CBOW, and is a model for predicting the occurrence probability of a current Word according to a context Word. Inputting a plurality of training words into a Word2vec model to obtain a Word vector corresponding to each training Word as a Word vector table, and then using the Word vector table for initializing an embedded layer of the text convolutional neural network. If the word does not exist in the training word set, the generating unit 502 may replace the word with a preset second character. Here, the second character may be a 'pos' character.

Referring now to FIG. 6, a schematic diagram of an electronic device (e.g., the server of FIG. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The server shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an article description text, and segmenting the article description text to obtain a word set; generating a word sequence according to the positions of all words in the word set in the article description text on the basis of the word set; inputting the word sequence into a pre-trained text convolution neural network to obtain a text detection result for indicating whether the object description text contains target content; and outputting a text detection result.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a generation unit, an input unit, and an output unit. Here, the names of these units do not constitute a limitation to the unit itself in some cases, and for example, the output unit may also be described as a "unit that outputs a text detection result".

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims

1. A method for detecting text, comprising:

acquiring an article description text, and segmenting words of the article description text to obtain a word set;

generating a word sequence according to the positions of all words in the word set in the article description text on the basis of the word set;

inputting the word sequence into a pre-trained text convolution neural network to obtain a text detection result for indicating whether the article description text contains target content;

and outputting the text detection result.

2. The method of claim 1, wherein the text convolutional neural network comprises a first convolutional layer for extracting text feature vectors of the item description text and a second convolutional layer for integrating the text feature vectors.

3. The method of claim 2, wherein the inputting the word sequence into a pre-trained text convolutional neural network to obtain a text detection result indicating whether the item description text contains target content comprises:

inputting the word sequence into an embedding layer of the text convolution neural network to obtain a word vector matrix;

inputting the word vector matrix into the first convolution layer to obtain a text feature vector of the article description text;

inputting the text feature vector into the second convolution layer to obtain an integrated text feature vector;

and inputting the integrated text feature vector into a full connection layer of the text convolutional neural network to obtain the probability that the article description text contains the target content, and determining a text detection result for indicating whether the article description text contains the target content or not based on the probability.

4. The method of claim 3, wherein prior to said inputting the sequence of words into an embedding layer of the text convolutional neural network resulting in a word vector matrix, the method further comprises:

initializing the embedded layer by utilizing a preset corresponding relation table, wherein the corresponding relation table is used for representing the corresponding relation between the training words and the training word vectors, and the corresponding relation table comprises preset first characters and/or preset second characters.

5. The method of claim 1, wherein generating a sequence of words by location of each word in the set of words in the item description text based on the set of words comprises:

comparing the number of words in the word set with a preset word number threshold;

based on the comparison result, generating an initial word sequence according to the position of each word in the word set in the article description text;

based on the initial word sequence, a word sequence is generated.

6. The method of claim 5, wherein generating an initial sequence of words by position of each word in the set of words in the item description text based on the comparison comprises:

if the number of the words in the word set is smaller than the word number threshold, the words in the word set are sorted according to the sequence of the positions of the words in the article description text from front to back, a first number of preset first characters are added after a sorting result, and an initial word sequence is obtained, wherein the first number is the difference value between the word number threshold and the number of the words in the word set.

7. The method of claim 5, wherein generating an initial sequence of words by position of each word in the set of words in the item description text based on the comparison comprises:

and if the number of the words in the word set is larger than the threshold value of the number of the words, selecting the threshold value of the number of the words from the word set, and sequencing the selected words according to the sequence of the positions of the words in the article description text from front to back to obtain an initial word sequence.

8. The method of any of claims 5-7, wherein said generating a sequence of words based on said initial sequence of words comprises:

and determining whether the word exists in a preset training word set or not aiming at each word in the initial word sequence, and replacing the word by using a preset second character if the word does not exist in the training word set.

9. An apparatus for detecting text, comprising:

the acquisition unit is configured to acquire an article description text, and perform word segmentation on the article description text to obtain a word set;

a generating unit configured to generate a word sequence according to the position of each word in the word set in the item description text based on the word set;

the input unit is configured to input the word sequence into a pre-trained text convolutional neural network to obtain a text detection result for indicating whether the article description text contains target content;

an output unit configured to output the text detection result.

10. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.

11. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-8.