CN112232524A

CN112232524A - Multi-label information identification method and device, electronic equipment and readable storage medium

Info

Publication number: CN112232524A
Application number: CN202011464955.4A
Authority: CN
Inventors: 于伟; 王林芳; 梅涛
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-12-14
Filing date: 2020-12-14
Publication date: 2021-01-15
Anticipated expiration: 2040-12-14
Also published as: CN112232524B

Abstract

The disclosure provides a method and a device for identifying multi-label information, electronic equipment and a readable storage medium, and relates to the technical field of computers. The identification method of the multi-label information comprises the following steps: training a multi-label recognition model through training labels, wherein a circulation network is introduced into the input end of a prediction branch of the multi-label recognition model, and the circulation network is used for learning the relation among a plurality of training labels; constructing a tag restriction relation set according to a branch prediction result of the multi-tag identification model, wherein the tag restriction relation set comprises tag combinations meeting preset tag rules; acquiring a multi-label message to be identified; and determining the recognition result of the multi-label information according to the trained multi-label recognition model and the label limit relation set. Through the technical scheme disclosed by the invention, the accuracy, reliability and self-consistency of identifying a plurality of labels are improved.

Description

Multi-label information identification method and device, electronic equipment and readable storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying multi-tag information, an electronic device, and a readable storage medium.

Background

The multi-label identification is an important direction in the field of machine learning, and has wide application scenes in real life.

In the related art, the multi-label identification method mostly uses a deep network model as a basic feature extractor, and takes the multi-label identification method of an image as an example, and extracts features through a convolutional neural network for identification, so that at least the following technical problems exist:

(1) the result of identifying each tag has a problem of poor self-consistency.

(2) In the post-processing process of the identification of a plurality of labels, the labels are introduced in the form of a matrix or a single layer of network, the generalization capability of the labels is weak, the requirement on the acquisition of training data is higher, and the labels are not friendly to long-tail labels.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The present disclosure is directed to a method, an apparatus, an electronic device, and a readable storage medium for identifying multi-tag information, which overcome, at least to some extent, the problem of poor self-consistency of the identification result of multi-tags in the related art.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to an aspect of the present disclosure, there is provided a method for identifying multi-tag information, including: training a multi-label recognition model through training labels, wherein a circulation network is introduced into the input end of a prediction branch of the multi-label recognition model, and the circulation network is used for learning the relation among a plurality of training labels; constructing a tag restriction relation set according to the result of the prediction branch, wherein the tag restriction relation set comprises tag combinations meeting preset tag rules; acquiring a multi-label message to be identified; and determining the recognition result of the multi-label information according to the trained multi-label recognition model and the label limit relation set.

In one embodiment of the present disclosure, training a multi-label recognition model by training labels includes: inputting a first training label in the training labels to a circulation network to obtain a first output label, and inputting the first output label to a first full-connection layer of a multi-label recognition model to obtain a first prediction result; inputting the nth output label and the (n + 1) th training label into a circulation network to obtain an (n + 1) th output label, and inputting the (n + 1) th output label into an (n + 1) th full-connection layer of the multi-label recognition model to obtain an (n + 1) th prediction result, wherein n is an integer greater than or equal to 1.

In one embodiment of the present disclosure, training the multi-label recognition model by training labels further comprises: introducing a loss function into the output end of the full connection layer of the multi-label identification model; and optimizing the multi-label recognition model according to the output result of the loss function.

In one embodiment of the present disclosure, the loss function includes a cross-entropy function and/or a focallloss function.

In one embodiment of the present disclosure, determining the recognition result of the multi-tag information according to the trained multi-tag recognition model and the tag constraint relation set includes: inputting the multi-label message to be identified into the trained multi-label identification model to obtain the prediction confidence of each label in the multi-label message; and determining the recognition result of the multi-tag information according to the prediction confidence and the tag limit relation set.

In one embodiment of the present disclosure, determining the recognition result of the multi-tag information according to the prediction confidence and the tag constraint relation set includes: determining a prediction confidence of a z label of an m layer in the multi-label message according to the label limit relation set; and solving the optimal solution of the prediction confidence degrees of the combination of the labels to obtain a recognition result.

In one embodiment of the present disclosure, the optimizing the prediction confidence of the combination of tags to obtain the recognition result comprises: determining a magnitude of a prediction confidence for the tag; and solving the optimal solution according to the prediction confidence of the combination of the magnitude pair labels to obtain a recognition result.

In one embodiment of the present disclosure, optimizing the prediction confidence of the combination of tags according to magnitude to obtain a recognition result comprises: determining a first quantity level difference between the prediction confidences of labels of different granularities; when the first quantity range is determined to be larger than a first preset quantity range, performing product calculation on the prediction confidence coefficient of the combination of the labels; and determining the combination of the labels with the maximum product result as the identification result.

In one embodiment of the present disclosure, the optimizing the prediction confidence of the combination of the tags according to the magnitude to obtain the recognition result further comprises: determining a second magnitude difference between the prediction confidences of labels of the same granularity; when the second quantity range is determined to be larger than the second preset quantity range, summing the prediction confidence degrees of the combination of the labels; and determining the combination of the labels with the maximum summation result as the identification result.

In one embodiment of the present disclosure, the recurrent network includes one of a recurrent neural network, a bidirectional recurrent neural network, a long-short term memory network, and a gated cell network.

In one embodiment of the present disclosure, before training the multi-label recognition model by the training labels, the method further includes: acquiring multilayer label characteristics, wherein the granularity of the label characteristics of a k layer is different from that of a k +1 layer, and k is an integer greater than or equal to 1; the multi-layer label features are preprocessed to obtain training labels.

In one embodiment of the present disclosure, the pre-processing includes at least one of random cropping, rotation, flipping, color adjustment, adding noise.

According to another aspect of the present disclosure, there is provided an apparatus for identifying multi-tag information, including: the training module is used for training the multi-label recognition model through training labels, a circulation network is introduced into the input end of a prediction branch of the multi-label recognition model, and the circulation network is used for learning the relation among the training labels; the construction module is used for constructing a tag restriction relation set according to the result of the predicted branch, and the tag restriction relation set comprises tag combinations meeting preset tag rules; the acquisition module is used for acquiring the multi-label message to be identified; and the determining module is used for determining the recognition result of the multi-label information according to the trained multi-label recognition model and the label limit relation set.

According to still another aspect of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions for the processor; wherein the processor is configured to perform the method for identifying multi-tag information of any one of the above via execution of the executable instructions.

According to yet another aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of identifying multi-tag information of any one of the above.

According to the identification scheme of the multi-label information provided by the embodiment of the disclosure, a circulation network is introduced into the input end of the prediction branch of the multi-label identification model, the label limit relation set is constructed according to the result of the prediction branch, and the self-consistency, reliability and accuracy of the label identification result are greatly improved through the limitation of the label limit relation set on the identified labels.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.

Fig. 1 shows a flow chart of a method for identifying multi-tag information in an embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating another method of identifying multi-tag information in an embodiment of the present disclosure;

FIG. 3 is a flow chart illustrating a method for identifying multi-tag information in an embodiment of the present disclosure;

FIG. 4 is a flow chart illustrating a method for identifying multi-tag information in an embodiment of the present disclosure;

FIG. 5 is a flow chart illustrating a method for identifying multi-tag information in an embodiment of the present disclosure;

FIG. 6 is a flow chart illustrating a method for identifying multi-tag information in an embodiment of the present disclosure;

FIG. 7 is a flow chart illustrating a method for identifying multi-tag information in an embodiment of the present disclosure;

FIG. 8 is a flow chart illustrating a method for identifying multi-tag information in an embodiment of the present disclosure;

FIG. 9 is a flow chart illustrating a method for identifying multi-tag information in an embodiment of the present disclosure;

FIG. 10 is a schematic diagram of an identification platform for multi-tag information in an embodiment of the disclosure;

fig. 11 is a schematic diagram illustrating an apparatus for identifying multi-tag information according to an embodiment of the present disclosure;

fig. 12 shows a schematic diagram of an electronic device in an embodiment of the disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

According to the scheme provided by the disclosure, a circulation network is introduced into the input end of the prediction branch of the multi-label recognition model, the label limit relation set is constructed according to the result of the prediction branch, and the self-consistency, reliability and accuracy of the label recognition result are greatly improved through the limitation of the label limit relation set on the recognized labels.

The identification scheme of the multi-label information can be realized through interaction of a plurality of terminals and a server cluster.

The terminal may be a mobile terminal such as a mobile phone, a game console, a tablet Computer, an e-book reader, smart glasses, an MP4 (Moving Picture Experts Group Audio Layer IV) player, an intelligent home device, an AR (Augmented Reality) device, a VR (Virtual Reality) device, or a Personal Computer (Personal Computer), such as a laptop Computer and a desktop Computer.

Among them, an application program for providing identification of multi-tag information may be installed in the terminal.

The terminal is connected with the server cluster through a communication network. Optionally, the communication network is a wired network or a wireless network.

The server cluster is a server, or consists of a plurality of servers, or is a virtualization platform, or is a cloud computing service center. The server cluster is used for providing background services for the application programs providing identification of the multi-tag information. Optionally, the server cluster undertakes primary computing work, and the terminal undertakes secondary computing work; or the server cluster bears secondary calculation work, and the terminal bears main calculation work; or, the terminal and the server cluster adopt a distributed computing architecture for cooperative computing.

In some alternative embodiments, the suffix tree index is used initially to index the string, which can improve the performance of string suffix searches. In the track data management, the matched track can be regarded as a character string, the identification of each road segment is equivalent to one character of the character string, and the path query (without considering the time filtering condition) can be mapped to a suffix search problem of the character string, namely: the search using the link identification sequence may be regarded as a search using a character sequence.

Optionally, the clients of the applications installed in different terminals are the same, or the clients of the applications installed on two terminals are clients of the same type of application of different control system platforms. The specific form of the client of the application may also be different based on different terminal platforms, for example, the client of the application may be a mobile phone client, a PC client, or a global wide area network client.

Those skilled in the art will appreciate that the number of terminals described above may be greater or fewer. For example, the number of the terminals may be only one, or several tens or hundreds of the terminals, or more. The number of terminals and the type of the device are not limited in the embodiments of the present disclosure.

Optionally, the system may further include a management device, and the management device is connected to the server cluster through a communication network. Optionally, the communication network is a wired network or a wireless network.

Optionally, the wireless network or wired network described above uses standard communication techniques and/or protocols. The Network is typically the Internet, but may be any Network including, but not limited to, a Local Area Network (LAN), a Metropolitan Area Network (MAN), a wide Area Network (W identifies an e Area Network, WAN), a mobile, wireline or wireless Network, a private Network, or any combination of virtual private networks. In some embodiments, data exchanged over a network is represented using techniques and/or formats including Hypertext Mark-up Language (HTML), Extensible markup Language (XML), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as Secure Socket Layer (SSL), Transport Layer Security (TLS), Virtual Private Network (VPN), Internet protocol Security (IPsec). In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.

Hereinafter, each step of the identification method of multi-tag information in the present exemplary embodiment will be described in more detail with reference to the drawings and examples.

Fig. 1 shows a flowchart of an identification method for multi-tag information in an embodiment of the present disclosure. The method provided by the embodiment of the disclosure can be executed by any electronic equipment with computing processing capacity.

As shown in fig. 1, the electronic device performs a method for identifying multi-tag information, including the steps of:

and S102, training the multi-label recognition model through the training labels, wherein a circulation network is introduced into the input end of a prediction branch of the multi-label recognition model, and the circulation network is used for learning the relation among the training labels.

And step S104, constructing a tag restriction relation set according to the result of the prediction branch, wherein the tag restriction relation set comprises tag combinations meeting preset tag rules.

And step S106, acquiring the multi-label message to be identified.

And S108, determining the recognition result of the multi-label information according to the trained multi-label recognition model and the label limit relation set.

In one embodiment of the disclosure, a multi-label recognition model is trained through training labels, a circulation network is introduced into an input end of a prediction branch of the multi-label recognition model, and a label limit relationship set is constructed according to a result of the prediction branch, so that self-consistency, accuracy and reliability of multi-label recognition are improved.

The label limit relation set is used for storing possible label combinations, and the circulating network structure can fully utilize the label relations in the label limit relation set in network training so as to improve the label prediction accuracy and the self-accuracy of label prediction.

Based on the steps shown in fig. 1, as shown in fig. 2, training the multi-label recognition model by training labels includes:

step S2022, inputting a first training label of the training labels to the loop network to obtain a first output label, and inputting the first output label to the first full connection layer of the multi-label recognition model to obtain a first prediction result.

Step S2024, inputting the nth output label and the (n + 1) th training label into a circulation network to obtain an (n + 1) th output label, and inputting the (n + 1) th output label into an (n + 1) th full connection layer of the multi-label recognition model to obtain an (n + 1) th prediction result, wherein n is an integer greater than or equal to 1.

In an embodiment of the present disclosure, from the second-layer prediction branch, iteration is performed on the output labels of the prediction branch through a loop network, that is, the nth output label and the (n + 1) th training label are input to the loop network to obtain an (n + 1) th output label, the (n + 1) th output label is input to the (n + 1) th fully-connected layer of the multi-label recognition model to obtain an (n + 1) th prediction result, so as to predict recognition results of all combination modes of the (n + 1) th labels, and to train and determine accuracy and self-consistency of combination among the multiple labels.

Based on the steps shown in fig. 1, as shown in fig. 3, training the multi-label recognition model by training the labels further includes:

and step S3022, introducing a loss function into the output end of the full connection layer of the multi-label identification model.

And step S3024, optimizing the multi-label recognition model according to the output result of the loss function.

In one embodiment of the present disclosure, the loss function (loss function) is used to measure the degree of inconsistency between the predicted value f (x) and the true value Y of the model, and is a non-negative real value function, usually expressed by L (Y, f (x)), and the smaller the loss function is, the better the robustness of the model is. In the disclosed embodiment, the loss function and the regular term are balanced by setting the superparameter in front of the regular term, so that the parameter scale is reduced, the purpose of model simplification is achieved, and the recognition model has better generalization capability.

The Cross Entropy (CE) is often used as the Loss of a classification task in machine learning, and can be derived from KL (Kullback-Leibler Divergence) definition, and when calculating the cross entropy, only one probability value is actually used, but since the predicted probability is calculated by softmax, the probability value becomes larger as the optimization becomes larger, and other probability values become smaller correspondingly, so that the objective of optimizing the whole probability distribution is achieved as a whole.

The actual data is likely to have the condition of class imbalance, and the effect of cross entropy is limited at this time, and the following Focal local is improved on the basis of CE, so that the problem of sample imbalance is solved.

Based on the steps shown in fig. 1, as shown in fig. 4, determining the recognition result of the multi-label information according to the trained multi-label recognition model and the label constraint relation set includes:

step S4082, the multi-label message to be recognized is input into the trained multi-label recognition model to obtain the prediction confidence of each label in the multi-label message.

Step S4084, determining the identification result of the multi-label information according to the prediction confidence and the label limit relation set.

In one embodiment of the disclosure, the recognition result of the multi-tag information is determined by predicting the confidence coefficient and the tag constraint relation set, so as to improve the reliability, the accuracy and the self-consistency of multi-tag recognition.

For the task of not building a label system, two principles are needed: on one hand, the labels in the same layer are mutually exclusive, namely different labels in the same layer are not simultaneously endowed with a sample, for example, two labels of a male and a female of a character image can be used as the labels in the same layer; on the other hand, labels of different layers can be simultaneously endowed to one sample, so that the number of layers is minimized.

Based on the steps shown in fig. 1, as shown in fig. 5, determining the recognition result of the multi-tag information according to the prediction confidence and the tag constraint relation set includes:

step S5082, determining a prediction confidence of the mth tag of the mth layer in the multi-tag message according to the tag constraint relation set.

In step S5084, an optimal solution is obtained for the prediction confidence of the combination of the tags to obtain a recognition result.

In one embodiment of the present disclosure, the set T is limited based on the obtained prediction and the tag. For the nth layer (N ∈ [1, N)]) Ith label L_n,iWith a prediction confidence of P_n,iThen, the optimal prediction result is obtained according to the following formula:

d () is a formula for calculating the overall prediction confidence of different label combinations, and different modes such as summation and product solving can be selected according to actual scenes. The process of finding the maximum value may also select various existing methods, such as the simplest conversion into a traversal problem of all possible combinations, or the conversion into a problem of searching the shortest path in the graph, but is not limited thereto.

Based on the steps shown in fig. 1, as shown in fig. 6, the optimizing the prediction confidence of the combination of the tags to obtain the recognition result includes:

in step S6082, the magnitude of the prediction confidence of the tag is determined.

Step S6084, solving the optimal solution according to the prediction confidence of the magnitude pair label combination to obtain a recognition result.

In one embodiment of the disclosure, by determining the magnitude of the prediction confidence of the label and solving the optimal solution for the prediction confidence of the combination of the labels according to the magnitude, the identification result is obtained, and the label prediction accuracy and the self-consistency of label prediction are greatly improved.

Based on the steps shown in fig. 1 and fig. 6, as shown in fig. 7, the step of optimizing the prediction confidence of the combination of the tags according to the magnitude to obtain the recognition result includes:

step S70842, a first magnitude difference between the prediction confidences for tags of different granularity is determined.

Step S70844, when it is determined that the first quantity range is greater than the first preset quantity range, performing an integration of the prediction confidence degrees of the combinations of the tags.

Step S70846, the combination of the labels with the largest product result is determined as the recognition result.

In one embodiment of the present disclosure, when the magnitudes of confidence levels of output predictions of different granularity (level) tags are different greatly, the product method is selected, which is more favorable for displaying the difference of confidence levels of different level tags.

Based on the steps shown in fig. 1 and fig. 6, as shown in fig. 8, the optimizing the prediction confidence of the combination of the tags according to the magnitude to obtain the recognition result further includes:

step S80842, determining a second magnitude difference between the prediction confidences of the labels of the same granularity.

In step S80844, when it is determined that the second amount range is greater than the second preset amount range, the prediction confidence degrees of the combinations of the tags are summed.

In step S80846, the combination of the labels having the largest summation result is determined as the recognition result.

In an embodiment of the present disclosure, when the output prediction confidence level of the label at the same level is greatly different, a summation manner is selected, which is more favorable for correcting the situation of prediction error.

Based on the steps shown in fig. 1, as shown in fig. 9, before training the multi-label recognition model by training labels, the method further includes:

step S902, obtaining multilayer label characteristics, wherein the granularity of the label characteristics of the kth layer is different from that of the (k + 1) th layer, and k is an integer greater than or equal to 1; the multi-layer label features are preprocessed to obtain training labels.

The preprocessing is data enhancement of deep learning, and the data enhancement is introduced to expand the diversity of a data set to a certain extent and add appropriate interference to increase the generalization capability of a model.

The identification apparatus 1000 of multi-tag information according to this embodiment of the present disclosure is described below with reference to fig. 10. The identification apparatus 1000 of multi-tag information shown in fig. 10 is only an example, and should not bring any limitation to the function and the scope of use of the embodiments of the present disclosure.

The identification apparatus 1000 of multi-tag information is represented in the form of a hardware module. The components of the identification apparatus 1000 of multi-tag information may include, but are not limited to: training module 1002, building module 1004, obtaining module 1006, and determining module 1008.

The training module 1002 is configured to train a multi-label recognition model through training labels, where a circulation network is introduced to an input end of a prediction branch of the multi-label recognition model, and the circulation network is used to learn a relationship between a plurality of training labels;

a building module 1004, configured to build a tag restriction relationship set according to a result of the predicted branch, where the tag restriction relationship set includes a tag combination meeting a preset tag rule;

an obtaining module 1006, configured to obtain a multi-tag message to be identified;

and a determining module 1008, configured to determine a recognition result of the multi-tag information according to the trained multi-tag recognition model and the tag constraint relation set.

An identification platform of multi-tag information according to such an embodiment of the present disclosure is described below with reference to fig. 11. The identification platform of multi-tag information shown in fig. 11 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 11, the multi-tag information identification platform includes sample data 1102, a feature selection network 1104, a loop network 1106, a full connection layer, and results of each prediction branch, and specifically implements multi-tag identification by the following steps:

a) constructing a label system:

while the labels in the multi-label recognition task are relatively independent but can be organized into a hierarchical structure, sample data 1102 requires hierarchical processing.

The logic of the hierarchy is to find the subordination among the labels with different granularities, and further form a tree-shaped label system.

For example, all differences of classification logic under different recognition tasks, such as for apparel-related label systems, can be constructed as follows:

first-level labeling: an upper garment and a lower garment.

Secondary labeling: t-shirt, coat (belonging to the second label of the first label coat); skirt and shorts (belonging to a secondary label for putting clothes under the primary label).

Third-level labeling: short-sleeved T-shirts and long-sleeved T-shirts (belonging to the third-level label of the second-level label T-shirts); a skirt and a long skirt (belonging to a three-level label of a two-level label skirt).

For the task of not building a label hierarchy, two principles have to be followed: on one hand, the labels in the same layer are mutually exclusive, namely different labels in the same layer are not simultaneously endowed with a sample, for example, two labels of a male and a female of a character image can be used as the labels in the same layer; on the other hand, labels of different layers may be given to one sample at the same time, thereby ensuring that the number of layers is minimized, but not limited thereto.

b) Data collection and preprocessing:

marking and collecting the training data through the feature selection network 1104, and preprocessing the training data to obtain the training data x of the first-layer prediction branch₁Second layer predictionTraining data x of branches₂Training data x of third-level predicted branch₃… … training data x for the nth level predicted branch_n。

c) Building a training network:

as a main improvement point of the present disclosure, this step introduces a loop network 1106 before the predicted branch of the labels when building the training network to learn the relationship between the labels.

(1) Sample data 1102 is applicable to different fields of data, such as text, audio, images, etc. The feature extraction network 1104 only needs to use input sample data, and a current mainstream deep learning network can be selected.

(2) By the feature extraction network 1104, input features x corresponding to different levels can be extracted₁~x_n. For a multi-branch network structure, input feature x₁~x_nAre different but have the same dimensions, the input features may be identical for a single-branch structure.

(3) There are also various candidates for the cyclic network 1106, such as RNN (cyclic neural network), bidirectional RNN (bidirectional cyclic neural network), LSTM (long short term memory network), GRU (gated cell network), etc., as shown in fig. 10.

(4) After the input features are computed by the W function of the loop network 1106, the output result includes the feature y of the first predicted branch₁Characteristic y of the second predicted branch₂The third predicted branch characteristic y₃And the characteristic y of the nth predicted branch_n. May be, for example, the characteristic y of the first predicted branch of the round robin network 1106₁Iterating to the input end of a second predicted branch, the prediction result of which comprises the training data x of the first layer predicted branch₁And training data x for second-level predicted branches₂Also contains the training data x of the first layer predicted branch₁And a second layer of predicted branches.

The back of the cyclic network 1106 corresponds to the corresponding full connectivity layer, which may be, for example, the first full connectivity layer FC₁Second full connection layer FC₂The third is connected withContact layer FC₃… … n full connection layer FC_n。

The fully-connected layer outputs the prediction results of the labels of different layers, which may be, for example, the first layer prediction result, the second layer prediction result, the third layer prediction result, … …, the nth layer prediction result.

In the training process of the recognition model, the loss function of the training classification model may be used after the fully connected layers of different layers are used, and may be, for example, a loss function such as cross entropy, focallloss, and the like, but is not limited thereto.

d) Obtaining a prediction result:

this step collects the prediction results of the recognition model on the sample data 1102, i.e., the confidence levels normalized for each tag.

e) Constructing a label restriction relationship:

as a main improvement point of the present disclosure, the tag constraint relation set is constructed to relax the constraint of the post-processing, and realize a flexible post-processing flow. Based on statistics of training data, all possible situations that labels of different levels commonly appear can be obtained, namely a label limit relation set T is generated, the set comprises the probability of whether a label combination exists or not, the probability of the existing method that the label combination exists is not, the generalization capability of the recognition model is enhanced by means of the improvement of the label prediction accuracy, and the probability of unfriendly long-tail labels is reduced.

f) Outputting a final prediction result:

and limiting the set T based on the obtained prediction result and the label.

For the nth layer (N ∈ [1, N)]) Label L of the ith_n,iWith a prediction confidence of P_n,iThen, the optimal prediction result is obtained according to the following formula:

In addition, D () can also be extended to other ways: for example, the influence of an abnormal value (outlier) on the overall prediction result can be reduced by a calculation mode of summation/product calculation based on a threshold value; the weighted summation/product calculation mode can improve/reduce the influence of certain level label confidence degrees on the overall prediction result.

In the identification scheme of the multi-label information, firstly, the relation among labels can be fully utilized in network training by introducing a circulating network, so that the label prediction accuracy and the self-consistency of label prediction are improved. Secondly, a flexible post-processing flow is constructed to serve as a guarantee for improving the predicted self-consistency of the labels, and meanwhile, the situation that the long-tail labels are unfriendly is reduced by avoiding the over-fitting of the post-processing to the distribution of the current training data.

An electronic device 1200 according to this embodiment of the disclosure is described below with reference to fig. 12. The electronic device 1200 shown in fig. 12 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 12, the electronic device 1200 is embodied in the form of a general purpose computing device. The components of the electronic device 1200 may include, but are not limited to: the at least one processing unit 1210, the at least one memory unit 1220, and a bus 1230 connecting the various system components including the memory unit 1220 and the processing unit 1210.

Where the memory unit stores program code, the program code may be executed by the processing unit 1210 such that the processing unit 1210 performs the steps according to various exemplary embodiments of the present disclosure described in the above-mentioned "exemplary methods" section of this specification. For example, the processing unit 1210 may perform the steps defined in the identification method of multi-tag information of the present disclosure.

The storage unit 1220 may include a readable medium in the form of a volatile memory unit, such as a random access memory unit (RAM) 12201 and/or a cache memory unit 12202, and may further include a read only memory unit (ROM) 12203.

Storage unit 1220 may also include a program/utility 12204 having a set (at least one) of program modules 12205, such program modules 12205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 1230 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 1200 may also communicate with one or more external devices 1240 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 1200 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 1250. Further, the electronic device 1200 can communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) through the network adapter 1110, the network adapter 1110 communicating with other modules of the electronic device 1200 through the bus 1230. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RA identification systems, tape drives, and data backup storage systems, etc.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the disclosure described in the above-mentioned "exemplary methods" section of this specification, when the program product is run on the terminal device.

According to the program product for implementing the above method of the embodiments of the present disclosure, it may employ a portable compact disc read only memory (CD-ROM) and include program codes, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Moreover, although the steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that the steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a mobile terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims

1. A method for identifying multi-label information is characterized by comprising the following steps:

training a multi-label recognition model through training labels, wherein a circulation network is introduced into the input end of a prediction branch of the multi-label recognition model, and the circulation network is used for learning the relationship among a plurality of training labels;

constructing a tag restriction relation set according to the result of the prediction branch, wherein the tag restriction relation set comprises tag combinations meeting preset tag rules;

acquiring a multi-label message to be identified;

and determining the recognition result of the multi-label information according to the trained multi-label recognition model and the label limit relation set.

2. The method for recognizing multi-label information according to claim 1, wherein training the multi-label recognition model by training labels comprises:

inputting a first training label in the training labels to the circulation network to obtain a first output label, and inputting the first output label to a first full-link layer of the multi-label recognition model to obtain a first prediction result;

inputting the nth output label and the (n + 1) th training label into the circulation network to obtain an (n + 1) th output label, and inputting the (n + 1) th output label into an (n + 1) th full-connection layer of the multi-label recognition model to obtain an (n + 1) th prediction result, wherein n is an integer greater than or equal to 1.

3. The method of claim 1, wherein training the multi-label recognition model by training labels further comprises:

introducing a loss function into the output end of the full connection layer of the multi-label identification model;

and optimizing the multi-label recognition model according to the output result of the loss function.

4. The method for identifying multi-tag information according to claim 3,

the loss function includes a cross entropy function and/or a focallloss function.

5. The method of claim 1, wherein determining the recognition result of the multi-label information according to the trained multi-label recognition model and the label constraint relation set comprises:

inputting a multi-label message to be identified into a trained multi-label identification model to obtain a prediction confidence of each label in the multi-label message;

and determining the recognition result of the multi-tag information according to the prediction confidence and the tag limit relation set.

6. The method of claim 5, wherein determining the recognition result of the multi-tag information according to the prediction confidence and the set of tag constraint relationships comprises:

determining a prediction confidence of a z-th tag of an m-th layer in the multi-tag message according to the tag constraint relation set;

and solving the optimal solution of the prediction confidence degrees of the combination of the labels to obtain the identification result.

7. The method for identifying multi-tag information according to claim 6, wherein the optimizing the prediction confidence of the combination of tags to obtain the identification result comprises:

determining a magnitude of a prediction confidence for the tag;

and solving the optimal solution according to the prediction confidence of the magnitude on the combination of the labels to obtain the identification result.

8. The method of claim 7, wherein the optimizing the prediction confidence of the combination of the labels according to the magnitude to obtain the recognition result comprises:

determining a first magnitude difference between prediction confidences of the labels of different granularities;

when the first quantity range is determined to be larger than a first preset quantity range, performing product calculation on the prediction confidence of the combination of the labels;

and determining the combination of the labels with the maximum product result as the identification result.

9. The method of claim 7, wherein the optimizing the prediction confidence of the combination of the labels according to the magnitude to obtain the recognition result further comprises:

determining a second magnitude difference between the prediction confidences of the labels of the same granularity;

when the second quantity range is determined to be larger than a second preset quantity range, summing the prediction confidence degrees of the combination of the labels;

and determining the combination of the labels with the maximum summation result as the identification result.

10. The method of identifying multi-tag information according to any one of claims 1 to 9,

the recurrent network comprises one of a recurrent neural network, a bidirectional recurrent neural network, a long-short term memory network, and a gated cell network.

11. The method for recognizing multi-tag information according to any one of claims 1 to 9, further comprising, before training the multi-tag recognition model by the training tag:

acquiring a multilayer label characteristic, wherein the granularity of the label characteristic of a kth layer is different from that of a (k + 1) th layer, and k is an integer greater than or equal to 1;

preprocessing the multi-layer label features to obtain the training label.

12. The method for identifying multi-tag information according to claim 11,

the preprocessing comprises at least one of random cutting, rotation, turning, color adjustment and noise addition.

13. An apparatus for identifying multi-tag information, comprising:

the training module is used for training the multi-label recognition model through training labels, a circulation network is introduced into the input end of a prediction branch of the multi-label recognition model, and the circulation network is used for learning the relationship among the training labels;

the construction module is used for constructing a tag restriction relation set according to the result of the prediction branch, wherein the tag restriction relation set comprises tag combinations meeting preset tag rules;

the acquisition module is used for acquiring the multi-label message to be identified;

and the determining module is used for determining the recognition result of the multi-label information according to the trained multi-label recognition model and the label limit relation set.

14. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of identifying multi-tag information of any one of claims 1-12 via execution of the executable instructions.

15. A computer-readable storage medium having stored thereon a computer program, characterized in that,

the computer program, when executed by a processor, implements the method of identifying multi-tag information of any of claims 1-12.