CN115344698A

CN115344698A - Label processing method, label processing device, computer equipment, storage medium and program product

Info

Publication number: CN115344698A
Application number: CN202210962568.6A
Authority: CN
Inventors: 杨皓
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-08-11
Filing date: 2022-08-11
Publication date: 2022-11-15

Abstract

The application relates to a label processing method, a label processing device, a computer device, a storage medium and a program product. The method involves artificial intelligence, comprising: respectively extracting text features of the text and tag features of each candidate tag contained in the text from media data containing the text, and respectively performing feature updating on each tag feature based on first feature similarity among the tag features to obtain a primary updated tag feature. And secondly updating the primary updated label features according to the co-occurrence relation of the candidate labels in the historical media data to obtain secondary updated label features, and determining target labels of the candidate labels, wherein the second feature similarity of each secondary updated label feature and the text feature meets the similarity condition, as the labels of the media data. By adopting the method, the association degree between the label of the media data and the media data is improved, and the problem that accurate information cannot be obtained due to low association degree is avoided.

Description

Label processing method, label processing device, computer equipment, storage medium and program product

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a tag processing method, apparatus, computer device, storage medium, and program product.

Background

With the development of artificial intelligence technology and the popularization and use of internet application, in the actual application process, due to the existence of massive data, interference information and the like, each application platform, such as a news platform, a shopping platform and the like, cannot perform efficient information management, and cannot accurately determine information or data required by different use objects.

Taking a news platform as an example, due to the increasing growth of media data of the platform and the lack of efficient management of the media data, the difficulty of obtaining effective information by using an object is increased, so that a mode of adding different content tags to the text of the media data appears, and the object can quickly and accurately obtain required information according to the content tags.

However, the inventor finds that conventionally, by adding content tags and performing information recommendation, a multi-class deep learning model is generally adopted, and the number of classes of the multi-class deep learning model is generally fixed and small. For the situation of news emergencies or hot events, the conventional multi-classification model does not support processing operations such as label labeling, identification and display of a large number of newly-added emergencies, and still has the problem that effective information cannot be quickly and accurately acquired.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a tag processing method, a tag processing apparatus, a computer device, a computer readable storage medium, and a computer program product, which can improve the accuracy of determining tags of obtained media data and the information recommendation effect.

In a first aspect, the present application provides a label processing method. The method comprises the following steps:

respectively extracting text features of a text and tag features of each candidate tag contained in the text from media data containing the text;

respectively performing feature updating on each label feature based on first feature similarity among the label features to obtain a primary updated label feature;

according to the co-occurrence relation of each candidate label in the historical media data, carrying out secondary updating on the primary updated label characteristic to obtain a secondary updated label characteristic;

and determining a target label, of the candidate labels, with the second feature similarity meeting a similarity condition, as the label of the media data based on the second feature similarity between each secondary updated label feature and the text feature.

In one embodiment, the method further comprises: determining a corresponding preset similarity threshold according to a preset similarity condition, and screening the second feature similarities to obtain a target similarity greater than the preset similarity threshold; or sequencing the second feature similarities according to the similarity values to obtain corresponding feature similarity sequences, and screening preset target similarities from the feature similarity sequences according to preset similarity conditions.

In a second aspect, the application also provides a label processing device. The device comprises:

the extraction module is used for respectively extracting text features of the text and tag features of each candidate tag contained in the text from media data containing the text;

the initial updating module is used for respectively updating the characteristics of each label characteristic based on the first characteristic similarity among the label characteristics to obtain the initial updated label characteristics;

the secondary updating module is used for carrying out secondary updating on the primarily updated tag characteristics according to the co-occurrence relation of each candidate tag in the historical media data to obtain secondary updated tag characteristics;

and the label screening module is used for determining a target label of each candidate label, of which the second feature similarity meets the similarity condition, as the label of the media data based on the second feature similarity between each secondary updated label feature and the text feature.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program:

In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:

respectively updating the characteristics of each label characteristic based on the first characteristic similarity among the label characteristics to obtain a primary updated label characteristic;

In a fifth aspect, the present application further provides a computer program product. The computer program product comprising a computer program which when executed by a processor performs the steps of:

according to the co-occurrence relation of each candidate tag in historical media data, performing secondary updating on the primary updated tag feature to obtain a secondary updated tag feature;

In the tag processing method, the tag processing device, the computer device, the storage medium and the computer program product, the text features of the text and the respective tag features of each candidate tag included in the text are respectively extracted from the media data including the text, so that each tag feature is respectively subjected to feature updating based on the first feature similarity between the tag features to obtain the initial update tag features, and the correlation degree between the obtained initial update tag features is improved. And performing secondary updating on the primary updated label features according to the co-occurrence relation of each candidate label in the historical media data to obtain secondary updated label features so as to further improve the correlation degree between the obtained secondary updated label features. Finally, based on the second feature similarity between each secondary updated tag feature and the text feature, the target tag, of which the second feature similarity satisfies the similarity condition, in each candidate tag is determined as the tag of the media data, so that the association between the determined tag of the media data and the media data is improved, the situation that the user object cannot accurately acquire effective information due to low tag association is avoided, and the personalized recommendation effect when the user object is recommended based on the media data is further improved.

Drawings

FIG. 1 is a diagram of an application environment of a tag processing method in one embodiment;

FIG. 2 is a schematic flow chart diagram of a tag processing method in one embodiment;

FIG. 3 is a schematic flow chart illustrating the process of updating the primarily updated tag feature a second time to obtain a secondary updated tag feature in one embodiment;

FIG. 4 is a schematic flow chart illustrating construction of a tag co-occurrence diagram according to co-occurrence relationships between candidate tags in historical media data according to an embodiment;

FIG. 5 is a diagram of initial co-occurrence relationships established based on candidate tags and co-occurrence probabilities between candidate tags, under an embodiment;

FIG. 6 is a graph of tag co-occurrence relationships obtained after screening based on probability conditions in an embodiment;

FIG. 7 is a schematic flow chart illustrating a training process for a label fusion model in one embodiment;

FIG. 8 is an architectural diagram of a tag fusion model in one embodiment;

FIG. 9 is a schematic flow chart diagram of a tag processing method in another embodiment;

FIG. 10 is a block diagram showing the construction of a tag processing apparatus according to an embodiment;

FIG. 11 is a diagram illustrating an internal structure of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.

The tag processing method provided by the embodiment of the application relates to an Artificial Intelligence technology, wherein Artificial Intelligence (AI) is a theory, a method, a technology and an application system which simulate, extend and expand human Intelligence by using a digital computer or a machine controlled by the digital computer, sense the environment, acquire knowledge and use the knowledge to obtain the best result. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Among them, natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language people use daily, so it has a close relation with the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question answering, knowledge mapping, and the like. With the research and development of artificial intelligence technology, the artificial intelligence technology is developed and researched in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical services, smart customer service and the like.

The tag processing method provided by the embodiment of the application specifically relates to a natural language processing technology in an artificial intelligence technology, and can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The server 104 extracts text features of the text and respective tag features of each candidate tag included in the text from the media data including the text, and performs feature update on each tag feature based on a first feature similarity between the tag features, so as to obtain a first updated tag feature. The media data including the text may be stored in a local storage of the terminal 102, or may be stored in a data storage system or a cloud storage associated with the server 104, and when the tag processing is required, the server 104 may obtain the media data including the text from the local storage of the terminal 102, or the data storage system, or the cloud storage. Further, the server 104 performs secondary updating on the primary updated tag feature according to the co-occurrence relationship of each candidate tag in the historical media data to obtain a secondary updated tag feature, and determines, as the tag of the media data, a target tag in each candidate tag whose second feature similarity satisfies the similarity condition based on the second feature similarity of each secondary updated tag feature and the text feature. After the tags of the media data are obtained through screening, the server 104 may use the tags as explicit tags of the media data to be displayed on the terminal 102, so that each user of the terminal 102 can quickly obtain effective information. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or a server cluster comprised of multiple servers.

In one embodiment, as shown in fig. 2, a tag processing method is provided, which is described by taking the application of the method to the server in fig. 1 as an example, and includes the following steps:

step S202 is to extract text features of the text and tag features of each candidate tag included in the text from the media data including the text.

The media data containing the text at least comprises two types of texts, and the text features comprise at least two text features which are in one-to-one correspondence with the at least two types of texts. Specifically, the media data may specifically include title text and content text, and the text features include title text features corresponding to the title text and content text features corresponding to the content text.

Similarly, the candidate tags are specifically determined from texts included in the media data, and when the texts included in the media data are a title text and a content text, the candidate tags specifically include tags obtained by tag identification of the title text and tags obtained by tag identification of the content text.

For example, the media data may be news data, the text may be news headline text and news body text, and the candidate tags may be tags respectively identified from news headlines and news bodies. Similarly, the media data may also be other types of data, such as page sharing data of a communication application platform and a social sharing platform, that is, data that includes a title content and a text content and is capable of performing operations such as tag processing and tag display may be determined as media data that needs to be processed.

Specifically, by acquiring media data to be subjected to label processing and identifying a title text and a content text of the media data to be subjected to label processing, a title text feature of the title text and a content text feature of the content text are further extracted. When the label features are extracted, label identification is specifically performed on the title text and the content text respectively to obtain a label in the title text and a label in the content text, the label in the title text and the label in the content text are both used as candidate labels, and the respective label features of each candidate label are further extracted.

When the label identification is performed on the title text and the content text, the respective labels of the title text and the content text can be obtained by means of entity mining based on the title text and the content text, concept label generation and the like.

In one embodiment, specifically, through a trained tag fusion model, from a title text and a content text of media data to be subjected to tag processing, a title text feature of the title text and a content text feature of the content text are further extracted. Similarly, through a trained label fusion model, respective label features of candidate labels identified from a title text and a content text of the media data are extracted. Further, when the trained label fusion model is used for extracting the title text features, the content text features and the label features, the feature extraction is specifically carried out by using a text feature extraction layer in the label fusion model.

The text feature extraction layer in the label fusion model may be a bert (full-name Bidirectional Encoder from Transformers), a fast-text network (i.e., a fast text classification network), a text-cnn network (i.e., a network that classifies texts using a convolutional neural network), or other different network types, as long as the network can implement the function of text feature extraction.

In one embodiment, the text includes at least two types of text, the text features include at least two text features corresponding to the at least two types of text one to one, and after extracting the text features of the text from the media data containing the text, the method further includes:

determining text feature similarity between at least two text features; and respectively updating the characteristics of the at least two text characteristics according to the similarity of the text characteristics to obtain a text characteristic combination comprising the updated at least two text characteristics.

Specifically, since the media data specifically includes a title text and a content text, the text feature includes a title text feature corresponding to the title text and a content text feature corresponding to the content text, and the text feature combination specifically includes the title text feature and the content text feature. Specifically, the method comprises the steps of determining text feature similarity between a title text feature and a content text feature, respectively updating the title text feature and the content text feature based on the text feature similarity between the title text feature and the content text feature to obtain an updated title text feature and an updated content text feature, and further obtaining a text feature combination comprising the updated title text feature and the updated content text feature.

In this embodiment, specifically, feature update is performed on the title text feature and the content text feature by using the first feature update layer in the trained tag fusion model. Specifically, the first feature update layer in the tag fusion model may be feature update on the title text feature and the content text feature using a self-attention transform network (i.e., a transform neural network using a self-attention mechanism).

In one embodiment, the updating the features of the title text feature and the content text feature based on the similarity of the text features between the title text feature and the content text feature to obtain an updated title text feature and an updated content text feature includes:

determining text feature similarity between the title text features and the content text features; determining a first feature weight matched with the title text feature and a second feature weight matched with the content text feature based on the text feature similarity; and according to the second feature weight, performing feature updating on the content text feature to obtain an updated content text feature.

Specifically, a first feature updating layer in a trained label fusion model is used, namely a self-attention transform network is used for calculating text feature similarity between the title text features and the content text features, and according to the text feature similarity, a first feature weight for performing feature updating on the title text features and a second feature weight for performing feature updating on the content text features are determined.

Further, after determining a first feature weight matched with the title text feature and a second feature weight matched with the content text feature, using a self-attention transformer network to perform feature update on the title text feature according to the first feature weight to obtain an updated title text feature, and performing feature update on the content text feature according to the second feature weight to obtain an updated content text feature.

When the feature of the title text is updated according to the first feature weight, specifically, a product between the first feature weight and a title feature vector corresponding to the title text feature is calculated to obtain an updated title text feature, and similarly, when the feature of the content text is updated according to the second feature weight, a product between the second feature weight and a content text feature vector corresponding to the content text feature is calculated to obtain an updated content text feature.

Specifically, the following formula (1) is adopted to obtain the updated title text feature and the updated content text feature:

Emb_T _title ，Emb_T _content ＝Transformer(Emb _title ，Emb _content )； (1)

wherein Emb _ T _title Indicating an updated title text feature, emb _ T _content Representing an updated content text feature, emb _title Representing a title text feature, emb _content Representing content text features, transformer (Emb) _title ，Emb _content ) The method comprises the step of performing self-attention interactive processing on the title text feature and the content text feature by using a self-attention transformer network to realize respective feature updating, so as to obtain an updated title text feature and an updated content text feature.

Step S204, based on the first feature similarity among the label features, respectively performing feature updating on each label feature to obtain a primary updated label feature.

Specifically, for each target tag feature to be updated in each tag feature, data fusion is performed on the target tag feature and the first feature similarity between each remaining tag feature in each tag feature to obtain a third feature weight matched with each target tag feature, and based on the third feature weight matched with the target tag feature, feature updating is performed on the target tag feature to obtain a primarily updated tag feature.

And performing data fusion according to the plurality of first feature similarities corresponding to the target label features, so as to obtain a third feature weight matched with each target label feature.

Further, in this embodiment, a second feature update layer in the trained tag fusion model is specifically used, that is, a self-attention transformer network is used, and the first feature similarity between each target tag feature and each of the rest tag features is calculated. The network structure of the self-attention-using transform network characterized by the second feature update layer is consistent with that of the self-attention-using transform network characterized by the first feature update layer, but the network parameters of the two networks are different.

Similarly, after the first feature similarity is obtained through calculation by using the self-attention transducer network, averaging is performed on a plurality of first feature similarities corresponding to the target label features by using the self-attention transducer network so as to achieve the purpose of data fusion, a third feature weight matched with each target label feature is obtained after averaging, and feature updating is performed on each target label feature respectively on the basis of the third feature weight matched with each target label feature so as to obtain an initial updated label feature.

When the target label feature is updated according to the third feature weight, specifically, a product between the third feature weight and a target label feature vector corresponding to the target label feature is calculated to obtain an initial updated label feature.

Specifically, the following formula (2) is adopted to obtain the characteristics of the initial update tag:

Emb_T _tag1 ，...，Emb_T _tagn ＝Transformer(Emb _tag1 ，...，Emb _tagn) ； (2)

wherein Emb _tag1 To Emb _tagn Indicates that there are n target tag features to be updated, emb _ T _tag1 To Emb _ T _tagn Then n initial update tag signatures after signature update are represented, and the transform (Emb) _tag1 ，...，Emb _tagn ) Then, it means that a self-attention transformer network is used to perform self-attention interaction processing on each of the n target tag features to be updated to implement feature update of the target tag features, so as to obtain n initial update tag features.

And step S206, carrying out secondary updating on the primary updated label characteristics according to the co-occurrence relation of each candidate label in the historical media data to obtain secondary updated label characteristics.

Specifically, the co-occurrence relationship of each tag in the historical media data is determined by performing data analysis and statistics based on the historical media data including a plurality of tags. And further combining every two candidate tags to obtain a plurality of candidate tag groups, further determining the co-occurrence relation of the two candidate tags in each candidate tag group in the historical media data according to the co-occurrence relation of the tags in the historical media data, and further updating the primary updated tag characteristics for the second time according to the co-occurrence relation of the candidate tags in the historical media data to obtain secondary updated tag characteristics.

In one embodiment, the co-occurrence relationship of the two candidate tags included in the candidate tag group in the historical media data is determined according to the co-occurrence probability between the two candidate tags included in the candidate tag group. For example, if a candidate tag group including a candidate tag a and a candidate tag B commonly occurs in a plurality of sub-media data of the historical media data, it indicates that the co-occurrence probability of the candidate tag a and the candidate tag B is high, and according to the co-occurrence probability of the candidate tag a and the candidate tag B, the co-occurrence relationship between the candidate tag a and the candidate tag B can be further determined. And determining the co-occurrence probability between the candidate tag A and the candidate tag B according to the co-occurrence frequency of the candidate tag A and the candidate tag B and the larger occurrence frequency of the candidate tag A and the candidate tag B.

Specifically, the co-occurrence probability between the candidate tag a and the candidate tag B is determined by using the following formula (3):

where Corr _ prob (a, B) represents the co-occurrence probability between candidate tag a and candidate tag B, # Corr _ hum (a, B) represents the co-occurrence count of candidate tag a and candidate tag B, and max (# a, # B) represents the greater occurrence count of candidate tag a and candidate tag B, which may be the greater occurrence count of candidate tag a or the greater occurrence count of candidate tag B.

For example, three candidate tags a, B, and C appear in the sub media data Item1 in the history media data, four candidate tags a, C, D, and E appear in the sub media data Item2, and two candidate tags a and F appear in the sub media data Item3, and then the co-occurrence probability between the two candidate tags a and C is represented by Corr _ prob (a, C).

Wherein, the number of items in which two candidate tags a and C appear together is 2 by statistics, that is, items 1 and 2 appear together, a and C, the number of candidate tags a appears is 3 by statistics, that is, candidate tags a appear in 3 pieces of sub-media data, and the number of candidate tags C appears is 2, that is, candidate tags C appear in items 1 and 2, the number of times of co-occurrence of candidate tags a and candidate tags C is 2, and candidate tags a appear 3 times, that is, the number of times of occurrence of candidate tags a is greater than the number of times of occurrence of candidate tags C, then the method for determining the number of co-occurrence of candidate tags a and C is applied to the multimedia broadcast television

Similarly, for other multiple candidate tag groups, the co-occurrence probability between two candidate tags included in each candidate tag group also needs to be determined, so that after the co-occurrence probability between two candidate tags in each candidate tag group is determined, the co-occurrence relationship between two candidate tags in each candidate tag group is further determined according to the co-occurrence probability between two candidate tags in each candidate tag group.

In one embodiment, the primary updated tag feature is updated secondarily by using a trained tag fusion model according to the co-occurrence relationship of each candidate tag in the historical media data, so that a secondary updated tag feature is obtained. When the primary updated label features are updated for the second time, the primary updated label features are updated for the second time according to the graph network feature updating layer of the trained label fusion model, and secondary updated label features are obtained.

The Graph Network feature updating layer of the label fusion model may be specifically a GAT Network (Graph Attention Networks), a GCN Network (Graph relational Network, graph convolution Network), and the like, that is, a Network capable of implementing feature extraction and updating based on a Graph mode may be used as a Graph Network feature updating layer of the label fusion model.

Further, a graph network feature updating layer based on the label fusion model obtains secondary updated label features by adopting the following formula (4):

Emb_G _tag1 ，...，Emb_G _tagn ＝GAT(Emb_T _tag1 ，...，Emb_T _tagn )； (4)

wherein Emb _ T _tag1 To Emb _ T _tagn Then n initial update tag features after the feature update are represented, emb _ G _tag1 To Emb _ G _tagn The tag feature, GAT (Emb _ T), representing the secondary update after the secondary update process _tag1 ，...，Emb_T _tagn ) And a graph network feature updating layer represented by a GAT network is used for performing correlation degree interactive processing on each primary updating label so as to realize secondary updating of the primary updating label feature and obtain n secondary updating label features.

And step S208, based on the second feature similarity between each secondary updated label feature and the text feature, determining the target label with the second feature similarity satisfying the similarity condition in each candidate label as the label of the media data.

Because the text comprises at least two types of texts and the text features comprise at least two text features which are in one-to-one correspondence with the at least two types of texts, when the second feature similarity of each secondary updated label feature and the text feature is calculated, the second feature similarity between each secondary updated label feature and the text feature combination is calculated.

Specifically, the text feature combination may include a plurality of text features, and further, for each secondary update label feature in the secondary update label features, it is necessary to determine a sub-similarity between each secondary update label feature and each text feature in the text feature combination, perform data fusion on the sub-similarities corresponding to the same secondary update label feature, and determine a result obtained by the data fusion as a second feature similarity between the secondary update label feature and the text feature combination.

In an embodiment, since the media data may specifically include a title text and a content text, the text feature specifically includes a title text feature of the title text and a content text feature of the content text, and further, when a second feature similarity between a combination of a secondary update tag feature and the text feature is calculated, specifically, for each secondary update tag feature in the secondary update tag features, data fusion is performed on a first sub-similarity between the secondary update tag feature and an update title text feature and a second sub-similarity between the secondary update tag feature and an update content text feature, so as to obtain a second feature similarity that each secondary update tag feature is matched with.

After the second feature similarity of each secondary updated label feature and the text feature is determined, the target similarity meeting the similarity condition is screened from each second feature similarity, the target label corresponding to the target similarity is screened from each candidate label, and each target label is determined as the label of the media data.

Further, when data fusion is performed on the first sub-similarity and the second sub-similarity, specifically, an average value of the first sub-similarity and the second sub-similarity is calculated to obtain second feature similarities matched with the secondary updated label features respectively. The first sub-similarity and the second sub-similarity may be specifically understood as a first cosine similarity between the secondary updated tag feature and the updated header text feature and a second cosine similarity between the secondary updated tag feature and the updated content text feature.

In one embodiment, the step of screening the second feature similarities to obtain the target similarity satisfying the similarity condition includes:

determining a corresponding preset similarity threshold according to a preset similarity condition, and screening the second characteristic similarities to obtain a target similarity greater than the preset similarity threshold; or sequencing the second feature similarity according to the similarity value to obtain a corresponding feature similarity sequence, and screening out the preset target similarity from the feature similarity sequence according to a preset similarity condition.

Specifically, the similarity condition may specifically be setting a corresponding preset similarity threshold to screen a target similarity greater than the preset similarity threshold, and the similarity condition may also specifically screen out the top preset target similarities in the feature similarity sequence, for example, screen out the top M target similarities. M can be integers with different values which are adjusted and set according to actual requirements or actual application scenes, no specific limitation is carried out, and the characteristic similarity sequence is obtained by arranging from large to small according to the size of the similarity value.

In one embodiment, specifically, a trained label fusion model is used to determine second feature similarity between each secondary updated label feature and the text feature, and target labels with second feature similarity satisfying the similarity condition are screened out based on the candidate labels, so as to determine the target labels as the labels of the media data. When the target label is screened, the label fusion screening layer of the trained label fusion model is utilized to screen out the target label with the second characteristic similarity meeting the similarity condition based on each candidate label, so that the target label is determined as the label of the media data.

In one embodiment, after the tags of the media data are determined, the tags of the media data are further used as content side features and added to a recommendation model to improve the effect of personalized recommendation, the tags of the media data can be displayed on an application platform to which the media data belong, clicking operations of using objects on the application platform on the displayed tags are detected to extract interest points of the using objects, and the interest or hobby tendency of the using objects is predicted to obtain personalized recommendation, so that the using objects can click, access or purchase products or services related to the corresponding media data as much as possible, and the benefit of the application platform is further improved.

In the tag processing method, the text features of the text and the respective tag features of each candidate tag included in the text are respectively extracted from the media data including the text, and feature updating is respectively performed on each tag feature based on the first feature similarity between the tag features to obtain the initial update tag features, so that the correlation degree between the obtained initial update tag features is improved. And performing secondary updating on the primary updated label features according to the co-occurrence relation of each candidate label in the historical media data to obtain secondary updated label features so as to further improve the correlation degree between the obtained secondary updated label features. Finally, based on the second feature similarity between each secondary updated tag feature and the text feature, the target tag, of which the second feature similarity satisfies the similarity condition, in each candidate tag is determined as the tag of the media data, so that the association between the determined tag of the media data and the media data is improved, the situation that the user object cannot accurately acquire effective information due to low tag association is avoided, and the personalized recommendation effect when the user object is recommended based on the media data is further improved.

In an embodiment, as shown in fig. 3, the step of performing secondary update on the primarily updated tag feature to obtain a secondary updated tag feature, that is, the step of performing secondary update on the primarily updated tag feature according to a co-occurrence relationship of each candidate tag in the historical media data to obtain a secondary updated tag feature specifically includes:

step S302, constructing a tag co-occurrence relation graph according to the co-occurrence relation among the candidate tags in the historical media data.

Specifically, a plurality of candidate tag groups are obtained by pairwise combination of the candidate tags, and a co-occurrence relationship between two candidate tags in each candidate tag group is determined, specifically, the co-occurrence frequency of the two candidate tags in each candidate tag group in the historical media data is obtained, and the co-occurrence probability of the candidate tags in the candidate tag group is determined according to the co-occurrence frequency, so that the co-occurrence relationship between the two candidate tags in the candidate tag group is determined further according to the co-occurrence probability of the candidate tags in the candidate tag group and a preset probability condition.

The co-occurrence relationship between two candidate tags in the candidate tag group is determined according to the co-occurrence probability of the candidate tags in the candidate tag group and a preset probability condition, and for the tag co-occurrence relationship diagram, the connection relationship between the nodes in the diagram can be understood as the co-occurrence relationship between the two candidate tags in the candidate tag group, that is, after the co-occurrence relationship between the two candidate tags in the candidate tag group is determined, the nodes in the tag co-occurrence relationship diagram and the connection relationship between the nodes can be understood as being determined.

Further, after determining the co-occurrence relationship between two candidate tags in the candidate tag group, that is, determining each node in the tag co-occurrence relationship graph and the connection relationship between each node, the tag co-occurrence relationship graph is further constructed according to the nodes and the connection relationship between the nodes.

Step S304, aiming at each candidate label in the label co-occurrence relation graph, obtaining a connection candidate label which has a connection relation with the candidate label in the label co-occurrence relation graph, and determining third feature similarity between the label feature of the connection candidate label and the label feature of the candidate label.

Specifically, for each candidate tag in the tag co-occurrence relationship diagram, a connection candidate tag having a connection relationship with the candidate tag may be obtained according to a connection relationship between nodes, where for each candidate tag, the obtained connection candidate tag having a connection relationship with each candidate tag is different according to a difference in the connection relationship between the nodes.

Further, after obtaining a connection candidate tag having a connection relation with the candidate tag, further according to the tag feature of the candidate tag and the tag feature of the connection candidate tag, calculating to obtain a third feature similarity between the tag feature of the connection candidate tag and the tag feature of the candidate tag. Wherein, since there may be a plurality of connection candidate tags having a connection relationship in one candidate tag, the corresponding calculated third feature similarity may also be one or more.

Step S306, according to the third feature similarity matched with the candidate label, determining the fourth feature weight matched with the initial update label feature of the candidate label.

Specifically, when a plurality of candidate connection tags having a connection relationship exist in one candidate tag, the matched third feature similarity is also multiple, further, the purpose of data fusion is achieved by performing further averaging processing on the multiple third feature similarities between the tag feature of the connection candidate tag and the tag feature of the candidate tag, and then the obtained average value of the third feature similarity after the averaging processing is determined as the fourth feature weight matched with the initial update tag feature of the candidate tag.

Further, in this embodiment, specifically, according to the graph network feature update layer of the trained tag fusion model, based on the tag features of the candidate tags and the tag features of the connected candidate tags, a third feature similarity between the tag features of the connected candidate tags and the tag features of the candidate tags is obtained through calculation, and further averaging processing is performed on the third feature similarity between the tag features of the connected candidate tags and the tag features of the candidate tags, so as to obtain a third feature similarity mean value after the averaging processing, and the third feature similarity mean value is determined as a fourth feature weight matched with the initially updated tag features of the candidate tags.

And step S308, performing secondary updating on the primary updated label feature of the candidate label based on the fourth feature weight matched with the primary updated label feature to obtain a secondary updated label feature.

Specifically, according to a graph network feature updating layer of a trained label fusion model, after a fourth feature weight matched with a primary updated label feature is determined, the primary updated label feature is further updated for the second time according to the fourth feature weight, and a secondary updated label feature is obtained.

Specifically, since the graph network feature update layer may specifically be a GAT network, in this embodiment, a fourth feature weight matched with the first-time updated tag feature is determined based on the GAT network, and the first-time updated tag feature is updated for the second time according to the fourth feature weight, so as to obtain a second-time updated tag feature.

When the primary updated label feature is updated for the second time according to the fourth feature weight, specifically, a product between the fourth feature weight and the primary updated label feature vector corresponding to the primary updated label is calculated to obtain a secondary updated label feature.

In this embodiment, a tag co-occurrence relationship diagram is constructed according to the co-occurrence relationship among the candidate tags in the historical media data, and for each candidate tag in the tag co-occurrence relationship diagram, a connection candidate tag having a connection relationship with the candidate tag in the tag co-occurrence relationship diagram needs to be obtained, and a third feature similarity between the tag feature of the connection candidate tag and the tag feature of the candidate tag is determined. And further determining fourth feature weight matched with the primarily updated tag feature of the candidate tag according to the third feature similarity matched with the candidate tag, and performing secondary updating on the primarily updated tag feature of the candidate tag based on the fourth feature weight matched with the primarily updated tag feature to obtain a secondary updated tag feature. The method and the device have the advantages that the connection candidate label having a connection relation with the candidate label is determined based on the co-occurrence relation graph, the third feature similarity between the candidate label and the connection candidate label is further calculated, the secondary update of the primary update label feature is realized according to the fourth feature weight determined according to the third feature similarity, the correlation degree between the obtained secondary update label features is further improved, the correlation degree between the determined label of the media data and the media data is improved when the label of the media data is screened subsequently, and the situation that the effective information cannot be accurately obtained by a user due to the fact that the label correlation degree is low is avoided.

In an embodiment, as shown in fig. 4, the step of constructing a tag co-occurrence relationship diagram according to co-occurrence relationships among candidate tags in the historical media data specifically includes:

step S402, combining each candidate label in pairs to obtain a plurality of candidate label groups.

Specifically, after the title text feature of the corresponding title text and the content text feature of the content text are extracted from the media data containing the text, label identification is further performed on the title text and the content text respectively to obtain a label in the title text and a label in the content text, and the label in the title text and the label in the content text are used as candidate labels.

Further, each candidate tag is pairwise combined to obtain a plurality of candidate tag groups, for example, 4 candidate tags including a tag a, a tag B, a tag C, and a tag D exist, and then the 4 candidate tags are pairwise combined to obtain candidate tag groups AB, AC, AD, BC, BD, and CD.

Step S404, acquiring the co-occurrence frequency of the two candidate tags in each candidate tag group in the historical media data.

Specifically, the co-occurrence times of the two candidate tags in each candidate tag group in the historical media data can be determined based on the co-occurrence relationship of the tags in the historical media data by performing data analysis and statistics based on the historical media data including a plurality of tags to determine the co-occurrence times of the tags in the historical media data.

Step S406, determining the co-occurrence probability of the candidate tags in the candidate tag group according to the co-occurrence times.

Specifically, the co-occurrence probability of the candidate tags in the candidate tag group is calculated according to the co-occurrence frequency of the two candidate tags in each candidate tag group in the historical media data and the respective occurrence frequency of the two candidate tags in each candidate tag group in the historical media data.

Specifically, the co-occurrence probability of two candidate tags in the candidate tag group, such as the co-occurrence probability of the candidate tag a and the candidate tag B in the candidate tag group, is determined according to the co-occurrence frequency of the candidate tag a and the candidate tag B and the larger occurrence frequency of the candidate tag a and the candidate tag B.

For example, if the number of occurrences of the candidate tag a in the historical media data is larger, the co-occurrence probability of the selected tag a and the selected tag B is calculated according to the number of co-occurrences of the candidate tag a and the candidate tag B and the number of occurrences of the candidate tag a.

Similarly, if the number of occurrences of the candidate tag B in the historical media data is larger, the co-occurrence probability of the candidate tag a and the candidate tag B is calculated according to the number of co-occurrences of the candidate tag a and the candidate tag B and the number of occurrences of the candidate tag B.

And step S408, performing deduplication processing on candidate tags contained in a target tag group of which the co-occurrence probability meets the probability condition in the candidate tag group, determining the candidate tags obtained after the deduplication processing as nodes of a tag co-occurrence relation graph, and determining the connection relation among the nodes based on the target tag group.

Specifically, according to a preset probability condition, screening the co-occurrence probability between two candidate tags in each candidate tag group, screening a target tag group with the co-occurrence probability meeting the probability condition, performing deduplication processing on the candidate tags contained in the target tag group, obtaining the candidate tags obtained after the deduplication processing, and determining the candidate tags obtained after the deduplication processing as nodes of a tag co-occurrence relation graph.

And further, determining the connection relation among the nodes based on the target label group meeting the probability condition and the candidate labels obtained after the deduplication processing, namely the nodes of the label co-occurrence relation graph. The probability condition can be set and adjusted according to actual requirements, and is not limited to specific values, for example, the probability condition is set to have a co-occurrence probability greater than 0, that is, the co-occurrence probability between two candidate tags in the target tag group is greater than 0.

It can be understood that, because the co-occurrence probability between two candidate tags in the target tag group is greater than 0, and each node in the tag co-occurrence relationship diagram is obtained by performing deduplication processing on the candidate tags included in the target tag group, and then the connection relationship between each node in the tag co-occurrence relationship diagram can be determined and obtained based on the target tag group meeting the probability condition, that is, the connection needs to be established between two candidate tags having the co-occurrence probability greater than 0, and the connection does not need to be established between two candidate tags having the co-occurrence probability less than or equal to 0.

And step S410, constructing a label co-occurrence relation graph based on the nodes and the connection relation among the nodes.

Specifically, the candidate tags obtained after the deduplication processing are obtained by performing the deduplication processing on the candidate tags included in the target tag group, the candidate tags obtained after the deduplication processing are determined as nodes of the tag co-occurrence relationship graph, the connection relationship between the nodes can be determined based on the target tag groups, and the tag sharing relationship graph can be constructed and obtained based on the nodes and the connection relationship between the nodes.

In an embodiment, as shown in fig. 5, an initial co-occurrence relationship diagram established based on candidate tags and co-occurrence probabilities between the candidate tags is provided, and as can be seen from fig. 5, for example, there are 5 candidate tags including candidate tags 0.3, B, C, D, and E, the candidate tag group is AB, BD, BC, EC, and ED, and the co-occurrence probability between two candidate tags in the candidate tag group is calculated, specifically, the co-occurrence probability between a and B in the candidate tag group AB is 0.6, the co-occurrence probability between B and C in the candidate tag group BC is 0, the co-occurrence probability between B and D in the candidate tag group BD is 0.4, the co-occurrence probability between E and C in the candidate tag group EC is 0.3, and the co-occurrence probability between E and D in the candidate tag group ED is 0.8.

Further, based on the co-occurrence probability between two subsequent tags in each candidate tag group and each corresponding candidate tag group, an initial co-occurrence relationship diagram as shown in fig. 5 is constructed.

In an embodiment, as shown in fig. 6, a tag co-occurrence relationship diagram obtained after screening based on a probability condition is provided, and referring to fig. 6, it can be known that, as each candidate tag group needs to be further screened according to a preset probability condition, for example, if the probability condition is set to be that the co-occurrence probability is greater than 0, each candidate tag group includes AB, BD, BC, EC and ED according to the probability condition, and is screened to obtain a target tag group, that is, a target tag group including target tag groups AB, BD, EC and ED, where the co-occurrence probability between the included candidate tags is greater than 0.

Further, according to the target tag groups AB, BD, EC, and ED, the connection relationship between the candidate tags a, B, C, D, and E is determined, and the edge weight between two candidate tags having the connection relationship is set to 1, so as to obtain the tag co-occurrence relationship diagram shown in fig. 6.

In this embodiment, a plurality of candidate tag groups are obtained by combining each candidate tag pairwise, the number of co-occurrences of two candidate tags in each candidate tag group in the historical media data is obtained, and the co-occurrence probability of the candidate tags in the candidate tag groups is determined according to the number of co-occurrences. Further, the candidate labels included in the target label group with the co-occurrence probability meeting the probability condition in the candidate label group are subjected to de-duplication processing, the candidate labels obtained after de-duplication processing are determined as nodes of the label co-occurrence relation graph, the connection relation among the nodes is determined based on the target label group, and the label co-occurrence relation graph is constructed based on the nodes and the connection relation among the nodes. According to the method and the device, each node for constructing the tag co-occurrence relation graph and the target tag group for determining the connection relation between the nodes are obtained by screening according to the co-occurrence probability between the candidate tags and the preset probability condition, the association degree between the candidate tags is improved, meanwhile, the secondary updating of the primary updating feature of the candidate tags is further realized on the basis of the tag co-occurrence relation graph in the follow-up process, so that the association degree between the obtained secondary updating tag features is improved, and the association degree between the tags of the media data and the media data determined in the follow-up process is further improved.

In an embodiment, as shown in fig. 7, a training process of a label fusion model is provided, which specifically includes the following steps:

step S702, obtaining pre-labeled media data samples.

Specifically, the original tag fusion model is trained, pre-labeled media data samples need to be obtained, and the original tag fusion model is trained by taking the media data samples as a training sample set, so as to obtain a trained tag fusion model.

The media data sample may be a news data sample, or may be other types of data, such as page sharing data of a communication application platform and a social sharing platform, that is, data that includes title content and text content and is capable of performing operations such as tag processing and tag display may be pre-labeled to obtain the media data sample.

Further, the pre-labeling represents that the original label correlation degree between each candidate label and the text feature in the media data sample is obtained through pre-calculation on the basis of each media data sample, and the original label correlation degree is added to the corresponding media data sample.

Step S704 is to extract sample text features of the text and sample label features of each candidate label included in the text from each media data sample including the text.

Specifically, the media data sample may specifically include a title text and a content text, and the sample text features include a sample title text feature corresponding to the title text and a sample content text feature corresponding to the content text. And the candidate tags are specifically determined from the texts included in the media data, and when the texts included in the media data are the title text and the content text, the candidate tags specifically include tags obtained by tag identification of the title text and tags obtained by tag identification of the content text, and further by feature extraction of the tags in the title text and the tags in the content text, respective sample tag features of each candidate tag can be obtained.

Specifically, through a text feature extraction layer of an original label model, sample title text features, sample content text features and sample label features are extracted, so that the sample title text features, the sample content text features and sample label features of each candidate label contained in the text are obtained.

In one embodiment, after obtaining the sample title text feature and the sample content text feature of the text, the method further includes:

determining text feature similarity between the sample title text features and the sample content text features; determining a first feature weight matched with the text feature of the sample title and a second feature weight matched with the text feature of the sample content based on the text feature similarity; and according to the second characteristic weight, carrying out characteristic updating on the sample content text characteristic to obtain an updated sample content text characteristic.

Specifically, a first feature updating layer in the original label fusion model, namely a self-attention transformer network, is used for calculating text feature similarity between sample title text features and sample content text features, determining first feature weights for performing feature updating on the sample title text features and second feature weights for performing feature updating on the sample content text features according to the text feature similarity, further performing feature updating on the sample title text features according to the first feature weights to obtain updated sample title text features, and performing feature updating on the sample content text features according to the second feature weights to obtain updated sample content text features.

Step S706, based on the first feature similarity between the sample label features, respectively performing feature update on each sample label feature to obtain a primary updated sample label feature.

Specifically, for each target sample tag feature to be updated in each sample tag feature, data fusion is performed on the target sample tag feature and the first feature similarity between each remaining sample tag feature in each sample tag feature to obtain a third feature weight matched with each target sample tag feature, and based on the third feature weight matched with the target sample tag feature, feature updating is performed on the target sample tag feature to obtain a primary updated sample tag feature.

Further, specifically, a second feature updating layer in the initial label fusion model is used, that is, a self-attention transform network is used to calculate and obtain a first feature similarity between each target sample label feature and each of the rest sample label features in each sample label feature, and further, according to the self-attention transform network, averaging is performed on a plurality of first feature similarities corresponding to the target sample label features to achieve the purpose of data fusion, so that a third feature weight matched with each target sample label feature is obtained after the averaging is performed, and further, based on the third feature weight matched with each target sample label feature, feature updating is performed on each target sample label feature respectively to obtain a primarily updated sample label feature.

The network structure of the self-attention-using transform network characterized by the second feature update layer is consistent with that of the self-attention-using transform network characterized by the first feature update layer, but the network parameters of the two networks are different.

And step S708, performing secondary updating on the primary updated sample label features according to the co-occurrence relation of the candidate labels in the historical media data to obtain secondary updated sample label features.

Specifically, a label co-occurrence relation graph is constructed according to the co-occurrence relation among the candidate labels in the historical media data, a connection candidate label having a connection relation with the candidate label in the label co-occurrence relation graph is obtained for each candidate label in the label co-occurrence relation graph, and a third feature similarity between the sample label features of the connection candidate label and the sample label features of the candidate label is further determined. And further determining fourth feature weight matched with the primary updating sample label feature of the candidate label according to the third feature similarity matched with the candidate label, and performing secondary updating on the primary updating sample label feature of the candidate label based on the fourth feature weight matched with the primary updating sample label feature to obtain a secondary updating sample label feature.

Specifically, after a fourth feature weight matched with the label features of the primary update sample is determined through a graph network feature update layer of the original label model, the label features of the primary update sample are further subjected to secondary update according to the fourth feature weight, and the label features of the secondary update sample are obtained.

Step S710, determining a second feature similarity between each secondary update sample label feature and the sample text feature.

Specifically, the media data sample may specifically include a title text and a content text, and the sample text features specifically include sample title text features of the title text and sample content text features of the content text, and specifically, for each secondary update sample tag feature, data fusion is performed on a first sub-similarity between the secondary update sample tag feature and the update sample title text feature and a second sub-similarity between the secondary update sample tag feature and the update sample content text feature, so as to obtain a second feature similarity that each secondary update sample tag feature is respectively matched with.

In step S712, the predicted loss value corresponding to each second feature similarity is determined.

Specifically, in the training process of the original label fusion model, the loss value calculation needs to be performed on the output result obtained by the prediction of the model, that is, the second feature similarity, so as to obtain the prediction loss value corresponding to the second feature similarity. And when the model convergence condition is determined to be reached according to the prediction loss value, finishing the training of the original label fusion model to obtain a trained label fusion model.

In one embodiment, because the underlying model for obtaining the tags of the media data may frequently update versions, different tag sets may be output by the underlying models of different versions for the same media data, but when the tags are labeled, the labeled tag set may be the a set output by the underlying model version 1, but if the underlying model version is updated to version 2 and the tag set is updated to the B set, a difference exists between the a set and the B set. When an original label fusion model is trained, a network is usually trained by using a latest B set, but if newly added labels in the B set are not in the A set, the newly added labels are not labeled in advance, and the original label correlation degree is not carried.

If the new labels in the B set but not in the a set are directly deleted in the model training process, the contribution of the deleted new labels to the current sample label feature is lost when the feature is updated. And then aiming at the condition of the existence of the newly added label, two different prediction loss value calculation modes are respectively provided so as to solve the problem that the prediction loss value has errors due to the fact that the newly added label does not carry the correlation degree of the original label.

Specifically, under the condition that no newly added label is detected, the original label correlation degree pre-labeled based on each candidate label is obtained, and the prediction loss value is determined according to each second feature similarity degree and each original label correlation degree.

Specifically, the predicted loss value MSEloss is determined by using the following formula (5):

MSEloss＝(mean_score-label) ² ； (5)

wherein MSEloss represents a prediction loss value, mean _ score represents a second feature similarity, and label represents an original label correlation degree pre-labeled by a candidate label.

Further, under the condition that the newly added tags are detected and do not carry the corresponding original tag correlation degrees, the preset fixed value is determined as the prediction loss value corresponding to each newly added tag.

It can be understood that, in order to solve the problem of label training that the newly added label does not carry the original label correlation degree, when the predicted loss value mselos is calculated, for the newly added label that does not carry the original label correlation degree, a preset fixed value is determined as the predicted loss value corresponding to each newly added label.

Specifically, the preset fixed value may be 0, that is, for a newly added tag not carrying the original tag correlation degree, the predicted loss value nselos is not calculated, and the predicted loss value MSEliss is 0, so as to retain the complete B set and avoid deleting the newly added tag.

And step S714, finishing the training of the original label fusion model if the predicted loss value reaches the model convergence condition, and obtaining the trained label fusion model.

Specifically, whether the predicted loss value is smaller than the corresponding loss value threshold is judged by obtaining the loss value threshold corresponding to the model convergence condition and comparing the predicted loss value with the loss value threshold. When the predicted loss value is smaller than the loss value threshold value, the predicted loss value is shown to reach the model convergence condition, the training of the original label fusion model is completed, the trained label fusion model is obtained, otherwise, when the predicted loss value is larger than or equal to the loss value threshold value, the predicted loss value is shown to not reach the model convergence condition, and the model needs to be further trained.

In one embodiment, based on a pre-labeled evaluation data set, for the label fusion model obtained by training with different training strategies, the evaluation is performed on the accuracy and recall rate, which is specifically described in the following table 1 (i.e., a model effect comparison table based on different training strategies):

TABLE 1 model effect comparison table based on different training strategies

As can be seen from table 1, the "content side and the label side both use the bert network to obtain the original feature vector", which means that the content side does not use the transform network to obtain the feature vector, only uses the original feature vector obtained by the bert network (i.e., embtitle and embbcontent), and the label side does not use the transform network and the GAT network, and only uses the original feature vector obtained by the bert network (i.e., embtag); "the content-only side adds the transform network", which means that the content side uses the feature vectors obtained by the transform network (i.e. Emb _ Ttitle and Emb _ Tcontent), and the label side still uses the original feature vectors obtained by the bert network (i.e. Embtag); "only the tag side adds the transform network", which means that the content side still uses only the original feature vectors of the bert network (i.e. Embtitle and embbcontent), and the tag uses the feature vectors obtained by using the transform network (i.e. Emb _ Ttag); "the content side and the tag side are added with a transform network at the same time", which means that the content side and the tag side use the feature vectors obtained by the transform network (namely Emb _ Ttitle, emb _ Tcontent and Emb _ Ttag); "the content side and the tag side add a transform and the tag side adds a GAT network" simultaneously, which means that the content side uses the feature vectors (i.e., emb _ Ttitle, emb _ Tcontent) obtained by the transform network, and the tag side uses the feature vectors (i.e., emb _ Gtag) obtained by the transform network processing first and then the GAT network processing later.

Further, by setting the accuracy to 85%, under the condition that the accuracy is 85%, evaluating the model recall rate effect and the average label number under different training strategies, and referring to table 1, it can be known that under the condition that "both the content side and the label side only use the bert network to obtain the original feature vector", the model recall rate effect is the worst, and the average label number is the least, by adding training strategies such as the transformer network processing or the GAT network processing to the training strategies continuously introduced to the content side and the label side, the model recall rate effect and the average label number of the model are gradually improved, and further it can be known that the more complete the added training strategies are, the more the average label number corresponding to the trained model is, and the higher the recall rate is, that the better the effect of the trained model is, and further the higher the association degree of the label of the trained model screening obtained media data is, the situation that the user objects cannot accurately obtain effective information due to the low degree of the label association is avoided, and the situation that the personalized recommendation effect of each user object is recommended based on the media data is further improved.

In this embodiment, each pre-labeled media data sample is obtained, and from each media data sample containing a text, a sample text feature of the text and a sample label feature of each candidate label contained in the text are respectively extracted, and further, based on a first feature similarity between each sample label feature, feature updating is performed on each sample label feature, so as to obtain a first updated sample label feature. And further, according to the co-occurrence relation of each candidate label in the historical media data, performing secondary updating on the primary updating sample label characteristic to obtain a secondary updating sample label characteristic, and determining the second characteristic similarity of each secondary updating sample label characteristic and the sample text characteristic. And under the condition that the newly added label is detected and does not carry the corresponding original label correlation degree, determining a preset fixed value as the corresponding prediction loss value of each newly added label. By setting different determination modes of the prediction loss value for different situations of the existence of the new label, a complete label set can be reserved, the situation of deleting the new label is avoided, and when the prediction loss value is calculated, the specific value of the prediction loss value cannot be interfered by the new label without the original label correlation degree, so that the calculation error is reduced, and the calculation accuracy of the prediction loss value is guaranteed. And then when the predicted loss value reaches the model convergence condition, training of the original label fusion model is completed, the trained label fusion model is obtained, the training efficiency of the model is improved, and the resource consumption is reduced.

In an embodiment, as shown in fig. 8, an architecture diagram of a tag fusion model is provided, and the tag processing method in the embodiment of the present application is implemented by the tag fusion model shown in fig. 8, that is, a tag of media data is determined by the tag fusion model shown in fig. 8. As can be seen from fig. 8, the label fusion model is a network structure with two towers, two towers are used as input at the bottom layer, and the two towers are respectively a content side and a label side, wherein the content side is divided into a title text (i.e., title) and a content text (i.e., content) in the media data sample, and the label side is used for performing label identification on the title text and the content text respectively to obtain a plurality of candidate labels (i.e., tags) ₁ 、……tag _n )。

Firstly, in the training process of the model, the title text, the content text and the candidate label in the media data sample are all input in text form, and the text feature extraction layer in the label fusion model, namely the Bert network, can be utilized to extract the sample title text feature (namely Emb) of the title text _title ) Sample content text feature of content text (i.e., emb) _content ) And extracting the respective sample label feature (i.e., emb) of each candidate label _tag1 ，...，Emb _tagn )。

Secondly, as the title text and the content text are from the content side, the sample title text feature and the sample content text feature are subjected to self-attention interactive processing by utilizing a first feature updating layer of the label fusion model, namely, a self-attention transformer network is used, so that respective feature updating is realized, and an updated sample title text feature (namely, emb _ T) is obtained _title ) And updating sample content text features (i.e., emb _ T) _content )。

Specifically, a first feature weight matched with the sample title text feature and a second feature weight matched with the sample content text feature are determined by using a self-attention transducer network, the sample title text feature is further subjected to feature updating according to the first feature weight to obtain an updated sample title text feature, and the sample content text feature is subjected to feature updating according to the second feature weight to obtain an updated sample content text feature.

Similarly, for the sample label features of each candidate label, each sample label feature includes a plurality of target sample label features that need to be updated, and then a second feature update layer in the label fusion model is utilized, that is, a self-attention transform network is used to calculate and obtain a first feature similarity between each target sample label feature and each of the remaining sample label features in each sample label feature. The network structure of the self-attention-using transform network characterized by the second feature update layer is consistent with that of the self-attention-using transform network characterized by the first feature update layer, but the network parameters of the two networks are different.

Further, after the first feature similarity is obtained through calculation by using the self-attention transducer network, averaging is performed on a plurality of first feature similarities corresponding to the target sample label features by using the self-attention transducer network so as to achieve the purpose of data fusion, and then a third feature weight matched with each target sample label feature is obtained after averaging, so that each target sample label feature (namely Emb) is respectively subjected to the third feature weight matched with each target sample label feature (based on the third feature weight matched with each target sample label feature) _tag1 ，…，Emb _tagn ) Performing feature updating to obtain a primary update sample label feature (Emb _ T) _tag1 ，...，Emb_T _tagn )。

Thirdly, aiming at the initial update sample label characteristics after the label side is subjected to characteristic update, a graph network characteristic update layer of a label fusion model is utilized, specifically, a GAT network is utilized, and the initial update sample label characteristics (namely Emb _ T) are subjected to the initial update according to the co-occurrence relation of each candidate label in the historical media data _tag1 ，…，Emb_T _tagn ) Performing secondary updating to obtain secondary updating sample label characteristic (Emb _ G) _tag1 ，…，Emb_G _tagn )。

Before the primary update sample label features are updated for the second time by using a graph network feature update layer of a label fusion model, a label co-occurrence relation graph needs to be constructed, so that a connection candidate label with a connection relation between each candidate label and the candidate label in the label co-occurrence relation graph is obtained based on the label co-occurrence relation graph, and third feature similarity between the label features of the connection candidate labels and the label features of the candidate labels is determined. And determining fourth feature weight matched with the initial updated label feature of the candidate label according to the third feature similarity matched with the candidate label. And the fourth feature weight matched with the primary updated label feature is used for carrying out secondary updating on the primary updated label feature of the candidate label to obtain a secondary updated label feature.

Specifically, data analysis and statistics are performed on the historical media data including a plurality of tags to determine the co-occurrence relationship of each tag in the historical media data. The co-occurrence relation of each label in the historical media data is used for determining the co-occurrence frequency of every two candidate labels in the historical media data.

Further, a plurality of candidate tag groups are obtained by pairwise combination of the candidate tags, and the co-occurrence probability of the candidate tags in the candidate tag groups is determined by obtaining the co-occurrence frequency of the two candidate tags in each candidate tag group in the historical media data and further according to the co-occurrence frequency. The method comprises the steps of carrying out duplication elimination on candidate labels contained in a target label group of which the co-occurrence probability meets probability conditions in the candidate label group, determining the candidate labels obtained after the duplication elimination as nodes of a label co-occurrence relation graph, and determining the connection relation among the nodes based on the target label group, so that the label co-occurrence relation graph can be constructed and obtained based on the connection relation among the nodes and the nodes. Fourth, the exemplar label feature (i.e., emb _ G) is updated for each secondary update _tag1 ，...，Emb_G _tagn ) Each secondary update sample label feature and update sample title text feature (i.e., emb-T) need to be calculated separately _title ) First sub-similarity (i.e. score) between _tt ) And each of the secondary update tag feature and the update sample content text feature (i.e., emb _ T) _content ) Second degree of sub-similarity (i.e. score) therebetween _ct )。

Specifically, the data fusion between the first sub-similarity and the second sub-similarity is realized by calculating a mean value of the first sub-similarity and the second sub-similarity, so as to obtain a second feature similarity (mean _ score) matched with each secondary update label feature.

Fifthly, aiming at the situation that no new label is detected, determining a layer according to the loss value of the label fusion model, and pre-labeling the relevance of the original label (namely label) based on each candidate label ₁ ，...，label _n ) And each second feature similarity (i.e., mean score) ₁ ，...，mean_score _n ) Determining a predicted loss value (i.e., MSEloss) corresponding to each candidate tag ₁ ，…，MSEloss _n )。

And determining a preset fixed value as a prediction loss value corresponding to each newly added label under the condition that the newly added label is detected and does not carry the corresponding original label correlation.

And sixthly, judging whether the predicted loss value is smaller than the corresponding loss value threshold value or not by obtaining the loss value threshold value corresponding to the model convergence condition and comparing the predicted loss value with the loss value threshold value. When the predicted loss value is smaller than the loss value threshold value, the predicted loss value is shown to reach the model convergence condition, the training of the original label fusion model is completed, and the trained label fusion model is obtained.

Further, label processing is performed on media data including a text through a trained label fusion model to obtain labels of the media data, specifically, text features of the text and label features of candidate labels included in the text are respectively extracted from the media data including the text through the trained label fusion model, and feature updating is performed on the label features respectively based on first feature similarity between the label features to obtain initial updated label features. And then updating the primary updated label features for the second time according to the co-occurrence relation of the candidate labels in the historical media data to obtain secondary updated label features, and determining the target label of which the second feature similarity meets the similarity condition in the candidate labels as the label of the media data based on the second feature similarity of each secondary updated label feature and the text feature.

In the label processing method, the trained label processing model is utilized to respectively extract the text features of the text and the respective label features of each candidate label contained in the text from the media data containing the text, so that each label feature is respectively subjected to feature updating based on the first feature similarity between the label features to obtain the initial updated label features, and the correlation degree between the obtained initial updated label features is improved. And performing secondary updating on the primary updated label features according to the co-occurrence relation of each candidate label in the historical media data to obtain secondary updated label features so as to further improve the correlation degree between the obtained secondary updated label features. Finally, based on the second feature similarity between each secondary updated tag feature and the text feature, the target tag, of which the second feature similarity satisfies the similarity condition, in each candidate tag is determined as the tag of the media data, so that the association between the determined tag of the media data and the media data is improved, the situation that the user object cannot accurately acquire effective information due to low tag association is avoided, and the personalized recommendation effect when the user object is recommended based on the media data is further improved.

In an embodiment, as shown in fig. 9, a tag processing method is provided, which specifically includes the following steps:

step S901 identifies a title text and a content text of the media data, and extracts a title text feature of the title text and a content text feature of the content text.

Step S902, performing label identification on the title text and the content text, respectively, to obtain candidate labels, and extracting respective label features of each candidate label, where the candidate labels include a label in the title text and a label in the content text.

Step S903, determining the text feature similarity between the title text feature and the content text feature.

Step S904, determining a first feature weight matched with the title text feature and a second feature weight matched with the content text feature based on the text feature similarity.

Step S905, according to the first feature weight, performing feature updating on the title text feature to obtain an updated title text feature, and according to the second feature weight, performing feature updating on the content text feature to obtain an updated content text feature.

Step S906, aiming at each target label feature to be updated in each label feature, performing data fusion on the target label feature and the first feature similarity between each other label feature in each label feature to obtain a third feature weight matched with each target label feature.

And step S907, performing feature updating on the target label features based on the third feature weight matched with the target label features to obtain the initial updated label features.

Step S908, combining every two candidate tags to obtain a plurality of candidate tag groups, and obtaining the co-occurrence frequency of the two candidate tags in each candidate tag group in the historical media data.

In step S909, the co-occurrence probability of the candidate tags in the candidate tag group is determined according to the co-occurrence number.

Step S910, performing deduplication processing on candidate tags contained in a target tag group of which the co-occurrence probability meets the probability condition in the candidate tag group, determining the candidate tags obtained after the deduplication processing as nodes of a tag co-occurrence relation graph, and determining the connection relation among the nodes based on the target tag group.

And step S911, constructing a label co-occurrence relation graph based on the nodes and the connection relation among the nodes.

Step S912, for each candidate tag in the tag co-occurrence relationship diagram, obtaining a connection candidate tag having a connection relationship with the candidate tag in the tag co-occurrence relationship diagram, and determining a third feature similarity between the tag feature of the connection candidate tag and the tag feature of the candidate tag.

Step S913, determining a fourth feature weight matched with the first updated tag feature of the candidate tag according to the third feature similarity matched with the candidate tag.

Step S914, based on the fourth feature weight matched with the primary updated label feature, performing secondary updating on the primary updated label feature of the candidate label to obtain a secondary updated label feature.

Step S915, based on the second feature similarity of each secondary updated label feature and the text feature, determining the target label of each candidate label whose second feature similarity meets the similarity condition as the label of the media data.

In the tag processing method, the text features of the text and the respective tag features of each candidate tag included in the text are respectively extracted from the media data including the text, so that each tag feature is respectively subjected to feature updating based on the first feature similarity between the tag features to obtain the primarily updated tag features, and the correlation degree between the obtained primarily updated tag features is improved. And performing secondary updating on the primary updated label features according to the co-occurrence relation of each candidate label in the historical media data to obtain secondary updated label features so as to further improve the correlation degree between the obtained secondary updated label features. Finally, the target label of which the second feature similarity meets the similarity condition in each candidate label is determined as the label of the media data based on the second feature similarity of each secondary update label feature and the text feature, so that the association degree between the determined label of the media data and the media data is improved, the condition that the effective information of the user cannot be accurately obtained due to low label association degree is avoided, and the personalized recommendation effect when the user is recommended based on the media data is further improved.

It should be understood that, although the steps in the flowcharts related to the above embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.

Based on the same inventive concept, the embodiment of the present application further provides a label processing apparatus for implementing the above-mentioned label processing method. The implementation scheme for solving the problem provided by the apparatus is similar to the implementation scheme described in the method, so specific limitations in one or more embodiments of the tag processing apparatus provided below may refer to the limitations on the tag processing method in the foregoing, and details are not described here.

In one embodiment, as shown in fig. 10, there is provided a label processing apparatus including: an extraction module 1002, a primary update module 1004, a secondary update module 1006, and a tag screening module 1008, wherein:

an extracting module 1002, configured to extract text features of the text and tag features of each candidate tag included in the text from media data including the text, respectively.

A primary update module 1004, configured to perform feature update on each tag feature respectively based on a first feature similarity between the tag features to obtain a primary updated tag feature

And a secondary updating module 1006, configured to perform secondary updating on the primarily updated tag feature according to a co-occurrence relationship of each candidate tag in the historical media data, so as to obtain a secondary updated tag feature.

And the tag screening module 1008 is configured to determine, based on the second feature similarity between each secondary updated tag feature and the text feature, a target tag of which the second feature similarity satisfies the similarity condition in each candidate tag as a tag of the media data.

In the tag processing device, the text features of the text and the respective tag features of each candidate tag included in the text are respectively extracted from the media data including the text, so that each tag feature is respectively subjected to feature updating based on the first feature similarity between the tag features to obtain the initial updated tag features, and the correlation degree between the obtained initial updated tag features is improved. And performing secondary updating on the primary updated label features according to the co-occurrence relation of each candidate label in the historical media data to obtain secondary updated label features so as to further improve the correlation degree between the obtained secondary updated label features. Finally, based on the second feature similarity between each secondary updated tag feature and the text feature, the target tag, of which the second feature similarity satisfies the similarity condition, in each candidate tag is determined as the tag of the media data, so that the association between the determined tag of the media data and the media data is improved, the situation that the user object cannot accurately acquire effective information due to low tag association is avoided, and the personalized recommendation effect when the user object is recommended based on the media data is further improved.

In one embodiment, a tag processing apparatus is provided, which further includes a second feature similarity determination module configured to: determining text feature similarity between at least two text features; respectively updating the characteristics of the at least two text characteristics according to the similarity of the text characteristics to obtain a text characteristic combination comprising the updated at least two text characteristics; and respectively determining second feature similarity between each secondary updating label feature and the text feature combination.

In one embodiment, the second feature similarity determination module is further configured to: for each secondary update label feature in the secondary update label features, determining the sub-similarity between the label feature and each text feature in the text feature combination; and performing data fusion on the sub-similarity corresponding to the same secondary updated label characteristic, and determining a result obtained by the data fusion as a second characteristic similarity between the secondary updated label characteristic and the text characteristic combination.

In one embodiment, the second feature similarity determination module is further configured to: respectively updating the features of the title text features and the content text features based on the text feature similarity between the title text features and the content text features to obtain updated title text features and updated content text features; and performing data fusion on the first sub-similarity between the secondary updated tag feature and the updated title text feature and the second sub-similarity between the secondary updated tag feature and the updated content text feature aiming at each secondary updated tag feature in the secondary updated tag features to obtain the second feature similarity matched with each secondary updated tag feature.

In one embodiment, the extraction module is further configured to: identifying title text and content text of the media data; extracting title text characteristics of the title text and content text characteristics of the content text; respectively carrying out label identification on the title text and the content text to obtain candidate labels, wherein the candidate labels comprise labels in the title text and labels in the content text; and extracting the respective label features of each candidate label.

In one embodiment, the second feature similarity determination module is further configured to: determining text feature similarity between the title text feature and the content text feature; determining a first feature weight matched with the title text feature and a second feature weight matched with the content text feature based on the text feature similarity; and according to the second characteristic weight, performing characteristic updating on the content text characteristic to obtain an updated content text characteristic.

In one embodiment, the primary update module is further configured to: performing data fusion on the target label features and the first feature similarity between the rest of each label feature in each label feature aiming at each target label feature to be updated in each label feature to obtain a third feature weight matched with each target label feature; and performing feature updating on the target label feature based on the third feature weight matched with the target label feature to obtain a primary updated label feature.

In one embodiment, the secondary update module is further configured to: constructing a tag co-occurrence relation graph according to the co-occurrence relation among the candidate tags in the historical media data; aiming at each candidate label in the label co-occurrence relation graph, obtaining a connection candidate label which has a connection relation with the candidate label in the label co-occurrence relation graph, and determining third feature similarity between the label feature of the connection candidate label and the label feature of the candidate label; determining fourth feature weight matched with the initial updated label features of the candidate labels according to the third feature similarity matched with the candidate labels; and performing secondary updating on the primary updated label features of the candidate labels based on the fourth feature weight matched with the primary updated label features to obtain secondary updated label features.

In one embodiment, the secondary update module is further configured to: combining every two candidate tags to obtain a plurality of candidate tag groups; acquiring the co-occurrence frequency of two candidate tags in each candidate tag group in historical media data; determining the co-occurrence probability of the candidate tags in the candidate tag group according to the co-occurrence times; performing de-duplication processing on candidate tags contained in a target tag group of which the co-occurrence probability meets the probability condition in the candidate tag group, determining the candidate tags obtained after de-duplication processing as nodes of a tag co-occurrence relation graph, and determining the connection relation among the nodes based on the target tag group; and constructing a label co-occurrence relation graph based on the nodes and the connection relation among the nodes.

In one embodiment, a tag processing apparatus is provided, which further includes a target similarity screening module, configured to: determining a corresponding preset similarity threshold according to a preset similarity condition, and screening the second characteristic similarities to obtain a target similarity greater than the preset similarity threshold; or sequencing the second feature similarity according to the similarity value to obtain a corresponding feature similarity sequence, and screening out the preset target similarity from the feature similarity sequence according to a preset similarity condition.

In one embodiment, a tag processing apparatus is provided, which further includes a tag fusion model training module, configured to: acquiring various media data samples labeled in advance; respectively extracting sample text features of the text and sample label features of each candidate label contained in the text from each media data sample containing the text; respectively updating the characteristics of each sample label characteristic based on the first characteristic similarity among the sample label characteristics to obtain a primary updated sample label characteristic; according to the co-occurrence relation of each candidate label in the historical media data, carrying out secondary updating on the primary updating sample label characteristic to obtain a secondary updating sample label characteristic; determining the similarity between each secondary updating sample label feature and a second feature of the sample text feature; determining a prediction loss value corresponding to each second feature similarity; and if the predicted loss value reaches the model convergence condition, finishing the training of the original label fusion model to obtain the trained label fusion model.

In one embodiment, the tag fusion model training module is further configured to: under the condition that a newly added label is not detected, obtaining the relevance of an original label pre-labeled based on each candidate label; determining a prediction loss value according to the second feature similarity and the original label correlation; and under the condition that the newly added tags are detected and do not carry the corresponding original tag correlation degrees, determining a preset fixed value as a prediction loss value corresponding to each newly added tag.

The modules in the above tag processing apparatus may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent of a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 11. The computer device includes a processor, a memory, an Input/Output interface (I/O for short), and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing media data containing texts, text features of the texts, candidate tags, tag features of the candidate tags, first feature similarity, primary updated tag features, co-occurrence relations of the candidate tags in historical media data, secondary updated tag features, second feature similarity, target tags, tags of the media data and the like. The input/output interface of the computer device is used for exchanging information between the processor and an external device. The communication interface of the computer device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a tag processing method.

Those skilled in the art will appreciate that the architecture shown in fig. 11 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

It should be noted that the object information (including but not limited to the device information of the object, the corresponding personal information, etc.) and data (including but not limited to the data for analysis, the stored data, the displayed data, etc.) referred to in the present application are information and data authorized by the object or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the relevant laws and regulations and standards of the relevant country and region.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include a Read-Only Memory (ROM), a magnetic tape, a floppy disk, a flash Memory, an optical Memory, a high-density embedded nonvolatile Memory, a resistive Random Access Memory (ReRAM), a Magnetic Random Access Memory (MRAM), a Ferroelectric Random Access Memory (FRAM), a Phase Change Memory (PCM), a graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims

1. A method of tag processing, the method comprising:

2. The method of claim 1, wherein the text comprises at least two types of text, and wherein the text features comprise at least two text features that correspond one-to-one to the at least two types of text, the method further comprising:

determining text feature similarity between the at least two text features;

respectively updating the characteristics of the at least two text characteristics according to the text characteristic similarity to obtain a text characteristic combination comprising the updated at least two text characteristics;

and respectively determining second feature similarity between each secondary updated label feature and the text feature combination.

3. The method of claim 2, wherein the separately determining a second feature similarity between each of the twice-updated label features and the text feature combination comprises:

for each secondary update label feature in the secondary update label features, determining a sub-similarity between the label feature and each text feature in the text feature combination;

and performing data fusion on the sub-similarities corresponding to the same secondary updated label feature, and determining a result obtained by the data fusion as a second feature similarity between the secondary updated label feature and the text feature combination.

4. The method of claim 1, wherein the media data comprises a title text and a content text, and wherein the text features comprise a title text feature of the title text and a content text feature of the content text;

the method further comprises the following steps:

respectively updating the title text features and the content text features based on the text feature similarity between the title text features and the content text features to obtain updated title text features and updated content text features;

and for each secondary update label feature in the secondary update label features, performing data fusion on the first sub-similarity between the secondary update label feature and the update header text feature and the second sub-similarity between the secondary update label feature and the update content text feature to obtain a second feature similarity matched with each secondary update label feature.

5. The method according to claim 4, wherein the extracting text features of the text and respective tag features of each candidate tag included in the text from the media data including the text respectively comprises:

identifying title text and content text of the media data;

extracting title text characteristics of the title text and content text characteristics of the content text;

respectively performing label identification on the title text and the content text to obtain candidate labels, wherein the candidate labels comprise labels in the title text and labels in the content text;

extracting the respective label features of each candidate label.

6. The method of claim 4, wherein the performing feature updates on the headline text feature and the content text feature respectively based on the text feature similarity between the headline text feature and the content text feature to obtain an updated headline text feature and an updated content text feature comprises:

determining text feature similarity between the title text features and the content text features;

determining a first feature weight matched with the title text feature and a second feature weight matched with the content text feature based on the text feature similarity;

and according to the second characteristic weight, performing characteristic updating on the content text characteristic to obtain an updated content text characteristic.

7. The method according to any one of claims 1 to 6, wherein the performing feature update on each of the tag features respectively based on the first feature similarity between the tag features to obtain an initial updated tag feature comprises:

performing data fusion on the target label features and the first feature similarity between the rest of each label feature in each label feature aiming at each target label feature to be updated in each label feature to obtain a third feature weight matched with each target label feature;

and performing feature updating on the target label feature based on the third feature weight matched with the target label feature to obtain a primary updated label feature.

8. The method as claimed in any one of claims 1 to 6, wherein performing a secondary update on the primarily updated tag feature according to a co-occurrence relationship of each candidate tag in historical media data to obtain a secondary updated tag feature, comprises:

constructing a tag co-occurrence relation graph according to the co-occurrence relation among the candidate tags in the historical media data;

for each candidate label in the label co-occurrence relationship graph, obtaining a connection candidate label which has a connection relationship with the candidate label in the label co-occurrence relationship graph, and determining a third feature similarity between the label features of the connection candidate label and the label features of the candidate label;

determining fourth feature weight matched with the primarily updated tag feature of the candidate tag according to the third feature similarity matched with the candidate tag;

and updating the primary updated label features of the candidate labels for the second time based on the fourth feature weight matched with the primary updated label features to obtain secondary updated label features.

9. The method of claim 8, wherein constructing a tag co-occurrence relationship graph according to co-occurrence relationships between the candidate tags in the historical media data comprises:

combining every two candidate tags to obtain a plurality of candidate tag groups;

acquiring the co-occurrence times of two candidate tags in each candidate tag group in the historical media data;

determining the co-occurrence probability of the candidate tags in the candidate tag group according to the co-occurrence times;

performing deduplication processing on candidate tags contained in a target tag group of which the co-occurrence probability meets probability conditions in the candidate tag group, determining the candidate tags obtained after the deduplication processing as nodes of the tag co-occurrence relation graph, and determining the connection relation among the nodes based on the target tag group;

and constructing a label co-occurrence relation graph based on the nodes and the connection relation among the nodes.

10. The method of claim 1, wherein the labels of the media data are determined according to a trained label fusion model; the training process of the label fusion model comprises the following steps:

acquiring various media data samples labeled in advance;

respectively extracting sample text features of a text and sample label features of each candidate label contained in the text from each media data sample containing the text;

respectively performing feature updating on each sample label feature based on first feature similarity among the sample label features to obtain a primary updated sample label feature;

according to the co-occurrence relation of each candidate label in the historical media data, carrying out secondary updating on the primary updating sample label characteristic to obtain a secondary updating sample label characteristic;

determining second feature similarity of each secondary updating sample label feature and the sample text feature;

determining a prediction loss value corresponding to each of the second feature similarities;

and if the predicted loss value reaches the model convergence condition, finishing the training of the original label fusion model to obtain the trained label fusion model.

11. The method of claim 10, wherein determining a predicted loss value corresponding to each of the second feature similarities comprises:

under the condition that a newly added label is not detected, obtaining the correlation degree of an original label pre-labeled based on each candidate label; determining a prediction loss value according to the second feature similarity and the original label correlation;

and under the condition that the newly added tags are detected and do not carry corresponding original tag correlation degrees, determining a preset fixed value as a prediction loss value corresponding to each newly added tag.

12. A label processing apparatus, characterized in that the apparatus comprises:

and the label screening module is used for determining a target label, of the candidate labels, of which the second feature similarity meets a similarity condition as the label of the media data based on the second feature similarity of each secondary updated label feature and the text feature.

13. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 11 when executing the computer program.

14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 11.

15. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the method of any one of claims 1 to 11.