CN114297390A

CN114297390A - Aspect category identification method and system under long-tail distribution scene

Info

Publication number: CN114297390A
Application number: CN202111681644.8A
Authority: CN
Inventors: 陆恒杨; 方伟; 聂玮; 孙俊; 吴小俊
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2022-04-08
Anticipated expiration: 2041-12-30
Also published as: CN114297390B

Abstract

The invention discloses an aspect category identification method and system under a long-tail distribution scene, and belongs to the technical field of natural language processing. The method is based on an aspect category identification system under a long-tail distribution scene, the system focuses on the long-tail distribution characteristics of data, firstly obtains fine-grained aspect feature vectors of sentences and provides additional context aspect-level semantic information; then, an attention mechanism fusing context aspect-level semantic information based on long-tail distribution is added, the capability of a model for capturing information most relevant to the aspect categories is enhanced, meanwhile, an improved distribution balance loss function is provided to solve the problems of label co-occurrence and negative category advantages in a long-tail multi-label text classification task, and the aspect category identification effect with the long-tail distribution characteristic is effectively improved.

Description

Aspect category identification method and system under long-tail distribution scene

Technical Field

The invention relates to an aspect category identification method and system under a long tail distribution scene, and belongs to the technical field of natural language processing.

Background

Aspect Category Detection (ACD), one of the important subtasks of Aspect level emotion analysis, aims to detect an Aspect Category contained in a sentence from a set of predefined Aspect categories. Aspect category identification is the fundamental task of whole aspect level sentiment analysis. The emotion analysis has wide application in various fields of life, for example, emotion analysis aiming at opinions of users on various topics expressed in social media, restaurant evaluation, online shopping and the like can help users to have better consumption experience, and can help merchants to know market demands.

However, in actual research, the aspect category distribution often presents the characteristics of unbalanced or even long-tail distribution, so that the model cannot sufficiently extract the features of the tail aspect category, which brings great challenges to the aspect category identification task.

Some existing work addresses this problem with classical machine learning models or deep learning models. For example, Ghadery, E et al (Ghadery, E., et al, MNCN: A Multilingual N gram-Based conditional Network for Aspect Category Detection in Online reviews.2019.33: p.6441-6448.) embed Multilingual words as input to the Network, extract features using a deep Convolutional neural Network, and then learn and identify different facet classes using different fully connected layers, respectively. Hu, M et al (Hu, M., et al, CAN: structured Attention Networks for Multi-Aspect Sentiment analysis.2018.) introduce sparse regularization and orthogonal regularization to compute Attention weights for multiple aspects. This allows the attention weight of multiple aspects to be focused on different parts, while the attention weight of each aspect is focused on only a few words. Movahedi, s. et al (Movahedi, s., et al, Aspect Category Detection view-Attention network.2019.) propose a Topic Attention network model that can detect Aspect categories of a given sentence by paying Attention to different parts of the sentence. Li, Y. et al (Li, Y., et al. Multi-instant Multi-Label Learning Networks for the Aspect-Category analysis. in the processes of the 2020 Conference on electronic Methods in Natural Language Processing (EMNLP).2020.) propose a joint model of Aspect emotion analysis for Multi-Instance Multi-Label Learning, where the attention-based ACD generates significant attention weights for different Aspect classes.

However, the distribution of data collected from an actual scene is often unbalanced, and even presents the feature of a long tail distribution, i.e. a few classes (also called head classes) occupy most of the data, while most of the classes (also called tail classes) have few samples. The above-mentioned prior art methods neglect such a sample number gap when training the model. Too large difference of the numbers of training samples of different classes can make the model unable to achieve good effect on the identification of the class with limited number of samples. The aspect categories are unbalanced, and even the caused long tail distribution can influence the learning process, so that the recognition effect is poor.

Disclosure of Invention

The invention provides an aspect type identification method and system under a long-tail distribution scene, and aims to solve the problem that identification of a model on an aspect type with a limited number of samples cannot achieve a good effect due to the fact that the number difference of training samples of different types is too large caused by long-tail distribution at present.

The first purpose of the invention is to provide an aspect category identification method under a long tail distribution scene, which is characterized in that the method is used for data sets

The N sentences of (1) performs aspect category identification, wherein S_l＝{w₁,w₂,...,w_nIs the ith sentence in the data set D, consisting of n words, w_nRepresents the l-th sentence S_lThe nth word;

is the l-th sentence S_lA corresponding aspect category label;

the method comprises the following steps:

step 1: defining m aspect classes in advance, and using A ═ a₁,a₂,…,a_mDenotes wherein a_mTo describe a word or phrase in the mth aspect,

step 2: constructing a word embedding matrix E₁∈R^|V|×dEach word w_iEmbedding matrix E by the words₁Is mapped as

Where | V | is the size of all words in the data set D, and D is the dimension of a word vector;

simultaneous construction of facet class embedding matrices E₂∈R^m×dEach word a_iEmbedding matrix E by the aspect class₂Is mapped as

Separately deriving text-embedded vectors

Sum aspect embedding vector

And step 3: embedding the text into a vector

Embedding vectors with the aspect

Inputting the sentence into a long-time memory network LSTM to obtain the network output hidden state of the sentence

And

and 4, step 4: the hidden state H_wAnd H_aInputting the data into an IAN-LoT mechanism to obtain a total aspect vector representation s fusing long-tail distribution characteristics;

and 5: inputting the total aspect vector s into an attention mechanism that fuses contextual aspect-level semantic information;

summing the total aspect vector s with

As input, a fusion vector is calculated

As shown in equation (1):

wherein W ∈ R^n×1Is a learnable weight parameter that blends each word with an aspect,

vector representing semantic information of the fusion context aspect level, will

Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (2), for the jth aspect class:

wherein W_j∈R^d×d,b_j∈R^dAnd u_j∈R^dFor a learnable parameter, β ∈ R^1×mRepresenting the long tail distribution characteristic learned in advance, and is the reciprocal number of effective samples in the training set, alpha_j∈RⁿIs an attention weight vector;

step 6: using vectors

As a predicted sentence representation, for the jth aspect class, as shown in equation (3):

wherein, W_j∈R^d×1，b_jIs a scalar quantity of the input signals,

and when the prediction result of the jth aspect category is greater than the classification threshold value, the sentence is considered to contain the jth aspect category.

Optionally, the step of calculating a total aspect vector of the fused long-tail distribution features in the IAN-LoT mechanism includes:

step 41: hidden state for input H_wAnd H_aCalculating an interaction attention weight matrix I e R^n×mAs shown in equation (4):

step 42: performing softmax calculations for each row of the interaction attention weight matrix, as shown in equation (5):

wherein k is_ijFor the matrix k ∈ R^n×mI row and j column of (a), k represents the attention weight of the text to the aspect, I_ijIs the element of ith row and jth column in the matrix I;

step 43: and then introducing a data long tail distribution characteristic for the matrix k, as shown in formula (6):

wherein the content of the first and second substances,

weight information for each aspect for text introducing long tail distribution, beta epsilon R^1×mRepresenting the long tail distribution characteristics learned in advance, and being the reciprocal number of the effective samples in the training set, wherein m is the number of the aspect categories;

step 44: for the

Performing maximum pooling to obtain fine-grained text-to-aspect weight information I blended with long-tail distribution characteristics_LFurther, the weight information and the embedded vector representation of the aspect category are expressed

The multiplication results in the final overall aspect vector representation s, as shown in equation (7):

wherein s ∈ R^1×d。

Optionally, the method trains the recognition model by using an improved a-DB loss function, where the improved a-DB loss function improves a calculation method of the rebalancing weight and a smoothing function, and specifically includes:

first, without considering tag co-occurrence,

representing the number of samples containing the jth aspect category in the data set; the expected value of the sampling frequency for the jth aspect class is

The sample sampling frequency P is then estimated for each of the positive classes of repeated samples contained in the example^IAs shown in equation (8):

wherein the content of the first and second substances,

when in use

Indicating that the ith sentence contains the jth aspect category a_j，

If not, then not included;

weight balancing weights

The calculation is shown in equation (9):

wherein gamma is a coordination weight hyperparameter;

the smoothing function is to

The formula of the mapping is:

in order to avoid the over-suppression of a few classes caused by the advantages of negative labels, a negative class suppression hyper-parameter lambda and a specific bias tau are introduced_jAs shown in formula (11):

where ρ is_jIs the ratio of the jth category to the total number of samples, and eta is a proportion hyper-parameter;

the A-DB loss function is shown in equation (12):

wherein the content of the first and second substances,

a probability value output for the network.

Optionally, the classification threshold is 0.5.

A second object of the present invention is to provide an aspect category identification system in a long tail distribution scenario, wherein the system includes: the system comprises an input module, a text embedding module, an LSTM module, an IAN-LOT module, a fusion module, an attention mechanism module and a prediction module;

the input module, the text embedding module, the LSTM module, the IAN-LOT module, the fusion module, the attention mechanism module and the prediction module are sequentially connected;

the input module is used for inputting a predefined aspect category combination and a text to be recognized; the text aspect embedding module is used for constructing a word embedding matrix and an aspect category embedding matrix, and mapping an input predefined aspect category combination and a text to be recognized to a text embedding vector and an aspect embedding vector; the LSTM module is used for outputting the hidden states of the text embedding vector and the aspect embedding vector; the IAN-LOT module is used for obtaining a total aspect vector fusing long tail distribution characteristics according to the hidden state; the fusion module is used for fusing context level semantic information and generating a fusion vector; the attention mechanism module generates an attention weight vector for each predefined aspect category according to the fusion vector; the prediction module is used for completing classification and prediction of aspect class identification according to the attention weight vector.

Optionally, the system pairs the data sets

The work process of recognizing the aspect category of the N sentences comprises the following steps:

wherein S is_l＝{w₁,w₂,…,w_nIs the ith sentence in the data set D, consisting of n words, w_iRepresents the l-th sentence S_lThe ith word;

is the l-th sentence S_lA corresponding aspect category label;

Separately deriving text-embedded vectors

Sum aspect embedding vector

And step 3: embedding the text into a vector

Embedding vectors with the aspect

And

summing the total aspect vector s with

As input, a fusion vector is calculated

As shown in equation (1):

step 6: using vectors

wherein, W_j∈R^d×1，b_jIs a scalar quantity of the input signals,

wherein the content of the first and second substances,

step 44: for the

wherein s ∈ R^1×d。

Optionally, the system trains the recognition model by using an improved a-DB loss function, where the improved a-DB loss function improves a calculation method of the rebalancing weight and a smoothing function, and specifically includes:

first, without considering tag co-occurrence,

wherein the content of the first and second substances,

when in use

Indicating that the ith sentence contains the jth aspect category a_j，

If not, then not included;

weight balancing weights

The calculation is shown in equation (9):

wherein gamma is a coordination weight hyperparameter;

the smoothing function is to

The formula of the mapping is:

the A-DB loss function is shown in equation (12):

wherein the content of the first and second substances,

a probability value output for the network.

The invention has the beneficial effects that:

aiming at the aspect category identification problem which is characterized by long-tail distribution data, the invention models the aspect category identification into a multi-label classification problem and provides an aspect category identification model based on the fusion aspect vector of long-tail distribution;

1) an A-DB loss function suitable for the multi-label long-tail distribution problem is introduced to train a recognition model, so that the method can be effectively suitable for data with the long-tail distribution characteristic, and the aspect category recognition effect with the long-tail distribution characteristic is improved;

2) an interactive attention module with the characteristic of data long-tail distribution, namely an IAN-LoT mechanism is introduced, and the mechanism introduces the characteristic of data long-tail distribution, can obtain feature vectors in the aspect of fine granularity of sentences, provides additional context aspect-level semantic information, and can effectively improve the recognition effect of tail categories;

3) after the fine-grained aspect feature vectors are obtained, the invention also provides an attention mechanism for fusing the context aspect-level semantic information, the aspect vectors and the context information are fused, the relevant correct information of the aspect can be focused, the capability of capturing the most relevant information of the aspect category by the model is enhanced, and the model is more effective on the aspect category identification task of long-tail distribution.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a model architecture diagram of the present invention.

Fig. 2 is a flow chart of the method of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

The first embodiment is as follows:

the embodiment provides an aspect category identification method under a long tail distribution scene, and the method is used for identifying the category of a data set

The N sentences of (1) performs aspect category identification, wherein S_l＝{w₁,w₂,…,w_nIs the ith sentence in the data set D, consisting of n words, w_nRepresents the l-th sentence S_lThe nth word;

is the l-th sentence S_lA corresponding aspect category label;

the method comprises the following steps:

Separately deriving text-embedded vectors

Sum aspect embedding vector

And step 3: embedding the text into a vector

Embedding vectors with the aspect

And

summing the total aspect vector s with

As input, a fusion vector is calculated

As shown in equation (1):

step 6: using vectors

wherein, W_j∈R^d×1，b_jIs a scalar quantity of the input signals,

Example two:

the method of the embodiment introduces an interactive attention module with the characteristic of data long-tail distribution, namely an IAN-LoT mechanism, and can obtain fine-grained aspect feature vectors of sentences and provide additional context aspect-level semantic information. After obtaining the fine-grained aspect feature vectors, the model also adds an attention mechanism based on long-tail distribution and fusing context-level semantic information, enhances the capability of the model to capture information most relevant to aspect categories, and is trained by using an improved multi-label classification loss function suitable for long-tail distribution.

This embodiment is for data sets

is the l-th sentence S_lA corresponding aspect category label;

the method comprises the following steps:

Separately deriving text-embedded vectors

Sum aspect embedding vector

And step 3: embedding the text into a vector

Embedding vectors with the aspect

And

step 41: hidden state for input H_wAnd H_aCalculating an interaction attention weight matrix I e R^n×mAs shown in equation (1):

step 42: performing softmax calculations for each row of the interaction attention weight matrix, as shown in equation (2):

step 43: and then introducing a data long tail distribution characteristic for the matrix k, as shown in formula (3):

wherein the content of the first and second substances,

step 44: for the

The multiplication results in the final overall aspect vector representation s, as shown in equation (4):

wherein s ∈ R^1×d。

summing the total aspect vector s with

As input, a fusion vector is calculated

As shown in equation (5):

Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (6), for the jth aspect class:

step 6: using vectors

As a predicted sentence representation, for the jth aspect class, as shown in equation (7):

wherein, W_j∈R^d×1，b_jIs a scalar quantity of the input signals,

for the prediction result of the jth aspect category, when the prediction result is greater than the classification threshold value of 0.5, the sentence is considered to contain the jth aspect category.

The method of this embodiment trains the recognition model by using an improved a-DB loss function, which improves the calculation method of the rebalancing weight and the smoothing function, and specifically includes:

first, without considering tag co-occurrence,

wherein the content of the first and second substances,

when in use

Indicating that the ith sentence contains the jth partyClass a of surface_j，

If not, then not included;

weight balancing weights

The calculation is shown in equation (9):

wherein gamma is a coordination weight hyperparameter;

the smoothing function is to

The formula of the mapping is:

the A-DB loss function is shown in equation (12):

wherein the content of the first and second substances,

a probability value output for the network.

Example three:

the embodiment provides an aspect category identification system under a long tail distribution scene, the system comprising: the system comprises an input module, a text embedding module, an LSTM module, an IAN-LOT module, a fusion module, an attention mechanism module and a prediction module;

For example, in a restaurant review, "When we sat down, the waiter barly spoken in the outer direction and abrupply overlapped outer menu on the table," we will classify this sentence and a predefined set of aspects: food, staff, miscellaneous, place, service, menu, price, and ambiance are input into the model shown in fig. 1, and a set of results can be predicted by using the model, wherein 1 is that the sentence includes the corresponding aspect type, and 0 is that the sentence does not include the corresponding aspect type, as shown in table 1:

Category	food	staff	miscellaneous	place	service	menu	price	ambience
									label	0	1	0	0	0	1	0	0

to further illustrate the beneficial effects that the present invention can achieve, the following experiments were performed:

the present invention uses 6 baseline methods for comparison:

(1) aspect category identification model:

TextCNN [34 ]: the method for classifying the text by utilizing the convolutional neural network is a more basic model;

LSTM [24 ]: training by using an LSTM network, and classifying by taking the last hidden state as a final form;

SVR [35 ]: combining word vectors of a sentence into a vector as input, and classifying by using a machine learning classifier;

SCAN [36 ]: an aspect category emotion analysis method based on a sentence component perception network.

(2) The ACD task and ACSA task combined training model comprises the following steps:

AS-Capsules [37 ]: a method for performing aspect category sentiment analysis by sharing components using correlation between aspect categories and sentiments;

AC-MIMLLN [6 ]: a joint model of multi-instance multi-label learning aspect emotion analysis is presented, where an attention-based ACD generates valid attention weights for different aspect categories. We compare for such models using their predicted results of ACD.

Table 2 shows a comparison of the results of the Macro F1 experiments on the ACD task by the method of the present invention and other methods of comparison on MAMS-LT and SemEval2014-LT data sets.

Table 2 data comparison results

In order to better research the problem of Long-tail Distribution, the invention refers to a method for creating a Multi-Label image Dataset satisfying the Long-tail Distribution by referring to (Wu, T.et al., Distribution-Balanced Loss for Multi-Label Classification in Long-labeled Datasets.2020: Computer Vision-ECCV 2020.), and an existing SemEval-2014 Task 4(Pontiki, M.et al., SemEval-Task 4: assembled basic analysis.2014.) and a MAMS Dataset (Jiang, Q.A. Challege Dataset and influence for batch analysis in Long-tail Distribution of Distribution 2019. meeting the requirements of the Emulation-topic analysis of the Long-tail Distribution and the load of the national Distribution of the origin of the image Dataset (local, quality, A. Challege Dataset and influence) are respectively created to conform to the emotion Distribution of the Long-Tail Distribution of the local Distribution of the origin and the origin of the local Distribution (local Distribution). Table 1 shows the Macro F1 scores of the baseline method and the method of the invention in the MAMS-LT data set and the SemEval2014-LT data set, and the higher the score is, the better the classification effect is.

From the experimental results, we can conclude the following:

firstly, the method is superior to all baseline methods in the MAMS-LT data set and the SemEval2014-LT data set, and the method has better aspect class detection capability in the data set with the long tail distribution characteristic.

Secondly, compared with the best scoring baseline method AS-Capsules, the method of the present invention is 2.28% and 1.92% higher on both datasets, respectively, indicating that the Macro F1 score of the method of the present invention is a distinct advantage on the MAMS-LT dataset, demonstrating that the method of the present invention has a better effect on aspect category detection for sentences containing multiple aspects.

Thirdly, the reason that the effect of the method of the present invention on the SemEval2014-LT data set is not as good as that of the MAMS-LT data set may be that most of the sentences of the former data set only contain one aspect category, which may weaken the effect of the rebalancing weights designed for the tag co-occurrence problem, but it can be seen that the effect is significantly improved on the MAMS-LT data sets with each sentence containing two or more aspects.

Table 3 shows the Macro F1 score experimental results for each aspect of AS-Capsules and the method of the present invention in the SemEval2014-LT dataset.

Table 3 comparison of AS-Capsules on SemEval2014-LT dataset with the results of the method of the invention

The SemEval2014-LT data set constructed by the method comprises 1, 2 and 2 classes of a header class, a middle class and a tail class respectively. The results show that:

first, for the tail classes such as price and ambiance, the Macro F1 scores were 5.99% and 7.88% higher, respectively, which indicates that the method of the present invention has a significantly improved detection effect on the aspect classes of the tail classes. The prediction result of the tailing class 'price' is even better than that of the head class 'food', and the prediction of the AS-Capsules on the tailing class 'price' is reduced by 9.83% compared with that of the head class 'food'. This therefore also proves the effectiveness of the work of the long tail distribution proposed by the present invention from an experimental point of view.

Secondly, for a head class 'food', the method of the invention is lower than the AS-Capsules of a comparison method, and possible reasons are two, one is that the model of the invention is more in processing the tail class, and more weight is distributed to the tail class when the fine-grained aspect is fused, so more information of the tail class is concerned when an attention mechanism is used; secondly, the use of weight balancing weights: in the improved loss function, the weight of the head classes is reduced, and the suppression effect is more obvious when the number of the head classes is larger, so that the prediction effect of the head classes can be weakened to a certain extent.

Some steps in the embodiments of the present invention may be implemented by software, and the corresponding software program may be stored in a readable storage medium, such as an optical disc or a hard disk.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. An aspect category identification method under a long tail distribution scene is characterized in that the method identifies a data set

is the l-th sentence S_lA corresponding aspect category label;

the method comprises the following steps:

Separately deriving text-embedded vectors

Sum aspect embedding vector

And step 3: embedding the text into a vector

Embedding vectors with the aspect

And

and 5: inputting the total aspect vector s into an attention mechanism for fusing context aspect-level semantic information, and calculating a fusion vector

Step 6: using the fused vector

As a predicted sentence representation, for the jth aspect class, as shown in equation (1):

wherein, W_j∈R^d×1，b_jIs a scalar quantity of the input signals,

2. The method of claim 1, wherein the step of calculating the total aspect vector of the fused long tail distribution features in the IAN-LoT mechanism comprises:

step 41: hidden state for input H_wAnd H_aCalculating an interaction attention weight matrix I e R^n×mAs shown in equation (2):

step 42: performing softmax calculations for each row of the interaction attention weight matrix, as shown in equation (3):

step 43: and then introducing a data long tail distribution characteristic for the matrix k, as shown in formula (4):

wherein the content of the first and second substances,

step 44: for the

The multiplication results in the final overall aspect vector representation s, as shown in equation (5):

wherein s ∈ R^1×d。

3. The method of claim 2, wherein the fused vector of fused context aspect-level semantic information

The calculation process of (2) includes:

summing the total aspect vector s with

As input, a fusion vector is calculated

As shown in equation (6):

Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (7), for the jth aspect class:

wherein W_j∈R^d×d,b_j∈R^dAnd u_j∈R^dFor a learnable parameter, β ∈ R^1×mRepresenting the long tail distribution characteristic learned in advance, and is the reciprocal number of effective samples in the training set, alpha_j∈RⁿIs the attention weight vector.

4. The method of claim 3, wherein the method trains the recognition model using an improved A-DB loss function that improves the way the rebalancing weights are computed and the smoothing function, and comprises:

first, without considering tag co-occurrence,

wherein the content of the first and second substances,

when in use

Indicating that the ith sentence contains the jth aspect category a_j，

If not, then not included;

weight balancing weights

The calculation is shown in equation (9):

wherein gamma is a coordination weight hyperparameter;

the smoothing function is to

The formula of the mapping is:

the A-DB loss function is shown in equation (12):

wherein the content of the first and second substances,

a probability value output for the network.

5. The method of claim 1, wherein the classification threshold is 0.5.

6. An aspect category identification system under a long tail distribution scene, the system comprising: the system comprises an input module, a text embedding module, an LSTM module, an IAN-LOT module, a fusion module, an attention mechanism module and a prediction module;

7. The system of claim 6, wherein the system is to a data set

is the l-th sentence S_lA corresponding aspect category label;

Separately deriving text-embedded vectors

Sum aspect embedding vector

And step 3: embedding the text into a vector

Embedding vectors with the aspect

And

Step 6: using vectors

wherein, W_j∈R^d×1，b_jIs a scalar quantity of the input signals,

8. The system of claim 7, wherein the step of calculating the total aspect vector of the fused long tail distribution features in the IAN-LoT mechanism comprises:

wherein the content of the first and second substances,

step 44: for the

wherein s ∈ R^1×d。

9. The system of claim 8, wherein the vector representation of fused contextual aspect-level semantic information

The calculation process of (2) includes:

summing the total aspect vector s with

As input, a fusion vector is calculated

As shown in equation (6):

10. The system of claim 9, wherein the system trains the recognition model using an improved a-DB loss function that improves the way rebalancing weights are computed and the smoothing function, and further comprising:

first, without considering tag co-occurrence,