CN114880473A

CN114880473A - Label classification method and device, storage medium and electronic equipment

Info

Publication number: CN114880473A
Application number: CN202210468316.8A
Authority: CN
Inventors: 檀彦超; 李龙飞
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2022-04-29
Filing date: 2022-04-29
Publication date: 2022-08-09
Anticipated expiration: 2042-04-29

Abstract

The specification discloses a label classification method, a label classification device, a storage medium and electronic equipment, wherein the method comprises the following steps: the method comprises the steps of obtaining an item label matrix, wherein the item label matrix comprises a plurality of items, a plurality of labels and corresponding relations between the items and the labels, obtaining a label embedding vector set based on all the labels in the item label matrix, and performing hierarchical classification processing on all the labels by adopting the item label matrix and the label embedding vector set based on a hyperbolic space model to obtain a label classification function.

Description

Label classification method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for classifying tags, a storage medium, and an electronic device.

Background

In the prior art, recommendation operation can be performed on a user according to hobbies and habits of the user, items and labels accessed by the user are classified and then recommended by combining a knowledge graph, and the knowledge graph emphasizes semantic relationships among the labels, so that the method is wide in coverage area, poor in pertinence and low in accuracy. It is necessary to provide a classification method which focuses on the hierarchical relationship of the labels and has high accuracy.

Disclosure of Invention

The embodiment of the specification provides a tag classification method, a tag classification device, a storage medium and electronic equipment, which can obtain a tag classification function by obtaining a hierarchical relationship between tags by adopting a hyperbolic model, and embody the hierarchical relationship of tag classification, so that the accuracy and individuation of tag recommendation for a user are improved. The technical scheme is as follows:

in a first aspect, an embodiment of the present specification provides a tag classification method, where the method includes:

acquiring a project label matrix, wherein the project label matrix comprises a plurality of projects, a plurality of labels and corresponding relations between the projects and the labels;

acquiring a label embedding vector set based on all labels in the item label matrix;

and based on a hyperbolic space model, performing hierarchical classification processing on all the labels by adopting the item label matrix and the label embedding vector set to obtain a label classification function.

In a second aspect, embodiments of the present specification provide a label sorting apparatus, the apparatus including:

the system comprises a matrix acquisition module, a tag analysis module and a tag analysis module, wherein the matrix acquisition module is used for acquiring a project tag matrix, and the project tag matrix comprises a plurality of projects, a plurality of tags and corresponding relations between the projects and the tags;

an embedded vector acquisition module, configured to acquire a tag embedded vector set based on all tags in the item tag matrix;

and the classification function acquisition module is used for carrying out hierarchical classification processing on all the labels by adopting the item label matrix and the label embedding vector set based on a hyperbolic space model so as to obtain a label classification function.

In a third aspect, embodiments of the present specification provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.

In a fourth aspect, embodiments of the present specification provide an electronic device, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.

The technical scheme provided by some embodiments of the present description brings beneficial effects at least including:

in one or more embodiments of the present specification, an item tag matrix is obtained, where the item tag matrix includes a plurality of items, a plurality of tags, and a correspondence between each item and each tag, a tag embedding vector set is obtained based on all tags in the item tag matrix, and a hierarchical classification processing is performed on all tags by using the item tag matrix and the tag embedding vector set based on a hyperbolic space model, so as to obtain a tag classification function. The hierarchical relation among the labels is obtained by adopting the hyperbolic model to obtain the label classification function, and the hierarchical relation of label classification is embodied, so that the accuracy and individuation of label recommendation for a user are improved.

Drawings

In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present specification, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic diagram illustrating an example of label classification provided in an embodiment of the present disclosure

Fig. 2 is a schematic flow chart of a tag classification method provided in an embodiment of the present specification;

fig. 3 is a schematic flowchart of a tag classification method provided in an embodiment of the present specification;

FIG. 4 is a schematic diagram illustrating an example of a hierarchical label classification provided by an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a label sorting apparatus provided in an embodiment of the present specification;

fig. 6 is a schematic structural diagram of a classification function obtaining module according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the present specification.

Detailed Description

The technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort belong to the protection scope of the present specification.

The tag classification method provided by the embodiments of the present specification may be implemented by relying on a computer program, and may be executed on a tag classification device based on a von neumann architecture. The computer program may be integrated into the application or may run as a separate tool-like application. The label sorting apparatus in the embodiments of the present specification may be, but is not limited to: the terminal device such as a mobile phone, a personal computer, a tablet computer, a handheld device, a vehicle-mounted device, a wearable device, a computing device or other processing device connected to a wireless modem may also be a service device, such as a server, connected to the terminal device and capable of acquiring user history data from the terminal device. The user history data may be history browsing data generated when the user uses the terminal device, and may include user history item data and user history tag data, where the user history item data is an item accessed and browsed by the user history, and the user history tag data is a tag clicked and accessed by the user history. It can be understood that one item may correspond to a plurality of tags, the item may be an article title, a program name, and the like when the user browses the web, the user may browse the content therein by clicking the item, for example, "method of fitness training", "local food recommendation", and the like, the tag may be a keyword describing the item, for example, the item "japanese cuisine" may correspond to a plurality of tags such as "sushi shao", "sushi", and the user may also query the item having the tag by clicking the tag. Referring to fig. 1 together, an exemplary schematic diagram of tag classification is provided for the embodiments of this specification, where a tag classification device may obtain user history data from a terminal device logged in by a user, and the tag classification device may obtain all items clicked and accessed by the user according to the user history data and tags corresponding to the items, obtain embedded vectors (Embedding) of all tags in all tags, and form a tag embedded vector set, where the embedded vectors may represent characterization data of the tags. It is understood that each tag can be regarded as a point in the multidimensional space, the characterization data of the tag is represented as an embedded vector in the multidimensional space, and includes data information of the tag in multiple dimensions, such as the number of clicks of the user, similarity with other tags, and the like. The label classification device can perform hierarchical classification processing on all labels by adopting a label embedding vector set and user historical data to obtain a label classification function based on a hyperbolic space model, can realize hierarchical classification of all labels related to a user by adopting the label classification function, and can perform label recommendation on the user by utilizing the label classification function.

The label classification method provided in the present specification will be described in detail with reference to specific examples.

Referring to fig. 2, a schematic flow chart of a tag classification method is provided for an embodiment of the present disclosure. As shown in fig. 2, the method of embodiments of the present specification may include the following steps S102-S106.

And S102, acquiring an item label matrix.

Specifically, the tag classifying device may obtain user history data, where the user history data includes user history item data and user history tag data. The tag classification device may obtain the user history data according to the identity of the user, for example, the identity may be a social account, and the tag classification device may obtain a browsing record of the social account of the user on the network, so as to extract the user history data. The tag classification device can acquire user historical item data and user historical tag data from user historical data, one item can correspond to a plurality of tags due to the corresponding relationship between the item and the tags, an item tag matrix can be generated according to all items in the user historical data, all tags and the corresponding relationship between the items and the tags, and the item tag matrix can comprise a plurality of items, a plurality of tags and the corresponding relationship between each item and each tag.

And S104, acquiring a label embedding vector set based on all labels in the item label matrix.

Specifically, the tag classification device may obtain an embedded vector of each item in the item tag matrix to form a tag embedded vector set.

And S106, based on the hyperbolic space model, performing hierarchical classification processing on all the labels by adopting the item label matrix and the label embedding vector set to obtain a label classification function.

Specifically, the hyperbolic space model can well capture the hierarchical relationship between the labels, so the label classification device can bring the item label matrix and the label embedding vector set into the hyperbolic space model based on the hyperbolic space model, and realize hierarchical classification processing on the labels to obtain a label classification function. For example, hierarchical classification processing may be performed on all tags to obtain a tree-like hierarchical classification structure, that is, all tags are divided into a plurality of tag levels, each tag level includes a plurality of tag sets, and in order to make the distances between corresponding points in a multi-dimensional space closer to tags of the same tag level or the same tag level, classification perception regularization may be performed on the tree-like hierarchical classification structure to generate a tag classification function. After the tag classification function is obtained by the tag classification device according to the user historical data, the tags or items preferred by the user can be recommended to the user by adopting the tag classification function, for example, the user browses related contents of the tag "hot pot", the tag classification device can calculate that the tag "hot pot" belongs to the tag set "Sichuan dish" according to the tag classification function, and then other tags in the tag set "Sichuan dish" can be recommended to the user, for example, "Yuxiang eggplant", "Mapo tofu", and the like.

In an embodiment of the present specification, an item tag matrix is obtained, where the item tag matrix includes a plurality of items, a plurality of tags, and a correspondence between each item and each tag, a tag embedding vector set is obtained based on all tags in the item tag matrix, and a hierarchical classification processing is performed on all tags by using the item tag matrix and the tag embedding vector set based on a hyperbolic space model, so as to obtain a tag classification function. The hierarchical relation among the labels is obtained by adopting the hyperbolic model to obtain the label classification function, and the hierarchical relation of label classification is embodied, so that the accuracy and individuation of label recommendation for a user are improved.

Referring to fig. 3, a schematic flow chart of a tag classification method is provided for an embodiment of the present disclosure. As shown in fig. 3, the method of embodiments of the present specification may include the following steps S202-S216.

S202, acquiring an item label matrix.

Specifically, the tag classifying device may obtain user history data, where the user history data includes user history item data and user history tag data. The tag classification device may obtain the user history data according to the identity of the user, for example, the identity may be a social account, and the tag classification device may obtain a browsing record of the social account of the user on the network, so as to extract the user history data. The tag classification device can acquire user historical item data and user historical tag data from user historical data, one item can correspond to a plurality of tags due to the corresponding relationship between the item and the tags, and an item tag matrix can be generated according to all items in the user historical data, all tags and the corresponding relationship between the items and the tags, wherein the item tag matrix comprises a plurality of items, a plurality of tags and the corresponding relationship between each item and each tag.

Optionally, the tag classification device obtains historical interaction data between the user and the item, and then obtains the user historical data and an item tag matrix based on the historical interaction data, where the historical interaction data may be represented as a bipartite graph:

wherein

In order to be a set of users,

is a collection of items. The tag classification device can obtain an implicit feedback matrix X, wherein X _uv 1 denotes a positive sample (u, v) _p ) I.e. representing user u and item v _p Interaction is generated, and X _uv 0 denotes a negative sample (u, v) _q ) I.e. representing user u and item v _q No interaction is generated. The tag classification device can acquire user historical data from the implicit feedback matrix X, and then tag the item v by using a tag to obtain an item tag matrix.

And S204, acquiring a label embedding vector set based on all labels in the item label matrix.

And S206, based on the Poincare model, adopting an item label matrix and a label embedding vector set to carry out hierarchical classification processing on all labels, and generating a tree hierarchical classification structure.

Specifically, Poincare model (Poincare's model)

Is a set of d-dimensional vectors with Euclidean norm less than 1, and the Poincare metric is:

it is understood that the poincare model is a hyperbolic space model, so the poincare model can capture the hierarchical relationship between tags. Therefore, the label classification device can adopt an item label matrix and a label embedding vector set to perform hierarchical classification processing on all labels based on a poincare model to obtain a tree hierarchical classification structure.

Optionally, the tag classification apparatus may establish a tag embedding vector set in the poincare model, which is expressed as

Wherein the content of the first and second substances,

s is the number of tags.

Optionally, the tag classification apparatus may divide all tags into at least one tag level according to the similarity based on a poincare model, each tag level includes at least one tag set, and a tree-like hierarchical classification structure is generated based on the at least one tag level and all tag sets. The label classification device can divide all labels into a plurality of label sets by using an adaptive clustering algorithm. It can be understood that each label can be represented as a point in a multidimensional space, the point corresponding to the label with high similarity is closer to the point, the adaptive clustering algorithm can combine the labels corresponding to the points with close distances into a label set, divide all the labels into a plurality of label sets, and the labels in the same label set have similar characteristics.

And then the label classification device can calculate the representation score of each label in the label set by adopting a representation perception score function, and the label with the representation score higher than a score threshold value can be put into the next label level to be continuously classified into a plurality of label sets by adopting an adaptive clustering algorithm. And until the characterization scores of all the labels are smaller than the scoring threshold value, the label classification device is indicated to obtain all label levels and label sets, and a stump hierarchy classification structure can be generated. It can be understood that the characterization perception score function is used for calculating the relationship between the labels in the same label set, judging whether the labels in the label set need to be placed in the next label level, if the characterization score of the label is higher than the score threshold, it indicates that the label needs to be placed in the adaptive clustering algorithm of the next label level, and the label does not need to be placed in the next label level for continuous calculation and grouping in an irregular manner.

Optionally, the scoring threshold may be set initially by the tag classification device, or may be set and stored by a relevant worker. The score threshold may be set to be a fixed numerical value, or the tags in the tag set may be sorted from large to small according to the characterization scores, and the characterization scores corresponding to the previous preset score are determined as the score threshold, for example, if the preset score is 60%, there are 30 tags in the tag set, the tags are sorted from large to small according to the characterization scores, and the characterization score of the 18 th tag is the score threshold.

Optionally, the tag classification device may obtain the characterization score of each tag in the first tag set in the first tag level by using a characterization perception scoring function. The label classification device may obtain at least one first label with a characteristic score higher than a score threshold in the first label set, that is, the first label needs to be placed in a next label level. And dividing at least one first label into at least one second label set by using an adaptive clustering algorithm, wherein the second label set is a subset of the first label set and belongs to a second label level, and the second label level is next to the first label level. Referring to fig. 4 together, an exemplary schematic diagram of hierarchical classification of tags is provided for the embodiment of the present specification, and as shown in fig. 4, a first tag set in a first tag level is "chinese dish", and the first tag set includes tags such as "chafing dish", "wife bean curd", and "roast goose". The label classification device may calculate a characterization score of each label in the first label set by using a characterization perception scoring function, obtain a first label of which the characterization is higher than a scoring threshold in the first label set, and then divide all the first labels into at least one second label set by using an adaptive clustering algorithm, for example, a subset of the first label set "chinese dish" may be "chinese dishes", "yuejue dishes", "xiangcai dishes", and the like, "chinese dishes", "yuejie dishes", and "xiangcai dishes" are all second label sets and are located at a second label level, and the second label level is a next level of the first label level.

Alternatively, in one node of the tree-like hierarchical classification structureTo include a set of tags

It will be appreciated that each labelset G _k (1. ltoreq. K. ltoreq.K) are all in

The other tag sets in (a) have the same semantic granularity. The label sorting device can obtain epsilon through the item label matrix _C ＝[E ₁ ,E ₂ ,......,E _K ]Each E _k (1. ltoreq. K. ltoreq.K) all correspond to the tag set G _k The collection of items of (1).

Label set G _k The tag t in (1) will often appear in E _k In the method, the label classification device can determine the label set G first _k Normalized frequency of middle label t:

wherein, tf (t, E) _k ) Is E _k The number of times the tag t appears in, tf (E) _k ) Is E _k Total number of items in. The label dividing device can obtain a characterization perception score function, and the formula is as follows:

wherein, s (t, G) _k ) Representing a set of tags G _k Characterization score of middle label t, rank (t, E) _k ) For a search function, the formula is as follows:

wherein avgdl is E _k Average number of tags per item in, k ₁ And b are all parameters, k ₁ May be 1.3 and b may be 0.5.

And S208, carrying out classification perception regularization processing on the tree-shaped hierarchical classification structure to generate a classification structure function.

Specifically, in order to enable the distance between points in the same label set to be closer and enable hierarchical classification processing to be more accurate and clear, the label classification device can perform classification perception regularization processing on a tree-shaped hierarchical classification structure to generate a classification structural function

The formula is as follows:

wherein the content of the first and second substances,

s is the number of tags, G _k ∈T _axo Representing a subset of tags in a node in a tree-like hierarchical classification structure.

S210, acquiring a project-related embedded vector, a user-related embedded vector, a project-unrelated embedded vector and a user-unrelated embedded vector based on the project tag matrix and the tag embedded vector matrix, generating a tag-related embedded vector based on the project-related embedded vector and the user-related embedded vector, and generating a tag-unrelated embedded vector based on the project-unrelated embedded vector and the user-unrelated embedded vector.

Specifically, the user may also be recommended by using a classification constructor, for example, the user is browsing the related content of the label "hot pot", and the label classification device may recommend the related content of the label "wife bean curd" belonging to the same label set "chuhai dish" as the hot pot, butThe label classification device can recommend other labels which are preferred by the user but do not belong to the same category to the user according to the historical data of the user. The tag classification means may obtain a tag-dependent embedding vector and a tag-independent embedding vector, wherein the tag-dependent embedding vector comprises an item-dependent embedding vector u ^tg Embedding vector v in relation to user ^tg The tag-independent embedded vector may comprise an item-independent embedded vector u ^ir User-independent embedding vector v ^it 。

The item-related embedded vector contains information about items that the user is actively accessing and browsing, for example, items that the user is actively searching in a web search bar, and similarly, the tag-related embedded vector contains information about tags that the user is actively accessing and browsing. It is understood that the item-independent embedded vector contains information about items that the user is not actively accessing or browsing, such as accessing through a friend-shared connection, and similarly, the tag-independent embedded vector contains information about tags that the user is not actively accessing.

S212, based on the Lorentz model, the label-related embedded vector and the label-unrelated embedded vector are combined to generate an initial similarity function.

Specifically, the Lorentz model (Lorentz model)

Is a unique unbounded hyperbolic model in which points are subject to

Wherein

Is the inner product of the Lorentz dimension,

to measure the tensor:

and the correlation distance function in the lorentz model is:

it will be appreciated that the proportion of the tag-related embedded vector and the tag-independent embedded vector in the recommendation process is different, and that the tag-related embedded vector is more closely matched to the user's preferences. The tag classification means can obtain the relevant weight of the relevant embedded vector and obtain the irrelevant weight of the irrelevant embedded vector. And combining the related vector and the unrelated vector based on a Lorentz model to generate an initial similarity function after combining the label-related embedded vector and the label-unrelated embedded vector.

Alternatively, the tag classification means may calculate the weights of the tag-dependent embedded vector and the tag-independent embedded vector:

wherein the content of the first and second substances,

representing the set of items interacted with by user u,

the number of the items is indicated,

represents a labelset to which item v corresponds, and α _u ∈[0,1]。

So an initial similarity function can be obtained:

s214, a large-interval nearest neighbor classification algorithm is adopted, and a label similarity measurement function is generated based on the initial similarity function.

Specifically, the tag classification device may optimize the initial similarity function by using a Large-interval Nearest Neighbor classification algorithm (Large region Neighbor, LMNN) to obtain a tag similarity measurement function:

wherein the content of the first and second substances,

is a set of positive samples derived from the matrix X, [ (X)] ₊ Max (x,0) is the standard hinge loss function.

S216, generating a label classification function based on the classification construction function and the label similarity measurement function.

Specifically, the tag classification device may generate a tag classification function based on the classification constructor and the tag similarity measure function:

wherein λ is a hyper-parameter controlling the regularization of the label.

The classification construction function can recommend labels to users according to label levels and label sets in the tree-shaped hierarchical classification structure, and the label similarity measurement function enables the distance between the labels which are frequently and simultaneously browsed by the users and are interested to be closer. For example, the user is browsing the related content of the label "hot pot", the label sorting device may calculate the label "wife bean curd" belonging to the same label set "chuhai" as the "hot pot" according to the sorting construction function, and may calculate the related content of the label "milky tea" that the user prefers to browse when browsing the related content of the label "hot pot" according to the label similarity measurement function, so the label sorting device may recommend the labels "wife bean curd" and "milky tea" to the user when the user browses the related content of the label "hot pot" according to the label sorting function.

In the embodiment of the description, a project tag matrix is acquired based on user history data, the user history data includes user history project data and user history tag data, a tag embedding vector set is acquired based on all tags in the project tag matrix, all tags are subjected to hierarchical classification processing based on a poincare model, a tree-shaped hierarchical classification structure is generated, and the poincare model can capture the hierarchical relationship among the tags. And whether the label can be put into the next label level can be determined by adopting a characterization perception scoring function, and the label is divided into a plurality of label subsets by adopting a self-adaptive clustering algorithm. Then, classification perception regularization processing is carried out on the tree-shaped hierarchical classification structure to generate a classification structural function, so that the distance between similar labels is closer, the accuracy of label classification is further improved, and errors are avoided. And then combining the label-related embedding vector and the label-unrelated embedding vector based on a Lorentz model to generate an initial similarity function, and generating a similarity measurement function for the initial similarity function by adopting a large-interval nearest neighbor classification method. And generating a label classification function based on the classification construction function and the label similarity measurement function. The hierarchical relation among the labels is obtained by adopting the hyperbolic model to obtain the label classification function, and the hierarchical relation of label classification is embodied, so that the accuracy and individuation of label recommendation for a user are improved. Besides, labels in the same label set can be recommended for the user according to the hierarchical classification result, labels frequently browsed simultaneously in the historical record can be recommended according to the preference of the user, and the intellectualization of label classification and label recommendation is further improved.

The following describes in detail a label sorting apparatus provided in an embodiment of the present disclosure with reference to fig. 5 to 6. It should be noted that, the tag sorting apparatus in fig. 5 is used for executing the method of the embodiment shown in fig. 2 and fig. 3 of the present specification, for convenience of description, only the portion related to the embodiment of the present specification is shown, and specific technical details are not disclosed, please refer to the embodiment shown in fig. 2 and fig. 3 of the present specification.

Referring to fig. 5, a schematic structural diagram of a label sorting apparatus according to an exemplary embodiment of the present disclosure is shown. The tag sorting apparatus may be implemented as all or part of an apparatus by software, hardware or a combination of both. The apparatus 1 includes a matrix acquisition module, an embedded vector acquisition module 12, and a classification function acquisition module 13.

A matrix obtaining module 11, configured to obtain an item tag matrix, where the item tag matrix includes multiple items, multiple tags, and a correspondence between each item and each tag;

an embedded vector obtaining module 12, configured to obtain a set of tag embedded vectors based on all tags in the item tag matrix;

and the classification function obtaining module 13 is configured to perform hierarchical classification processing on all the tags by using the item tag matrix and the tag embedding vector set based on a hyperbolic space model to obtain a tag classification function.

Specifically, please refer to fig. 6, which provides a schematic structural diagram of a classification function obtaining module according to an embodiment of the present disclosure. As shown in fig. 6, the classification function obtaining module 13 may include:

a constructor obtaining unit 131, configured to perform hierarchical classification processing on all the tags by using the item tag matrix and the tag embedding vector set based on a poincare model to obtain a classification constructor;

an embedded vector acquisition unit 132 configured to acquire a tag-related embedded vector and a tag-unrelated embedded vector based on the item tag matrix and the set of tag embedded vectors;

a metric function obtaining unit 133, configured to combine the tag-related embedding vector and the tag-unrelated embedding vector based on a lorentz model to generate a tag similarity metric function;

a classification function obtaining unit 134, configured to generate a label classification function based on the classification constructor and the label similarity measure function.

Optionally, the constructor obtaining unit 131 is specifically configured to perform hierarchical classification processing on all the tags by using the item tag matrix and the tag embedding vector set based on a poincare model, so as to generate a tree-like hierarchical classification structure;

and carrying out classification perception regularization processing on the tree-shaped hierarchical classification structure to generate a classification construction function.

Optionally, the constructor obtaining unit 131 is specifically configured to divide all the tags into at least one tag level by using the item tag matrix and the tag embedding vector set based on a poincare model, where each tag level includes at least one tag set;

a tree-like hierarchical classification structure is generated based on the at least one label level and all label sets.

Optionally, the structure function obtaining unit 131 is specifically configured to obtain a characterization score of each tag in a first tag set at a first tag level by using a characterization perception score function based on a poincare model, the item tag matrix, and the tag embedding vector set;

obtaining at least one first label with the characteristic score higher than a scoring threshold value in the first label set;

and dividing the at least one first label into at least one second label set by adopting an adaptive clustering algorithm, wherein the second label set is a subset of the first label set and belongs to a second label level, and the second label level is the next label level of the first label level.

Optionally, the embedded vector obtaining unit 132 is specifically configured to obtain a project-related embedded vector, a user-related embedded vector, a project-unrelated embedded vector, and a user-unrelated embedded vector based on the project tag matrix and the tag embedded vector matrix;

generating a tag-dependent embedded vector based on the item-dependent embedded vector and the user-dependent embedded vector, and generating a tag-independent embedded vector based on the item-independent embedded vector and the user-independent embedded vector.

Optionally, the weighting function obtaining unit 133 is specifically configured to combine the tag-related embedded vector and the tag-unrelated embedded vector based on a lorentz model to generate an initial similarity function;

and generating a label similarity measurement function based on the initial similarity function by adopting a large-interval nearest neighbor classification algorithm.

Optionally, the weighting function obtaining unit 133 is specifically configured to obtain a relevant weight of the relevant embedded vector, and obtain an irrelevant weight of the irrelevant embedded vector;

and combining the label-related embedded vector and the label-unrelated embedded vector based on a Lorentz model and the related weight and the unrelated weight to generate an initial similarity function.

In this embodiment, an item tag matrix is obtained, where the item tag matrix includes a plurality of items, a plurality of tags, and a correspondence between each item and each tag, a tag embedding vector set is obtained based on all tags in the item tag matrix, hierarchical classification processing is performed on all tags based on a poincare model, a tree-like hierarchical classification structure is generated, and the poincare model can capture the hierarchical relationship between the tags. And whether the label can be put into the next label level can be determined by adopting a characterization perception scoring function, and the label is divided into a plurality of label subsets by adopting a self-adaptive clustering algorithm. Then, classification perception regularization processing is carried out on the tree-shaped hierarchical classification structure to generate a classification structural function, so that the distance between similar labels is closer, the accuracy of label classification is further improved, and errors are avoided. And then combining the label-related embedding vector and the label-unrelated embedding vector based on a Lorentz model to generate an initial similarity function, and generating a similarity measurement function for the initial similarity function by adopting a large-interval nearest neighbor classification method. And generating a label classification function based on the classification construction function and the label similarity measurement function. The hierarchical relation among the labels is obtained by adopting the hyperbolic model to obtain the label classification function, and the hierarchical relation of label classification is embodied, so that the accuracy and individuation of label recommendation for a user are improved. Besides, labels in the same label set can be recommended for the user according to the hierarchical classification result, labels frequently browsed simultaneously in the historical record can be recommended according to the preference of the user, and the intellectualization of label classification and label recommendation is further improved.

Referring to fig. 7, a block diagram of an electronic device according to an exemplary embodiment of the present disclosure is shown. The electronic device in this specification may include one or more of the following components: a processor 110, a memory 120, an input device 130, an output device 140, and a bus 150. The processor 110, memory 120, input device 130, and output device 140 may be connected by a bus 150.

Processor 110 may include one or more processing cores. The processor 110 connects various parts within the entire electronic device using various interfaces and lines, and performs various functions of the terminal 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120 and calling data stored in the memory 120. Alternatively, the processor 110 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 110 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. The CPU mainly processes an operating system, a user page, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 110, but may be implemented by a communication chip.

The Memory 120 may include a Random Access Memory (RAM) or a Read-Only Memory (ROM). Optionally, the memory 120 includes a Non-Transitory Computer-Readable Medium (Non-transient Computer-Readable Storage Medium). The memory 120 may be used to store instructions, programs, code sets, or instruction sets. The memory 120 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, and the like), instructions for implementing the above method embodiments, and the like, and the operating system may be an Android (Android) system, including a system based on Android system depth development, an IOS system developed by apple, including a system based on IOS system depth development, or other systems.

The memory 120 may be divided into an operating system space, in which an operating system runs, and a user space, in which native and third-party applications run. In order to ensure that different third-party application programs can achieve a better operation effect, the operating system allocates corresponding system resources for the different third-party application programs. However, the requirements of different application scenarios in the same third-party application program on system resources are different, for example, in a local resource loading scenario, the third-party application program has a higher requirement on the disk reading speed; in the animation rendering scene, the third-party application program has a high requirement on the performance of the GPU. The operating system and the third-party application program are independent from each other, and the operating system cannot sense the current application scene of the third-party application program in time, so that the operating system cannot perform targeted system resource adaptation according to the specific application scene of the third-party application program.

In order to enable the operating system to distinguish a specific application scenario of the third-party application program, data communication between the third-party application program and the operating system needs to be opened, so that the operating system can acquire current scenario information of the third-party application program at any time, and further perform targeted system resource adaptation based on the current scenario.

The input device 130 is used for receiving input instructions or data, and the input device 130 includes, but is not limited to, a keyboard, a mouse, a camera, a microphone, or a touch device. The output device 140 is used for outputting instructions or data, and the output device 140 includes, but is not limited to, a display device, a speaker, and the like. In one example, the input device 130 and the output device 140 may be combined, and the input device 130 and the output device 140 are touch display screens.

The touch display screen can be designed as a full-face screen, a curved screen or a profiled screen. The touch display screen can also be designed as a combination of a full-face screen and a curved-face screen, and a combination of a special-shaped screen and a curved-face screen, which is not limited in this specification.

In addition, those skilled in the art will appreciate that the configurations of the electronic devices illustrated in the above-described figures do not constitute limitations on the electronic devices, which may include more or fewer components than illustrated, or some components may be combined, or a different arrangement of components. For example, the electronic device further includes a radio frequency circuit, an input unit, a sensor, an audio circuit, a Wireless Fidelity (WiFi) module, a power supply, a bluetooth module, and other components, which are not described herein again.

In the electronic device shown in fig. 7, the processor 110 may be configured to invoke the tag classification application stored in the memory 120 and specifically perform the following operations:

In an embodiment, when the processor 110 performs hierarchical classification processing on all the tags by using the item tag matrix and the tag embedding vector set based on the hyperbolic space model to obtain a tag classification function, the following operations are specifically performed:

based on a Poincare model, adopting the item label matrix and the label embedding vector set to carry out hierarchical classification processing on all labels to obtain a classification structure function;

acquiring a label-related embedded vector and a label-unrelated embedded vector based on the item label matrix and the label embedded vector set;

combining the tag-dependent embedding vector and the tag-independent embedding vector based on a Lorentz model to generate a tag similarity measure function;

and generating a label classification function based on the classification construction function and the label similarity measurement function.

In an embodiment, when the processor 110 performs hierarchical classification processing on all the tags by using the item tag matrix and the tag embedding vector set based on a poincare model to obtain a classification constructor, the following operations are specifically performed:

based on a Poincare model, adopting the item label matrix and the label embedding vector set to carry out hierarchical classification processing on all labels to generate a tree-shaped hierarchical classification structure;

In an embodiment, when the processor 110 performs hierarchical classification processing on all the tags by using the item tag matrix and the tag embedding vector set based on a poincare model to generate a tree-like hierarchical classification structure, the following operations are specifically performed:

based on a Poincare model, dividing all the labels into at least one label level by adopting the item label matrix and the label embedding vector set, wherein each label level comprises at least one label set;

In an embodiment, when the processor 110 performs the tag classification of all tags into at least one tag level by using the item tag matrix and the tag embedding vector set based on a poincare model, the following operations are specifically performed:

based on a Poincare model, the item label matrix and the label embedding vector set, adopting a characterization perception score function to obtain a characterization score of each label in a first label set at a first label level;

In one embodiment, when the processor 110 obtains the tag-related embedded vector and the tag-independent embedded vector based on the item tag matrix and the set of tag embedded vectors, the following operations are specifically performed:

acquiring a project-related embedded vector, a user-related embedded vector, a project-unrelated embedded vector and a user-unrelated embedded vector based on the project tag matrix and the tag embedded vector matrix;

In one embodiment, the processor 110 specifically performs the following operations when performing the tag-dependent embedding vector and the tag-independent embedding vector to generate the tag similarity measure function based on the lorentz model:

combining the label-related embedding vector and the label-unrelated embedding vector based on a Lorentz model to generate an initial similarity function;

In one embodiment, the processor 110 specifically performs the following operations when performing the initial similarity function generated by combining the tag-dependent embedding vector and the tag-independent embedding vector based on the lorentz model:

acquiring the relevant weight of the relevant embedded vector, and acquiring the irrelevant weight of the irrelevant embedded vector;

In this embodiment, an item tag matrix is obtained, where the item tag matrix includes a plurality of items, a plurality of tags, and a correspondence between each item and each tag, a tag embedding vector set is obtained based on all tags in the item tag matrix, and hierarchical classification processing is performed on all tags based on a poincare model to generate a tree-like hierarchical classification structure, and the poincare model can capture the hierarchical relationship between the tags. And whether the label can be put into the next label level can be determined by adopting a characterization perception scoring function, and the label is divided into a plurality of label subsets by adopting a self-adaptive clustering algorithm. Then, classification perception regularization processing is carried out on the tree-shaped hierarchical classification structure to generate a classification structural function, so that the distance between similar labels is closer, the accuracy of label classification is further improved, and errors are avoided. And then combining the label-related embedding vector and the label-unrelated embedding vector based on a Lorentz model to generate an initial similarity function, and generating a similarity measurement function for the initial similarity function by adopting a large-interval nearest neighbor classification method. And generating a label classification function based on the classification construction function and the label similarity measurement function. The hierarchical relation among the labels is obtained by adopting the hyperbolic model to obtain the label classification function, and the hierarchical relation of label classification is embodied, so that the accuracy and individuation of label recommendation for a user are improved. Besides, labels in the same label set can be recommended for the user according to the hierarchical classification result, labels frequently browsed simultaneously in the historical record can be recommended according to the preference of the user, and the intellectualization of label classification and label recommendation is further improved.

An embodiment of this specification further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the firmware flashing method according to the embodiment shown in fig. 2 to 4, and a specific execution process may refer to specific descriptions of the embodiment shown in fig. 2 to 4, which is not described herein again.

The present specification further provides a computer program product, where at least one instruction is stored, and the at least one instruction is loaded by a processor and executes the firmware flashing method according to the embodiment shown in fig. 1 to 4, where a specific execution process may refer to specific descriptions of the embodiment shown in fig. 1 to 4, and is not described herein again.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, and the program can be stored in a computer readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present disclosure, and it is not intended to limit the scope of the present disclosure, so that the present disclosure will be covered by the claims and their equivalents.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Claims

1. A method of tag classification, the method comprising:

2. The method according to claim 1, wherein the hierarchical classification processing is performed on all the tags by using the item tag matrix and the tag embedding vector set based on the hyperbolic space model to obtain a tag classification function, and the method comprises:

3. The method according to claim 2, wherein the performing hierarchical classification processing on all the tags by using the item tag matrix and the tag embedding vector set based on the poincare model to obtain a classification constructor comprises:

4. The method according to claim 3, wherein the hierarchical classification processing is performed on all the tags by using the item tag matrix and the tag embedding vector set based on the poincare model to generate a tree-like hierarchical classification structure, including:

5. The method of claim 4, the classifying all tags into at least one tag level using the item tag matrix and the set of tag embedding vectors based on a Poincare model, comprising:

6. The method of claim 2, said obtaining tag-dependent embedded vectors and tag-independent embedded vectors based on the item tag matrix and the set of tag-embedded vectors, comprising:

7. The method of claim 2, the combining the tag-dependent embedding vector and the tag-independent embedding vector based on a lorentz model to generate a tag similarity metric function, comprising:

8. The method of claim 7, wherein combining the tag-dependent embedding vector and the tag-independent embedding vector based on a Lorentzian model to generate an initial similarity function comprises:

9. A label sorting apparatus, the apparatus comprising:

10. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps according to any of claims 1-8.

11. An electronic device, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1 to 8.