WO2018188378A1

WO2018188378A1 - Method and device for tagging label for application, terminal and computer readable storage medium

Info

Publication number: WO2018188378A1
Application number: PCT/CN2017/118709
Authority: WO
Inventors: 潘岸腾
Original assignee: 广州优视网络科技有限公司
Priority date: 2017-04-10
Filing date: 2017-12-26
Publication date: 2018-10-18
Also published as: CN106951571A; CN106951571B

Abstract

A method and device for tagging a label for an application, a terminal and a computer readable storage medium. The method comprises: extracting feature word information from application description information of each application in a preset application library (S1); merging the corresponding feature word information of a plurality of applications with the same label, and using the merged feature word information as feature word information of the label (S2); determining a first preference degree of each label on each feature word belonging to the label (S3); extracting feature word information from application description information of new applications which are to be tagged with labels (S4); determining a second preference degree of the new application on each label in the label library based on the first preference degree and the extracted feature word information of the new applications (S5); and according to the second preference degree, selecting corresponding one or more labels from the label library in a preset mode to tag the new application (S6).

Description

Method, device, terminal and computer readable storage medium for labeling applications

Cross-reference to related applications

The present application claims priority to Chinese Patent Application No. 201710227588.8, entitled "A Method and Apparatus for Labeling Applications", filed on April 10, 2017, the entire contents of which is incorporated herein by reference. In the application.

Technical field

The present invention relates to the field of information processing technologies, and in particular, to a method, an apparatus, a terminal, and a computer readable storage medium for labeling an application.

Background technique

Applications provided in the app store or the app market have one or more tags, which are used to identify categories or content of various applications for user to find. In the application store or application market operation process, it is necessary to label the application newly added to the application library. For example, the app store or app market just launched a "snake" app, you need to label this app, you can label the tag "casual game." The traditional method of labeling new online applications is to determine by operator what kind of label the application is suitable for. The drawbacks of this approach include:

1. It takes a huge labor cost. For each new application that joins the application library, the operator needs to review all the tags and find the appropriate tags for the application.

2. The accuracy is difficult to guarantee and the efficiency is low. Due to the large number of new applications that are added to the application library, it is impossible for operators to spend time downloading, installing, and experiencing each new application. Operators generally rely on the application name as a basis for judgment, which makes accuracy difficult to guarantee; It is inefficient to label labels one by one.

Summary of the invention

In view of the above, it is an object of the present invention to provide a method, apparatus, terminal and computer readable storage medium for labeling applications to improve at least one of the above problems.

A first embodiment of the present invention provides a method for labeling an application, including:

Extracting feature word information from application description information of each application in the preset application library;

Combining corresponding feature word information of a plurality of applications having the same tag as feature word information of the tag;

Determining a first preference for each feature word to which each tag belongs;

Extracting feature word information from application description information of a new application to be tagged;

Determining, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;

According to the second preference, the corresponding one or more tags are selected from the tag library in a preset manner for the new application to be marked.

A second embodiment of the present invention provides an apparatus for labeling an application, including:

The feature word information extracting unit is configured to extract the feature word information from the application description information of each application in the preset application library, and extract the feature word information from the application description information of the new application to be labeled;

a feature word information determining unit of the tag, configured to combine corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag;

a first preference determining unit configured to determine a first preference of each tag pair for each feature word to which it belongs;

a second preference determining unit, configured to determine, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;

The label labeling unit is configured to select a corresponding one or more labels from the label library according to the second preference to label the new application.

Optionally, the feature word information extracting unit is configured to perform word segmentation processing on the application description information to extract the feature words, and calculate a probability that each of the feature words appears as a weight of the feature word to which the feature word belongs. Obtaining the feature word information, the feature word information includes a feature word and a weight of the feature word to which the application belongs.

Optionally, the feature word information determining unit of the tag combines the corresponding feature word information of the plurality of applications having the same tag, and the method for the feature word information of the tag includes:

Combining the same feature words in the feature word information corresponding to each application having the same tag into one feature word, and using the feature word obtained after the combination as the feature word of the tag;

Determining a weight of each of the feature words on the label;

The feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.

Optionally, the method for calculating the weight of each feature word on the label by the feature word information determining unit of the tag is as follows:

And i∈A,j∈w

among them:

f _t,j represents the weight of the feature word j on the label t;

w _i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library;

A represents a set of applications with a tag t in the preset application library;

W represents a feature word set belonging to an application in the application set A;

n represents the number of applications in the application set A;

m represents the number of feature words in the feature word set W.

Optionally, the method for determining, by the first preference determining unit, the first preference includes:

The first preference of each tag to each feature word belonging to it is calculated by the following formula:

among them:

p _t,j represents the first preference of the tag t for the feature word j;

f _t,j represents the weight of the feature word j on the label t;

s _j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:

And i∈AA,j∈Aw

among them:

w _i,j represents the weight of the feature word j on the application i in the preset application library;

AA represents a collection of all applications in the preset application library;

Aw represents a set of all feature words extracted from the respective application description information of all applications;

n represents the number of applications in the application set AA;

m represents the number of feature words in the feature word set Aw.

Optionally, the method for determining, by the second preference determining unit, the second preference includes:

The second preference of the new application for each tag in the tag library is calculated by the following formula:

And j∈AM

among them:

r _i,t represents a second preference of the new application i for the tag t;

p _t,j represents the first preference of the tag t for the feature word j;

w _i,j represents the weight of the feature word j extracted from the application description information of the new application i to the new application i;

AM represents the obtained set of all feature words attributed to the tag t;

m represents the number of feature words in the feature word set attributed to the tag t.

Optionally, the method for determining, by the second preference determining unit, the second preference of the new application to each tag in the tag library comprises:

Selecting a certain number of feature words as the topic feature words of the corresponding tags according to a first preference degree of each feature word belonging to each tag according to a preset manner;

The second preference is determined by the following formula:

And j∈topic _t

among them:

r _i,t represents a second preference of the new application i for the tag t;

p _t,j represents the first preference of the tag t for the feature word j;

w _i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;

Topic _t represents a selected set of subject feature words attributed to the tag t;

m represents the number of feature words in the set of subject feature words belonging to the tag t.

Optionally, the second preference determining unit selects, according to a first preference degree of each of the feature words belonging to each tag, a certain number of feature words as a topic feature of the corresponding tag according to a preset manner. Word methods include:

Selecting a preset number of feature words ranked first in the first preference degree as the topic feature words according to a descending order of the first preference degree of each feature word belonging to the tag.

A plurality of feature words corresponding to the plurality of first preference degrees greater than or equal to the first preset preference threshold are selected as the topic feature words.

Optionally, the method for the label labeling unit to select the corresponding one or more labels from the label library according to the second preference to mark the new application according to the second preference includes:

According to the order in which the new application applies the second preference value of each tag from large to small, the first preference is ranked by the first one or more tags to the new application.

The one or more tags corresponding to one or more second preference degrees greater than or equal to the second preset preference threshold are selected for the new application.

The embodiment of the present invention further provides a terminal, including a memory and a processor, where the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the application provided by the embodiment of the present invention is implemented. The method of labeling labels.

The embodiment of the present invention further provides a computer readable storage medium, on which a computer program is stored, and when the computer program is executed, a method for labeling an application provided by an embodiment of the present invention is implemented.

A method, device, terminal and computer readable storage medium for labeling an application according to an embodiment of the present invention, by using an application known in the library and a tag thereof, using features and core functions for introducing an application Application description information and word segmentation technology establishes an association between the new application to be tagged and the tag in the tag library, enabling automatic identification of one or more tags for new application annotation, reducing labor costs Improves accuracy and productivity.

DRAWINGS

1 is a flowchart of a method for labeling an application according to an embodiment of the present invention;

FIG. 2 is a schematic block diagram of an apparatus for labeling an application according to an embodiment of the present invention.

detailed description

The technical solutions in the embodiments of the present invention will be clearly and completely described in the following with reference to the embodiments of the present invention and the accompanying drawings. The components of the embodiments of the invention, which are generally described and illustrated in the figures herein, may be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of the invention are not intended to All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

In order to be able to automatically label the application of the tag to be tagged into the new application library, it is necessary to establish a correlation between the new application of the tag to be tagged and the tag in the preset tag library, which is utilized by the method described in the embodiment below. Applicable applications in the library and the tags they have, as well as application description information and word segmentation techniques for introducing the features and core functions of the application, establish a new application in the tag to be tagged and a tag in the preset tag library. The correlation between the two, thus achieving the purpose of automatically labeling the application.

An embodiment of the present invention provides a terminal, where the terminal includes a memory, a processor, and a device for labeling an application. The memory, the processor, and other components are electrically connected directly or indirectly to implement data transmission or interaction. The device for labeling an application includes at least one software that can be stored in the memory or firmware in an operating system (OS) of the terminal in the form of software or firmware. functional module. The processor is configured to execute the executable module stored in the memory when the execution instruction is received, thereby implementing a corresponding function application, such as the method for labeling the application provided by the embodiment.

It should be understood that, in this embodiment, the terminal may further include more, less, or completely different components than the above, and is not limited herein.

FIG. 1 is a flowchart of a method for labeling an application according to an embodiment of the present invention, and the method is applicable to the foregoing terminal. As shown in FIG. 1, the method for labeling an application of the present invention includes the following steps S1 to S6.

S1: Extract feature word information from application description information of each application in the preset application library.

The application library is usually preset when developing the application market or the application store, and the third-party applications downloaded from the application market or the application store are saved in the preset application library. In addition, third-party applications provided by the app store or the app market have one or more tags from the tag library that is preset when developing the app store or app market. The tag is used to identify the categories of various apps. Or content, easy for users to find. .

In addition, each application in the preset application library has application description information, which is used to introduce the characteristics and core functions of the application, so that the user can understand the application and generate interest in the application.

The method provided by the present invention may first perform word segmentation on the application description information to extract feature words, and then count the probability of occurrence of each feature word as the weight of the feature word to which the feature word belongs. Thus, the feature word information described in step S1 includes the feature word and the weight of the feature word to which it belongs. Word segmentation technology can be used to process word segmentation of application description information. The extracted feature words are words obtained after word segmentation processing, or keywords.

For example, the feature word information extracted from the description information of an application i is written as w _i

w _i ={w1:pci1,w2:pci2,w3:pci3,...}

Where: w1: pci1, w2: pci2, w3: pci3, ... represent feature words and corresponding weights, for example w1 represents a feature word, and pci1 represents the weight of the feature word on the application i.

For example, the application description information of the application "Sogou Pinyin Input Method" is: "The input method with the precise typing and the most personalized interface, and the versatile input method". The feature words obtained after the segmentation of the description information are: "typing and precision" , interface, personality, input method, possession, omnipotence, input method." Then the characteristic word information of "Sogou Pinyin Input Method" is:

S2: Combine the corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag.

In detail, the same feature words in the feature word information corresponding to each application having the same tag may be merged into one feature word, and the feature word obtained after the combination may be used as the feature word of the tag. The weight of each of the feature words on the tag is then determined. The feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.

Each existing application in the preset application library has one or more tags, and the feature information attributed to the application is extracted in the description information of each application, and the corresponding applications of multiple applications having the same tag are The feature word information is merged, and the feature word information obtained after the combination is used as the feature word information of the tag.

The feature word information of the tag similarly includes the feature word and the weight of the feature word on the tag. In the process of merging multiple feature word information, the same plurality of feature words may be combined into one feature word, and the weight of each feature word on the label is calculated as follows:

And i∈A,j∈w

among them:

f _t,j represents the weight of the feature word j on the label t;

n represents the number of applications in the application set A;

m represents the number of feature words in the feature word set W.

It can be seen that the weight of each feature word on a certain tag is the probability that each feature word appears in the feature word set of the application belonging to the application set having the tag.

For example, taking the label "live" as an example, it is assumed that there are two applications with the label, namely "Betta TV" and "YY". The characteristic word information of the application "Betta Fish TV" is

The feature word information of the application "YY" is

Then, after the merger, the characteristic words of the label "live" are ("game", "live", "entertainment"), and the characteristic word information of the label "live" is:

S3: Determine a first preference of each tag pair for each feature word that belongs to it.

After obtaining the characteristic word information of all the tags in the tag library (it can be considered that all the applications in the preset application library have a set of tags covering all the tags in the tag library), it is necessary to establish each tag and the characteristics belonging to it. The degree of association between words, here the first preference degree of each feature word belonging to each tag pair is used as the degree of association, and the method for determining the first preference degree is as follows:

among them:

p _t,j represents the first preference of the tag t for the feature word j;

f _t,j denotes the weight of the feature word j on the tag t, that is, the probability of occurrence in the feature word set of the application belonging to the application set having the tag t;

And i∈AA,j∈Aw

among them:

n represents the number of applications in the application set AA;

m represents the number of feature words in the feature word set Aw.

S4: Extract feature word information from application description information of the new application to be labeled.

The implementation step here is the same as the implementation method of step S1. The feature word information is extracted from the application description information of the new application to be labeled, and the feature word information includes the feature word and the weight of the feature word for the new application to which the feature word belongs. Can also be recorded as w _i

w _i ={w1:pci1,w2:pci2,w3:pci3,...}.

For other related descriptions, refer to the description of step S1, and the description is not repeated here.

S5: Determine, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library.

When there is a first preference for each feature word assigned to each tag pair and the extracted feature word information attributed to the new application, the association degree of the new application with each tag in the tag library can be established. Here, the second preference of each tag in the tag library is used as the degree of association by the new application, and the method for determining the second preference is as follows:

And j∈AM

among them:

r _i,t represents a second preference of the new application i for the tag t;

p _t,j represents the first preference of the tag t for the feature word j;

AM represents the obtained set of all feature words attributed to the tag t;

It can be seen from the formula that the new application i is regarded as a combination of different feature words j extracted from the application description information of the new application i, and the first preference of each feature word belonging to the new application i by superimposing the tag t The degree value is obtained, and the second preference of the new application i for the tag t is obtained. Note that if a certain feature word attributed to the new application i is not in the feature word set belonging to the tag t, the first preference of the tag t for the feature word is zero.

In this embodiment, when the number of feature words in the feature word set attributed to the tag t is large, the number of times of searching and accumulating is also large, which causes a large amount of calculation. A preferred embodiment is described. A part of the feature words may be filtered out from the feature word set belonging to the tag t according to the size of the first preference value, and the feature words corresponding to the smaller first preference value may be filtered out. The number of feature words in the feature word set belonging to the tag t is reduced, and the amount of calculation can be reduced.

Optionally, a certain number of feature words are selected as the topic feature words of the corresponding tags according to a preset manner, and a certain number of feature words are selected as corresponding tags according to a first preference degree of each feature word belonging to each tag. The preset manner of the topic feature words may be: selecting a certain number of feature words ranked first in the first preference degree as the topic feature according to the order of the first preference degree of each feature word belonging to the tag t from the label t. The word may also preset a first preset preference threshold, and select a plurality of feature words corresponding to the plurality of first preference degrees that are greater than or equal to the first preset preference threshold as the topic feature words. Alternatively, it may be defined according to the data situation and the business scenario, for example, selecting 50, 100, 200 or other values; then determining the second preference, as follows:

And j∈topic _t

among them:

r _i,t represents a second preference of the new application i for the tag t;

p _t,j represents the first preference of the tag t for the feature word j;

S6: Select the corresponding one or more tags from the tag library according to the second preference to label the new application.

There are several ways to select the corresponding one or more labels from the tag library. For example, the first preference or the first one or more labels of the second preference ranking may be selected for the new application according to the order in which the new application applies the second preference value of each label from large to small. Alternatively, the number of labels to be labeled may be defined according to the data situation and the business scenario, and may be any number between 1-5, such as 1, 2, 5, etc., or more.

In addition, a second preset preference threshold may be set, and one or more labels corresponding to one or more second preferences equal to or greater than the second preset preference threshold are selected to mark the new application. .

A method for labeling an application according to the present invention, by applying an application known in the library and a tag thereof, using a description and a word segmentation technique for introducing characteristics and core functions of the application, and applying a new application to the tag to be labeled Establishes an association with the tags in the preset tag library, which automatically finds one or more tags suitable for new application annotation, reduces labor costs, and improves the accuracy of labeling new applications. And work efficiency.

FIG. 2 is a schematic block diagram of an apparatus for labeling an application according to an embodiment of the present invention. As shown in FIG. 2, the apparatus for labeling an application of the present invention includes:

The feature word information extracting unit is configured to extract feature word information from the application description information of each application in the preset application library, and extract feature word information from the application description information of the new application to be tagged.

In the present embodiment, regarding the description of the feature word information extracting unit, reference may be made to the detailed description of step S1 shown in Fig. 1, that is, step S1 may be performed by the feature word information extracting unit. The feature word information determining unit of the tag is configured to merge the corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag.

In the present embodiment, the description of the feature word information determining unit of the tag may refer to the detailed description of step S2 shown in FIG. 1, that is, step S2 may be performed by the feature word information determining unit of the tag.

The first preference determination unit is configured to determine a first preference for each of the feature words to which each tag belongs.

In the present embodiment, the description about the first preference determination unit may refer to the detailed description of step S3 shown in FIG. 1, that is, step S3 may be performed by the first preference determination unit. The second preference determining unit is configured to determine a second preference of the new application for each tag in the tag library based on the first preference and the extracted feature word information of the new application.

In the present embodiment, regarding the description of the second preference determination unit, reference may be made to the detailed description of step S4 shown in FIG. 1, that is, step S4 may be performed by the second preference determination unit.

The label labeling unit is configured to select a corresponding one or more labels from the label library according to the second preference to mark the new application.

In the present embodiment, regarding the description of the label labeling unit, reference may be made to the detailed description of step S5 shown in FIG. 1, that is, step S5 may be performed by the label labeling unit.

For the specific implementation manner of selecting one or more labels in a preset manner, reference may be made to the implementation method described in the foregoing method embodiments.

Optionally, the method for extracting the feature word information from the application description information of each application in the preset application library may include: first performing word segmentation on the application description information to extract the feature words, and then The probability of occurrence of each feature word is counted as the weight of the feature word to which it belongs.

Optionally, the feature word information determining unit of the tag merges the corresponding feature word information of the plurality of applications having the same tag, and the method as the feature word information of the tag may include: corresponding to each application having the same tag The same feature words in the feature word information are merged into one feature word, and the feature word obtained after the combination is used as the feature word of the tag. The weight of each of the feature words on the label is then determined. The feature words obtained after the combination and the weight of each of the feature words on the tag are then used as feature word information of the tag.

Optionally, the feature word information determining unit of the tag is configured to merge the same plurality of feature words into one feature word in the merging process, and the method for calculating the weight of each feature word on the tag is as follows:

And i∈A,j∈w

among them:

f _t,j represents the weight of the feature word j on the label t;

n represents the number of applications in the application set A;

m represents the number of feature words in the feature word set W.

Optionally, the method for determining, by the first preference determining unit, the first preference may include:

The first preference is determined by the following calculation formula as follows:

among them:

p _t,j represents the first preference of the tag t for the feature word j;

f _t,j represents the weight of the feature word j on the label t;

And i∈AA,j∈Aw

among them:

n represents the number of applications in the application set AA;

m represents the number of feature words in the feature word set Aw.

Optionally, the method for determining, by the second preference determining unit, the second preference may include:

The second preference is determined by the following formula:

And j∈AM

among them:

r _i,t represents a second preference of the new application i for the tag t;

p _t,j represents the first preference of the tag t for the feature word j;

AM represents the obtained set of all feature words attributed to the tag t;

In order to reduce the amount of calculation of the second preference degree, the method for determining, by the second preference degree determining unit, the second preference of the new application to each tag in the tag library comprises: firstly, according to each tag pair The first preference of each feature word selects a certain number of feature words as the topic feature words of the corresponding tags in a preset manner, and then determines the second preference degrees. In detail, according to the descending order of the first preference degree of each feature word belonging to the tag t, the first preference number is selected as the topic feature word, or Presetting a first preset preference threshold, and selecting a plurality of feature words corresponding to the plurality of first preference degrees that are greater than or equal to the first preset preference threshold as the topic feature words. Alternatively, it can be customized according to the data situation and the business scenario, for example, 50, 100, 200 or other values are selected.

The second preference can be determined by the following formula:

And j∈topic _t

among them:

r _i,t represents a second preference of the new application i for the tag t;

p _t,j represents the first preference of the tag t for the feature word j;

After the second preference degree of the label is determined by the second preference determining unit, the label labeling unit may select the corresponding one or more labels from the label library according to the second preference degree to label the new application. on.

In detail, the preset manner of selecting one or more labels from the tag library can be performed in various ways. For example, the first preference or the first one or more labels of the second preference ranking may be selected for the new application according to the order in which the new application applies the second preference value of each label from large to small. Alternatively, the number of labels to be labeled may be defined according to the data situation and the business scenario, and may be any number between 1-5, such as 1, 2, 5, etc., or more.

In addition, a second preset preference threshold may be set, and one or more labels corresponding to one or more second preferences equal to or greater than the second preset preference threshold are selected to mark the new application. . .

A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the device described in the embodiment of the present invention can be referred to the corresponding process in the foregoing method embodiments, and the description is not repeated here.

According to the present invention, a device for labeling an application, by using an application known in the library and a tag thereof, using a description and a word segmentation technique for introducing characteristics and core functions of the application, a new application to be tagged Establishes an association with the tags in the preset tag library, which automatically finds one or more tags suitable for new applications, reduces labor costs, and improves accuracy and work efficiency.

A computer program product for providing a method for labeling an application according to an embodiment of the present invention, comprising a computer readable storage medium storing program code, the program code comprising instructions for executing the application described in the foregoing method embodiment For the specific method, refer to the method embodiment, and details are not described herein again.

The functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including A number of instructions are used to cause a computer device (which may be a personal computer, smart tablet, smartphone, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a removable hard disk, a read only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.

The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the appended claims.

Industrial applicability

The method, the device, the terminal and the computer readable storage medium for labeling an application provided by the embodiment of the present invention utilize an application description for introducing an application feature and a core function by using an application known in the library and a tag thereof Information and word segmentation technology establishes an association between the new application to be tagged and the tag in the preset tag library, enabling automatic identification of one or more tags for new applications, reducing manual Cost, improved accuracy and productivity

Claims

A method for labeling an application, comprising:

Extracting feature word information from application description information of each application in the preset application library;

Combining corresponding feature word information of a plurality of applications having the same tag as feature word information of the tag;

Determining a first preference for each feature word to which each tag belongs;

Extracting feature word information from application description information of a new application to be tagged;

Determining, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;

And correspondingly selecting one or more tags from the tag library according to the second preference to mark the new application.
The method according to claim 1, wherein the step of extracting feature word information from the application description information of each application in the preset application library comprises:

Performing word segmentation processing on the application description information to extract the feature words;

The probability of occurrence of each of the feature words is counted as a weight of the feature word to which the feature word belongs, to obtain the feature word information, and the feature word information includes a feature word and a weight of the feature word to which the feature word belongs.
The method according to any one of claims 1 to 2, wherein the step of combining the corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag comprises:

Combining the same feature words in the feature word information corresponding to each application having the same tag into one feature word, and using the feature word obtained after the combination as the feature word of the tag;

Determining a weight of each of the feature words on the label;

The feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.
The method according to claim 3, wherein said calculating a weight of each feature word on the label is as follows:

And i∈A,j∈w

among them:

f t,j represents the weight of the feature word j on the label t;

w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library;

A represents a set of applications with a tag t in the preset application library;

W represents a feature word set belonging to an application in the application set A;

n represents the number of applications in the application set A;

m represents the number of feature words in the feature word set W.
The method according to any one of claims 1 to 4, wherein the step of determining the first preference of each tag pair for each feature word belonging to it comprises:

The first preference of each tag to each feature word belonging to it is calculated by the following formula:

among them:

p t,j represents the first preference of the tag t for the feature word j;

f t,j represents the weight of the feature word j on the label t;

s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:

And i∈AA,j∈Aw

among them:

w i,j represents the weight of the feature word j on the application i in the preset application library;

AA represents a collection of all applications in the preset application library;

Aw represents a set of all feature words extracted from the respective application description information of all applications;

n represents the number of applications in the application set AA;

m represents the number of feature words in the feature word set Aw.
The method according to any one of claims 1 to 5, wherein the step of determining a second preference of the new application for each tag in the tag library comprises:

The second preference of the new application for each tag in the tag library is calculated by the following formula:

And j∈AM

among them:

r i,t represents a second preference of the new application i for the tag t;

p t,j represents the first preference of the tag t for the feature word j;

w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;

AM represents the obtained set of all feature words attributed to the tag t;

m represents the number of feature words in the feature word set attributed to the tag t.
The method according to any one of claims 1 to 6, wherein the step of determining a second preference of the new application for each tag in the tag library comprises:

And selecting, according to a preset manner, a certain number of feature words as the topic feature words of the corresponding tags according to a first preference degree of each of the feature words belonging to each of the tags;

The second preference is determined by the following formula:

And j∈topic t

among them:

r i,t represents a second preference of the new application i for the tag t;

p t,j represents the first preference of the tag t for the feature word j;

w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;

Topic t represents a selected set of subject feature words attributed to the tag t;

m represents the number of feature words in the set of subject feature words belonging to the tag t.
The method according to claim 7, wherein the selecting, according to a first preference degree of each of the feature words belonging to each of the tags, a certain number of feature words as corresponding tags according to a preset manner The steps of the topic feature words include:

Selecting a preset number of feature words ranked first in the first preference degree as the topic feature words according to a descending order of the first preference degree of each feature word belonging to the tag.
The method according to claim 7, wherein the selecting, according to a first preference degree of each of the feature words belonging to each of the tags, a certain number of feature words as corresponding tags according to a preset manner The steps of the topic feature words include:

A plurality of feature words corresponding to the plurality of first preference degrees greater than or equal to the first preset preference threshold are selected as the topic feature words.
The method according to any one of claims 1 to 7, wherein the step of selecting the corresponding one or more tags from the tag library according to the second preference to label the new application comprises: :

According to the order in which the new application applies the second preference value of each tag from large to small, the first preference is ranked by the first one or more tags to the new application.
The method according to any one of claims 1 to 7, wherein the step of selecting the corresponding one or more tags from the tag library according to the second preference to label the new application comprises: :

The one or more tags corresponding to one or more second preference degrees greater than or equal to the second preset preference threshold are selected for the new application.
A device for labeling an application, comprising:

The feature word information extracting unit is configured to extract the feature word information from the application description information of each application in the preset application library, and extract the feature word information from the application description information of the new application to be labeled;

a feature word information determining unit of the tag, configured to combine corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag;

a first preference determining unit configured to determine a first preference of each tag pair for each feature word to which it belongs;

a second preference determining unit, configured to determine, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;

The label labeling unit is configured to select a corresponding one or more labels from the label library according to the second preference to label the new application.
The apparatus according to claim 12, wherein said feature word information extracting unit is configured to perform word segmentation processing on said application description information to extract said feature words, and to calculate a probability of occurrence of each of said feature words as The characteristic word is weighted by the feature word to obtain the feature word information, and the feature word information includes the feature word and the weight of the feature word for the application to which the feature word belongs.
The apparatus according to any one of claims 12 or 13, wherein the feature word information determining unit of the tag combines corresponding feature word information of a plurality of applications having the same tag as the feature word information of the tag Methods include:

Combining the same feature words in the feature word information corresponding to each application having the same tag into one feature word, and using the feature word obtained after the combination as the feature word of the tag;

Determining a weight of each of the feature words on the label;

The feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.
The apparatus according to claim 14, wherein the feature word information determining unit of the tag determines a weight of each feature word on the tag as follows:

And i∈A,j∈w

among them:

f t,j represents the weight of the feature word j on the label t;

w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library;

A represents a set of applications with a tag t in the preset application library;

W represents a feature word set belonging to an application in the application set A;

n represents the number of applications in the application set A;

m represents the number of feature words in the feature word set W.
The device according to any one of claims 12 to 15, wherein the method for determining the first preference by the first preference determining unit comprises:

The first preference of each tag to each feature word belonging to it is calculated by the following formula:

among them:

p t,j represents the first preference of the tag t for the feature word j;

f t,j represents the weight of the feature word j on the label t;

s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:

And i∈AA,j∈Aw

among them:

w i,j represents the weight of the feature word j on the application i in the preset application library;

AA represents a collection of all applications in the preset application library;

Aw represents a set of all feature words extracted from the respective application description information of all applications;

n represents the number of applications in the application set AA;

m represents the number of feature words in the feature word set Aw.
The device according to any one of claims 12 to 16, wherein the method for determining the second preference by the second preference determining unit comprises:

The second preference of the new application for each tag in the tag library is calculated by the following formula:

And j∈AM

among them:

r i,t represents a second preference of the new application i for the tag t;

p t,j represents the first preference of the tag t for the feature word j;

w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;

AM represents the obtained set of all feature words attributed to the tag t;

m represents the number of feature words in the feature word set attributed to the tag t.
The device according to any one of claims 12 to 17, wherein the method for determining, by the second preference determining unit, the second preference of the new application to each tag in the tag library comprises:

Selecting a certain number of feature words as the topic feature words of the corresponding tags according to a first preference degree of each feature word belonging to each tag according to a preset manner;

The second preference is determined by the following formula:

And j∈topic t

among them:

r i,t represents a second preference of the new application i for the tag t;

p t,j represents the first preference of the tag t for the feature word j;

w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;

Topic t represents a selected set of subject feature words attributed to the tag t;

m represents the number of feature words in the set of subject feature words belonging to the tag t.
The device according to claim 18, wherein the second preference determining unit selects a certain quantity according to a preset manner according to a first preference degree of each of the feature words belonging to each of the tags. The method of the feature word as the topic feature word of the corresponding tag includes:

Selecting a preset number of feature words ranked first in the first preference degree as the topic feature words according to a descending order of the first preference degree of each feature word belonging to the tag.
The device according to claim 18, wherein the second preference determining unit selects a certain quantity according to a preset manner according to a first preference degree of each of the feature words belonging to each of the tags. The method of the feature word as the topic feature word of the corresponding tag includes:

A plurality of feature words corresponding to the plurality of first preference degrees greater than or equal to the first preset preference threshold are selected as the topic feature words.
The device according to any one of claims 12 to 18, wherein the label labeling unit selects one or more labels from the label library in the preset manner according to the second preference degree to the new application. The methods on the label include:

According to the order in which the new application applies the second preference value of each tag from large to small, the first preference is ranked by the first one or more tags to the new application.
The device according to any one of claims 12 to 18, wherein the label labeling unit selects one or more labels from the label library in the preset manner according to the second preference degree to the new application. The methods on the label include:

The one or more tags corresponding to one or more second preference degrees greater than or equal to the second preset preference threshold are selected for the new application.
A terminal, comprising: a memory and a processor, wherein the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, performing any one of claims 1 to 11 The method described in the item.
A computer readable medium, wherein the computer readable storage medium stores executable instructions that, when executed by one or more processors, implement any of claims 1-11 Methods.