WO2018188378A1 - Method and device for tagging label for application, terminal and computer readable storage medium - Google Patents

Method and device for tagging label for application, terminal and computer readable storage medium Download PDF

Info

Publication number
WO2018188378A1
WO2018188378A1 PCT/CN2017/118709 CN2017118709W WO2018188378A1 WO 2018188378 A1 WO2018188378 A1 WO 2018188378A1 CN 2017118709 W CN2017118709 W CN 2017118709W WO 2018188378 A1 WO2018188378 A1 WO 2018188378A1
Authority
WO
WIPO (PCT)
Prior art keywords
tag
application
feature word
preference
feature
Prior art date
Application number
PCT/CN2017/118709
Other languages
French (fr)
Chinese (zh)
Inventor
潘岸腾
Original Assignee
广州优视网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州优视网络科技有限公司 filed Critical 广州优视网络科技有限公司
Publication of WO2018188378A1 publication Critical patent/WO2018188378A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Definitions

  • the present invention relates to the field of information processing technologies, and in particular, to a method, an apparatus, a terminal, and a computer readable storage medium for labeling an application.
  • a first embodiment of the present invention provides a method for labeling an application, including:
  • the corresponding one or more tags are selected from the tag library in a preset manner for the new application to be marked.
  • a second embodiment of the present invention provides an apparatus for labeling an application, including:
  • the feature word information extracting unit is configured to extract the feature word information from the application description information of each application in the preset application library, and extract the feature word information from the application description information of the new application to be labeled;
  • a feature word information determining unit of the tag configured to combine corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag;
  • a first preference determining unit configured to determine a first preference of each tag pair for each feature word to which it belongs
  • a second preference determining unit configured to determine, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;
  • the label labeling unit is configured to select a corresponding one or more labels from the label library according to the second preference to label the new application.
  • the feature word information extracting unit is configured to perform word segmentation processing on the application description information to extract the feature words, and calculate a probability that each of the feature words appears as a weight of the feature word to which the feature word belongs.
  • the feature word information includes a feature word and a weight of the feature word to which the application belongs.
  • the feature word information determining unit of the tag combines the corresponding feature word information of the plurality of applications having the same tag, and the method for the feature word information of the tag includes:
  • the feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.
  • the method for calculating the weight of each feature word on the label by the feature word information determining unit of the tag is as follows:
  • f t,j represents the weight of the feature word j on the label t
  • w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library
  • A represents a set of applications with a tag t in the preset application library
  • W represents a feature word set belonging to an application in the application set A
  • n the number of applications in the application set A
  • m represents the number of feature words in the feature word set W.
  • the method for determining, by the first preference determining unit, the first preference includes:
  • p t,j represents the first preference of the tag t for the feature word j
  • f t,j represents the weight of the feature word j on the label t
  • s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:
  • w i,j represents the weight of the feature word j on the application i in the preset application library
  • AA represents a collection of all applications in the preset application library
  • Aw represents a set of all feature words extracted from the respective application description information of all applications
  • n the number of applications in the application set AA
  • m represents the number of feature words in the feature word set Aw.
  • the method for determining, by the second preference determining unit, the second preference includes:
  • the second preference of the new application for each tag in the tag library is calculated by the following formula:
  • r i,t represents a second preference of the new application i for the tag t
  • p t,j represents the first preference of the tag t for the feature word j
  • w i,j represents the weight of the feature word j extracted from the application description information of the new application i to the new application i;
  • AM represents the obtained set of all feature words attributed to the tag t
  • m represents the number of feature words in the feature word set attributed to the tag t.
  • the method for determining, by the second preference determining unit, the second preference of the new application to each tag in the tag library comprises:
  • the second preference is determined by the following formula:
  • r i,t represents a second preference of the new application i for the tag t
  • p t,j represents the first preference of the tag t for the feature word j
  • w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
  • Topic t represents a selected set of subject feature words attributed to the tag t;
  • m represents the number of feature words in the set of subject feature words belonging to the tag t.
  • the second preference determining unit selects, according to a first preference degree of each of the feature words belonging to each tag, a certain number of feature words as a topic feature of the corresponding tag according to a preset manner.
  • Word methods include:
  • the second preference determining unit selects, according to a first preference degree of each of the feature words belonging to each tag, a certain number of feature words as a topic feature of the corresponding tag according to a preset manner.
  • Word methods include:
  • a plurality of feature words corresponding to the plurality of first preference degrees greater than or equal to the first preset preference threshold are selected as the topic feature words.
  • the method for the label labeling unit to select the corresponding one or more labels from the label library according to the second preference to mark the new application according to the second preference includes:
  • the first preference is ranked by the first one or more tags to the new application.
  • the method for the label labeling unit to select the corresponding one or more labels from the label library according to the second preference to mark the new application according to the second preference includes:
  • the one or more tags corresponding to one or more second preference degrees greater than or equal to the second preset preference threshold are selected for the new application.
  • the embodiment of the present invention further provides a terminal, including a memory and a processor, where the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the application provided by the embodiment of the present invention is implemented.
  • the method of labeling labels are described in detail below.
  • the embodiment of the present invention further provides a computer readable storage medium, on which a computer program is stored, and when the computer program is executed, a method for labeling an application provided by an embodiment of the present invention is implemented.
  • a method, device, terminal and computer readable storage medium for labeling an application according to an embodiment of the present invention by using an application known in the library and a tag thereof, using features and core functions for introducing an application
  • Application description information and word segmentation technology establishes an association between the new application to be tagged and the tag in the tag library, enabling automatic identification of one or more tags for new application annotation, reducing labor costs Improves accuracy and productivity.
  • FIG. 1 is a flowchart of a method for labeling an application according to an embodiment of the present invention
  • FIG. 2 is a schematic block diagram of an apparatus for labeling an application according to an embodiment of the present invention.
  • An embodiment of the present invention provides a terminal, where the terminal includes a memory, a processor, and a device for labeling an application.
  • the memory, the processor, and other components are electrically connected directly or indirectly to implement data transmission or interaction.
  • the device for labeling an application includes at least one software that can be stored in the memory or firmware in an operating system (OS) of the terminal in the form of software or firmware.
  • functional module The processor is configured to execute the executable module stored in the memory when the execution instruction is received, thereby implementing a corresponding function application, such as the method for labeling the application provided by the embodiment.
  • the terminal may further include more, less, or completely different components than the above, and is not limited herein.
  • FIG. 1 is a flowchart of a method for labeling an application according to an embodiment of the present invention, and the method is applicable to the foregoing terminal. As shown in FIG. 1, the method for labeling an application of the present invention includes the following steps S1 to S6.
  • S1 Extract feature word information from application description information of each application in the preset application library.
  • the application library is usually preset when developing the application market or the application store, and the third-party applications downloaded from the application market or the application store are saved in the preset application library.
  • third-party applications provided by the app store or the app market have one or more tags from the tag library that is preset when developing the app store or app market. The tag is used to identify the categories of various apps. Or content, easy for users to find. .
  • each application in the preset application library has application description information, which is used to introduce the characteristics and core functions of the application, so that the user can understand the application and generate interest in the application.
  • the method provided by the present invention may first perform word segmentation on the application description information to extract feature words, and then count the probability of occurrence of each feature word as the weight of the feature word to which the feature word belongs.
  • the feature word information described in step S1 includes the feature word and the weight of the feature word to which it belongs.
  • Word segmentation technology can be used to process word segmentation of application description information.
  • the extracted feature words are words obtained after word segmentation processing, or keywords.
  • the feature word information extracted from the description information of an application i is written as w i
  • w1: pci1, w2: pci2, w3: pci3, ... represent feature words and corresponding weights, for example w1 represents a feature word, and pci1 represents the weight of the feature word on the application i.
  • the application description information of the application is: "The input method with the precise typing and the most personalized interface, and the versatile input method".
  • the feature words obtained after the segmentation of the description information are: "typing and precision” , interface, personality, input method, possession, omnipotence, input method.”
  • the characteristic word information of "Sogou Pinyin Input Method” is:
  • the same feature words in the feature word information corresponding to each application having the same tag may be merged into one feature word, and the feature word obtained after the combination may be used as the feature word of the tag.
  • the weight of each of the feature words on the tag is then determined.
  • the feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.
  • Each existing application in the preset application library has one or more tags, and the feature information attributed to the application is extracted in the description information of each application, and the corresponding applications of multiple applications having the same tag are The feature word information is merged, and the feature word information obtained after the combination is used as the feature word information of the tag.
  • the feature word information of the tag similarly includes the feature word and the weight of the feature word on the tag.
  • the same plurality of feature words may be combined into one feature word, and the weight of each feature word on the label is calculated as follows:
  • f t,j represents the weight of the feature word j on the label t
  • w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library
  • A represents a set of applications with a tag t in the preset application library
  • W represents a feature word set belonging to an application in the application set A
  • n the number of applications in the application set A
  • m represents the number of feature words in the feature word set W.
  • the weight of each feature word on a certain tag is the probability that each feature word appears in the feature word set of the application belonging to the application set having the tag.
  • S3 Determine a first preference of each tag pair for each feature word that belongs to it.
  • the degree of association between words here the first preference degree of each feature word belonging to each tag pair is used as the degree of association, and the method for determining the first preference degree is as follows:
  • p t,j represents the first preference of the tag t for the feature word j
  • f t,j denotes the weight of the feature word j on the tag t, that is, the probability of occurrence in the feature word set of the application belonging to the application set having the tag t;
  • s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:
  • w i,j represents the weight of the feature word j on the application i in the preset application library
  • AA represents a collection of all applications in the preset application library
  • Aw represents a set of all feature words extracted from the respective application description information of all applications
  • n the number of applications in the application set AA
  • m represents the number of feature words in the feature word set Aw.
  • the implementation step here is the same as the implementation method of step S1.
  • the feature word information is extracted from the application description information of the new application to be labeled, and the feature word information includes the feature word and the weight of the feature word for the new application to which the feature word belongs. Can also be recorded as w i
  • w i ⁇ w1:pci1,w2:pci2,w3:pci3,... ⁇ .
  • step S1 For other related descriptions, refer to the description of step S1, and the description is not repeated here.
  • S5 Determine, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library.
  • the association degree of the new application with each tag in the tag library can be established.
  • the second preference of each tag in the tag library is used as the degree of association by the new application, and the method for determining the second preference is as follows:
  • r i,t represents a second preference of the new application i for the tag t
  • p t,j represents the first preference of the tag t for the feature word j
  • w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
  • AM represents the obtained set of all feature words attributed to the tag t
  • m represents the number of feature words in the feature word set attributed to the tag t.
  • the new application i is regarded as a combination of different feature words j extracted from the application description information of the new application i, and the first preference of each feature word belonging to the new application i by superimposing the tag t The degree value is obtained, and the second preference of the new application i for the tag t is obtained. Note that if a certain feature word attributed to the new application i is not in the feature word set belonging to the tag t, the first preference of the tag t for the feature word is zero.
  • the number of feature words in the feature word set attributed to the tag t when the number of feature words in the feature word set attributed to the tag t is large, the number of times of searching and accumulating is also large, which causes a large amount of calculation.
  • a preferred embodiment is described.
  • a part of the feature words may be filtered out from the feature word set belonging to the tag t according to the size of the first preference value, and the feature words corresponding to the smaller first preference value may be filtered out.
  • the number of feature words in the feature word set belonging to the tag t is reduced, and the amount of calculation can be reduced.
  • a certain number of feature words are selected as the topic feature words of the corresponding tags according to a preset manner, and a certain number of feature words are selected as corresponding tags according to a first preference degree of each feature word belonging to each tag.
  • the preset manner of the topic feature words may be: selecting a certain number of feature words ranked first in the first preference degree as the topic feature according to the order of the first preference degree of each feature word belonging to the tag t from the label t.
  • the word may also preset a first preset preference threshold, and select a plurality of feature words corresponding to the plurality of first preference degrees that are greater than or equal to the first preset preference threshold as the topic feature words.
  • it may be defined according to the data situation and the business scenario, for example, selecting 50, 100, 200 or other values; then determining the second preference, as follows:
  • r i,t represents a second preference of the new application i for the tag t
  • p t,j represents the first preference of the tag t for the feature word j
  • w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
  • Topic t represents a selected set of subject feature words attributed to the tag t;
  • m represents the number of feature words in the set of subject feature words belonging to the tag t.
  • S6 Select the corresponding one or more tags from the tag library according to the second preference to label the new application.
  • the first preference or the first one or more labels of the second preference ranking may be selected for the new application according to the order in which the new application applies the second preference value of each label from large to small.
  • the number of labels to be labeled may be defined according to the data situation and the business scenario, and may be any number between 1-5, such as 1, 2, 5, etc., or more.
  • a second preset preference threshold may be set, and one or more labels corresponding to one or more second preferences equal to or greater than the second preset preference threshold are selected to mark the new application.
  • a method for labeling an application according to the present invention by applying an application known in the library and a tag thereof, using a description and a word segmentation technique for introducing characteristics and core functions of the application, and applying a new application to the tag to be labeled Establishes an association with the tags in the preset tag library, which automatically finds one or more tags suitable for new application annotation, reduces labor costs, and improves the accuracy of labeling new applications. And work efficiency.
  • FIG. 2 is a schematic block diagram of an apparatus for labeling an application according to an embodiment of the present invention. As shown in FIG. 2, the apparatus for labeling an application of the present invention includes:
  • the feature word information extracting unit is configured to extract feature word information from the application description information of each application in the preset application library, and extract feature word information from the application description information of the new application to be tagged.
  • step S1 may be performed by the feature word information extracting unit.
  • the feature word information determining unit of the tag is configured to merge the corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag.
  • the description of the feature word information determining unit of the tag may refer to the detailed description of step S2 shown in FIG. 1, that is, step S2 may be performed by the feature word information determining unit of the tag.
  • the first preference determination unit is configured to determine a first preference for each of the feature words to which each tag belongs.
  • the description about the first preference determination unit may refer to the detailed description of step S3 shown in FIG. 1, that is, step S3 may be performed by the first preference determination unit.
  • the second preference determining unit is configured to determine a second preference of the new application for each tag in the tag library based on the first preference and the extracted feature word information of the new application.
  • step S4 may be performed by the second preference determination unit.
  • the label labeling unit is configured to select a corresponding one or more labels from the label library according to the second preference to mark the new application.
  • step S5 may be performed by the label labeling unit.
  • the method for extracting the feature word information from the application description information of each application in the preset application library may include: first performing word segmentation on the application description information to extract the feature words, and then The probability of occurrence of each feature word is counted as the weight of the feature word to which it belongs.
  • the feature word information determining unit of the tag merges the corresponding feature word information of the plurality of applications having the same tag, and the method as the feature word information of the tag may include: corresponding to each application having the same tag The same feature words in the feature word information are merged into one feature word, and the feature word obtained after the combination is used as the feature word of the tag. The weight of each of the feature words on the label is then determined. The feature words obtained after the combination and the weight of each of the feature words on the tag are then used as feature word information of the tag.
  • the feature word information determining unit of the tag is configured to merge the same plurality of feature words into one feature word in the merging process, and the method for calculating the weight of each feature word on the tag is as follows:
  • f t,j represents the weight of the feature word j on the label t
  • w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library
  • A represents a set of applications with a tag t in the preset application library
  • W represents a feature word set belonging to an application in the application set A
  • n the number of applications in the application set A
  • m represents the number of feature words in the feature word set W.
  • the method for determining, by the first preference determining unit, the first preference may include:
  • the first preference is determined by the following calculation formula as follows:
  • p t,j represents the first preference of the tag t for the feature word j
  • f t,j represents the weight of the feature word j on the label t
  • s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:
  • w i,j represents the weight of the feature word j on the application i in the preset application library
  • AA represents a collection of all applications in the preset application library
  • Aw represents a set of all feature words extracted from the respective application description information of all applications
  • n the number of applications in the application set AA
  • m represents the number of feature words in the feature word set Aw.
  • the method for determining, by the second preference determining unit, the second preference may include:
  • the second preference is determined by the following formula:
  • r i,t represents a second preference of the new application i for the tag t
  • p t,j represents the first preference of the tag t for the feature word j
  • w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
  • AM represents the obtained set of all feature words attributed to the tag t
  • m represents the number of feature words in the feature word set attributed to the tag t.
  • the method for determining, by the second preference degree determining unit, the second preference of the new application to each tag in the tag library comprises: firstly, according to each tag pair The first preference of each feature word selects a certain number of feature words as the topic feature words of the corresponding tags in a preset manner, and then determines the second preference degrees.
  • the first preference number is selected as the topic feature word, or Presetting a first preset preference threshold, and selecting a plurality of feature words corresponding to the plurality of first preference degrees that are greater than or equal to the first preset preference threshold as the topic feature words.
  • it can be customized according to the data situation and the business scenario, for example, 50, 100, 200 or other values are selected.
  • the second preference can be determined by the following formula:
  • r i,t represents a second preference of the new application i for the tag t
  • p t,j represents the first preference of the tag t for the feature word j
  • w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
  • Topic t represents a selected set of subject feature words attributed to the tag t;
  • m represents the number of feature words in the set of subject feature words belonging to the tag t.
  • the label labeling unit may select the corresponding one or more labels from the label library according to the second preference degree to label the new application. on.
  • the preset manner of selecting one or more labels from the tag library can be performed in various ways.
  • the first preference or the first one or more labels of the second preference ranking may be selected for the new application according to the order in which the new application applies the second preference value of each label from large to small.
  • the number of labels to be labeled may be defined according to the data situation and the business scenario, and may be any number between 1-5, such as 1, 2, 5, etc., or more.
  • a second preset preference threshold may be set, and one or more labels corresponding to one or more second preferences equal to or greater than the second preset preference threshold are selected to mark the new application. . .
  • a device for labeling an application by using an application known in the library and a tag thereof, using a description and a word segmentation technique for introducing characteristics and core functions of the application, a new application to be tagged
  • a computer program product for providing a method for labeling an application according to an embodiment of the present invention comprising a computer readable storage medium storing program code, the program code comprising instructions for executing the application described in the foregoing method embodiment
  • program code comprising instructions for executing the application described in the foregoing method embodiment
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • a number of instructions are used to cause a computer device (which may be a personal computer, smart tablet, smartphone, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a removable hard disk, a read only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
  • the method, the device, the terminal and the computer readable storage medium for labeling an application provided by the embodiment of the present invention utilize an application description for introducing an application feature and a core function by using an application known in the library and a tag thereof
  • Information and word segmentation technology establishes an association between the new application to be tagged and the tag in the preset tag library, enabling automatic identification of one or more tags for new applications, reducing manual Cost, improved accuracy and productivity

Abstract

A method and device for tagging a label for an application, a terminal and a computer readable storage medium. The method comprises: extracting feature word information from application description information of each application in a preset application library (S1); merging the corresponding feature word information of a plurality of applications with the same label, and using the merged feature word information as feature word information of the label (S2); determining a first preference degree of each label on each feature word belonging to the label (S3); extracting feature word information from application description information of new applications which are to be tagged with labels (S4); determining a second preference degree of the new application on each label in the label library based on the first preference degree and the extracted feature word information of the new applications (S5); and according to the second preference degree, selecting corresponding one or more labels from the label library in a preset mode to tag the new application (S6).

Description

一种给应用标注标签的方法、装置、终端及计算机可读存储介质Method, device, terminal and computer readable storage medium for labeling applications
相关申请的交叉引用Cross-reference to related applications
本申请要求于2017年04月10日提交中国专利局的申请号为201710227588.8、名称为“一种给应用标注标签的方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 201710227588.8, entitled "A Method and Apparatus for Labeling Applications", filed on April 10, 2017, the entire contents of which is incorporated herein by reference. In the application.
技术领域Technical field
本发明涉及信息处理技术领域,具体而言涉及一种给应用标注标签的方法、装置、终端及计算机可读存储介质。The present invention relates to the field of information processing technologies, and in particular, to a method, an apparatus, a terminal, and a computer readable storage medium for labeling an application.
背景技术Background technique
在应用商店或应用市场里提供的应用都具有1个或多个标签,标签的作用是标识各种应用程序的分类或内容,便于用户查找。而在应用商店或应用市场运营过程中,需要对新加入应用库的应用标注标签。例如:应用商店或应用市场刚上线了一个“贪吃蛇”应用,需要给这个应用标注标签,可以标注标签“休闲游戏”。给新上线的应用标注标签的传统方法是通过运营人员根据经验判断该应用适合什么样的标签。这种方法的缺陷包括:Applications provided in the app store or the app market have one or more tags, which are used to identify categories or content of various applications for user to find. In the application store or application market operation process, it is necessary to label the application newly added to the application library. For example, the app store or app market just launched a "snake" app, you need to label this app, you can label the tag "casual game." The traditional method of labeling new online applications is to determine by operator what kind of label the application is suitable for. The drawbacks of this approach include:
1、需要耗费巨大的人力成本。对于加入应用库的每一个新应用,运营人员都需要去查阅所有标签,找出合适的标签给应用标注上。1. It takes a huge labor cost. For each new application that joins the application library, the operator needs to review all the tags and find the appropriate tags for the application.
2、准确性难以保障并且效率低。由于加入应用库的新应用众多,运营人员不可能对每一款新应用都花时间去下载、安装、体验,运营人员一般是通过应用名作为判断依据,这导致准确性难以保障;而且靠人工来一个一个标注标签,效率低。2. The accuracy is difficult to guarantee and the efficiency is low. Due to the large number of new applications that are added to the application library, it is impossible for operators to spend time downloading, installing, and experiencing each new application. Operators generally rely on the application name as a basis for judgment, which makes accuracy difficult to guarantee; It is inefficient to label labels one by one.
发明内容Summary of the invention
有鉴于此,本发明的目的在于提供一种给应用标注标签的方法、装置、终端及计算机可读存储介质,以改善上述问题中的至少一个。In view of the above, it is an object of the present invention to provide a method, apparatus, terminal and computer readable storage medium for labeling applications to improve at least one of the above problems.
本发明第一实施例提供了一种给应用标注标签的方法,其包括:A first embodiment of the present invention provides a method for labeling an application, including:
从预置应用库里的每个应用的应用描述信息中提取特征词信息;Extracting feature word information from application description information of each application in the preset application library;
将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息;Combining corresponding feature word information of a plurality of applications having the same tag as feature word information of the tag;
确定每个标签对归属其的每个特征词的第一偏好度;Determining a first preference for each feature word to which each tag belongs;
从待标注标签的新应用的应用描述信息中提取特征词信息;Extracting feature word information from application description information of a new application to be tagged;
基于所述第一偏好度和提取出的新应用的特征词信息,确定该新应用对标签库里的每个标签的第二偏好度;Determining, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;
根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标 注上。According to the second preference, the corresponding one or more tags are selected from the tag library in a preset manner for the new application to be marked.
本发明第二实施例提供了一种给应用标注标签的装置,其包括:A second embodiment of the present invention provides an apparatus for labeling an application, including:
特征词信息提取单元,配置成从预置应用库里的每个应用的应用描述信息中提取特征词信息,和从待标注标签的新应用的应用描述信息中提取特征词信息;The feature word information extracting unit is configured to extract the feature word information from the application description information of each application in the preset application library, and extract the feature word information from the application description information of the new application to be labeled;
标签的特征词信息确定单元,配置成将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息;a feature word information determining unit of the tag, configured to combine corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag;
第一偏好度确定单元,配置成确定每个标签对归属其的每个特征词的第一偏好度;a first preference determining unit configured to determine a first preference of each tag pair for each feature word to which it belongs;
第二偏好度确定单元,配置成基于所述第一偏好度和提取出的新应用的特征词信息,确定该新应用对标签库里的每个标签的第二偏好度;a second preference determining unit, configured to determine, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;
标签标注单元,配置成根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上。The label labeling unit is configured to select a corresponding one or more labels from the label library according to the second preference to label the new application.
可选地,所述特征词信息提取单元配置成对所述应用描述信息进行分词处理以提取出所述特征词,统计每个所述特征词出现的概率作为该特征词对其所属应用的权重,以获得所述特征词信息,所述特征词信息包含特征词和该特征词对其所属应用的权重。Optionally, the feature word information extracting unit is configured to perform word segmentation processing on the application description information to extract the feature words, and calculate a probability that each of the feature words appears as a weight of the feature word to which the feature word belongs. Obtaining the feature word information, the feature word information includes a feature word and a weight of the feature word to which the application belongs.
可选地,所述标签的特征词信息确定单元将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息的方法包括:Optionally, the feature word information determining unit of the tag combines the corresponding feature word information of the plurality of applications having the same tag, and the method for the feature word information of the tag includes:
将具有相同标签的每个应用对应的特征词信息中的相同的特征词合并为一个特征词,将合并后获得的该特征词作为所述标签的特征词;Combining the same feature words in the feature word information corresponding to each application having the same tag into one feature word, and using the feature word obtained after the combination as the feature word of the tag;
确定每个所述特征词在所述标签上的权重;Determining a weight of each of the feature words on the label;
将合并后获得的所述特征词和每个所述特征词在所述标签上的权重作为该标签的特征词信息。The feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.
可选地,所述标签的特征词信息确定单元确定每个特征词在该标签上的权重的计算方法如下:Optionally, the method for calculating the weight of each feature word on the label by the feature word information determining unit of the tag is as follows:
Figure PCTCN2017118709-appb-000001
且i∈A,j∈w
Figure PCTCN2017118709-appb-000001
And i∈A,j∈w
其中:among them:
f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
w i,j表示特征词j对预置应用库里具有标签t的应用i的权重; w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library;
A表示预置应用库里的具有标签t的应用集合;A represents a set of applications with a tag t in the preset application library;
W表示归属于应用集合A中的应用的特征词集合;W represents a feature word set belonging to an application in the application set A;
n表示应用集合A里的应用数量;n represents the number of applications in the application set A;
m表示特征词集合W里的特征词数量。m represents the number of feature words in the feature word set W.
可选地,所述第一偏好度确定单元用于确定所述第一偏好度的方法包括:Optionally, the method for determining, by the first preference determining unit, the first preference includes:
通过以下计算式计算每个标签对归属其的每个特征词的第一偏好度:The first preference of each tag to each feature word belonging to it is calculated by the following formula:
Figure PCTCN2017118709-appb-000002
Figure PCTCN2017118709-appb-000002
其中:among them:
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
s j表示特征词j在从预置应用库里的所有应用的各自应用描述信息中所提取的全部特征词集合中出现的概率,其中: s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:
Figure PCTCN2017118709-appb-000003
且i∈AA,j∈Aw
Figure PCTCN2017118709-appb-000003
And i∈AA,j∈Aw
其中:among them:
w i,j表示特征词j对预置应用库里的应用i的权重; w i,j represents the weight of the feature word j on the application i in the preset application library;
AA表示预置应用库里的所有应用的集合;AA represents a collection of all applications in the preset application library;
Aw表示从所有应用的各自应用描述信息中所提取的所有特征词的集合;Aw represents a set of all feature words extracted from the respective application description information of all applications;
n表示应用集合AA里的应用数量;n represents the number of applications in the application set AA;
m表示特征词集合Aw里的特征词数量。m represents the number of feature words in the feature word set Aw.
可选地,所述第二偏好度确定单元用于确定所述第二偏好度的方法包括:Optionally, the method for determining, by the second preference determining unit, the second preference includes:
通过以下计算式计算新应用对标签库里的每个标签的第二偏好度:The second preference of the new application for each tag in the tag library is calculated by the following formula:
Figure PCTCN2017118709-appb-000004
且j∈AM
Figure PCTCN2017118709-appb-000004
And j∈AM
其中:among them:
r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
w i,j表示从新应用i的应用描述信息中提取的特征词j对新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i to the new application i;
AM表示所获得的归属于标签t的所有特征词的集合;AM represents the obtained set of all feature words attributed to the tag t;
m表示归属于标签t的特征词集合里的特征词数量。m represents the number of feature words in the feature word set attributed to the tag t.
可选地,所述第二偏好度确定单元确定新应用对标签库里的每个标签的第二偏好度的方法包括:Optionally, the method for determining, by the second preference determining unit, the second preference of the new application to each tag in the tag library comprises:
根据每个标签对归属其的每个特征词的第一偏好度按预设方式选取一定数量的特征词作为相应标签的主题特征词;Selecting a certain number of feature words as the topic feature words of the corresponding tags according to a first preference degree of each feature word belonging to each tag according to a preset manner;
通过以下计算式确定所述第二偏好度:The second preference is determined by the following formula:
Figure PCTCN2017118709-appb-000005
且j∈topic t
Figure PCTCN2017118709-appb-000005
And j∈topic t
其中:among them:
r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
topic t表示所选取的归属于标签t的一定数量的主题特征词的集合; Topic t represents a selected set of subject feature words attributed to the tag t;
m表示归属于标签t的主题特征词集合里的特征词数量。m represents the number of feature words in the set of subject feature words belonging to the tag t.
可选地,所述第二偏好度确定单元根据每个所述标签对归属其的每个所述特征词的第一偏好度,按预设方式选取一定数量的特征词作为相应标签的主题特征词的方法包括:Optionally, the second preference determining unit selects, according to a first preference degree of each of the feature words belonging to each tag, a certain number of feature words as a topic feature of the corresponding tag according to a preset manner. Word methods include:
根据所述标签对归属于其的每个特征词的第一偏好度的从大到小顺序,选取第一偏好度排名在前面的预设数量的特征词作为主题特征词,Selecting a preset number of feature words ranked first in the first preference degree as the topic feature words according to a descending order of the first preference degree of each feature word belonging to the tag.
可选地,所述第二偏好度确定单元根据每个所述标签对归属其的每个所述特征词的第一偏好度,按预设方式选取一定数量的特征词作为相应标签的主题特征词的方法包括:Optionally, the second preference determining unit selects, according to a first preference degree of each of the feature words belonging to each tag, a certain number of feature words as a topic feature of the corresponding tag according to a preset manner. Word methods include:
选取大于或等于第一预设偏好度阈值的多个第一偏好度所对应的多个特征词作为主题特征词。A plurality of feature words corresponding to the plurality of first preference degrees greater than or equal to the first preset preference threshold are selected as the topic feature words.
可选地,所述标签标注单元根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上的方法包括:Optionally, the method for the label labeling unit to select the corresponding one or more labels from the label library according to the second preference to mark the new application according to the second preference includes:
根据所述新应用对每个标签的第二偏好度值从大到小的顺序,选取第二偏好度排名在前面的1个或多个标签给该新应用标注上。According to the order in which the new application applies the second preference value of each tag from large to small, the first preference is ranked by the first one or more tags to the new application.
可选地,所述标签标注单元根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上的方法包括:Optionally, the method for the label labeling unit to select the corresponding one or more labels from the label library according to the second preference to mark the new application according to the second preference includes:
选取大于或等于第二预设偏好度阈值的1个或多个第二偏好度所对应的1个或多个标签给该新应用标注上。The one or more tags corresponding to one or more second preference degrees greater than or equal to the second preset preference threshold are selected for the new application.
本发明实施例还提供一种终端,包括存储器和处理器,所述存储器中储存有计算 机可读指令,所述计算机可读指令被所述处理器执行时,实现本发明实施例提供的给应用标注标签的方法。The embodiment of the present invention further provides a terminal, including a memory and a processor, where the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the application provided by the embodiment of the present invention is implemented. The method of labeling labels.
本发明实施例还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被执行时实现本发明实施例提供的给应用标注标签的方法。The embodiment of the present invention further provides a computer readable storage medium, on which a computer program is stored, and when the computer program is executed, a method for labeling an application provided by an embodiment of the present invention is implemented.
根据本发明实施例的一种给应用标注标签的方法、装置、终端及计算机可读存储介质,通过应用库里已知的应用及其具有的标签,利用用于介绍应用的特点和核心功能的应用描述信息和分词技术,在待标注标签的新应用与标签库里的标签之间建立了关联性,实现了自动找出适合的1个或多个标签给新应用标注上,降低了人工成本,提高了准确性和工作效率。A method, device, terminal and computer readable storage medium for labeling an application according to an embodiment of the present invention, by using an application known in the library and a tag thereof, using features and core functions for introducing an application Application description information and word segmentation technology establishes an association between the new application to be tagged and the tag in the tag library, enabling automatic identification of one or more tags for new application annotation, reducing labor costs Improves accuracy and productivity.
附图说明DRAWINGS
图1是本发明实施例提供的给应用标注标签的方法的流程图;1 is a flowchart of a method for labeling an application according to an embodiment of the present invention;
图2是本发明实施例提供的给应用标注标签的装置的示意性框图。FIG. 2 is a schematic block diagram of an apparatus for labeling an application according to an embodiment of the present invention.
具体实施方式detailed description
下面将结合本发明实施例和附图,对本发明实施例中的技术方案进行清楚、完整地描述。通常在此处附图中描述和示出的本发明实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围。基于本发明的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described in the following with reference to the embodiments of the present invention and the accompanying drawings. The components of the embodiments of the invention, which are generally described and illustrated in the figures herein, may be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of the invention are not intended to All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
为了能够实现自动给新入应用库的待标注标签的应用标注标签,需要在待标注标签的新应用与预置的标签库里的标签之间建立起关联性,下面通过实施例描述的方法利用了应用库里已知的应用及其具有的标签、以及用于介绍应用的特点和核心功能的应用描述信息和分词技术,建立了在待标注标签的新应用与预置的标签库里的标签之间的关联性,从而实现自动给应用标注标签的目的。In order to be able to automatically label the application of the tag to be tagged into the new application library, it is necessary to establish a correlation between the new application of the tag to be tagged and the tag in the preset tag library, which is utilized by the method described in the embodiment below. Applicable applications in the library and the tags they have, as well as application description information and word segmentation techniques for introducing the features and core functions of the application, establish a new application in the tag to be tagged and a tag in the preset tag library. The correlation between the two, thus achieving the purpose of automatically labeling the application.
本发明实施例提供一种终端,该终端包括存储器、处理器及给应用标注标签的装置。其中,存储器、处理器及其他各元件之间直接或间接地电性连接,以实现数据的传输或交互。其中,所述给应用标注标签的装置包括至少一个可以以软件(software)或固件(firmware)的形式存储在所述存储器中或固化在所述终端的操作系统(Operation System,OS)中的软件功能模块。所述处理器配置成在接收到执行指令时,执行存储在所述存储器中的可执行模块,从而实现相应的功能应用,例如本实施例提供的给应用标注标签的方法。An embodiment of the present invention provides a terminal, where the terminal includes a memory, a processor, and a device for labeling an application. The memory, the processor, and other components are electrically connected directly or indirectly to implement data transmission or interaction. The device for labeling an application includes at least one software that can be stored in the memory or firmware in an operating system (OS) of the terminal in the form of software or firmware. functional module. The processor is configured to execute the executable module stored in the memory when the execution instruction is received, thereby implementing a corresponding function application, such as the method for labeling the application provided by the embodiment.
应当理解,在本实施例中,所述终端还可以包括比上述内容更多、更少或者完全不同的组件,在此不做限制。It should be understood that, in this embodiment, the terminal may further include more, less, or completely different components than the above, and is not limited herein.
图1是本发明实施例提供的给应用标注标签的方法的流程图,该方法可应用于上述的终端。如图1所示,本发明的给应用标注标签的方法包括以下步骤S1至步骤S6。FIG. 1 is a flowchart of a method for labeling an application according to an embodiment of the present invention, and the method is applicable to the foregoing terminal. As shown in FIG. 1, the method for labeling an application of the present invention includes the following steps S1 to S6.
S1:从预置应用库里的每个应用的应用描述信息中提取特征词信息。S1: Extract feature word information from application description information of each application in the preset application library.
在开发应用市场或应用商店时通常都会预置应用库,从应用市场或应用商店下载的第三方应用都保存在该预置应用库中。另外,应用商店或应用市场提供的第三方应用都具有1个或多个标签,这些标签来自于在开发应用商店或应用市场时预置的标签库,标签的作用是标识各种应用程序的分类或内容,便于用户查找。。The application library is usually preset when developing the application market or the application store, and the third-party applications downloaded from the application market or the application store are saved in the preset application library. In addition, third-party applications provided by the app store or the app market have one or more tags from the tag library that is preset when developing the app store or app market. The tag is used to identify the categories of various apps. Or content, easy for users to find. .
除此之外,预置应用库里的每个应用都具有应用描述信息,用于介绍该应用的特点和核心功能,以便让用户了解该应用,对该应用产生兴趣。In addition, each application in the preset application library has application description information, which is used to introduce the characteristics and core functions of the application, so that the user can understand the application and generate interest in the application.
本发明提供的方法可以首先对应用描述信息进行分词处理以提取出特征词,然后统计每个特征词出现的概率作为该特征词对其所属应用的权重。这样,步骤S1所述的特征词信息包含特征词和该特征词对其所属应用的权重。可以采用分词技术来对应用描述信息进行分词处理,所提取出的特征词就是经过分词处理后得到的词语,或者称为关键词。The method provided by the present invention may first perform word segmentation on the application description information to extract feature words, and then count the probability of occurrence of each feature word as the weight of the feature word to which the feature word belongs. Thus, the feature word information described in step S1 includes the feature word and the weight of the feature word to which it belongs. Word segmentation technology can be used to process word segmentation of application description information. The extracted feature words are words obtained after word segmentation processing, or keywords.
例如,对某个应用i的描述信息提取的特征词信息记为w i For example, the feature word information extracted from the description information of an application i is written as w i
w i={w1:pci1,w2:pci2,w3:pci3,…} w i ={w1:pci1,w2:pci2,w3:pci3,...}
其中:w1:pci1,w2:pci2,w3:pci3,…表示特征词和相应的权重,例如w1表示一个特征词,pci1表示该特征词在该应用i上的权重。Where: w1: pci1, w2: pci2, w3: pci3, ... represent feature words and corresponding weights, for example w1 represents a feature word, and pci1 represents the weight of the feature word on the application i.
例如:应用“搜狗拼音输入法”的应用描述信息为:“打字精准、界面最个性的输入法,拥有全能的输入法”,对该描述信息分词处理后得到的特征词为:”打字、精准、界面、个性、输入法、拥有、全能、输入法”。那么“搜狗拼音输入法”的特征词信息为:For example, the application description information of the application "Sogou Pinyin Input Method" is: "The input method with the precise typing and the most personalized interface, and the versatile input method". The feature words obtained after the segmentation of the description information are: "typing and precision" , interface, personality, input method, possession, omnipotence, input method." Then the characteristic word information of "Sogou Pinyin Input Method" is:
Figure PCTCN2017118709-appb-000006
Figure PCTCN2017118709-appb-000006
S2:将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息。S2: Combine the corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag.
详细地,可以将具有相同标签的每个应用对应的特征词信息中的相同的特征词合并为一个特征词,将合并后获得的该特征词作为所述标签的特征词。然后确定每个所述特征词在所述标签上的权重。将合并后获得的所述特征词和每个所述特征词在所述标签上的权重作为该标签的特征词信息。In detail, the same feature words in the feature word information corresponding to each application having the same tag may be merged into one feature word, and the feature word obtained after the combination may be used as the feature word of the tag. The weight of each of the feature words on the tag is then determined. The feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.
预置应用库里的已存在的每个应用都具有1个或多个标签,在对每个应用的描述信 息提取了归属于应用的特征词,将具有相同标签的多个应用所拥有的相应特征词信息进行合并,合并后得到的特征词信息作为该标签的特征词信息。Each existing application in the preset application library has one or more tags, and the feature information attributed to the application is extracted in the description information of each application, and the corresponding applications of multiple applications having the same tag are The feature word information is merged, and the feature word information obtained after the combination is used as the feature word information of the tag.
所述标签的特征词信息同样地包括了特征词和该特征词在该标签上的权重。在多个特征词信息的合并过程中可以将相同的多个特征词合并为1个特征词,对每个特征词在该标签上的权重的计算方法如下:The feature word information of the tag similarly includes the feature word and the weight of the feature word on the tag. In the process of merging multiple feature word information, the same plurality of feature words may be combined into one feature word, and the weight of each feature word on the label is calculated as follows:
Figure PCTCN2017118709-appb-000007
且i∈A,j∈w
Figure PCTCN2017118709-appb-000007
And i∈A,j∈w
其中:among them:
f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
w i,j表示特征词j对预置应用库里具有标签t的应用i的权重; w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library;
A表示预置应用库里的具有标签t的应用集合;A represents a set of applications with a tag t in the preset application library;
W表示归属于应用集合A中的应用的特征词集合;W represents a feature word set belonging to an application in the application set A;
n表示应用集合A里的应用数量;n represents the number of applications in the application set A;
m表示特征词集合W里的特征词数量。m represents the number of feature words in the feature word set W.
可知,每个特征词在某个标签上的权重即为每个特征词在归属于具有该标签的应用集合中的应用的特征词集合中出现的概率。It can be seen that the weight of each feature word on a certain tag is the probability that each feature word appears in the feature word set of the application belonging to the application set having the tag.
例如,以标签“直播”为例,假设具有该标签的应用有两个,分别是“斗鱼TV”、“YY”。应用“斗鱼TV”的特征词信息为
Figure PCTCN2017118709-appb-000008
应用“YY”的特征词信息为
Figure PCTCN2017118709-appb-000009
那么合并后,标签“直播”的特征词为("游戏","直播","娱乐"),标签“直播”的特征词信息为:
For example, taking the label "live" as an example, it is assumed that there are two applications with the label, namely "Betta TV" and "YY". The characteristic word information of the application "Betta Fish TV" is
Figure PCTCN2017118709-appb-000008
The feature word information of the application "YY" is
Figure PCTCN2017118709-appb-000009
Then, after the merger, the characteristic words of the label "live" are ("game", "live", "entertainment"), and the characteristic word information of the label "live" is:
Figure PCTCN2017118709-appb-000010
Figure PCTCN2017118709-appb-000010
S3:确定每个标签对归属其的每个特征词的第一偏好度。S3: Determine a first preference of each tag pair for each feature word that belongs to it.
在得到标签库里的全部标签的各自特征词信息(可以认为预置应用库里的全部应用所具有的标签的集合涵盖了标签库里的全部标签),需要建立每个标签与归属其的特征词之间关联度,在这里以每个标签对归属其的每个特征词的第一偏好度作为关联度,确定所述第一偏好度的方法如下:After obtaining the characteristic word information of all the tags in the tag library (it can be considered that all the applications in the preset application library have a set of tags covering all the tags in the tag library), it is necessary to establish each tag and the characteristics belonging to it. The degree of association between words, here the first preference degree of each feature word belonging to each tag pair is used as the degree of association, and the method for determining the first preference degree is as follows:
Figure PCTCN2017118709-appb-000011
Figure PCTCN2017118709-appb-000011
其中:among them:
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
f t,j表示特征词j在标签t上的权重,即在归属于具有标签t的应用集合中的应用的特征词集合中出现的概率; f t,j denotes the weight of the feature word j on the tag t, that is, the probability of occurrence in the feature word set of the application belonging to the application set having the tag t;
s j表示特征词j在从预置应用库里的所有应用的各自应用描述信息中所提取的全部特征词集合中出现的概率,其中: s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:
Figure PCTCN2017118709-appb-000012
且i∈AA,j∈Aw
Figure PCTCN2017118709-appb-000012
And i∈AA,j∈Aw
其中:among them:
w i,j表示特征词j对预置应用库里的应用i的权重; w i,j represents the weight of the feature word j on the application i in the preset application library;
AA表示预置应用库里的所有应用的集合;AA represents a collection of all applications in the preset application library;
Aw表示从所有应用的各自应用描述信息中所提取的所有特征词的集合;Aw represents a set of all feature words extracted from the respective application description information of all applications;
n表示应用集合AA里的应用数量;n represents the number of applications in the application set AA;
m表示特征词集合Aw里的特征词数量。m represents the number of feature words in the feature word set Aw.
S4:从待标注标签的新应用的应用描述信息中提取特征词信息。S4: Extract feature word information from application description information of the new application to be labeled.
这里的实现步骤与步骤S1的实现方法相同,从待标注标签的新应用的应用描述信息中提取特征词信息,该特征词信息包含特征词和该特征词对其所属新应用的权重。同样可以记为w i The implementation step here is the same as the implementation method of step S1. The feature word information is extracted from the application description information of the new application to be labeled, and the feature word information includes the feature word and the weight of the feature word for the new application to which the feature word belongs. Can also be recorded as w i
w i={w1:pci1,w2:pci2,w3:pci3,…}。 w i ={w1:pci1,w2:pci2,w3:pci3,...}.
其它相关描述可以参见步骤S1的描述,这里不重复描述了。For other related descriptions, refer to the description of step S1, and the description is not repeated here.
S5:基于所述第一偏好度和提取出的新应用的特征词信息,确定该新应用对标签库里的每个标签的第二偏好度。S5: Determine, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library.
当有了每个标签对归属其的每个特征词的第一偏好度、以及提取的归属于新应用的特征词信息,就可以建立起该新应用与标签库里的每个标签的关联度,在这里以该新应用对标签库里的每个标签的第二偏好度作为关联度,确定所述第二偏好度的方法如下:When there is a first preference for each feature word assigned to each tag pair and the extracted feature word information attributed to the new application, the association degree of the new application with each tag in the tag library can be established. Here, the second preference of each tag in the tag library is used as the degree of association by the new application, and the method for determining the second preference is as follows:
Figure PCTCN2017118709-appb-000013
且j∈AM
Figure PCTCN2017118709-appb-000013
And j∈AM
其中:among them:
r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
AM表示所获得的归属于标签t的所有特征词的集合;AM represents the obtained set of all feature words attributed to the tag t;
m表示归属于标签t的特征词集合里的特征词数量。m represents the number of feature words in the feature word set attributed to the tag t.
从该公式可以看出:将新应用i看做是从新应用i的应用描述信息中提取的不同特征词j的组合,通过叠加标签t对归属于新应用i的每个特征词的第一偏好度值,得到该新应用i对该标签t的第二偏好度。注意,如果归属于新应用i的某个特征词不在归属于标签t的特征词集合里,则标签t对该特征词的第一偏好度为0。It can be seen from the formula that the new application i is regarded as a combination of different feature words j extracted from the application description information of the new application i, and the first preference of each feature word belonging to the new application i by superimposing the tag t The degree value is obtained, and the second preference of the new application i for the tag t is obtained. Note that if a certain feature word attributed to the new application i is not in the feature word set belonging to the tag t, the first preference of the tag t for the feature word is zero.
在该实施例中,当归属于标签t的特征词集合里的特征词数量很多时,进行查找和累加的次数也多,会造成计算量大。下面介绍一个优选实施例,可以根据第一偏好度值的大小,预先从归属于标签t的特征词集合里筛选掉一部分特征词,可以将较小第一偏好度值对应的特征词筛选掉,减少了归属于标签t的特征词集合里的特征词数量,可以降低计算量。In this embodiment, when the number of feature words in the feature word set attributed to the tag t is large, the number of times of searching and accumulating is also large, which causes a large amount of calculation. A preferred embodiment is described. A part of the feature words may be filtered out from the feature word set belonging to the tag t according to the size of the first preference value, and the feature words corresponding to the smaller first preference value may be filtered out. The number of feature words in the feature word set belonging to the tag t is reduced, and the amount of calculation can be reduced.
可选地,首先根据每个标签对归属其的每个特征词的第一偏好度按预设方式选取一定数量的特征词作为相应标签的主题特征词,选取一定数量的特征词作为相应标签的主题特征词的预设方式可以是根据标签t对归属于其的每个特征词的第一偏好度的从大到小顺序,选取第一偏好度排名在前面的一定数量的特征词作为主题特征词,也可以预设一个第一预设偏好度阈值,选取大于等于该第一预设偏好度阈值的多个第一偏好度所对应的多个特征词作为主题特征词。或者,还可以根据数据情况及业务场景自行定义,例如选取50个、100个、200个或其它值;接着确定所述第二偏好度,方法如下:Optionally, a certain number of feature words are selected as the topic feature words of the corresponding tags according to a preset manner, and a certain number of feature words are selected as corresponding tags according to a first preference degree of each feature word belonging to each tag. The preset manner of the topic feature words may be: selecting a certain number of feature words ranked first in the first preference degree as the topic feature according to the order of the first preference degree of each feature word belonging to the tag t from the label t. The word may also preset a first preset preference threshold, and select a plurality of feature words corresponding to the plurality of first preference degrees that are greater than or equal to the first preset preference threshold as the topic feature words. Alternatively, it may be defined according to the data situation and the business scenario, for example, selecting 50, 100, 200 or other values; then determining the second preference, as follows:
Figure PCTCN2017118709-appb-000014
且j∈topic t
Figure PCTCN2017118709-appb-000014
And j∈topic t
其中:among them:
r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
topic t表示所选取的归属于标签t的一定数量的主题特征词的集合; Topic t represents a selected set of subject feature words attributed to the tag t;
m表示归属于标签t的主题特征词集合里的特征词数量。m represents the number of feature words in the set of subject feature words belonging to the tag t.
S6:根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上。S6: Select the corresponding one or more tags from the tag library according to the second preference to label the new application.
从标签库里选取相应的1个或多个标签的预设方式可以采用多种方式。例如,可以根据新应用对每个标签的第二偏好度值从大到小的顺序,选取第二偏好度排名在前面的1个或多个标签给该新应用标注上。或者,可以根据数据情况及业务场景自行定义需要标注的标签数量,可以是1-5之间的任意数量,如1个、2个、5个等,或者还可以更多。There are several ways to select the corresponding one or more labels from the tag library. For example, the first preference or the first one or more labels of the second preference ranking may be selected for the new application according to the order in which the new application applies the second preference value of each label from large to small. Alternatively, the number of labels to be labeled may be defined according to the data situation and the business scenario, and may be any number between 1-5, such as 1, 2, 5, etc., or more.
另外,也可以设置一个第二预设偏好度阈值,选取大于等于该第二预设偏好度阈值的1个或多个第二偏好度所对应的1个或多个标签给该新应用标注上。In addition, a second preset preference threshold may be set, and one or more labels corresponding to one or more second preferences equal to or greater than the second preset preference threshold are selected to mark the new application. .
根据本发明的给应用标注标签的方法,通过应用库里已知的应用及其具有的标签,利用用于介绍应用的特点和核心功能的应用描述信息和分词技术,在待标注标签的新应用与预置的标签库里的标签之间建立了关联性,实现了自动找出适合的1个或多个标签给新应用标注上,降低了人工成本,提高了给新应用标注标签的准确性和工作效率。A method for labeling an application according to the present invention, by applying an application known in the library and a tag thereof, using a description and a word segmentation technique for introducing characteristics and core functions of the application, and applying a new application to the tag to be labeled Establishes an association with the tags in the preset tag library, which automatically finds one or more tags suitable for new application annotation, reduces labor costs, and improves the accuracy of labeling new applications. And work efficiency.
图2是本发明实施例提供的给应用标注标签的装置的示意性框图。如图2所示,本发明的给应用标注标签的装置包括:FIG. 2 is a schematic block diagram of an apparatus for labeling an application according to an embodiment of the present invention. As shown in FIG. 2, the apparatus for labeling an application of the present invention includes:
特征词信息提取单元,配置成从预置应用库里的每个应用的应用描述信息中提取特征词信息,和从待标注标签的新应用的应用描述信息中提取特征词信息。The feature word information extracting unit is configured to extract feature word information from the application description information of each application in the preset application library, and extract feature word information from the application description information of the new application to be tagged.
在本实施例中,关于特征词信息提取单元的描述可参考对图1所示步骤S1的详细描述,也即,步骤S1可以由所述特征词信息提取单元执行。标签的特征词信息确定单元,配置成将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息。In the present embodiment, regarding the description of the feature word information extracting unit, reference may be made to the detailed description of step S1 shown in Fig. 1, that is, step S1 may be performed by the feature word information extracting unit. The feature word information determining unit of the tag is configured to merge the corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag.
在本实施例中,关于所述标签的特征词信息确定单元的描述可参考对图1所示步骤S2的详细描述,也即,步骤S2可以由所述标签的特征词信息确定单元执行。In the present embodiment, the description of the feature word information determining unit of the tag may refer to the detailed description of step S2 shown in FIG. 1, that is, step S2 may be performed by the feature word information determining unit of the tag.
第一偏好度确定单元,配置成确定每个标签对归属其的每个特征词的第一偏好度。The first preference determination unit is configured to determine a first preference for each of the feature words to which each tag belongs.
在本实施例中,关于所述第一偏好度确定单元的描述可参考对图1所示步骤S3的详细描述,也即,步骤S3可以由所述第一偏好度确定单元执行。第二偏好度确定单元,配置成基于所述第一偏好度和提取出的新应用的特征词信息,确定该新应用对标签库里的每个标签的第二偏好度。In the present embodiment, the description about the first preference determination unit may refer to the detailed description of step S3 shown in FIG. 1, that is, step S3 may be performed by the first preference determination unit. The second preference determining unit is configured to determine a second preference of the new application for each tag in the tag library based on the first preference and the extracted feature word information of the new application.
在本实施例中,关于所述第二偏好度确定单元的描述可参考对图1所示步骤S4的详细描述,也即,步骤S4可以由所述第二偏好度确定单元执行。In the present embodiment, regarding the description of the second preference determination unit, reference may be made to the detailed description of step S4 shown in FIG. 1, that is, step S4 may be performed by the second preference determination unit.
标签标注单元,配置成根据该第二偏好度按预设方式从标签库里选取相应的1个或 多个标签给该新应用标注上。The label labeling unit is configured to select a corresponding one or more labels from the label library according to the second preference to mark the new application.
在本实施例中,关于所述标签标注单元的描述可参考对图1所示步骤S5的详细描述,也即,步骤S5可以由所述标签标注单元执行。In the present embodiment, regarding the description of the label labeling unit, reference may be made to the detailed description of step S5 shown in FIG. 1, that is, step S5 may be performed by the label labeling unit.
这里的按预设方式选取1个或多个标签的具体实现方式可以参考前述方法实施例中描述的实现方法。For the specific implementation manner of selecting one or more labels in a preset manner, reference may be made to the implementation method described in the foregoing method embodiments.
可选地,所述特征词信息提取单元从预置应用库里的每个应用的应用描述信息中提取特征词信息的方法可以包括:首先对应用描述信息进行分词处理以提取出特征词,然后统计每个特征词出现的概率作为该特征词对其所属应用的权重。Optionally, the method for extracting the feature word information from the application description information of each application in the preset application library may include: first performing word segmentation on the application description information to extract the feature words, and then The probability of occurrence of each feature word is counted as the weight of the feature word to which it belongs.
可选地,所述标签的特征词信息确定单元将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息的方法可以包括:将具有相同标签的每个应用对应的特征词信息中的相同的特征词合并为一个特征词,将合并后获得的该特征词作为所述标签的特征词。再确定每个所述特征词在所述标签上的权重。然后将合并后获得的所述特征词和每个所述特征词在所述标签上的权重作为该标签的特征词信息。Optionally, the feature word information determining unit of the tag merges the corresponding feature word information of the plurality of applications having the same tag, and the method as the feature word information of the tag may include: corresponding to each application having the same tag The same feature words in the feature word information are merged into one feature word, and the feature word obtained after the combination is used as the feature word of the tag. The weight of each of the feature words on the label is then determined. The feature words obtained after the combination and the weight of each of the feature words on the tag are then used as feature word information of the tag.
可选地,所述标签的特征词信息确定单元配置成在合并过程中将相同的多个特征词合并为1个特征词,对每个特征词在该标签上的权重的计算方法如下:Optionally, the feature word information determining unit of the tag is configured to merge the same plurality of feature words into one feature word in the merging process, and the method for calculating the weight of each feature word on the tag is as follows:
Figure PCTCN2017118709-appb-000015
且i∈A,j∈w
Figure PCTCN2017118709-appb-000015
And i∈A,j∈w
其中:among them:
f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
w i,j表示特征词j对预置应用库里具有标签t的应用i的权重; w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library;
A表示预置应用库里的具有标签t的应用集合;A represents a set of applications with a tag t in the preset application library;
W表示归属于应用集合A中的应用的特征词集合;W represents a feature word set belonging to an application in the application set A;
n表示应用集合A里的应用数量;n represents the number of applications in the application set A;
m表示特征词集合W里的特征词数量。m represents the number of feature words in the feature word set W.
可选地,所述第一偏好度确定单元确定所述第一偏好度的方法可以包括:Optionally, the method for determining, by the first preference determining unit, the first preference may include:
通过以下计算式确定所述第一偏好度如下:The first preference is determined by the following calculation formula as follows:
Figure PCTCN2017118709-appb-000016
Figure PCTCN2017118709-appb-000016
其中:among them:
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
s j表示特征词j在从预置应用库里的所有应用的各自应用描述信息中所提取的全部特征词集合中出现的概率,其中: s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:
Figure PCTCN2017118709-appb-000017
且i∈AA,j∈Aw
Figure PCTCN2017118709-appb-000017
And i∈AA,j∈Aw
其中:among them:
w i,j表示特征词j对预置应用库里的应用i的权重; w i,j represents the weight of the feature word j on the application i in the preset application library;
AA表示预置应用库里的所有应用的集合;AA represents a collection of all applications in the preset application library;
Aw表示从所有应用的各自应用描述信息中所提取的所有特征词的集合;Aw represents a set of all feature words extracted from the respective application description information of all applications;
n表示应用集合AA里的应用数量;n represents the number of applications in the application set AA;
m表示特征词集合Aw里的特征词数量。m represents the number of feature words in the feature word set Aw.
可选地,所述第二偏好度确定单元确定所述第二偏好度的方法可以包括:Optionally, the method for determining, by the second preference determining unit, the second preference may include:
通过以下计算式确定所述第二偏好度:The second preference is determined by the following formula:
Figure PCTCN2017118709-appb-000018
且j∈AM
Figure PCTCN2017118709-appb-000018
And j∈AM
其中:among them:
r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
AM表示所获得的归属于标签t的所有特征词的集合;AM represents the obtained set of all feature words attributed to the tag t;
m表示归属于标签t的特征词集合里的特征词数量。m represents the number of feature words in the feature word set attributed to the tag t.
为了降低对所述第二偏好度的计算量,所述第二偏好度确定单元确定新应用对标签库里的每个标签的第二偏好度的方法包括:首先根据每个标签对归属其的每个特征词的第一偏好度按预设方式选取一定数量的特征词作为相应标签的主题特征词,接着确定所述第二偏好度。详细地,可以是根据标签t对归属于其的每个特征词的第一偏好度的从大到小顺序,选取第一偏好度排名在前面的一定数量的特征词作为主题特征词,也可以预设一个第一预设偏好度阈值,选取大于等于该第一预设偏好度阈值的多个第一偏好度所对应的多个特征词作为主题特征词。或者,还可以根据数据情况及业务场景自行定义,例如选取50个、100个、200个或其它值。In order to reduce the amount of calculation of the second preference degree, the method for determining, by the second preference degree determining unit, the second preference of the new application to each tag in the tag library comprises: firstly, according to each tag pair The first preference of each feature word selects a certain number of feature words as the topic feature words of the corresponding tags in a preset manner, and then determines the second preference degrees. In detail, according to the descending order of the first preference degree of each feature word belonging to the tag t, the first preference number is selected as the topic feature word, or Presetting a first preset preference threshold, and selecting a plurality of feature words corresponding to the plurality of first preference degrees that are greater than or equal to the first preset preference threshold as the topic feature words. Alternatively, it can be customized according to the data situation and the business scenario, for example, 50, 100, 200 or other values are selected.
可以通过以下计算式确定第二偏好度:The second preference can be determined by the following formula:
Figure PCTCN2017118709-appb-000019
且j∈topic t
Figure PCTCN2017118709-appb-000019
And j∈topic t
其中:among them:
r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
topic t表示所选取的归属于标签t的一定数量的主题特征词的集合; Topic t represents a selected set of subject feature words attributed to the tag t;
m表示归属于标签t的主题特征词集合里的特征词数量。m represents the number of feature words in the set of subject feature words belonging to the tag t.
在通过第二偏好度确定单元确定了标签的第二偏好度后,标签标注单元可以根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上。After the second preference degree of the label is determined by the second preference determining unit, the label labeling unit may select the corresponding one or more labels from the label library according to the second preference degree to label the new application. on.
详细地,从标签库里选取相应的1个或多个标签的预设方式可以采用多种方式。例如,可以根据新应用对每个标签的第二偏好度值从大到小的顺序,选取第二偏好度排名在前面的1个或多个标签给该新应用标注上。或者,可以根据数据情况及业务场景自行定义需要标注的标签数量,可以是1-5之间的任意数量,如1个、2个、5个等,或者还可以更多。In detail, the preset manner of selecting one or more labels from the tag library can be performed in various ways. For example, the first preference or the first one or more labels of the second preference ranking may be selected for the new application according to the order in which the new application applies the second preference value of each label from large to small. Alternatively, the number of labels to be labeled may be defined according to the data situation and the business scenario, and may be any number between 1-5, such as 1, 2, 5, etc., or more.
另外,也可以设置一个第二预设偏好度阈值,选取大于等于该第二预设偏好度阈值的1个或多个第二偏好度所对应的1个或多个标签给该新应用标注上。。In addition, a second preset preference threshold may be set, and one or more labels corresponding to one or more second preferences equal to or greater than the second preset preference threshold are selected to mark the new application. . .
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,这里结合产品实施例描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再重复描述。A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the device described in the embodiment of the present invention can be referred to the corresponding process in the foregoing method embodiments, and the description is not repeated here.
根据本发明的给应用标注标签的装置,通过应用库里已知的应用及其具有的标签,利用用于介绍应用的特点和核心功能的应用描述信息和分词技术,在待标注标签的新应用与预置的标签库里的标签之间建立了关联性,实现了自动找出适合的1个或多个标签给新应用标注上,降低了人工成本,提高了准确性和工作效率。According to the present invention, a device for labeling an application, by using an application known in the library and a tag thereof, using a description and a word segmentation technique for introducing characteristics and core functions of the application, a new application to be tagged Establishes an association with the tags in the preset tag library, which automatically finds one or more tags suitable for new applications, reduces labor costs, and improves accuracy and work efficiency.
本发明实施例所提供的给应用标注标签的方法的计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可用于执行前面方法实施例中所述的给应用标注标签的方法,具体实现可参见方法实施例,在此不再赘述。A computer program product for providing a method for labeling an application according to an embodiment of the present invention, comprising a computer readable storage medium storing program code, the program code comprising instructions for executing the application described in the foregoing method embodiment For the specific method, refer to the method embodiment, and details are not described herein again.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该 计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,智能平板电脑,智能手机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM)、随机存取存储器(RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including A number of instructions are used to cause a computer device (which may be a personal computer, smart tablet, smartphone, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a removable hard disk, a read only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the appended claims.
工业实用性Industrial applicability
本发明实施例提供的给应用标注标签的方法、装置、终端及计算机可读存储介质,通过应用库里已知的应用及其具有的标签,利用用于介绍应用的特点和核心功能的应用描述信息和分词技术,在待标注标签的新应用与预置的标签库里的标签之间建立了关联性,实现了自动找出适合的1个或多个标签给新应用标注上,降低了人工成本,提高了准确性和工作效率The method, the device, the terminal and the computer readable storage medium for labeling an application provided by the embodiment of the present invention utilize an application description for introducing an application feature and a core function by using an application known in the library and a tag thereof Information and word segmentation technology establishes an association between the new application to be tagged and the tag in the preset tag library, enabling automatic identification of one or more tags for new applications, reducing manual Cost, improved accuracy and productivity

Claims (24)

  1. 一种给应用标注标签的方法,其特征在于,包括:A method for labeling an application, comprising:
    从预置应用库里的每个应用的应用描述信息中提取特征词信息;Extracting feature word information from application description information of each application in the preset application library;
    将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息;Combining corresponding feature word information of a plurality of applications having the same tag as feature word information of the tag;
    确定每个标签对归属其的每个特征词的第一偏好度;Determining a first preference for each feature word to which each tag belongs;
    从待标注标签的新应用的应用描述信息中提取特征词信息;Extracting feature word information from application description information of a new application to be tagged;
    基于所述第一偏好度和提取出的新应用的特征词信息,确定该新应用对标签库里的每个标签的第二偏好度;Determining, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;
    根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上。And correspondingly selecting one or more tags from the tag library according to the second preference to mark the new application.
  2. 根据权利要求1所述的方法,其特征在于,从预置应用库里的每个应用的应用描述信息中提取特征词信息的步骤包括:The method according to claim 1, wherein the step of extracting feature word information from the application description information of each application in the preset application library comprises:
    对所述应用描述信息进行分词处理以提取出所述特征词;Performing word segmentation processing on the application description information to extract the feature words;
    统计每个所述特征词出现的概率作为该特征词对其所属应用的权重,以获得所述特征词信息,所述特征词信息包含特征词和该特征词对其所属应用的权重。The probability of occurrence of each of the feature words is counted as a weight of the feature word to which the feature word belongs, to obtain the feature word information, and the feature word information includes a feature word and a weight of the feature word to which the feature word belongs.
  3. 根据权利要求1或2任意一项所述的方法,其特征在于,将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息的步骤包括:The method according to any one of claims 1 to 2, wherein the step of combining the corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag comprises:
    将具有相同标签的每个应用对应的特征词信息中的相同的特征词合并为一个特征词,将合并后获得的该特征词作为所述标签的特征词;Combining the same feature words in the feature word information corresponding to each application having the same tag into one feature word, and using the feature word obtained after the combination as the feature word of the tag;
    确定每个所述特征词在所述标签上的权重;Determining a weight of each of the feature words on the label;
    将合并后获得的所述特征词和每个所述特征词在所述标签上的权重作为该标签的特征词信息。The feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.
  4. 根据权利要求3所述的方法,其特征在于,所述确定每个特征词在该标签上的权重的计算方法如下:The method according to claim 3, wherein said calculating a weight of each feature word on the label is as follows:
    Figure PCTCN2017118709-appb-100001
    且i∈A,j∈w
    Figure PCTCN2017118709-appb-100001
    And i∈A,j∈w
    其中:among them:
    f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
    w i,j表示特征词j对预置应用库里具有标签t的应用i的权重; w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library;
    A表示预置应用库里的具有标签t的应用集合;A represents a set of applications with a tag t in the preset application library;
    W表示归属于应用集合A中的应用的特征词集合;W represents a feature word set belonging to an application in the application set A;
    n表示应用集合A里的应用数量;n represents the number of applications in the application set A;
    m表示特征词集合W里的特征词数量。m represents the number of feature words in the feature word set W.
  5. 根据权利要求1至4任意一项所述的方法,其特征在于,所述确定每个标签对归属其的每个特征词的第一偏好度的步骤包括:The method according to any one of claims 1 to 4, wherein the step of determining the first preference of each tag pair for each feature word belonging to it comprises:
    通过以下计算式计算每个标签对归属其的每个特征词的第一偏好度:The first preference of each tag to each feature word belonging to it is calculated by the following formula:
    Figure PCTCN2017118709-appb-100002
    Figure PCTCN2017118709-appb-100002
    其中:among them:
    p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
    f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
    s j表示特征词j在从预置应用库里的所有应用的各自应用描述信息中所提取的全部特征词集合中出现的概率,其中: s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:
    Figure PCTCN2017118709-appb-100003
    且i∈AA,j∈Aw
    Figure PCTCN2017118709-appb-100003
    And i∈AA,j∈Aw
    其中:among them:
    w i,j表示特征词j对预置应用库里的应用i的权重; w i,j represents the weight of the feature word j on the application i in the preset application library;
    AA表示预置应用库里的所有应用的集合;AA represents a collection of all applications in the preset application library;
    Aw表示从所有应用的各自应用描述信息中所提取的所有特征词的集合;Aw represents a set of all feature words extracted from the respective application description information of all applications;
    n表示应用集合AA里的应用数量;n represents the number of applications in the application set AA;
    m表示特征词集合Aw里的特征词数量。m represents the number of feature words in the feature word set Aw.
  6. 根据权利要求1至5任意一项所述的方法,其特征在于,所述确定新应用对标签库里的每个标签的第二偏好度的步骤包括:The method according to any one of claims 1 to 5, wherein the step of determining a second preference of the new application for each tag in the tag library comprises:
    通过以下计算式计算新应用对标签库里的每个标签的第二偏好度:The second preference of the new application for each tag in the tag library is calculated by the following formula:
    Figure PCTCN2017118709-appb-100004
    且j∈AM
    Figure PCTCN2017118709-appb-100004
    And j∈AM
    其中:among them:
    r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
    p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
    w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
    AM表示所获得的归属于标签t的所有特征词的集合;AM represents the obtained set of all feature words attributed to the tag t;
    m表示归属于标签t的特征词集合里的特征词数量。m represents the number of feature words in the feature word set attributed to the tag t.
  7. 根据权利要求1至6任意一项所述的方法,其特征在于,所述确定新应用对标签库里的每个标签的第二偏好度的步骤包括:The method according to any one of claims 1 to 6, wherein the step of determining a second preference of the new application for each tag in the tag library comprises:
    根据每个所述标签对归属其的每个所述特征词的第一偏好度,按预设方式选取一定数量的特征词作为相应标签的主题特征词;And selecting, according to a preset manner, a certain number of feature words as the topic feature words of the corresponding tags according to a first preference degree of each of the feature words belonging to each of the tags;
    通过以下计算式确定所述第二偏好度:The second preference is determined by the following formula:
    Figure PCTCN2017118709-appb-100005
    且j∈topic t
    Figure PCTCN2017118709-appb-100005
    And j∈topic t
    其中:among them:
    r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
    p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
    w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
    topic t表示所选取的归属于标签t的一定数量的主题特征词的集合; Topic t represents a selected set of subject feature words attributed to the tag t;
    m表示归属于标签t的主题特征词集合里的特征词数量。m represents the number of feature words in the set of subject feature words belonging to the tag t.
  8. 根据权利要求7所述的方法,其特征在于,所述根据每个所述标签对归属其的每个所述特征词的第一偏好度,按预设方式选取一定数量的特征词作为相应标签的主题特征词的步骤包括:The method according to claim 7, wherein the selecting, according to a first preference degree of each of the feature words belonging to each of the tags, a certain number of feature words as corresponding tags according to a preset manner The steps of the topic feature words include:
    根据所述标签对归属于其的每个特征词的第一偏好度的从大到小顺序,选取第一偏好度排名在前面的预设数量的特征词作为主题特征词,Selecting a preset number of feature words ranked first in the first preference degree as the topic feature words according to a descending order of the first preference degree of each feature word belonging to the tag.
  9. 根据权利要求7所述的方法,其特征在于,所述根据每个所述标签对归属其的每个所述特征词的第一偏好度,按预设方式选取一定数量的特征词作为相应标签的主题特征词的步骤包括:The method according to claim 7, wherein the selecting, according to a first preference degree of each of the feature words belonging to each of the tags, a certain number of feature words as corresponding tags according to a preset manner The steps of the topic feature words include:
    选取大于或等于第一预设偏好度阈值的多个第一偏好度所对应的多个特征词作为主题特征词。A plurality of feature words corresponding to the plurality of first preference degrees greater than or equal to the first preset preference threshold are selected as the topic feature words.
  10. 根据权利要求1至7任意一项所述的方法,其特征在于,根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上的步骤包括:The method according to any one of claims 1 to 7, wherein the step of selecting the corresponding one or more tags from the tag library according to the second preference to label the new application comprises: :
    根据所述新应用对每个标签的第二偏好度值从大到小的顺序,选取第二偏好度排名在前面的1个或多个标签给该新应用标注上。According to the order in which the new application applies the second preference value of each tag from large to small, the first preference is ranked by the first one or more tags to the new application.
  11. 根据权利要求1至7任意一项所述的方法,其特征在于,根据该第二偏好度按预 设方式从标签库里选取相应的1个或多个标签给该新应用标注上的步骤包括:The method according to any one of claims 1 to 7, wherein the step of selecting the corresponding one or more tags from the tag library according to the second preference to label the new application comprises: :
    选取大于或等于第二预设偏好度阈值的1个或多个第二偏好度所对应的1个或多个标签给该新应用标注上。The one or more tags corresponding to one or more second preference degrees greater than or equal to the second preset preference threshold are selected for the new application.
  12. 一种给应用标注标签的装置,其特征在于,包括:A device for labeling an application, comprising:
    特征词信息提取单元,配置成从预置应用库里的每个应用的应用描述信息中提取特征词信息,和从待标注标签的新应用的应用描述信息中提取特征词信息;The feature word information extracting unit is configured to extract the feature word information from the application description information of each application in the preset application library, and extract the feature word information from the application description information of the new application to be labeled;
    标签的特征词信息确定单元,配置成将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息;a feature word information determining unit of the tag, configured to combine corresponding feature word information of the plurality of applications having the same tag as the feature word information of the tag;
    第一偏好度确定单元,配置成确定每个标签对归属其的每个特征词的第一偏好度;a first preference determining unit configured to determine a first preference of each tag pair for each feature word to which it belongs;
    第二偏好度确定单元,配置成基于所述第一偏好度和提取出的新应用的特征词信息,确定该新应用对标签库里的每个标签的第二偏好度;a second preference determining unit, configured to determine, according to the first preference and the extracted feature word information of the new application, a second preference of the new application to each tag in the tag library;
    标签标注单元,配置成根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上。The label labeling unit is configured to select a corresponding one or more labels from the label library according to the second preference to label the new application.
  13. 根据权利要求12所述的装置,其特征在于,所述特征词信息提取单元配置成对所述应用描述信息进行分词处理以提取出所述特征词,统计每个所述特征词出现的概率作为该特征词对其所属应用的权重,以获得所述特征词信息,所述特征词信息包含特征词和该特征词对其所属应用的权重。The apparatus according to claim 12, wherein said feature word information extracting unit is configured to perform word segmentation processing on said application description information to extract said feature words, and to calculate a probability of occurrence of each of said feature words as The characteristic word is weighted by the feature word to obtain the feature word information, and the feature word information includes the feature word and the weight of the feature word for the application to which the feature word belongs.
  14. 根据权利要求12或13任意一项所述的装置,其特征在于,所述标签的特征词信息确定单元将具有相同标签的多个应用的相应特征词信息合并,作为该标签的特征词信息的方法包括:The apparatus according to any one of claims 12 or 13, wherein the feature word information determining unit of the tag combines corresponding feature word information of a plurality of applications having the same tag as the feature word information of the tag Methods include:
    将具有相同标签的每个应用对应的特征词信息中的相同的特征词合并为一个特征词,将合并后获得的该特征词作为所述标签的特征词;Combining the same feature words in the feature word information corresponding to each application having the same tag into one feature word, and using the feature word obtained after the combination as the feature word of the tag;
    确定每个所述特征词在所述标签上的权重;Determining a weight of each of the feature words on the label;
    将合并后获得的所述特征词和每个所述特征词在所述标签上的权重作为该标签的特征词信息。The feature words obtained after the combination and the weight of each of the feature words on the label are used as feature word information of the label.
  15. 根据权利要求14所述的装置,其特征在于,所述标签的特征词信息确定单元确定每个特征词在该标签上的权重的计算方法如下:The apparatus according to claim 14, wherein the feature word information determining unit of the tag determines a weight of each feature word on the tag as follows:
    Figure PCTCN2017118709-appb-100006
    且i∈A,j∈w
    Figure PCTCN2017118709-appb-100006
    And i∈A,j∈w
    其中:among them:
    f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
    w i,j表示特征词j对预置应用库里具有标签t的应用i的权重; w i,j denotes the weight of the feature word j for the application i having the tag t in the preset application library;
    A表示预置应用库里的具有标签t的应用集合;A represents a set of applications with a tag t in the preset application library;
    W表示归属于应用集合A中的应用的特征词集合;W represents a feature word set belonging to an application in the application set A;
    n表示应用集合A里的应用数量;n represents the number of applications in the application set A;
    m表示特征词集合W里的特征词数量。m represents the number of feature words in the feature word set W.
  16. 根据权利要求12至15任意一项所述的装置,其特征在于,所述第一偏好度确定单元用于确定所述第一偏好度的方法包括:The device according to any one of claims 12 to 15, wherein the method for determining the first preference by the first preference determining unit comprises:
    通过以下计算式计算每个标签对归属其的每个特征词的第一偏好度:The first preference of each tag to each feature word belonging to it is calculated by the following formula:
    Figure PCTCN2017118709-appb-100007
    Figure PCTCN2017118709-appb-100007
    其中:among them:
    p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
    f t,j表示特征词j在标签t上的权重; f t,j represents the weight of the feature word j on the label t;
    s j表示特征词j在从预置应用库里的所有应用的各自应用描述信息中所提取的全部特征词集合中出现的概率,其中: s j represents the probability that the feature word j appears in all the feature word sets extracted from the respective application description information of all applications in the preset application library, wherein:
    Figure PCTCN2017118709-appb-100008
    且i∈AA,j∈Aw
    Figure PCTCN2017118709-appb-100008
    And i∈AA,j∈Aw
    其中:among them:
    w i,j表示特征词j对预置应用库里的应用i的权重; w i,j represents the weight of the feature word j on the application i in the preset application library;
    AA表示预置应用库里的所有应用的集合;AA represents a collection of all applications in the preset application library;
    Aw表示从所有应用的各自应用描述信息中所提取的所有特征词的集合;Aw represents a set of all feature words extracted from the respective application description information of all applications;
    n表示应用集合AA里的应用数量;n represents the number of applications in the application set AA;
    m表示特征词集合Aw里的特征词数量。m represents the number of feature words in the feature word set Aw.
  17. 根据权利要求12至16任意一项所述的装置,其特征在于,所述第二偏好度确定单元用于确定所述第二偏好度的方法包括:The device according to any one of claims 12 to 16, wherein the method for determining the second preference by the second preference determining unit comprises:
    通过以下计算式计算新应用对标签库里的每个标签的第二偏好度:The second preference of the new application for each tag in the tag library is calculated by the following formula:
    Figure PCTCN2017118709-appb-100009
    且j∈AM
    Figure PCTCN2017118709-appb-100009
    And j∈AM
    其中:among them:
    r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
    p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
    w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
    AM表示所获得的归属于标签t的所有特征词的集合;AM represents the obtained set of all feature words attributed to the tag t;
    m表示归属于标签t的特征词集合里的特征词数量。m represents the number of feature words in the feature word set attributed to the tag t.
  18. 根据权利要求12至17任意一项所述的装置,其特征在于,所述第二偏好度确定单元确定新应用对标签库里的每个标签的第二偏好度的方法包括:The device according to any one of claims 12 to 17, wherein the method for determining, by the second preference determining unit, the second preference of the new application to each tag in the tag library comprises:
    根据每个标签对归属其的每个特征词的第一偏好度按预设方式选取一定数量的特征词作为相应标签的主题特征词;Selecting a certain number of feature words as the topic feature words of the corresponding tags according to a first preference degree of each feature word belonging to each tag according to a preset manner;
    通过以下计算式确定所述第二偏好度:The second preference is determined by the following formula:
    Figure PCTCN2017118709-appb-100010
    且j∈topic t
    Figure PCTCN2017118709-appb-100010
    And j∈topic t
    其中:among them:
    r i,t表示新应用i对标签t的第二偏好度; r i,t represents a second preference of the new application i for the tag t;
    p t,j表示标签t对特征词j的第一偏好度; p t,j represents the first preference of the tag t for the feature word j;
    w i,j表示从新应用i的应用描述信息中提取的特征词j对该新应用i的权重; w i,j represents the weight of the feature word j extracted from the application description information of the new application i for the new application i;
    topic t表示所选取的归属于标签t的一定数量的主题特征词的集合; Topic t represents a selected set of subject feature words attributed to the tag t;
    m表示归属于标签t的主题特征词集合里的特征词数量。m represents the number of feature words in the set of subject feature words belonging to the tag t.
  19. 根据权利要求18所述的装置,其特征在于,所述第二偏好度确定单元根据每个所述标签对归属其的每个所述特征词的第一偏好度,按预设方式选取一定数量的特征词作为相应标签的主题特征词的方法包括:The device according to claim 18, wherein the second preference determining unit selects a certain quantity according to a preset manner according to a first preference degree of each of the feature words belonging to each of the tags. The method of the feature word as the topic feature word of the corresponding tag includes:
    根据所述标签对归属于其的每个特征词的第一偏好度的从大到小顺序,选取第一偏好度排名在前面的预设数量的特征词作为主题特征词,Selecting a preset number of feature words ranked first in the first preference degree as the topic feature words according to a descending order of the first preference degree of each feature word belonging to the tag.
  20. 根据权利要求18所述的装置,其特征在于,所述第二偏好度确定单元根据每个所述标签对归属其的每个所述特征词的第一偏好度,按预设方式选取一定数量的特征词作为相应标签的主题特征词的方法包括:The device according to claim 18, wherein the second preference determining unit selects a certain quantity according to a preset manner according to a first preference degree of each of the feature words belonging to each of the tags. The method of the feature word as the topic feature word of the corresponding tag includes:
    选取大于或等于第一预设偏好度阈值的多个第一偏好度所对应的多个特征词作为主题特征词。A plurality of feature words corresponding to the plurality of first preference degrees greater than or equal to the first preset preference threshold are selected as the topic feature words.
  21. 根据权利要求12至18任意一项所述的装置,其特征在于,所述标签标注单元根 据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上的方法包括:The device according to any one of claims 12 to 18, wherein the label labeling unit selects one or more labels from the label library in the preset manner according to the second preference degree to the new application. The methods on the label include:
    根据所述新应用对每个标签的第二偏好度值从大到小的顺序,选取第二偏好度排名在前面的1个或多个标签给该新应用标注上。According to the order in which the new application applies the second preference value of each tag from large to small, the first preference is ranked by the first one or more tags to the new application.
  22. 根据权利要求12至18任意一项所述的装置,其特征在于,所述标签标注单元根据该第二偏好度按预设方式从标签库里选取相应的1个或多个标签给该新应用标注上的方法包括:The device according to any one of claims 12 to 18, wherein the label labeling unit selects one or more labels from the label library in the preset manner according to the second preference degree to the new application. The methods on the label include:
    选取大于或等于第二预设偏好度阈值的1个或多个第二偏好度所对应的1个或多个标签给该新应用标注上。The one or more tags corresponding to one or more second preference degrees greater than or equal to the second preset preference threshold are selected for the new application.
  23. 一种终端,其特征在于,包括存储器和处理器,其中,所述存储器存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,执行所述权利要求1至11任意一项所述的方法。A terminal, comprising: a memory and a processor, wherein the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, performing any one of claims 1 to 11 The method described in the item.
  24. 一种计算机可读介质,其特征在于,所述计算机可读存储介质存储有可执行的指令,所述指令在被一个或多个处理器执行时,实现权利要求1-11任意一项所述的方法。A computer readable medium, wherein the computer readable storage medium stores executable instructions that, when executed by one or more processors, implement any of claims 1-11 Methods.
PCT/CN2017/118709 2017-04-10 2017-12-26 Method and device for tagging label for application, terminal and computer readable storage medium WO2018188378A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710227588.8 2017-04-10
CN201710227588.8A CN106951571B (en) 2017-04-10 2017-04-10 Method and device for labeling application with label

Publications (1)

Publication Number Publication Date
WO2018188378A1 true WO2018188378A1 (en) 2018-10-18

Family

ID=59475645

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/118709 WO2018188378A1 (en) 2017-04-10 2017-12-26 Method and device for tagging label for application, terminal and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN106951571B (en)
WO (1) WO2018188378A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951571B (en) * 2017-04-10 2021-06-22 阿里巴巴(中国)有限公司 Method and device for labeling application with label
CN107506398B (en) * 2017-08-02 2020-01-24 杭州东信北邮信息技术有限公司 Method for adding label attribute to book
CN108363550A (en) * 2017-12-28 2018-08-03 中兴智能交通股份有限公司 A kind of method and apparatus of data cached update and storage
CN108763194B (en) * 2018-04-27 2022-09-27 阿里巴巴(中国)有限公司 Method and device for applying label labeling, storage medium and computer equipment
CN108900922B (en) * 2018-07-20 2021-03-19 广州方硅信息技术有限公司 Method and device for setting label of live broadcast component
CN109522424B (en) * 2018-10-16 2020-04-24 北京达佳互联信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN110457464B (en) * 2019-07-08 2023-03-24 创新先进技术有限公司 Method and device for information processing and computing equipment
CN111880872A (en) * 2020-06-28 2020-11-03 华为技术有限公司 Method, terminal device, server and system for managing application program APP
CN111967518B (en) * 2020-08-18 2023-10-13 深圳市欢太科技有限公司 Application labeling method, application labeling device and terminal equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810168A (en) * 2012-11-06 2014-05-21 深圳市世纪光速信息技术有限公司 Search application method, device and terminal
CN105069106A (en) * 2015-08-07 2015-11-18 小米科技有限责任公司 Application group recommendation method and device
US20160267165A1 (en) * 2015-03-14 2016-09-15 Hui Wang Automated Key Words (Phrases) Discovery In Document Stacks And Its Application To Document Classification, Aggregation, and Summarization
CN106951571A (en) * 2017-04-10 2017-07-14 广州优视网络科技有限公司 A kind of method and apparatus for giving application mark label
CN106980667A (en) * 2017-03-22 2017-07-25 广州优视网络科技有限公司 A kind of method and apparatus that label is marked to article

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9269078B2 (en) * 2011-04-22 2016-02-23 Verizon Patent And Licensing Inc. Method and system for associating a contact with multiple tag classifications
CN103927309B (en) * 2013-01-14 2017-08-11 阿里巴巴集团控股有限公司 A kind of method and device to business object markup information label
CN104133877B (en) * 2014-07-25 2017-09-29 百度在线网络技术(北京)有限公司 The generation method and device of software label

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810168A (en) * 2012-11-06 2014-05-21 深圳市世纪光速信息技术有限公司 Search application method, device and terminal
US20160267165A1 (en) * 2015-03-14 2016-09-15 Hui Wang Automated Key Words (Phrases) Discovery In Document Stacks And Its Application To Document Classification, Aggregation, and Summarization
CN105069106A (en) * 2015-08-07 2015-11-18 小米科技有限责任公司 Application group recommendation method and device
CN106980667A (en) * 2017-03-22 2017-07-25 广州优视网络科技有限公司 A kind of method and apparatus that label is marked to article
CN106951571A (en) * 2017-04-10 2017-07-14 广州优视网络科技有限公司 A kind of method and apparatus for giving application mark label

Also Published As

Publication number Publication date
CN106951571A (en) 2017-07-14
CN106951571B (en) 2021-06-22

Similar Documents

Publication Publication Date Title
WO2018188378A1 (en) Method and device for tagging label for application, terminal and computer readable storage medium
WO2018188576A1 (en) Resource pushing method and device
CN107657048B (en) User identification method and device
US11176453B2 (en) System and method for detangling of interleaved conversations in communication platforms
US20190251471A1 (en) Machine learning device
US8468146B2 (en) System and method for creating search index on cloud database
CN103136228A (en) Image search method and image search device
AU2016201273B2 (en) Recommending form fragments
WO2017206376A1 (en) Searching method, searching device and non-volatile computer storage medium
US20160253577A1 (en) Image Clustering Method, Image Clustering System, And Image Clustering Server
US10599760B2 (en) Intelligent form creation
CN112136127A (en) Action indicator for search operation output element
CN112199526B (en) Method and device for issuing multimedia content, electronic equipment and storage medium
US9436891B2 (en) Discriminating synonymous expressions using images
WO2018196553A1 (en) Method and apparatus for obtaining identifier, storage medium, and electronic device
CN110674620A (en) Target file generation method, device, medium and electronic equipment
US11836331B2 (en) Mathematical models of graphical user interfaces
JP6419969B2 (en) Method and apparatus for providing image presentation information
CN112818111A (en) Document recommendation method and device, electronic equipment and medium
WO2018171288A1 (en) Method and apparatus for tagging information stream, terminal device, and storage medium
CN106899755B (en) Information sharing method, information sharing device and terminal
CN112990625A (en) Method and device for allocating annotation tasks and server
WO2017097102A1 (en) Retrieval method and retrieval device
US20170161322A1 (en) Method and electronic device for searching resource
US20190265954A1 (en) Apparatus and method for assisting discovery of design pattern in model development environment using flow diagram

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17905122

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17905122

Country of ref document: EP

Kind code of ref document: A1