WO2020057301A1 - 决策树生成方法和装置 - Google Patents

决策树生成方法和装置 Download PDF

Info

Publication number
WO2020057301A1
WO2020057301A1 PCT/CN2019/100682 CN2019100682W WO2020057301A1 WO 2020057301 A1 WO2020057301 A1 WO 2020057301A1 CN 2019100682 W CN2019100682 W CN 2019100682W WO 2020057301 A1 WO2020057301 A1 WO 2020057301A1
Authority
WO
WIPO (PCT)
Prior art keywords
tree
decision tree
node
skeleton
split
Prior art date
Application number
PCT/CN2019/100682
Other languages
English (en)
French (fr)
Inventor
李龙飞
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020057301A1 publication Critical patent/WO2020057301A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Definitions

  • This specification relates to the field of machine learning technology, and in particular, to a method and a device for generating a decision tree.
  • Decision tree is a basic supervised learning model that can continuously cut data to achieve the purpose of segmenting data.
  • the generation of a decision tree relies on a large number of labeled samples. When the number of samples is small, the effect of the training decision tree is often poor.
  • the present specification provides a method and an apparatus for generating a decision tree.
  • a method for generating a decision tree includes:
  • the second type of sample data is used to train the missing split values of the tree skeleton to obtain the target decision tree.
  • a decision tree generating device includes:
  • a basic acquisition unit that acquires a basic decision tree that is generated based on the first type of sample data
  • a skeleton extraction unit which extracts a tree skeleton of the basic decision tree, the tree skeleton including a node's split feature, and not including a split value or including a partial split value;
  • the target training unit uses the second type of sample data to train the missing split values of the tree skeleton to obtain a target decision tree.
  • a decision tree generating device includes:
  • Memory for storing machine-executable instructions
  • the processor is caused to:
  • the second type of sample data is used to train the missing split values of the tree skeleton to obtain the target decision tree.
  • the tree skeleton can be extracted from the basic decision tree in this specification, and the tree skeleton can be migrated to a scene with less sample data, and the tree skeleton is trained based on the sample data in the scene. , Thereby generating a credible decision tree for scenarios with less sample data and solving the problem of model training for scenarios with less sample data.
  • FIG. 1 is a schematic flowchart of a method for generating a decision tree according to an exemplary embodiment of the present specification.
  • FIG. 2 is a schematic diagram of a basic decision tree according to an exemplary embodiment of the present specification.
  • Fig. 3 is a schematic diagram of a tree skeleton according to an exemplary embodiment of the present specification.
  • Fig. 4 is a schematic structural diagram of a device for generating a decision tree, according to an exemplary embodiment of the present specification.
  • Fig. 5 is a block diagram of a device for generating a decision tree, according to an exemplary embodiment of the present specification.
  • first, second, third, etc. may be used in this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other.
  • first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information.
  • word “if” as used herein can be interpreted as “at” or "when” or "in response to determination”.
  • This specification provides a decision tree generation scheme, which can extract a tree skeleton from a decision tree in a scene with a large sample size, and then migrate the tree skeleton to a scene with a small sample size. Based on the sample data of the scene, the tree The skeleton is trained to train a more reliable decision tree for scenes with a small sample size.
  • FIG. 1 is a schematic flowchart of a method for generating a decision tree according to an exemplary embodiment of the present specification.
  • the method for generating a decision tree may include the following steps:
  • Step 102 Obtain a basic decision tree, which is generated based on the first type of sample data.
  • the first type of sample data comes from a first scenario, and the first scenario is usually a scenario with a large number of samples.
  • the first scenario is usually a scenario with a large number of samples.
  • a decision tree oriented to a specified topic can be generated as Easy to distinguish, this decision tree can be called a basic decision tree.
  • algorithms such as C4.5 and C5 can be used to generate the basic decision tree.
  • GBDT Gradient Boosting Decision Tree
  • the subject is usually a subject of classification determination, for example, cash determination, abnormal account determination, money laundering determination, etc., which are not specifically limited in this specification.
  • the generated basic decision tree is more reliable.
  • step 104 a tree skeleton of the basic decision tree is extracted, and the tree skeleton includes a split feature of a node, and does not include a split value or a partial split value.
  • a branch path between a partial node and the partial node may be extracted downward from a root node of the basic decision tree, or the basic decision tree may be extracted downward from the root node.
  • the tree skeleton may include a splitting feature of an extraction node, but may not include a splitting value of the splitting feature, or may include a splitting value of a partial splitting feature, which is not specifically limited in this specification.
  • Step 106 Use the second type of sample data to train the missing split values of the tree skeleton to obtain a target decision tree.
  • the second type of sample data comes from a second scenario.
  • the second scenario is usually a scenario with a small sample size and has some of the same characteristics as the first scenario. For example, transactions in the past 3 days Total amount, total number of people transferred on the day, etc.
  • the fitting degree of decision tree generated based on the second type of sample data is often too high and the credibility is poor.
  • the tree skeleton extracted in the foregoing step 104 may be trained based on the second type of sample data to obtain the missing split value of the tree skeleton, and then the tree skeleton may be further extended to consider that the The second scenario generates a target decision tree for the same topic.
  • the tree skeleton can be extracted from the basic decision tree in this specification, and the tree skeleton can be migrated to a scene with less sample data, and the tree skeleton is trained based on the sample data in the scene. Therefore, a more credible decision tree is generated for the scene with less sample data, and the model training problem of the scene with less sample data is solved.
  • cashing refers to cash withdrawal, and generally refers to obtaining cash benefits through illegal or false means of exchange, such as credit card cashing, credit product cashing, etc.
  • the first scenario is an O2O (Online To Offline) scenario, for example, an offline code scan payment.
  • the second scenario is a money collection code scenario. For example, a user scans a merchant's static two-dimensional code to make a payment.
  • node 1 is a root node of the basic decision tree
  • nodes 2 to 7 are ordinary tree nodes of the decision tree
  • nodes 8 to 15 are leaf nodes of the basic decision tree.
  • the basic decision tree includes several forked paths, which are used to connect various nodes, for example, path 12 connects root node 1 and common tree node 2, path 13 connects root node 1 and common tree node 3, and so on.
  • the maximum depth of the basic decision tree is 3, and the depth can be understood as the distance from the node to the root node.
  • the distance from the ordinary tree node 2 to the root node 1 is 1, that is, the depth of the ordinary tree node 2 is 1; the leaf node 8 to the root
  • the distance of node 1 is 3, that is, the depth of leaf node 8 is 3 and so on.
  • node Split feature Root node 1 Total transaction amount in the last 10 days Ordinary tree node 2 Total transaction amount in the last 5 days Ordinary tree node 3 Number of transfers in the last 5 days Ordinary tree node 4 Number of transfers in the last 8 days Ordinary tree node 5 Number of transfers in the last 3 days ... ...
  • Each node except the leaf node in the basic decision tree can represent a split feature. Please refer to the example in Table 1.
  • the split feature represented by the root node 1 is the total transaction amount in the past 10 days
  • the ordinary tree node 2 represents The split feature is the total amount of transactions in the last 5 days.
  • the split feature represented by node 3 of the ordinary tree is the number of transfers in the last 5 days.
  • Root node 1 Total transaction amount in the last 10 days 1000
  • Ordinary tree node 2 Total transaction amount in the last 5 days 500
  • Ordinary tree node 3 Number of transfers in the last 5 days 8
  • Ordinary tree node 4 Number of transfers in the last 8 days 12
  • Ordinary tree node 5 Number of transfers in the last 3 days 5 ... ... Zh
  • Each split feature may correspond to a split value, and a unique split path may be determined based on the split value and a selection strategy of the split path.
  • the selection strategy of the bifurcation path can be set in advance. For example, the left bifurcation path corresponds to a determination result that is less than or equal to the split value, and the right bifurcation path corresponds to a determination result that is greater than the split value.
  • the split value of the root node 1's total transaction amount for the last 10 days is 1000.
  • the branch path is 12, and jump to the ordinary tree Node 2 continues to judge the relationship between the total transaction amount and the split value of 500 in the past 5 days.
  • the branching path is 13, jump to the ordinary tree node 3, and continue to judge the relationship between the number of transfers and the split value 8 in the last 5 days, and so on.
  • the path of the account in the basic decision tree shown in Figure 2 is root node 1-ordinary tree node 2- Normal tree node 5 ..., and so on.
  • FIG. 2 is only an exemplary illustration. In actual applications, the generated basic decision tree is usually more complicated than that in FIG. 2.
  • extraction of a tree skeleton may be performed.
  • a branch path between a node less than or equal to a specified depth and the node may be extracted downward.
  • the specified depth is usually smaller than the maximum depth of the basic decision tree and can be set in advance, for example, it can be set by a business person based on experience.
  • the specified depth is 2, and still take the basic decision tree shown in Figure 2 as an example. From the root node, you can extract each node with a depth of 1 and 2 and the branching path between the nodes, that is, extract nodes 1 to nodes 7 and bifurcation paths between node 1 to node 7: path 12, path 13, path 24, path 25, path 36, and path 37, and the tree skeleton shown in FIG. 3 is obtained.
  • the tree skeleton includes the split features represented by the extracted nodes, that is, the total transaction amount of the last 10 days including the split feature of the root node 1 and the total transaction amount of the last 5 days including the split feature of the ordinary tree node 2.
  • the tree skeleton may not include the split value of each split feature, and may also include the split value of a partial split feature. For example, it may only include the split value of the split feature of the root node 1, the ordinary tree node 2 and the ordinary tree node 3. There are no special restrictions.
  • all the nodes of the basic decision tree and the branching paths between all the nodes may be extracted downward from the root node of the basic decision tree to obtain the tree skeleton of the basic decision tree.
  • the tree skeleton may not include a split value of each split feature, and may also include a split value of a partial split feature, which is not particularly limited in this specification.
  • the tree skeleton may be extracted without reference to the depth. Taking FIG. 2 as an example, the root node 1 and the ordinary tree node 2 to the ordinary tree node 5 can be extracted.
  • the tree skeleton can be trained by using the second type of sample data in the money-receiving code scene to obtain that the tree skeleton lacks a split value.
  • Root node 1 Total transaction amount in the last 10 days 800 Ordinary tree node 2 Total transaction amount in the last 5 days 400 Ordinary tree node 3 Number of transfers in the last 5 days 7 Ordinary tree node 4 Number of transfers in the last 8 days 10 Ordinary tree node 5 Number of transfers in the last 3 days 4 ... ... Zh
  • the split value of each split feature can be trained based on the second type of sample data in the money collecting code scene. Please refer to the example in Table 3. It can be obtained that the split feature of root node 1 has a total transaction value of 800 in the last 10 days. According to a predetermined fork path selection strategy, when the total transaction amount in the last 10 days is less than or equal to 800, it can be determined The bifurcation path is 12, and so on.
  • the tree skeleton may be further fitted and extended based on the second type of sample data, and the split feature and split value of each extended node after the extension are determined. , Until the model converges to obtain the target decision tree, thereby completing the training of the cash-out decision tree in the cash code scenario.
  • the leaf node when the amount of black samples of a leaf node is small, the leaf node is generally considered unreliable.
  • the credibility of each leaf node in the target decision tree may be calculated using the second type of sample data in the second scenario, and then the leaves whose credibility does not satisfy the credibility condition may be filtered. Nodes to streamline the target decision tree.
  • the leaf nodes of the target decision tree can be scored based on all the second type of sample data. For each leaf node, the score results can be summarized, and the score results can be used as the credibility of the leaf nodes. . Assume that the credibility condition is that the credibility is ranked in the top 1%, then the leaf nodes ranked in the top 1% of credibility can be retained, and the leaf nodes arranged in the back are filtered.
  • this specification can automatically generate the model's decision rules for target decision trees that have high requirements for explanatory aspects related to finance.
  • the complete path from the root node to the leaf node can be obtained from the bottom up, and then generated based on the split features and split values of the nodes on the full path A decision rule corresponding to the target decision tree.
  • the target decision tree shown in FIG. 3 includes four complete paths, which are node 1-node 2-node 4, node 1-node 2-node 5, node 1-node 3-node 6, Node 1-Node 3-Node 7.
  • the split features and split values represented by the above nodes are shown in Table 2. You can use logic and connect each split feature and its split value. Taking node 1-node 2-node 4 as an example, the corresponding determination rule is: the total transaction amount in the last 10 days is greater than or equal to 1000 and the total transaction amount in the last 5 days is greater than or equal to 500 and the number of transfers in the last 8 days is greater than or equal to 12.
  • each decision rule of the target decision tree can be automatically generated.
  • this specification also provides an embodiment of a device for generating a decision tree.
  • the embodiment of the decision tree generating device of the present specification can be applied to a server.
  • the device embodiments may be implemented by software, or by hardware or a combination of software and hardware. Taking software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory through the processor of the server where it is located.
  • FIG. 4 it is a hardware structure diagram of the server where the decision tree generating device of this specification is located, in addition to the processor, memory, network interface, and non-volatile memory shown in FIG. 4.
  • the server where the device is located usually includes other hardware according to the actual function of the server, and details are not described herein again.
  • Fig. 5 is a block diagram of a device for generating a decision tree, according to an exemplary embodiment of the present specification.
  • the decision tree generating device 400 may be applied to the server shown in FIG. 4 and includes a basic obtaining unit 401, a skeleton extracting unit 402, a target training unit 403, and a rule generating unit 404.
  • the basic acquisition unit 401 acquires a basic decision tree, which is generated based on the first type of sample data
  • the skeleton extraction unit 402 extracts a tree skeleton of the basic decision tree, where the tree skeleton includes a split feature of a node, and does not include a split value or a partial split value;
  • the target training unit 403 uses the second type of sample data to train the missing split values of the tree skeleton to obtain a target decision tree.
  • the skeleton extraction unit 402 starts from the root node of the basic decision tree and extracts a branch path between a node that is less than or equal to a specified depth and the node, and the specified depth is less than the basic decision.
  • the depth of the tree is the depth of the tree.
  • the skeleton extraction unit 402 starts from the root node of the basic decision tree and extracts all the nodes of the basic decision tree and the branching paths between all the nodes.
  • the target training unit 403 after training using the second type of sample data to obtain the missing split value of the tree skeleton, extends the tree skeleton based on the second sample data, and determines an extension node The split features and split values until convergence.
  • a rule generating unit 404 for each leaf node of the target decision tree, obtaining a complete path from a root node to the leaf node;
  • a decision rule corresponding to the target decision tree is generated according to a split feature and a split value of a node on the complete path.
  • the relevant part may refer to the description of the method embodiment.
  • the device embodiments described above are only schematic, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, may be located One place, or it can be distributed across multiple network elements. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution in this specification. Those of ordinary skill in the art can understand and implement without creative efforts.
  • the system, device, module, or unit described in the foregoing embodiments may be specifically implemented by a computer chip or entity, or a product with a certain function.
  • a typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email sending and receiving device, and a game control Desk, tablet computer, wearable device, or a combination of any of these devices.
  • this specification also provides a device for generating a decision tree.
  • the device includes a processor and a memory for storing machine-executable instructions.
  • the processor and the memory are usually connected to each other through an internal bus.
  • the device may further include an external interface to enable communication with other devices or components.
  • the processor by reading and executing machine-executable instructions corresponding to decision tree generation logic stored in the memory, the processor is caused to:
  • the second type of sample data is used to train the missing split values of the tree skeleton to obtain the target decision tree.
  • the processor when extracting a tree skeleton of the basic decision tree, the processor is caused to:
  • a branch path between a node that is less than or equal to a specified depth and the node is extracted downward, and the specified depth is less than the depth of the basic decision tree.
  • the processor when extracting a tree skeleton of the basic decision tree, the processor is caused to:
  • the processor is further caused to:
  • the tree skeleton After training the second type of sample data to obtain the missing split values of the tree skeleton, based on the second sample data, the tree skeleton is extended, and the split features and split values of the extended nodes are determined until convergence.
  • the processor is further caused to:
  • a decision rule corresponding to the target decision tree is generated according to a split feature and a split value of a node on the complete path.
  • the present specification also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the program is executed by a processor, the following steps are implemented:
  • the second type of sample data is used to train the missing split values of the tree skeleton to obtain the target decision tree.
  • extracting a tree skeleton of the basic decision tree includes:
  • a branch path between a node that is less than or equal to a specified depth and the node is extracted downward, and the specified depth is less than the depth of the basic decision tree.
  • extracting a tree skeleton of the basic decision tree includes:
  • Optional also includes:
  • the tree skeleton After training the second type of sample data to obtain the missing split values of the tree skeleton, based on the second sample data, the tree skeleton is extended, and the split features and split values of the extended nodes are determined until convergence.
  • Optional also includes:
  • a decision rule corresponding to the target decision tree is generated according to a split feature and a split value of a node on the complete path.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

说明书披露一种决策树生成方法和装置。所述方法包括:获取基础决策树,所述基础决策树基于第一类样本数据生成;提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。

Description

决策树生成方法和装置 技术领域
本说明书涉及机器学习技术领域,尤其涉及一种决策树生成方法和装置。
背景技术
决策树是一种基础的有监督学习模型,可以不断的对数据进行切割,以达到分割数据的目的。决策树的生成依赖大量有标签的样本,当样本数量较少时,训练得到的决策树的效果往往比较差。
发明内容
有鉴于此,本说明书提供一种决策树生成方法和装置。
具体地,本说明书是通过如下技术方案实现的:
一种决策树生成方法,包括:
获取基础决策树,所述基础决策树基于第一类样本数据生成;
提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
一种决策树生成装置,包括:
基础获取单元,获取基础决策树,所述基础决策树基于第一类样本数据生成;
骨架提取单元,提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
目标训练单元,利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
一种决策树生成装置,包括:
处理器;
用于存储机器可执行指令的存储器;
其中,通过读取并执行所述存储器存储的与决策树生成逻辑对应的机器可执行指令,所述处理器被促使:
获取基础决策树,所述基础决策树基于第一类样本数据生成;
提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
由以上描述可以看出,本说明书可从基础决策树中提取出树骨架,并可将该树骨架迁移到样本数据较少的场景中,基于该场景下的样本数据对所述树骨架进行训练,从而为样本数据较少的场景生成可信的决策树,解决了样本数据较少场景的模型训练问题。
附图说明
图1是本说明书一示例性实施例示出的一种决策树生成方法的流程示意图。
图2是本说明书一示例性实施例示出的一种基础决策树示意图。
图3是本说明书一示例性实施例示出的一种树骨架示意图。
图4是本说明书一示例性实施例示出的一种用于决策树生成装置的一结构示意图。
图5是本说明书一示例性实施例示出的一种决策树生成装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本说明书相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本说明书的一些方面相一致的装置和方法的例子。
在本说明书使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本说明书。在本说明书和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本说明书可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在 不脱离本说明书范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
本说明书提供一种决策树生成方案,可从样本量较大场景的决策树中提取出树骨架,然后将树骨架迁移到样本量较少场景中,基于的该场景的样本数据对所述树骨架进行训练,从而为样本量较少的场景训练出较为可信的决策树。
图1是本说明书一示例性实施例示出的一种决策树生成方法的流程示意图。
请参考图1,所述决策树生成方法可以包括以下步骤:
步骤102,获取基础决策树,所述基础决策树基于第一类样本数据生成。
在本实施例中,所述第一类样本数据来自第一场景,所述第一场景通常为样本量较多的场景,基于所述第一类样本数据可生成面向指定主题的决策树,为便于区分,可将该决策树称为基础决策树。
例如,可采用C4.5,C5等算法生成所述基础决策树。
再例如,也可采用GBDT(Gradient Boosting Decision Tree,梯度提升决策树)算法生成含有一棵树的基础决策树。
在本实施例中,所述主题通常是分类判定主题,例如,套现判定、异常账号判定、洗钱判定等,本说明书对此不作特殊限制。
在本实施例中,由于第一类样本量较多,生成的基础决策树较为可信。
步骤104,提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值。
在本实施例中,可从所述基础决策树的根节点开始向下提取部分节点和所述部分节点之间的分叉路径,也可从所述根节点开始向下提取所述基础决策树的所有节点和所述所有节点之间的分叉路径,以生成树骨架。
所述树骨架可包括提取节点的分裂特征,但可不包括所述分裂特征的分裂值,也可以包括部分分裂特征的分裂值,本说明书对此不作特殊限制。
步骤106,利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
在本实施例中,所述第二类样本数据来自第二场景,所述第二场景通常是样本量较 少的场景,与所述第一场景具有部分相同的特征,例如,近3天交易总金额、当天转账总人数等。基于第二类样本数据生成的决策树拟合度往往过高,可信度较差。在本步骤中,可基于第二类样本数据对前述步骤104提取出的树骨架进行训练,从而得到所述树骨架缺少的分裂值,然后还可继续对所述树骨架进行延伸,以为所述第二场景生成相同主题的目标决策树。
由以上描述可以看出,本说明书可从基础决策树中提取出树骨架,并可将该树骨架迁移到样本数据较少的场景中,基于该场景下的样本数据对所述树骨架进行训练,从而为样本数据较少的场景生成较为可信的决策树,解决了样本数据较少场景的模型训练问题。
下面以指定主题是套现判定为例,对本说明书的具体实现过程进行详细描述。
上述套现是指套取现金,一般是指用违法或虚假的手段交换取得现金利益,例如,信用卡套现、信贷产品套现等。
在本实施例中,假设第一场景是O2O(Online To Offline,线上到线下)场景,例如,线下扫码支付。假设第二场景是收钱码场景,例如,用户扫描商家的静态二维码进行支付。
在本实施例中,O2O场景中的套现判定样本较多,基于O2O场景下的第一类样本数据,可采用C4.5、C5等算法生成套现判定的基础决策树。
假设在O2O场景中训练得到的基础决策树如图2所示。请参考图2,节点1是该基础决策树的根节点,节点2至节点7是该决策树的普通树节点,节点8至节点15是该基础决策树的叶子节点。
该基础决策树包括若干分叉路径,所述分叉路径用于连接各个节点,例如,路径12连接根节点1和普通树节点2,路径13连接根节点1和普通树节点3等。
该基础决策树的最大深度是3,深度可以理解为节点到根节点的距离,例如普通树节点2到根节点1的距离是1,即普通树节点2的深度是1;叶子节点8到根节点1的距离是3,即叶子节点8的深度是3等。
节点 分裂特征
根节点1 近10天交易总金额
普通树节点2 近5天交易总金额
普通树节点3 近5天转账人数
普通树节点4 近8天转账人数
普通树节点5 近3天转账人数
表1
所述基础决策树中除叶子节点之外的每个节点都可代表一个分裂特征,请参考表1的示例,根节点1代表的分裂特征是近10天交易总金额,普通树节点2代表的分裂特征是近5天交易总金额,普通树节点3代表的分裂特征是近5天转账人数等。
节点 分裂特征 分裂值
根节点1 近10天交易总金额 1000
普通树节点2 近5天交易总金额 500
普通树节点3 近5天转账人数 8
普通树节点4 近8天转账人数 12
普通树节点5 近3天转账人数 5
 
表2
每个分裂特征都可对应一个分裂值,基于所述分裂值和分叉路径的选择策略可确定唯一的分叉路径。其中,所述分叉路径的选择策略可预先设定,例如,左边的分叉路径对应小于等于分裂值的判定结果,右边的分叉路径对应大于分裂值的判定结果。
请参考表2的示例,根节点1的分裂特征近10天交易总金额的分裂值是1000,当近10天交易总金额小于等于1000时,可确定分叉路径是12,跳转到普通树节点2,继续判断近5天交易总金额与分裂值500的大小关系。当近10天交易总金额大于1000时,可确定分叉路径是13,跳转到普通树节点3,继续判断近5天转账人数与分裂值8的大小关系,依次类推。
举例来说,假设某账号近10天交易总金额是950,近5天交易总金额是550,则该账号在图2所示的基础决策树中的路径是根节点1-普通树节点2-普通树节点5…,依次 类推。
值得注意的是,图2仅为示例性的说明,在实际应用中,生成的基础决策树通常会比图2更加复杂。
在本实施例中,在生成所述基础决策树后,可进行树骨架的提取。
在一个例子中,可从基础决策树的根节点开始向下提取小于等于指定深度的节点和所述节点之间的分叉路径。
所述指定深度通常小于所述基础决策树的最大深度,可预先设置,例如,可由业务人员依据经验设置等。
假设指定深度是2,仍以图2所示的基础决策树为例,可从根节点开始提取深度是1和2的各个节点和所述节点之间的分叉路径,即提取节点1至节点7以及节点1至节点7之间的分叉路径:路径12、路径13、路径24、路径25、路径36和路径37,以及得到图3所示的树骨架。
在本实施例中,所述树骨架包括提取的节点所代表的分裂特征,即包括根节点1的分裂特征近10天交易总金额、普通树节点2的分裂特征近5天交易总金额等。
所述树骨架可不包括各个分裂特征的分裂值,也可以包括部分分裂特征的分裂值,例如可仅包括根节点1、普通树节点2和普通树节点3的分裂特征的分裂值,本说明书对此不作特殊限制。
在另一个例子中,可从基础决策树的根节点开始向下提取所述基础决策树的所有节点和所述所有节点之间的分叉路径,得到所述基础决策树的树骨架。
所述树骨架可不包括各个分裂特征的分裂值,也可以包括部分分裂特征的分裂值,本说明书对此不作特殊限制。
在另一个例子中,也可以不以深度为基准进行树骨架的提取。仍以图2为例,可提取根节点1以及普通树节点2至普通树节点5等。
在本实施例中,在提取到基础决策树的树骨架后,可采用收钱码场景中的第二类样本数据对所述树骨架进行训练,以得到所述树骨架缺少分裂值。
节点 分裂特征 分裂值
根节点1 近10天交易总金额 800
普通树节点2 近5天交易总金额 400
普通树节点3 近5天转账人数 7
普通树节点4 近8天转账人数 10
普通树节点5 近3天转账人数 4
 
表3
以所述树骨架中不包括任何分裂特征的分裂值为例,基于收钱码场景中的第二类样本数据,可训练得到各个分裂特征的分裂值。请参考表3的示例,可得到根节点1的分裂特征近10天交易总金额的分裂值是800,根据预定的分叉路径选择策略,当近10天交易总金额小于等于800时,可确定分叉路径是12,依次类推。
在本实施例中,在得到树骨架中各个分裂特征的分裂值之后,可基于所述第二类样本数据继续对树骨架进行拟合延伸,并确定延伸后各个延伸节点的分裂特征和分裂值,直至模型收敛,得到目标决策树,从而完成收钱码场景中套现判定决策树的训练。
一般而言,当某个叶子节点的黑样本量较少时,通常认为该叶子节点不可信。可选的,针对训练得到目标决策树,可采用第二场景下的第二类样本数据计算所述目标决策树中各叶子节点的可信度,然后过滤可信度不满足可信条件的叶子节点,以对目标决策树进行精简。
以GBDT算法为例,可先基于所有第二类样本数据对目标决策树的叶子节点进行打分,针对每个叶子节点,可汇总打分结果,并可将打分结果作为所述叶子节点的可信度。假设,可信条件是可信度排列在前1%,则可保留可信度排列在前1%的叶子节点,过滤排列在后面的叶子节点。
值得注意的是,在实际应用中,为确保目标决策树的完整性,可不对不满足可信条件的叶子节点进行剪枝,仅在目标决策树的使用中,不使用所述不满足可信条件的叶子节点。
可选的,针对与金融相关的对解释性方面有较高要求的目标决策树,本说明书可自动生成模型的判定规则。
在本例中,针对训练得到的目标决策树的每个叶子节点,可自下而上获取根节点到 所述叶子节点的完整路径,然后根据该完整路径上的节点的分裂特征和分裂值生成所述目标决策树对应的判定规则。
请继续参考图3,图3所示的目标决策树包括有4条完整路径,分别为节点1-节点2-节点4,节点1-节点2-节点5,节点1-节点3-节点6,节点1-节点3-节点7。
假设上述节点所代表的分裂特征和分裂值如表2所示,则可以采用逻辑与连接各个分裂特征及其分裂值。以节点1-节点2-节点4为例,其对应的判定规则为:近10天交易总金额大于等于1000and近5天交易总金额大于等于500and近8天转账人数大于等于12。
由此,可自动生成目标决策树的各个判定规则。
与前述决策树生成方法的实施例相对应,本说明书还提供了决策树生成装置的实施例。
本说明书决策树生成装置的实施例可以应用在服务器上。装置实施例可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在服务器的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,如图4所示,为本说明书决策树生成装置所在服务器的一种硬件结构图,除了图4所示的处理器、内存、网络接口、以及非易失性存储器之外,实施例中装置所在的服务器通常根据该服务器的实际功能,还可以包括其他硬件,对此不再赘述。
图5是本说明书一示例性实施例示出的一种决策树生成装置的框图。
请参考图5,所述决策树生成装置400可应用在前述图4所示的服务器中,包括有:基础获取单元401、骨架提取单元402、目标训练单元403以及规则生成单元404。
其中,基础获取单元401,获取基础决策树,所述基础决策树基于第一类样本数据生成;
骨架提取单元402,提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
目标训练单元403,利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
可选的,所述骨架提取单元402,从所述基础决策树的根节点开始向下提取小于等 于指定深度的节点和所述节点之间的分叉路径,所述指定深度小于所述基础决策树的深度。
可选的,所述骨架提取单元402,从所述基础决策树的根节点开始向下提取所述基础决策树的所有节点和所述所有节点之间的分叉路径。
可选的,所述目标训练单元403,在利用第二类样本数据训练得到所述树骨架缺少的分裂值之后,基于所述第二样本数据,对所述树骨架进行延伸,并确定延伸节点的分裂特征和分裂值,直至收敛。
规则生成单元404,针对所述目标决策树的每个叶子节点,获取根节点到所述叶子节点的完整路径;
根据所述完整路径上的节点的分裂特征和分裂值生成所述目标决策树对应的判定规则。
上述装置中各个单元的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程,在此不再赘述。
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本说明书方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。
与前述决策树生成方法的实施例相对应,本说明书还提供一种决策树生成装置,该装置包括:处理器以及用于存储机器可执行指令的存储器。其中,处理器和存储器通常借由内部总线相互连接。在其他可能的实现方式中,所述设备还可能包括外部接口,以能够与其他设备或者部件进行通信。
在本实施例中,通过读取并执行所述存储器存储的与决策树生成逻辑对应的机器可 执行指令,所述处理器被促使:
获取基础决策树,所述基础决策树基于第一类样本数据生成;
提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
可选的,在提取所述基础决策树的树骨架时,所述处理器被促使:
从所述基础决策树的根节点开始向下提取小于等于指定深度的节点和所述节点之间的分叉路径,所述指定深度小于所述基础决策树的深度。
可选的,在提取所述基础决策树的树骨架时,所述处理器被促使:
从所述基础决策树的根节点开始向下提取所述基础决策树的所有节点和所述所有节点之间的分叉路径。
可选的,所述处理器还被促使:
在利用第二类样本数据训练得到所述树骨架缺少的分裂值之后,基于所述第二样本数据,对所述树骨架进行延伸,并确定延伸节点的分裂特征和分裂值,直至收敛。
可选的,所述处理器还被促使:
针对所述目标决策树的每个叶子节点,获取根节点到所述叶子节点的完整路径;
根据所述完整路径上的节点的分裂特征和分裂值生成所述目标决策树对应的判定规则。
与前述决策树生成方法的实施例相对应,本说明书还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,该程序被处理器执行时实现以下步骤:
获取基础决策树,所述基础决策树基于第一类样本数据生成;
提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
可选的,提取所述基础决策树的树骨架包括:
从所述基础决策树的根节点开始向下提取小于等于指定深度的节点和所述节点之间的分叉路径,所述指定深度小于所述基础决策树的深度。
可选的,提取所述基础决策树的树骨架包括:
从所述基础决策树的根节点开始向下提取所述基础决策树的所有节点和所述所有节点之间的分叉路径。
可选的,还包括:
在利用第二类样本数据训练得到所述树骨架缺少的分裂值之后,基于所述第二样本数据,对所述树骨架进行延伸,并确定延伸节点的分裂特征和分裂值,直至收敛。
可选的,还包括:
针对所述目标决策树的每个叶子节点,获取根节点到所述叶子节点的完整路径;
根据所述完整路径上的节点的分裂特征和分裂值生成所述目标决策树对应的判定规则。
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
以上所述仅为本说明书的较佳实施例而已,并不用以限制本说明书,凡在本说明书的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本说明书保护的范围之内。

Claims (11)

  1. 一种决策树生成方法,包括:
    获取基础决策树,所述基础决策树基于第一类样本数据生成;
    提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
    利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
  2. 根据权利要求1所述的方法,提取所述基础决策树的树骨架包括:
    从所述基础决策树的根节点开始向下提取小于等于指定深度的节点和所述节点之间的分叉路径,所述指定深度小于所述基础决策树的深度。
  3. 根据权利要求1所述的方法,提取所述基础决策树的树骨架包括:
    从所述基础决策树的根节点开始向下提取所述基础决策树的所有节点和所述所有节点之间的分叉路径。
  4. 根据权利要求2或3所述的方法,还包括:
    在利用第二类样本数据训练得到所述树骨架缺少的分裂值之后,基于所述第二样本数据,对所述树骨架进行延伸,并确定延伸节点的分裂特征和分裂值,直至收敛。
  5. 根据权利要求1所述的方法,还包括:
    针对所述目标决策树的每个叶子节点,获取根节点到所述叶子节点的完整路径;
    根据所述完整路径上的节点的分裂特征和分裂值生成所述目标决策树对应的判定规则。
  6. 一种决策树生成装置,包括:
    基础获取单元,获取基础决策树,所述基础决策树基于第一类样本数据生成;
    骨架提取单元,提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
    目标训练单元,利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
  7. 根据权利要求6所述的装置,
    所述骨架提取单元,从所述基础决策树的根节点开始向下提取小于等于指定深度的节点和所述节点之间的分叉路径,所述指定深度小于所述基础决策树的深度。
  8. 根据权利要求6所述的装置,
    所述骨架提取单元,从所述基础决策树的根节点开始向下提取所述基础决策树的所有节点和所述所有节点之间的分叉路径。
  9. 根据权利要求7或8所述的装置,
    所述目标训练单元,在利用第二类样本数据训练得到所述树骨架缺少的分裂值之后,基于所述第二样本数据,对所述树骨架进行延伸,并确定延伸节点的分裂特征和分裂值,直至收敛。
  10. 根据权利要求6所述的装置,还包括:
    规则生成单元,针对所述目标决策树的每个叶子节点,获取根节点到所述叶子节点的完整路径;
    根据所述完整路径上的节点的分裂特征和分裂值生成所述目标决策树对应的判定规则。
  11. 一种决策树生成装置,包括:
    处理器;
    用于存储机器可执行指令的存储器;
    其中,通过读取并执行所述存储器存储的与决策树生成逻辑对应的机器可执行指令,所述处理器被促使:
    获取基础决策树,所述基础决策树基于第一类样本数据生成;
    提取所述基础决策树的树骨架,所述树骨架包括节点的分裂特征,且不包括分裂值或包括部分分裂值;
    利用第二类样本数据训练所述树骨架缺少的分裂值,得到目标决策树。
PCT/CN2019/100682 2018-09-21 2019-08-15 决策树生成方法和装置 WO2020057301A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811110423.3A CN109242034B (zh) 2018-09-21 2018-09-21 决策树生成方法和装置
CN201811110423.3 2018-09-21

Publications (1)

Publication Number Publication Date
WO2020057301A1 true WO2020057301A1 (zh) 2020-03-26

Family

ID=65056548

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/100682 WO2020057301A1 (zh) 2018-09-21 2019-08-15 决策树生成方法和装置

Country Status (3)

Country Link
CN (2) CN109242034B (zh)
TW (1) TW202013266A (zh)
WO (1) WO2020057301A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330054A (zh) * 2020-11-23 2021-02-05 大连海事大学 基于决策树的动态旅行商问题求解方法、系统及存储介质
CN112329874A (zh) * 2020-11-12 2021-02-05 京东数字科技控股股份有限公司 数据业务的决策方法、装置、电子设备和存储介质
CN114218994A (zh) * 2020-09-04 2022-03-22 京东科技控股股份有限公司 用于处理信息的方法和装置
CN114399000A (zh) * 2022-01-20 2022-04-26 中国平安人寿保险股份有限公司 树模型的对象可解释性特征提取方法、装置、设备及介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109242034B (zh) * 2018-09-21 2020-09-15 阿里巴巴集团控股有限公司 决策树生成方法和装置
CN111353600B (zh) * 2020-02-20 2023-12-12 第四范式(北京)技术有限公司 一种异常行为检测方法及装置
CN111429282B (zh) * 2020-03-27 2023-08-25 中国工商银行股份有限公司 基于反洗钱模型迁移的交易反洗钱方法及装置
CN111401570B (zh) * 2020-04-10 2022-04-12 支付宝(杭州)信息技术有限公司 针对隐私树模型的解释方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679777A (zh) * 2013-12-02 2015-06-03 中国银联股份有限公司 一种用于检测欺诈交易的方法及系统
WO2016090290A1 (en) * 2014-12-05 2016-06-09 Alibaba Group Holding Limited Method and apparatus for decision tree based search result ranking
CN106096748A (zh) * 2016-04-28 2016-11-09 武汉宝钢华中贸易有限公司 基于聚类分析和决策树算法的装车工时预测模型
CN107203774A (zh) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 对数据的归属类别进行预测的方法及装置
CN109242034A (zh) * 2018-09-21 2019-01-18 阿里巴巴集团控股有限公司 决策树生成方法和装置

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982336B (zh) * 2011-09-02 2015-11-25 株式会社理光 识别模型生成方法和系统
US9324040B2 (en) * 2013-01-30 2016-04-26 Technion Research & Development Foundation Limited Training ensembles of randomized decision trees
US9292599B2 (en) * 2013-04-30 2016-03-22 Wal-Mart Stores, Inc. Decision-tree based quantitative and qualitative record classification
CN105574544A (zh) * 2015-12-16 2016-05-11 平安科技(深圳)有限公司 一种数据处理方法和装置
US20170221075A1 (en) * 2016-01-29 2017-08-03 Sap Se Fraud inspection framework
US11443224B2 (en) * 2016-08-10 2022-09-13 Paypal, Inc. Automated machine learning feature processing
US11100421B2 (en) * 2016-10-24 2021-08-24 Adobe Inc. Customized website predictions for machine-learning systems
CN106682414A (zh) * 2016-12-23 2017-05-17 中国科学院深圳先进技术研究院 一种建立时序预测模型的方法及装置
US20180260531A1 (en) * 2017-03-10 2018-09-13 Microsoft Technology Licensing, Llc Training random decision trees for sensor data processing
CN107135061B (zh) * 2017-04-17 2019-10-22 北京科技大学 一种5g通信标准下的分布式隐私保护机器学习方法
CN108304936B (zh) * 2017-07-12 2021-11-16 腾讯科技(深圳)有限公司 机器学习模型训练方法和装置、表情图像分类方法和装置
CN108491891A (zh) * 2018-04-04 2018-09-04 桂林电子科技大学 一种基于决策树局部相似性的多源在线迁移学习方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679777A (zh) * 2013-12-02 2015-06-03 中国银联股份有限公司 一种用于检测欺诈交易的方法及系统
WO2016090290A1 (en) * 2014-12-05 2016-06-09 Alibaba Group Holding Limited Method and apparatus for decision tree based search result ranking
CN107203774A (zh) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 对数据的归属类别进行预测的方法及装置
CN106096748A (zh) * 2016-04-28 2016-11-09 武汉宝钢华中贸易有限公司 基于聚类分析和决策树算法的装车工时预测模型
CN109242034A (zh) * 2018-09-21 2019-01-18 阿里巴巴集团控股有限公司 决策树生成方法和装置

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114218994A (zh) * 2020-09-04 2022-03-22 京东科技控股股份有限公司 用于处理信息的方法和装置
CN112329874A (zh) * 2020-11-12 2021-02-05 京东数字科技控股股份有限公司 数据业务的决策方法、装置、电子设备和存储介质
CN112330054A (zh) * 2020-11-23 2021-02-05 大连海事大学 基于决策树的动态旅行商问题求解方法、系统及存储介质
CN112330054B (zh) * 2020-11-23 2024-03-19 大连海事大学 基于决策树的动态旅行商问题求解方法、系统及存储介质
CN114399000A (zh) * 2022-01-20 2022-04-26 中国平安人寿保险股份有限公司 树模型的对象可解释性特征提取方法、装置、设备及介质

Also Published As

Publication number Publication date
CN109242034A (zh) 2019-01-18
TW202013266A (zh) 2020-04-01
CN109242034B (zh) 2020-09-15
CN112418274B (zh) 2024-09-17
CN112418274A (zh) 2021-02-26

Similar Documents

Publication Publication Date Title
WO2020057301A1 (zh) 决策树生成方法和装置
US10719763B2 (en) Image searching
US10726208B2 (en) Consumer insights analysis using word embeddings
US11929074B2 (en) Automatically generating a meeting summary for an information handling system
US10685183B1 (en) Consumer insights analysis using word embeddings
US11182806B1 (en) Consumer insights analysis by identifying a similarity in public sentiments for a pair of entities
WO2015135321A1 (zh) 基于金融数据的社会关系挖掘的方法及装置
US10558759B1 (en) Consumer insights analysis using word embeddings
US20160225030A1 (en) Social data collection and automated social replies
EP3746934A1 (en) Face synthesis
TW201734893A (zh) 信用分的獲取、特徵向量值的輸出方法及其裝置
WO2022105118A1 (zh) 基于图像的健康状态识别方法、装置、设备及存储介质
US10509863B1 (en) Consumer insights analysis using word embeddings
US10803248B1 (en) Consumer insights analysis using word embeddings
US20170024388A1 (en) Methods and systems for determining query date ranges
US11030539B1 (en) Consumer insights analysis using word embeddings
WO2018119593A1 (zh) 一种语句推荐方法及装置
WO2021093367A1 (zh) 模型训练和风险识别方法、装置及设备
JP7393475B2 (ja) 画像を検索するための方法、装置、システム、電子デバイス、コンピュータ可読記憶媒体及びコンピュータプログラム
WO2021068613A1 (zh) 面部识别方法、装置、设备及计算机可读存储介质
CN109447273A (zh) 模型训练方法、广告推荐方法、相关装置、设备及介质
CN110046648A (zh) 基于至少一个业务分类模型进行业务分类的方法及装置
US10778619B2 (en) Personality reply for digital content
WO2023192951A1 (en) Non-fungible token minting in a metaverse environment
CN109598513B (zh) 一种风险识别方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19861363

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19861363

Country of ref document: EP

Kind code of ref document: A1