CN115423540B - Financial model knowledge distillation method and device based on reinforcement learning - Google Patents

Financial model knowledge distillation method and device based on reinforcement learning Download PDF

Info

Publication number
CN115423540B
CN115423540B CN202211373039.9A CN202211373039A CN115423540B CN 115423540 B CN115423540 B CN 115423540B CN 202211373039 A CN202211373039 A CN 202211373039A CN 115423540 B CN115423540 B CN 115423540B
Authority
CN
China
Prior art keywords
enterprise
model
teacher
student
reasoning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211373039.9A
Other languages
Chinese (zh)
Other versions
CN115423540A (en
Inventor
韩柳
胡雪枫
朱威
郑宇晟
唐镇坤
黄文辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Post Consumer Finance Co ltd
Original Assignee
China Post Consumer Finance Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Post Consumer Finance Co ltd filed Critical China Post Consumer Finance Co ltd
Priority to CN202211373039.9A priority Critical patent/CN115423540B/en
Publication of CN115423540A publication Critical patent/CN115423540A/en
Application granted granted Critical
Publication of CN115423540B publication Critical patent/CN115423540B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention relates to a financial model knowledge distillation method and a device based on reinforcement learning, which comprises the following steps: s1: model design of an enterprise A and an enterprise B is carried out, and pre-training distillation and initialization are carried out on student models of the enterprise A; s2: building the pre-trained, distilled and initialized student model in a server of an enterprise B, and performing distillation training again; s3: and reasoning and forecasting are carried out through a teacher reasoning model of the enterprise A, and data enhancement is carried out on a student model of the enterprise B through a reasoning result. The financial model knowledge distillation method and device based on reinforcement learning provided by the invention realize a cross-organization combined modeling scheme, achieve the purpose of protecting data privacy by using the weak interpretability of a deep learning model in knowledge distillation, and simultaneously can obtain a high-quality response rate passenger group in a drainage mechanism required by a credit company under the condition of not revealing a credit company wind control strategy, thereby saving the marketing and passenger obtaining cost.

Description

Financial model knowledge distillation method and device based on reinforcement learning
Technical Field
The invention relates to the technical field of data enhancement of knowledge distillation training, in particular to a financial model knowledge distillation method and device based on reinforcement learning.
Background
In the inter-enterprise drainage of the credit industry, joint modeling is involved, and the following modes can be adopted: desensitization data is brought out of the company or federal learning techniques are used, but the following drawbacks are encountered:
(1) Desensitization data is limited to other domain samples below ten thousand levels to participate in local modeling of the other party, and the effect is limited due to the limited open data volume;
(2) When the system is oriented to a financial scene, a unified federal learning training platform is not provided, and is trusted by all credit industry institutions, meanwhile, the training and debugging period of the federal learning is more time-consuming compared with local training, the time period is 2-6 times of that of local modeling, and popularization is influenced;
(3) Under a cross-domain scene, the financial company and other domain companies have guest group labels with larger difference, and under the scene of giving consideration to privacy, the difficulty of selecting the risk guest group of the financial company is large;
(4) When a financial scene searches for a customer group, the marketing customer group is searched by using the characteristics of risk classes, so that the customer group does not always have the willingness of loan, the conversion rate is low, and the operation and delivery cost is increased.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a knowledge distillation method and a knowledge distillation device suitable for a financial customer consensus model of a credit company under the condition of implementing cross-domain or co-domain cooperation, wherein the financial consensus model is defined by the range of response rate, repayment willingness after default, application of a scoring card, behavior scoring classification and the like.
In order to achieve the purpose of the invention, the invention provides a financial model knowledge distillation method based on reinforcement learning, which comprises the following steps:
s1: model design of enterprises A and enterprises B is carried out, and pre-training distillation and initialization are carried out on student models of the enterprises A;
s2: building the pre-trained, distilled and initialized student model in a server of an enterprise B, and performing distillation training again;
s3: and reasoning and forecasting are carried out through a teacher reasoning model of the enterprise A, and data enhancement is carried out on a student model of the enterprise B through a reasoning result.
Preferably, the model design of the enterprise a in the step S1 specifically includes:
the teacher model is designed into n layers of transformers, and the transformers of the teacher model output corresponding hidden variables through multi-head attention by taking the hidden variables of the upper layer as input.
Preferably, the model design of the enterprise B in step S1 specifically includes:
carry out the design of bull teacher's model of B enterprise, its teacher's model is only distilled as local student's model and is used, specifically includes: a teacher model of risk assessment classification.
Preferably, the step S1 of designing the model of the enterprise a specifically further includes:
cleaning the behavior data of the enterprise A, wherein the behavior data specifically comprises the following components: when the user behavior data is too large, the statistical-level feature engineering is required, and the method specifically comprises the following steps: the number of clicks of the user and the page element dwell time.
Preferably, the specific steps of step S3 are:
and (4) reasoning and predicting through a teacher reasoning model of the enterprise A, and performing data enhancement on a student model of the enterprise B by adopting a hardlab mode.
Preferably, the specific steps of step S2 include:
the pre-trained, distilled and initialized student models are built in a server of an enterprise B, the student models of the enterprise B are subjected to distillation training through the knowledge distilled by teacher models of the enterprise A and the enterprise B, and the trained student models of the enterprise B are deployed in the enterprise A and used as prediction reasoning considering the batch approval rate and the advertisement response rate of the enterprise B.
Preferably, the specific step of performing distillation training again in step S2 is:
based on the Actor-Critic method, a student model of the enterprise B is used as an Actor, and the produced behavior sequence is explored according to different scenes, wherein the behavior sequence comprises a financial risk type strategy and a marketing pull-up type strategy.
Preferably, the financial risk type policy and the marketing pull-up type policy specifically include:
financial risk class policy: derating, promoting, rejecting, and passing risk intervention strategies;
marketing pull new class strategy: designing a marketing strategy of issuing coupons and avoiding activities;
the financial risk strategy is used for stimulating the local wind control environment of the enterprise B and collecting status; the marketing pull class strategy is used for sending to the A enterprise and is responsible for collecting status.
Preferably, the teacher inference model of the enterprise A and the teacher model of the enterprise B are used as Critic, behavior scores are made through behavior of the Critic based on Actor, preset values are selected according to the obtained scores, and the Critic and the Actor are updated simultaneously according to the scores of the preset values.
Preferably, the invention also provides a financial model knowledge distillation device based on reinforcement learning, which comprises:
a configuration module: the method is used for carrying out model design of the enterprise A and the enterprise B;
a training module: pre-training, distilling and initializing the student models of the enterprise A, building the pre-trained, distilled and initialized student models in a server of the enterprise B, and performing distillation training again;
the data enhancement module: and carrying out reasoning prediction through a teacher reasoning model of the enterprise A, and carrying out data enhancement on a student model of the enterprise B through a reasoning result.
The invention has the beneficial effects that: the financial model knowledge distillation method and device based on reinforcement learning provided by the invention realize a cross-organization combined modeling scheme, achieve the purpose of protecting data privacy by using the weak interpretability of a deep learning model in knowledge distillation, and simultaneously can obtain a high-quality response rate passenger group in a drainage mechanism required by a credit company under the condition of not revealing a credit company wind control strategy, thereby saving the marketing and passenger obtaining cost.
Drawings
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings. Like reference numerals refer to like parts throughout the drawings, and the drawings are not intended to be drawn to scale in actual dimensions, emphasis instead being placed upon illustrating the principles of the invention.
FIG. 1 is a schematic flow chart of a method and apparatus for distilling knowledge of financial models based on reinforcement learning according to an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating specific steps of a financial model knowledge distillation method and apparatus based on reinforcement learning according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention are further described in detail with reference to the drawings and specific embodiments so that those skilled in the art can better understand the present invention and can implement the present invention, but the embodiments are not limited to the present invention.
Referring to fig. 1-2, an embodiment of the invention provides a financial model knowledge distillation method based on reinforcement learning, including the following steps:
s1: model design of an enterprise A (other domain enterprises) and an enterprise B (financial enterprises) is carried out, and pre-training distillation and initialization are carried out on student models of the enterprise A;
s2: building the pre-trained, distilled and initialized student model in a server of an enterprise B, and performing distillation training again;
s3: and carrying out reasoning prediction through a teacher reasoning model of the enterprise A, and carrying out data enhancement on a student model of the enterprise B through a reasoning result.
The beneficial effects of the invention are as follows: the financial model knowledge distillation method and device based on reinforcement learning provided by the invention adopt a teacher-student mode, realize a cross-institution joint modeling scheme, utilize the weak interpretability of a deep learning model in knowledge distillation, achieve the purpose of protecting data privacy, and simultaneously can obtain a high-quality response rate guest group in a drainage mechanism required by a credit company under the condition of not revealing a credit company wind control strategy, thereby saving the marketing and guest obtaining cost.
Referring to fig. 1-2, in a preferred embodiment, the model design of the enterprise a in step S1 specifically includes:
the teacher model is designed as n layers of transformers, and the transformers of the teacher model output relatively good mutual hidden variables (the student model needs to consider privacy, and the number of layers and the attention head need to keep a certain number) by taking the hidden variables of the previous layer as input and through multi-head attention (considering privacy and proposing more than 12 heads).
Referring to fig. 1-2, in a preferred embodiment, the model design of the enterprise B in step S1 specifically includes:
carry out the design of bull teacher's model of B enterprise, its teacher's model is only distilled as local student's model and is used, specifically includes: a teacher model of risk assessment classification. The enterprise B is used as a user or an advertiser of a flow passenger group, financial wind control attributes are considered, a teacher model for risk assessment classification is established locally in the enterprise B, the teacher model established by the enterprise B is also a deep learning model and is only used for local student model distillation, so that the interpretability is not considered to be weak, and the teacher model is not used as a real scoring card model.
So-called knowledge distillation: generally, a large model is often a single complex network or a collection of networks, and has good performance and generalization capability, while a small model has limited expression capability because of a small network size. Therefore, the knowledge learned by the large model can be used for guiding the training of the small model, so that the small model has the performance equivalent to that of the large model, but the number of parameters is greatly reduced, and the model compression and acceleration are realized.
Referring to fig. 1-2, in a further preferred embodiment, the step S1 of designing the model of the enterprise a further includes:
cleaning the behavior data of the enterprise a (the enterprise a is used as an advertisement operator, and the characteristics of users on the platform of the enterprise a need to be fully considered), wherein the behavior data specifically includes but is not limited to: when the user behavior data is too large, statistical-level feature engineering is required, which specifically includes but is not limited to: the number of clicks of the user and the page element dwell time.
Referring to fig. 1-2, in a further preferred embodiment, the specific steps of step S3 are:
and (4) reasoning and predicting through a teacher reasoning model of the enterprise A, and performing data enhancement on a student model of the enterprise B by adopting a hardlab mode. Namely: the behavior of the customer group predicted by the enterprise inference model A is real and response performance, the condition characteristic local condition can be added at the activation function during the training of the student model, and the possibility that the error is transmitted to Net-S during the distillation of the teacher model can be effectively reduced by using the ground route.
Referring to fig. 1-2, in a preferred embodiment, the specific steps of step S2 include:
the method comprises the steps that a pre-trained, distilled and initialized student model is built in a server of an enterprise B, the student model of the enterprise B is subjected to distillation training through distilled knowledge of teacher models of the enterprise A and the enterprise B, and the trained student model of the enterprise B is deployed in the enterprise A and used as prediction reasoning considering batch approval rate and advertisement response rate of the enterprise B.
The financial model knowledge distillation method and device based on reinforcement learning provided by the invention also have the following characteristics:
1. for enterprises B, a teacher model in the enterprises A takes two factors into consideration during design, wherein the teacher model is a large model pre-trained by knowledge in the enterprise A field, such as a recommendation model, and used data such as user behavior data or local tags; in the initial phase, enterprise B will negotiate with enterprise a about the BASE user group scope of the initial drainage for the requirements related to risk characteristics.
2. The output of the teacher reasoning model needs to be used as a data enhancement means during the training of the student model;
3. in the aspect of protecting data privacy, the enterprise B is required by financial supervision, and the student model uses a deep learning structure, so that even if the subsequent student reasoning model is deployed in the enterprise A, the risk scoring knowledge learned by the enterprise B in the process of training the student model can not be exposed; on the other hand, in the fine adjustment stage of the enterprise B, the knowledge of the distilled teacher model of the enterprise A taken by the enterprise B can be ensured, and the original data of the enterprise A cannot be easily cracked.
Referring to fig. 1-2, in a further preferred embodiment, the specific steps of performing distillation training again in step S2 are:
based on the Actor-Critic method, a student model of an enterprise B is used as an Actor, and heuristics (N behavior sequences can be obtained by assuming that an Actor heuristics N times) of different generated behaviors are realized aiming at different scenes, wherein the behavior sequences comprise a financial risk type strategy and a marketing update type strategy.
Referring to fig. 1-2, in a preferred embodiment, the financial risk policy and the marketing pull-new policy specifically include:
financial risk class policy: derating, promoting, rejecting, and passing risk intervention strategies;
marketing pull new class strategy: designing a coupon issuing and activity free marketing strategy;
the financial risk strategy is used for stimulating the local wind control environment of the enterprise B (such as changing the advertising style and the offering rule), and collecting status; the marketing pull class strategy is used for sending to the A enterprise and is responsible for collecting status.
Referring to fig. 1-2, in a preferred embodiment, a teacher inference model of enterprise a and a teacher model of enterprise B are used as Critic, behavior scores are made by the Critic based on actions of Actor, preset values are selected according to the obtained scores (latest status, reward, td _ error are obtained), the Critic and Actor are updated simultaneously according to the scores of the preset values (defined according to actual conditions), and loss is log _ prob _ td _ error.
According to the financial model knowledge distillation method and device based on reinforcement learning, the number of exploration times of action cannot be large according to the life cycle of a financial client, so that after a student model is preset, the number of iteration rounds of an operator-critical frame is 1-2 times per 2 weeks, and meanwhile, the learning rate \1013ofthe operator-critical frame is controlled well due to the fact that the structure of a commonly used bert is used for designing a teacher-student model.
Referring to fig. 1-2, in a preferred embodiment, the present invention further provides an apparatus for distilling knowledge of financial models based on reinforcement learning, comprising:
a configuration module: the method is used for carrying out model design of the enterprise A and the enterprise B;
a training module: pre-training, distilling and initializing the student model of the enterprise A, building the pre-trained, distilled and initialized student model in a server of the enterprise B, and performing distillation training again;
the data enhancement module: and carrying out reasoning prediction through a teacher reasoning model of the enterprise A, and carrying out data enhancement on a student model of the enterprise B through a reasoning result.
The invention has the beneficial effects that: the invention provides a financial model knowledge distillation method based on reinforcement learning, which realizes a cross-organization combined modeling scheme, achieves the purpose of protecting data privacy by using the weak interpretability of a deep learning model in knowledge distillation, and can obtain a high-quality response rate passenger group in a drainage mechanism required by a credit company under the condition of not revealing a credit company wind control strategy, thereby saving the marketing and passenger obtaining cost.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (6)

1. A financial model knowledge distillation method based on reinforcement learning is characterized by comprising the following steps:
s1: model design of an enterprise A and an enterprise B is carried out, and pre-training distillation and initialization are carried out on student models of the enterprise A;
s2: building the pre-trained, distilled and initialized student model in a server of an enterprise B, and performing distillation training again;
s3: reasoning and forecasting are carried out through a teacher reasoning model of the enterprise A, and data enhancement is carried out on a student model of the enterprise B through a reasoning result;
the specific steps of the step S2 include:
building the pre-trained, distilled and initialized student model in a server of an enterprise B, carrying out distillation training on the student model of the enterprise B through the distilled knowledge of the teacher models of the enterprise A and the enterprise B, deploying the trained student model of the enterprise B in the enterprise A, and using the trained student model of the enterprise B as a predictive reasoning which gives consideration to the batch approval rate and the advertisement response rate of the enterprise B;
the specific step of performing distillation training again in the step S2 is as follows:
based on an Actor-Critic method, a student model of an enterprise B is used as an Actor, and the exploration of the generated behavior sequence is realized aiming at different scenes, wherein the behavior sequence comprises a financial risk class strategy and a marketing update class strategy; taking the teacher inference model of the enterprise A and the teacher model of the enterprise B as Critic, performing behavior scoring based on action of Actor through the Critic, selecting a preset value according to the obtained scoring, and updating the Critic and the Actor simultaneously according to the scoring of the preset value;
the financial risk type strategy and the marketing pull-new type strategy specifically comprise the following steps:
financial risk class policy: derating, promoting, declining, and passing risk intervention strategies;
marketing pull new class strategy: designing a marketing strategy of issuing coupons and avoiding activities;
the financial risk strategy is used for stimulating the local wind control environment of the enterprise B and collecting status; the marketing pull class strategy is used for sending to the A enterprise and is responsible for collecting status.
2. The method of claim 1, wherein the model design of the enterprise a in step S1 comprises:
the teacher model is designed into n layers of transformers, and the transformers of the teacher model output corresponding hidden variables through multi-head attention by taking the hidden variables of the upper layer as input.
3. The method of claim 1, wherein the model design of the business B in step S1 comprises:
carry out the design of bull teacher's model of B enterprise, its teacher's model is only distilled as local student's model and is used, specifically includes: a teacher model of risk assessment classification.
4. The method of claim 1, wherein the step of designing the model of enterprise a in step S1 further comprises:
cleaning the behavior data of the enterprise A, wherein the behavior data specifically comprises the following components: when the user behavior data is too large, the statistical-level feature engineering is required, and the method specifically comprises the following steps: the number of clicks of the user and the page element dwell time.
5. The financial model knowledge distillation method of claim 1, wherein the step S3 comprises the following steps:
and (4) reasoning and predicting through a teacher reasoning model of the enterprise A, and performing data enhancement on a student model of the enterprise B by adopting a hardlab mode.
6. A financial model knowledge distillation apparatus based on reinforcement learning, comprising:
a configuration module: the method is used for carrying out model design on the enterprise A and the enterprise B;
a training module: pre-training, distilling and initializing the student models of the enterprise A, building the pre-trained, distilled and initialized student models in a server of the enterprise B, and performing distillation training again;
the data enhancement module: reasoning and predicting through a teacher reasoning model of the enterprise A, and performing data enhancement on a student model of the enterprise B through a reasoning result;
the training module comprises the following specific steps: building a pre-trained, distilled and initialized student model in a server of an enterprise B, carrying out distillation training on the student model of the enterprise B through distilled knowledge of teacher models of the enterprise A and the enterprise B, deploying the trained student model of the enterprise B in the enterprise A, and carrying out prediction reasoning considering batch approval rate and advertisement response rate of the enterprise B;
the specific steps of the training module further comprise: based on an Actor-Critic method, a student model of an enterprise B is used as an Actor, and the exploration of the generated behavior sequence is realized aiming at different scenes, wherein the behavior sequence comprises a financial risk class strategy and a marketing update class strategy; taking a teacher inference model of an enterprise A and a teacher model of an enterprise B as Critic, making behavior scores through the Critic based on action of Actor, selecting preset values according to the obtained scores, and updating the Critic and the Actor simultaneously according to the scores of the preset values;
the financial risk type strategy and the marketing pull-new type strategy specifically comprise the following steps:
financial risk class policy: derating, promoting, declining, and passing risk intervention strategies;
marketing pull new class strategy: designing a coupon issuing and activity free marketing strategy;
the financial risk strategy is used for stimulating the local wind control environment of the enterprise B and collecting status; the marketing pull class strategy is used for sending to the A enterprise and is responsible for collecting status.
CN202211373039.9A 2022-11-04 2022-11-04 Financial model knowledge distillation method and device based on reinforcement learning Active CN115423540B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211373039.9A CN115423540B (en) 2022-11-04 2022-11-04 Financial model knowledge distillation method and device based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211373039.9A CN115423540B (en) 2022-11-04 2022-11-04 Financial model knowledge distillation method and device based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN115423540A CN115423540A (en) 2022-12-02
CN115423540B true CN115423540B (en) 2023-02-03

Family

ID=84208235

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211373039.9A Active CN115423540B (en) 2022-11-04 2022-11-04 Financial model knowledge distillation method and device based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN115423540B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021114974A1 (en) * 2019-12-14 2021-06-17 支付宝(杭州)信息技术有限公司 User risk assessment method and apparatus, electronic device, and storage medium
CN113887230A (en) * 2021-09-30 2022-01-04 北京熵简科技有限公司 Financial scene-oriented end-to-end natural language processing training framework and method
CN113947214A (en) * 2021-11-23 2022-01-18 湖南三湘银行股份有限公司 Client knowledge distillation-based federal learning implementation method
CN114787833A (en) * 2019-09-23 2022-07-22 普雷萨根私人有限公司 Distributed Artificial Intelligence (AI)/machine learning training system
CN114863092A (en) * 2022-04-29 2022-08-05 广州广电运通金融电子股份有限公司 Knowledge distillation-based federal target detection method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11487944B1 (en) * 2019-12-09 2022-11-01 Asapp, Inc. System, method, and computer program for obtaining a unified named entity recognition model with the collective predictive capabilities of teacher models with different tag sets using marginal distillation
CN111767711B (en) * 2020-09-02 2020-12-08 之江实验室 Compression method and platform of pre-training language model based on knowledge distillation
US20220335303A1 (en) * 2021-04-16 2022-10-20 Md Akmal Haidar Methods, devices and media for improving knowledge distillation using intermediate representations

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114787833A (en) * 2019-09-23 2022-07-22 普雷萨根私人有限公司 Distributed Artificial Intelligence (AI)/machine learning training system
WO2021114974A1 (en) * 2019-12-14 2021-06-17 支付宝(杭州)信息技术有限公司 User risk assessment method and apparatus, electronic device, and storage medium
CN113887230A (en) * 2021-09-30 2022-01-04 北京熵简科技有限公司 Financial scene-oriented end-to-end natural language processing training framework and method
CN113947214A (en) * 2021-11-23 2022-01-18 湖南三湘银行股份有限公司 Client knowledge distillation-based federal learning implementation method
CN114863092A (en) * 2022-04-29 2022-08-05 广州广电运通金融电子股份有限公司 Knowledge distillation-based federal target detection method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An Actor-Critic-Based Transfer Learning Framework for Experience-Driven Networking;Xu Zhiyuan 等;《IEEE/ACM TRANSACTIONS ON NETWORKING》;20210228;第29卷(第1期);360-371 *
Diversity-driven knowledge distillation for financial trading using;Avraam Tsantekidis 等;《Neural Networks》;20210317;193-202 *

Also Published As

Publication number Publication date
CN115423540A (en) 2022-12-02

Similar Documents

Publication Publication Date Title
CN110647765B (en) Privacy protection method and system based on knowledge migration under collaborative learning framework
EP3516595B1 (en) Training action selection neural networks
Chen et al. Deep reinforcement learning in recommender systems: A survey and new perspectives
US20180330248A1 (en) Context-aware recommendation system for analysts
CN111339415A (en) Click rate prediction method and device based on multi-interactive attention network
US20090282343A1 (en) Web Page Server Process Using Visitor Context and Page Features to Select Optimized Web Pages for Display
CN113609965B (en) Training method and device of character recognition model, storage medium and electronic equipment
Wu et al. Poisoning attacks against knowledge graph-based recommendation systems using deep reinforcement learning
Chen et al. Distributed structured actor-critic reinforcement learning for universal dialogue management
US20230034820A1 (en) Systems and methods for managing, distributing and deploying a recursive decisioning system based on continuously updating machine learning models
Jadhav et al. Design and development of chatbot based on reinforcement learning
Lu et al. Supervision system of English online teaching based on machine learning
Chan et al. Applied artificial intelligence in business
KR102363370B1 (en) Artificial neural network automatic design generation apparatus and method using UX-bit and Monte Carlo tree search
CN115423540B (en) Financial model knowledge distillation method and device based on reinforcement learning
WO2021186338A1 (en) System and method for determining solution for problem in organization
US20220108334A1 (en) Inferring unobserved event probabilities
Prajwal et al. Universal semantic web assistant based on sequence to sequence model and natural language understanding
CN112307212A (en) Public opinion delivery monitoring method for advertisement delivery
CA3028205A1 (en) System and method for screening candidates and including a process for autobucketing candidate roles
CN111178535B (en) Method and apparatus for implementing automatic machine learning
Gao The Advance of GPTs and Language Model in Cyber Security
Hurwitz et al. Causal Artificial Intelligence: The Next Step in Effective Business AI
CN116628236B (en) Method and device for delivering multimedia information, electronic equipment and storage medium
US20230244900A1 (en) Automated Multi-Persona Response Generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant