CN113487040A - Attention mechanism-based joint learning method and device, computer equipment and computer readable storage medium - Google Patents

Attention mechanism-based joint learning method and device, computer equipment and computer readable storage medium Download PDF

Info

Publication number
CN113487040A
CN113487040A CN202110778808.2A CN202110778808A CN113487040A CN 113487040 A CN113487040 A CN 113487040A CN 202110778808 A CN202110778808 A CN 202110778808A CN 113487040 A CN113487040 A CN 113487040A
Authority
CN
China
Prior art keywords
participant
model
parameter matrix
key value
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110778808.2A
Other languages
Chinese (zh)
Inventor
谢龙飞
马国良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ennew Digital Technology Co Ltd
Original Assignee
Ennew Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ennew Digital Technology Co Ltd filed Critical Ennew Digital Technology Co Ltd
Priority to CN202110778808.2A priority Critical patent/CN113487040A/en
Publication of CN113487040A publication Critical patent/CN113487040A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a joint learning method and device based on an attention mechanism, computer equipment and a computer readable storage medium. The method comprises the following steps: receiving a plurality of participant models uploaded by participants; respectively determining a query parameter matrix and a key value matrix according to the participant model; determining the difference degree between each query parameter matrix and a plurality of key value matrixes; and locally updating the participant model by using the difference degree to obtain a new participant model. The problem that data are distributed unevenly due to the fact that data of all parties are split and the specific difference between the data is difficult to measure directly in the joint learning process is solved.

Description

Attention mechanism-based joint learning method and device, computer equipment and computer readable storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a joint learning method and apparatus based on an attention mechanism, a computer device, and a computer-readable storage medium.
Background
In a joint learning scenario, a situation that data distributions of different data sources (between participants) are inconsistent is often encountered, and the prior art performs weighted summation based on differences of data volumes of the participants, so that the differences between the data distributions are ignored. In joint learning, the data of each party are split from each other, and it is difficult to directly measure the specific difference between the data.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide a joint learning method and apparatus based on an attention mechanism, a computer device, and a computer readable storage medium, so as to solve the problem in the prior art that data distribution is inconsistent.
In a first aspect of the embodiments of the present disclosure, a joint learning method based on an attention mechanism is provided, including:
receiving a plurality of participant models uploaded by participants;
respectively determining a query parameter matrix and a key value matrix according to the participant model;
determining the difference degree between each query parameter matrix and a plurality of key value matrixes;
and locally updating the participant model by using the difference degree to obtain a new participant model.
In a second aspect of the disclosed embodiments, there is provided an attention-based joint learning apparatus, including:
the receiving module is used for receiving a plurality of participant models uploaded by participants;
the determining module is used for respectively determining a query parameter matrix and a key value matrix according to the participant model;
the calculation module is used for determining the difference degree between each inquiry parameter matrix and the plurality of key value matrixes;
and the updating module is used for locally updating the participant model by utilizing the difference degree so as to obtain a new participant model.
In a third aspect of the embodiments of the present disclosure, a computer device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the above method when executing the computer program.
In a fourth aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor, implements the steps of the above-mentioned method.
Compared with the prior art, the embodiment of the disclosure has the following beneficial effects: receiving a plurality of participant models uploaded by a participant; respectively determining a query parameter matrix and a key value matrix according to the participant model; determining the difference degree between each query parameter matrix and a plurality of key value matrixes; and locally updating the participant model by using the difference degree to obtain a new participant model. The problem that data are distributed unevenly due to the fact that data of all parties are split and the specific difference between the data is difficult to measure directly in the joint learning process is solved.
Drawings
To more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive efforts.
FIG. 1 is a scenario diagram of an application scenario of an embodiment of the present disclosure;
FIG. 2 is a flowchart of a joint learning method based on an attention mechanism according to an embodiment of the present disclosure;
FIG. 3 is a block diagram of a joint learning apparatus based on an attention mechanism provided by an embodiment of the present disclosure;
fig. 4 is a schematic diagram of a computer device provided by an embodiment of the present disclosure.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. However, it will be apparent to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
An attention mechanism-based joint learning method and apparatus according to an embodiment of the present disclosure will be described in detail below with reference to the accompanying drawings.
Fig. 1 is a scene schematic diagram of an application scenario of an embodiment of the present disclosure. The application scenario may include terminal devices 1, 2, and 3, server 4, and network 5.
The terminal devices 1, 2, and 3 may be hardware or software. When the terminal devices 1, 2 and 3 are hardware, they may be various electronic devices having a display screen and supporting communication with the server 4, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like; when the terminal devices 1, 2, and 3 are software, they may be installed in the electronic device as described above. The terminal devices 1, 2 and 3 may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module, which is not limited by the embodiments of the present disclosure. Further, the terminal devices 1, 2, and 3 may have various applications installed thereon, such as a data processing application, an instant messaging tool, social platform software, a search-type application, a shopping-type application, and the like.
The server 4 may be a server providing various services, for example, a backend server receiving a request sent by a terminal device establishing a communication connection with the server, and the backend server may receive and analyze the request sent by the terminal device and generate a processing result. The server 4 may be one server, may also be a server cluster composed of a plurality of servers, or may also be a cloud computing service center, which is not limited in this disclosure.
The server 4 may be hardware or software. When the server 4 is hardware, it may be various electronic devices that provide various services to the terminal devices 1, 2, and 3. When the server 4 is software, it may be implemented as a plurality of software or software modules that provide various services for the terminal devices 1, 2, and 3, or may be implemented as a single software or software module that provides various services for the terminal devices 1, 2, and 3, which is not limited in this embodiment of the disclosure.
The network 5 may be a wired network connected by a coaxial cable, a twisted pair and an optical fiber, or may be a wireless network that can interconnect various Communication devices without wiring, for example, Bluetooth (Bluetooth), Near Field Communication (NFC), Infrared (Infrared), and the like, which is not limited in the embodiment of the present disclosure.
A user can establish a communication connection with the server 4 via the network 5 through the terminal devices 1, 2, and 3 to receive or transmit information or the like. Specifically, after the user imports the collected data of the interest points into the server 4, the server 4 acquires first data of the interest points to be processed, the first data includes a first longitude latitude and a first classification of the interest points to be processed, and performs conflict check on the interest points to be processed according to the first longitude latitude and the first classification; further, in the case of determining a conflict, the server 4 performs conflict processing on the interest points to be processed, so as to avoid a large amount of repeated data and unavailable data existing in the database.
It should be noted that the specific types, numbers and combinations of the terminal devices 1, 2 and 3, the server 4 and the network 5 may be adjusted according to the actual requirements of the application scenarios, and the embodiment of the present disclosure does not limit this.
Fig. 2 is a flowchart of a joint learning method based on an attention mechanism according to an embodiment of the present disclosure. The execution subject in fig. 2 of fig. 2 may be a terminal or a server in the figure. The participant in the invention can be terminal equipment of a user or a server; the server referred to in the present invention may comprise a central server of a plurality of participants, as shown in fig. 2, and the method of joint learning based on attention mechanism includes:
s201, receiving a plurality of participant models uploaded by participants;
specifically, the received participant model may be identified as a jointly learned participant model; initializing the participator model; the participator model is a model which is obtained and trained by the joint learning participator.
S202, respectively determining a query parameter matrix and a key value matrix according to a participant model;
specifically, the participant model may be obtained; screening out a target reference participant and a comparison participant in the participant model; extracting the model parameters of the target reference participants as a query parameter matrix; and extracting the model parameters of the comparison participants as a key value matrix.
S203, determining the difference degree between each query parameter matrix and a plurality of key value matrixes;
specifically, in the server, taking a certain participant as an example, the Query parameter matrix is marked as Query, the other participants are marked as Key value matrices as keys, and the difference between the Query and the Key of each participant is calculated.
S204, locally updating the participant model by using the difference degree to obtain a new participant model;
specifically, the difference degree between each group of query parameter matrix and a plurality of key value matrixes can be calculated; establishing a normalization weighting function according to the difference degree; obtaining an attention coefficient of the query parameter matrix by utilizing a normalization weighting function; and adjusting whether the participant model is updated locally or not according to the attention coefficient.
Adjusting whether the participant model is updated locally according to the attention coefficient may include: acquiring an attention coefficient of the query parameter matrix; based on the attention coefficient, carrying out weighted summation on the normalization weighting function value; and adjusting whether the participant model is updated locally or not according to the value of the weighted sum.
According to the technical scheme provided by the embodiment of the disclosure, a plurality of participant models uploaded by participants are received; respectively determining a query parameter matrix and a key value matrix according to the participant model; determining the difference degree between each query parameter matrix and a plurality of key value matrixes; and locally updating the participant model by using the difference degree to obtain a new participant model. The problem that data are distributed unevenly due to the fact that data of all parties are split and the specific difference between the data is difficult to measure directly in the joint learning process is solved.
All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.
Fig. 3 is a schematic diagram of a joint learning apparatus based on an attention mechanism according to an embodiment of the present disclosure. As shown in fig. 3, the attention-based joint learning apparatus includes:
a receiving module 301, configured to receive a plurality of participant models uploaded by participants;
a determining module 302, configured to determine a query parameter matrix and a key value matrix according to the participant model;
a calculating module 303, configured to determine a difference between each query parameter matrix and the plurality of key value matrices;
and the updating module 304 is configured to locally update the participant model by using the difference degree to obtain a new participant model.
Preferably, the determining module 302 may include: the method comprises the following steps of obtaining a submodule, a screening submodule, a first extraction submodule and a second extraction submodule; the method comprises the following specific steps:
the acquisition submodule is used for acquiring a participant model;
the screening submodule is used for screening out a target reference participant and a comparison participant in the participant model;
the first extraction submodule is used for extracting the model parameters of the target reference participant into a query parameter matrix;
and the second extraction submodule is used for extracting the model parameters of the contrast participants into a key value matrix.
Preferably, the updating module 304 may include: the method comprises the following steps of calculating a submodule, establishing a submodule, determining a submodule and adjusting a submodule; the method comprises the following specific steps:
the calculation submodule is used for calculating the difference degree between each group of inquiry parameter matrixes and the plurality of key value matrixes;
the establishing submodule is used for establishing a normalization weighting function according to the difference degree;
the determining submodule is used for obtaining an attention coefficient of the query parameter matrix by utilizing a normalization weighting function;
and the adjusting submodule is used for adjusting whether the participant model is locally updated or not according to the attention coefficient.
According to the device provided by the embodiment of the disclosure, the problem that data distribution is uneven due to the fact that data of all parties are split and specific differences among the data are difficult to measure directly in the joint learning process can be solved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.
Fig. 4 is a schematic diagram of a computer device 4 provided by the disclosed embodiment. As shown in fig. 4, the computer device 4 of this embodiment includes: a processor 401, a memory 402 and a computer program 403 stored in the memory 402 and executable on the processor 401. The steps in the various method embodiments described above are implemented when the processor 401 executes the computer program 403. Alternatively, the processor 401 implements the functions of the respective modules/units in the above-described respective apparatus embodiments when executing the computer program 403.
Illustratively, the computer program 403 may be partitioned into one or more modules/units, which are stored in the memory 402 and executed by the processor 401 to accomplish the present disclosure. One or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 403 in the computer device 4.
The computer device 4 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computer devices. Computer device 4 may include, but is not limited to, a processor 401 and a memory 402. Those skilled in the art will appreciate that fig. 4 is merely an example of a computer device 4 and is not intended to limit computer device 4 and may include more or fewer components than those shown, or some of the components may be combined, or different components, e.g., the computer device may also include input output devices, network access devices, buses, etc.
The Processor 401 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 402 may be an internal storage unit of the computer device 4, for example, a hard disk or a memory of the computer device 4. The memory 402 may also be an external storage device of the computer device 4, such as a plug-in hard disk provided on the computer device 4, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, memory 402 may also include both internal storage units of computer device 4 and external storage devices. The memory 402 is used for storing computer programs and other programs and data required by the computer device. The memory 402 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
In the embodiments provided in the present disclosure, it should be understood that the disclosed apparatus/computer device and method may be implemented in other ways. For example, the above-described apparatus/computer device embodiments are merely illustrative, and for example, a division of modules or units, a division of logical functions only, an additional division may be made in actual implementation, multiple units or components may be combined or integrated with another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the present disclosure may implement all or part of the flow of the method in the above embodiments, and may also be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of the above methods and embodiments. The computer program may comprise computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain suitable additions or additions that may be required in accordance with legislative and patent practices within the jurisdiction, for example, in some jurisdictions, computer readable media may not include electrical carrier signals or telecommunications signals in accordance with legislative and patent practices.
The above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present disclosure, and are intended to be included within the scope of the present disclosure.

Claims (10)

1. A joint learning method based on an attention mechanism is characterized by comprising the following steps:
receiving a plurality of participant models uploaded by participants;
respectively determining a query parameter matrix and a key value matrix according to the participant model;
determining the difference degree between each inquiry parameter matrix and a plurality of key value matrixes;
and locally updating the participant model by using the difference degree to obtain a new participant model.
2. The method of claim 1, wherein receiving a plurality of participant models uploaded by a participant comprises:
confirming that the received participant model is a joint learning participant model;
initializing the participant model;
the participant model is a model which is obtained and trained by the joint learning participant after the joint learning participant acquires the server model.
3. The method of claim 1, wherein determining a matrix of query parameters and a matrix of key values, respectively, according to the participant model comprises:
acquiring the participant model;
screening out a target reference participant and a comparison participant in the participant model;
extracting the model parameters of the target reference participant as a query parameter matrix;
and extracting the model parameters of the comparison participants as a key value matrix.
4. The method of claim 1, wherein locally updating the participant model with the degree of difference to obtain a new participant model comprises:
calculating the difference degree between each group of inquiry parameter matrix and a plurality of key value matrixes;
establishing a normalization weighting function according to the difference degree;
obtaining an attention coefficient of the query parameter matrix by using the normalization weighting function;
and adjusting whether the participant model is locally updated or not according to the attention coefficient.
5. The method of claim 4, wherein adjusting whether the participant model is updated locally according to the attention coefficient comprises:
acquiring an attention coefficient of the query parameter matrix;
performing a weighted summation of the normalized weighting function values based on the attention coefficient;
and adjusting whether the participant model is locally updated or not according to the value of the weighted sum.
6. An attention-based joint learning apparatus, comprising:
the receiving module is used for receiving a plurality of participant models uploaded by participants;
the determining module is used for respectively determining a query parameter matrix and a key value matrix according to the participant model;
the calculation module is used for determining the difference degree between each inquiry parameter matrix and a plurality of key value matrixes;
and the updating module is used for locally updating the participant model by utilizing the difference degree so as to obtain a new participant model.
7. The apparatus of claim 6, wherein the determining module comprises:
an obtaining submodule for obtaining the participant model;
the screening submodule is used for screening out a target reference participant and a comparison participant in the participant model;
the first extraction submodule is used for extracting the model parameters of the target reference participant into a query parameter matrix;
and the second extraction submodule is used for extracting the model parameters of the comparison participants into a key value matrix.
8. The apparatus of claim 6, wherein the update module comprises:
the calculation sub-module is used for calculating the difference degree between each group of inquiry parameter matrix and a plurality of key value matrixes;
the establishing submodule is used for establishing a normalization weighting function according to the difference degree;
the determining submodule is used for obtaining the attention coefficient of the query parameter matrix by utilizing the normalization weighting function;
and the adjusting submodule is used for adjusting whether the participant model is locally updated or not according to the attention coefficient.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN202110778808.2A 2021-07-09 2021-07-09 Attention mechanism-based joint learning method and device, computer equipment and computer readable storage medium Pending CN113487040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110778808.2A CN113487040A (en) 2021-07-09 2021-07-09 Attention mechanism-based joint learning method and device, computer equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110778808.2A CN113487040A (en) 2021-07-09 2021-07-09 Attention mechanism-based joint learning method and device, computer equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN113487040A true CN113487040A (en) 2021-10-08

Family

ID=77938286

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110778808.2A Pending CN113487040A (en) 2021-07-09 2021-07-09 Attention mechanism-based joint learning method and device, computer equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN113487040A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059323A (en) * 2019-04-22 2019-07-26 苏州大学 Based on the multi-field neural machine translation method from attention mechanism
US20200137083A1 (en) * 2018-10-24 2020-04-30 Nec Laboratories America, Inc. Unknown malicious program behavior detection using a graph neural network
WO2020163422A1 (en) * 2019-02-08 2020-08-13 Lu Heng Enhancing hybrid self-attention structure with relative-position-aware bias for speech synthesis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200137083A1 (en) * 2018-10-24 2020-04-30 Nec Laboratories America, Inc. Unknown malicious program behavior detection using a graph neural network
WO2020163422A1 (en) * 2019-02-08 2020-08-13 Lu Heng Enhancing hybrid self-attention structure with relative-position-aware bias for speech synthesis
CN110059323A (en) * 2019-04-22 2019-07-26 苏州大学 Based on the multi-field neural machine translation method from attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUA HUANG ET.AL: "Behavior Mimics Distribution: Combining Individual and Group Behaviors for Federated Learning", 《ARXIV》 *

Similar Documents

Publication Publication Date Title
CN113435534A (en) Data heterogeneous processing method and device based on similarity measurement, computer equipment and computer readable storage medium
CN112307331A (en) Block chain-based college graduate intelligent recruitment information pushing method and system and terminal equipment
CN116403250A (en) Face recognition method and device with shielding
CN116385328A (en) Image data enhancement method and device based on noise addition to image
CN115953803A (en) Training method and device for human body recognition model
CN114700957B (en) Robot control method and device with low computational power requirement of model
CN113487040A (en) Attention mechanism-based joint learning method and device, computer equipment and computer readable storage medium
CN115048430A (en) Data verification method, system, device and storage medium
CN113780148A (en) Traffic sign image recognition model training method and traffic sign image recognition method
CN116910566B (en) Target recognition model training method and device
CN114417717B (en) Simulation method and device of printed circuit board
CN115862117A (en) Face recognition method and device with occlusion
CN115563641A (en) Joint learning-based joint recommendation framework method and device, computer equipment and computer-readable storage medium
CN114862281B (en) Method and device for generating task state diagram corresponding to accessory system
CN114627066A (en) Picture quality evaluation method and device
CN115984783B (en) Crowd counting method and device
CN114140845A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN115937929A (en) Training method and device of face recognition model for difficult sample
CN114519884A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN115731623A (en) Human body and human face combined detection method and device
CN115830691A (en) Training method and device of face recognition model
CN114417039A (en) Same house source determining method, target house source determining method and device
CN115022592A (en) Method and device for playing monitoring videos with multiple interfaces
CN114418142A (en) Equipment inspection method and device
CN114139704A (en) Training method and device of neural network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211008