CN113435534A

CN113435534A - Data heterogeneous processing method and device based on similarity measurement, computer equipment and computer readable storage medium

Info

Publication number: CN113435534A
Application number: CN202110777789.1A
Authority: CN
Inventors: 谢龙飞; 马国良
Original assignee: Ennew Digital Technology Co Ltd
Current assignee: Ennew Digital Technology Co Ltd
Priority date: 2021-07-09
Filing date: 2021-07-09
Publication date: 2021-09-24

Abstract

The invention provides a data heterogeneous processing method and device based on similarity measurement, computer equipment and a computer readable storage medium. The method comprises the following steps: the participator obtains the server model; training a target model according to the server model; when gradient updating is carried out, carrying out double-target optimization training on the target model; and carrying out target model convergence on the target model according to the result of the dual-target optimization training. The server receives a plurality of models participating in joint learning uploaded by participants participating in joint learning; initializing a plurality of models; and aggregating the models by adopting weighted average based on data quantity to obtain the server model. The problem of model non-convergence caused by model parameter continuous oscillation caused by data isomerism in the prior art is solved.

Description

Data heterogeneous processing method and device based on similarity measurement, computer equipment and computer readable storage medium

Technical Field

The present disclosure relates to the field of computer application technologies, and in particular, to a data heterogeneous processing method and apparatus based on similarity measurement, a computer device, and a computer readable storage medium.

Background

The joint learning can be used for supporting multi-user multi-party cooperation, and an AI technology is combined with the multi-party cooperation to mine data value and establish intelligent joint modeling. The method can establish AI technical ecology based on joint learning, fully exert the industrial data value and promote the falling of scenes in the vertical field.

In a joint learning scene, the situation that data distribution of different data sources (among participants) is inconsistent and the situation of data heterogeneity are often encountered, and at present, because joint learning generally adopts a weighted average algorithm based on the data volume of each data source to construct a starting point of model updating of each participant in a next round, the problem of data heterogeneity in joint learning cannot be effectively processed, so that how to efficiently process the problem of model non-convergence caused by continuous oscillation of model parameters due to data heterogeneity becomes a problem which needs to be solved urgently.

Disclosure of Invention

In view of this, the embodiments of the present disclosure provide a data heterogeneous processing method and apparatus based on similarity measurement, a computer device, and a computer readable storage medium, so as to solve the problem of model non-convergence caused by incessant oscillation of model parameters due to data heterogeneous in the prior art.

In a first aspect of the embodiments of the present disclosure, a data heterogeneous processing method based on similarity measurement is provided, including:

a participant acquires a server model;

training a target model according to the server model;

when gradient updating is carried out, carrying out double-target optimization training on the target model;

and carrying out target model convergence on the target model according to the result of the dual-target optimization training.

In a second aspect of the embodiments of the present disclosure, a data heterogeneous processing method based on similarity measurement is provided, including:

the server receives a plurality of models participating in joint learning uploaded by participants participating in the joint learning;

initializing a plurality of models;

and aggregating the models by adopting weighted average based on data quantity to obtain the server model.

In a third aspect of the embodiments of the present disclosure, a data heterogeneous processing apparatus based on similarity metric is provided, including:

the acquisition module is used for acquiring the server model by the participant;

the first training module is used for training the target model according to the server model;

the second training module is used for performing double-target optimization training on the target model when gradient updating is performed;

and the adjusting module is used for carrying out target model convergence on the target model according to the result of the dual-target optimization training.

In a fourth aspect of the embodiments of the present disclosure, a data heterogeneous processing apparatus based on similarity metric is provided, including:

the server side is used for receiving a plurality of models which are uploaded from the participants and participate in the joint learning;

an initialization module for initializing a plurality of models;

and the aggregation module is used for aggregating the models by adopting weighted average based on data quantity to obtain the server model.

In a fifth aspect of the embodiments of the present disclosure, there is provided a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.

In a sixth aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor, implements the steps of the above-mentioned method.

Compared with the prior art, the embodiment of the disclosure has the following beneficial effects: the participator obtains the server model; training a target model according to the server model; when gradient updating is carried out, carrying out double-target optimization training on the target model; and carrying out target model convergence on the target model according to the result of the dual-target optimization training. The server receives a plurality of models participating in joint learning uploaded by participants participating in joint learning; initializing a plurality of models; and aggregating the models by adopting weighted average based on data quantity to obtain the server model. The problem of model non-convergence caused by model parameter continuous oscillation caused by data isomerism in the prior art is solved.

Drawings

To more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive efforts.

FIG. 1 is a scenario diagram of an application scenario of an embodiment of the present disclosure;

fig. 2 is a flowchart of a data heterogeneous processing method based on similarity measurement according to an embodiment of the present disclosure;

fig. 3 is a flowchart of another data heterogeneous processing method based on similarity measurement according to an embodiment of the present disclosure;

fig. 4 is a block diagram of a data heterogeneous processing apparatus based on a similarity metric according to an embodiment of the present disclosure;

fig. 5 is a block diagram of another data heterogeneous processing apparatus based on a similarity metric according to an embodiment of the present disclosure;

fig. 6 is a schematic diagram of a computer device provided by an embodiment of the present disclosure.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. However, it will be apparent to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.

A data heterogeneous processing method and apparatus based on similarity measure according to an embodiment of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 is a scene schematic diagram of an application scenario of an embodiment of the present disclosure. The application scenario may include

terminal devices

1, 2, and 3, server 4, and network 5.

The

terminal devices

1, 2, and 3 may be hardware or software. When the

terminal devices

1, 2 and 3 are hardware, they may be various electronic devices having a display screen and supporting communication with the server 4, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like; when the

terminal devices

1, 2, and 3 are software, they may be installed in the electronic device as described above. The

terminal devices

1, 2 and 3 may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module, which is not limited by the embodiments of the present disclosure. Further, the

terminal devices

1, 2, and 3 may have various applications installed thereon, such as a data processing application, an instant messaging tool, social platform software, a search-type application, a shopping-type application, and the like.

The server 4 may be a server providing various services, for example, a backend server receiving a request sent by a terminal device establishing a communication connection with the server, and the backend server may receive and analyze the request sent by the terminal device and generate a processing result. The server 4 may be one server, may also be a server cluster composed of a plurality of servers, or may also be a cloud computing service center, which is not limited in this disclosure.

The server 4 may be hardware or software. When the server 4 is hardware, it may be various electronic devices that provide various services to the

terminal devices

1, 2, and 3. When the server 4 is software, it may be implemented as a plurality of software or software modules that provide various services for the

terminal devices

1, 2, and 3, or may be implemented as a single software or software module that provides various services for the

terminal devices

1, 2, and 3, which is not limited in this embodiment of the disclosure.

The network 5 may be a wired network connected by a coaxial cable, a twisted pair and an optical fiber, or may be a wireless network that can interconnect various Communication devices without wiring, for example, Bluetooth (Bluetooth), Near Field Communication (NFC), Infrared (Infrared), and the like, which is not limited in the embodiment of the present disclosure.

A user can establish a communication connection with the server 4 via the network 5 through the

terminal devices

1, 2, and 3 to receive or transmit information or the like. Specifically, after the user imports the collected data of the interest points into the server 4, the server 4 acquires first data of the interest points to be processed, the first data includes a first longitude latitude and a first classification of the interest points to be processed, and performs conflict check on the interest points to be processed according to the first longitude latitude and the first classification; further, in the case of determining a conflict, the server 4 performs conflict processing on the interest points to be processed, so as to avoid a large amount of repeated data and unavailable data existing in the database.

It should be noted that the specific types, numbers and combinations of the

terminal devices

1, 2 and 3, the server 4 and the network 5 may be adjusted according to the actual requirements of the application scenarios, and the embodiment of the present disclosure does not limit this.

Fig. 2 is a flowchart of a data heterogeneous processing method based on similarity measurement according to an embodiment of the present disclosure. The execution subject in fig. 2 may be a terminal or a server in the figure; the participant in the invention can be terminal equipment of a user or a server; the server of the present invention may include a plurality of terminal device nodes, as shown in fig. 2, the data heterogeneous processing method based on the similarity metric includes:

s201, a participant acquires a server model;

specifically, when the participating party receives the join joint learning notification, the participating party responds to the server; receiving a server model; the server model is a weighted average aggregation mode based on the data volume. Participants may be randomly selected by the server, which maintains a list of all local clients, selecting a batch of participants to join the joint learning at a time.

S202, training a target model according to the server model;

specifically, the participants can learn the server model to train to obtain a local model and a global model; selecting parameters of a local model and parameters of a global model, and establishing a target function; minimizing the objective function by adopting a back propagation algorithm; the target function can further screen out a double-target loss function as a target from the parameters of the local model and the parameters of the global model, and a back propagation algorithm is adopted to perform gradient (first order or second order) to perform recursive updating so as to minimize the target function.

S203, performing double-target optimization training on the target model when performing gradient updating;

specifically, the participant performs error training on the local model to obtain an error value of the local model; determining a difference value between the loss parameter of the local model and the loss parameter of the global model; and performing double-target optimization training on the target model according to the error value and the difference value.

Further, the dual-target optimization training may be composed of two parts, one part is a training error value for the local model, and the other part is a difference value between a loss parameter of the local model and a loss parameter of the global model, and the two parts are subjected to the dual-target optimization training by controlling the proportion of different items by the hyper-parameter.

The participator can minimize the error value and the related value, so that the local models of different clients corresponding to the participator can minimize the difference between the local models and the global model while improving the local performance, thereby balancing the influence of the heterogeneous data of each party on the weight drift of the full model.

And S204, carrying out target model convergence on the target model according to the result of the dual-target optimization training.

Specifically, the participator can establish a parameter matrix of the local model and a parameter matrix of the global model according to the result of the dual-objective optimization training; calculating the similarity metric between the parameter matrixes; and according to the similarity metric value, carrying out target model convergence on the target model.

According to the technical scheme provided by the embodiment of the disclosure, a server model is obtained; training a target model according to the server model; when gradient updating is carried out, carrying out double-target optimization training on the target model; and carrying out target model convergence on the target model according to the result of the dual-target optimization training. The local models of different clients corresponding to the participants can be minimized and the difference between the local models and the global model is minimized while the local performance is improved, so that the influence of data isomerism of each party on weight drift of the full model is balanced.

Fig. 3 is a flowchart of a data heterogeneous processing method based on similarity measurement according to an embodiment of the present disclosure. The data heterogeneous processing method based on the similarity metric of fig. 2 may be performed by the terminal device or the server of fig. 1. The participant in the invention can be terminal equipment of a user or a server; the server of the present invention may include a plurality of terminal device nodes, as shown in fig. 3, the data heterogeneous processing method based on the similarity metric includes:

s301, a server receives a plurality of models participating in joint learning uploaded by participants joining in joint learning;

s302, initializing a plurality of models;

and S303, aggregating the models by adopting weighted average based on data quantity to obtain the server model.

Wherein, the joint learning: the method can be used for supporting multi-user multi-party cooperation, mining data value by combining multi-party cooperation through AI technology and establishing intelligent combined modeling. Wherein, intelligent joint modeling includes:

1) the participating nodes control a weak centralized joint training mode of own data, so that the data privacy security in the co-creation intelligent process is ensured;

2) under different application scenes, a plurality of model aggregation optimization strategies are established by utilizing screening and/or combined AI algorithm and privacy protection calculation; to obtain a high-level, high-quality model;

3) on the premise of ensuring data security and user privacy, acquiring an efficiency method for improving the joint learning engine based on a plurality of model aggregation optimization strategies, wherein the efficiency method can improve the overall efficiency of the joint learning engine by solving the problems of information interaction, intelligent perception, abnormal processing mechanisms and the like under the conditions of parallel computing architectures, large-scale cross-domain networks and the like;

4) the method comprises the steps of obtaining the requirements of multi-party users in each scene, determining the true contribution degree of each joint participant to be reasonably evaluated through a mutual trust mechanism, and carrying out distribution stimulation;

based on the mode, the AI technical ecology based on the joint learning can be established, the industrial data value is fully exerted, and the falling of scenes in the vertical field is promoted.

According to the technical scheme provided by the embodiment of the disclosure, the participants acquire the server model; training a target model according to the server model; when gradient updating is carried out, carrying out double-target optimization training on the target model; and carrying out target model convergence on the target model according to the result of the dual-target optimization training. The server receives a plurality of models participating in joint learning uploaded by participants participating in joint learning; initializing a plurality of models; and aggregating the models by adopting weighted average based on data quantity to obtain the server model. The problem of model non-convergence caused by model parameter continuous oscillation caused by data isomerism in the prior art is solved.

All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.

Fig. 4 is a schematic diagram of a data heterogeneous processing apparatus based on a similarity metric according to an embodiment of the present disclosure. As shown in fig. 4, the data heterogeneous processing apparatus based on the similarity metric includes:

an acquisition module 401 configured by a participant for acquiring a server model;

a first training module 402 configured by a participant for training a target model according to a server model;

a second training module 403, configured by the participant, for performing a dual-target optimization training on the target model when performing the gradient update;

an adjustment module 404 configured by the participants for performing target model convergence on the target model according to the result of the dual target optimization training.

Fig. 5 is a schematic diagram of a data heterogeneous processing apparatus based on a similarity metric according to an embodiment of the present disclosure. As shown in fig. 5, the data heterogeneous processing apparatus based on the similarity metric includes:

a receiving module 501, where the server side is configured to receive a plurality of models participating in joint learning, which are uploaded from participants to join in the joint learning;

an initialization module 502, where the server side is configured to initialize a plurality of models;

and an aggregation module 503, configured to aggregate the models by using a weighted average based on the data amount, so as to obtain a server model.

According to the technical solutions provided by the embodiments of the present disclosure with reference to fig. 4 and 5, by setting the acquisition module, the first training module, the second training module, and the adjustment module at the participating party, and setting the receiving module, the initialization module, and the aggregation module at the server end, the problem of model non-convergence caused by the incessant oscillation of model parameters due to data heterogeneity in the prior art can be solved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.

Fig. 6 is a schematic diagram of a computer device 6 provided by an embodiment of the present disclosure. As shown in fig. 6, the computer device 6 of this embodiment includes: a processor 601, a memory 602, and a computer program 603 stored in the memory 602 and operable on the processor 601. The steps in the various method embodiments described above are implemented when the computer program 603 is executed by the processor 601. Alternatively, the processor 601 realizes the functions of each module/unit in the above-described apparatus embodiments when executing the computer program 603.

Illustratively, the computer program 603 may be partitioned into one or more modules/units, which are stored in the memory 602 and executed by the processor 601 to accomplish the present disclosure. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, which are used to describe the execution of computer program 603 in computer device 6.

The computer device 6 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computer devices. Computer device 6 may include, but is not limited to, a processor 601 and a memory 602. Those skilled in the art will appreciate that fig. 6 is merely an example of a computer device 6 and is not intended to limit the computer device 6 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the computer device may also include input output devices, network access devices, buses, etc.

The Processor 601 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 602 may be an internal storage unit of the computer device 6, for example, a hard disk or a memory of the computer device 6. The memory 602 may also be an external storage device of the computer device 6, such as a plug-in hard disk provided on the computer device 6, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 602 may also include both internal and external storage units of the computer device 6. The memory 602 is used for storing computer programs and other programs and data required by the computer device. The memory 602 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.

In the embodiments provided in the present disclosure, it should be understood that the disclosed apparatus/computer device and method may be implemented in other ways. For example, the above-described apparatus/computer device embodiments are merely illustrative, and for example, a division of modules or units, a division of logical functions only, an additional division may be made in actual implementation, multiple units or components may be combined or integrated with another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the present disclosure may implement all or part of the flow of the method in the above embodiments, and may also be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of the above methods and embodiments. The computer program may comprise computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain suitable additions or additions that may be required in accordance with legislative and patent practices within the jurisdiction, for example, in some jurisdictions, computer readable media may not include electrical carrier signals or telecommunications signals in accordance with legislative and patent practices.

The above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present disclosure, and are intended to be included within the scope of the present disclosure.

Claims

1. A data heterogeneous processing method based on similarity measurement is characterized by comprising the following steps:

a participant acquires a server model;

training a target model according to the server model;

when gradient updating is carried out, carrying out dual-target optimization training on the target model;

2. The method of claim 1, wherein obtaining the server model comprises:

responding to the server when the participator receives the notice of joining the joint learning;

receiving the server model;

wherein the server model is a weighted average aggregation mode based on data volume.

3. The method of claim 1, wherein training a target model according to the server model comprises:

learning the server model to obtain a local model and a global model through training;

selecting parameters of the local model and parameters of the global model, and establishing an objective function;

and minimizing the objective function by adopting a back propagation algorithm.

4. The method of claim 3, wherein performing dual target optimization training on the target model while performing gradient updates comprises:

carrying out error training on the local model to obtain an error value of the local model;

determining a difference value between the loss parameter of the local model and the loss parameter of the global model;

and performing double-target optimization training on the target model according to the error value and the difference value.

5. The method of claim 1, wherein performing target model convergence on the target model according to the results of the dual target optimization training comprises:

according to the result of the double-target optimization training, establishing a parameter matrix of the local model and a parameter matrix of the global model;

calculating a metric of similarity between the parameter matrices;

and carrying out target model convergence on the target model according to the metric value of the similarity.

6. A data heterogeneous processing method based on similarity measurement is characterized by comprising the following steps:

initializing a plurality of said models;

and aggregating the models by adopting weighted average based on data quantity to obtain a server model.

7. A data heterogeneous processing device based on similarity measurement, comprising:

the first training module is used for training a target model according to the server model;

8. A data heterogeneous processing device based on similarity measurement, comprising:

an initialization module for initializing a plurality of said models;

and the aggregation module is used for aggregating the model by adopting weighted average based on data quantity to obtain the server model.

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.