CN112734033A - Model training method, device, equipment and storage medium - Google Patents

Model training method, device, equipment and storage medium Download PDF

Info

Publication number
CN112734033A
CN112734033A CN202011637010.8A CN202011637010A CN112734033A CN 112734033 A CN112734033 A CN 112734033A CN 202011637010 A CN202011637010 A CN 202011637010A CN 112734033 A CN112734033 A CN 112734033A
Authority
CN
China
Prior art keywords
model
module
trained
client
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011637010.8A
Other languages
Chinese (zh)
Inventor
朱星华
王健宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011637010.8A priority Critical patent/CN112734033A/en
Publication of CN112734033A publication Critical patent/CN112734033A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The model training method, the device, the equipment and the storage medium are applied to a central server in a model training system, the model training system comprises the central server and at least two clients, structural information and parameter information of a sharing module in a model to be trained are sent to each client, parameters of the sharing module returned by each client are received, and the model to be trained is updated according to the parameters of each sharing module. The model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into a model to be trained; the local module is used for obtaining a model result of the model to be trained according to the characteristic value, and the parameters of the sharing module are obtained by training the model to be trained by each client according to the point cloud data stored locally. The accuracy of the model to be trained obtained by the model training method provided by the embodiment of the application is high.

Description

Model training method, device, equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a model training method, apparatus, device, and storage medium.
Background
With the continuous development of artificial intelligence, neural network models are widely used in various fields. In some fields, such as the field of indoor map creation or the field of automatic driving, point cloud data (point cloud data) is often processed through a neural network model to obtain a corresponding indoor map creation result or an automatic driving result. Wherein point cloud data refers to a collection of vectors in a three-dimensional coordinate system.
Before the point cloud data is processed by applying the neural network model, the neural network model is usually trained by using a point cloud data set, so that the neural network model with high accuracy is obtained. The point cloud data set typically comprises a plurality of labeled point cloud data. However, on the one hand, the point cloud data is 3D data, which is difficult to label, thus making the point cloud data set difficult to make. On the other hand, data holders of point cloud data sets typically do not share point cloud data sets publicly based on considerations of data security, business interest, and the like.
Due to the two reasons, the amount of data required by training the neural network model based on the point cloud data is small, and the accuracy of the obtained neural network model is low.
Disclosure of Invention
The application provides a model training method, a model training device, model training equipment and a storage medium, which can improve the accuracy of model training to be trained.
In a first aspect, an embodiment of the present application provides a model training method, which is applied to a central server in a model training system, where the model training system includes the central server and at least two clients, and the method includes:
sending structural information and parameter information of a sharing module in the model to be trained to each client; the model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into a model to be trained; the local module is used for obtaining a model result of the model to be trained according to the characteristic value;
receiving parameters of a sharing module returned by each client, wherein the parameters of the sharing module are obtained by training a model to be trained by each client according to locally stored point cloud data;
and updating the model to be trained according to the parameters of each sharing module.
In an embodiment, the updating the model to be trained according to the parameters of the shared modules includes:
carrying out weighted average on the parameters of each sharing module to obtain the updating parameters of the sharing modules;
and updating the model to be trained according to the update parameters of the sharing module.
In an embodiment, after updating the model to be trained according to the update parameters of the shared module, the method further includes:
and returning the update parameters of the sharing module to each client.
In an embodiment, the sending structure information and parameter information of a shared module in a model to be trained to each client includes:
and sending the structural information and the parameter information of the shared module in the model to be trained to each client in batches according to the network transmission speed and the batch corresponding to each client.
In one embodiment, the local modules are arranged on corresponding clients;
the parameters of the local module are obtained by training the model to be trained by each client according to the structural information of the sharing module, the parameter information of the sharing module, the structural information of the local module preset on the client, the parameter information of the local module and the point cloud data stored on the client.
In one embodiment, the method further comprises:
and sending an updating instruction to each client so that each client updates the local module according to the parameters of the local module.
In one embodiment, the local module is configured to obtain a classification result or a segmentation result of the point cloud data according to the feature value.
In a second aspect, a model training apparatus is applied to a central server in a model training system, where the model training system includes the central server and at least two clients, and the apparatus includes:
the sending module is used for sending the structural information and the parameter information of the sharing module in the model to be trained to each client; the model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into a model to be trained; the local module is used for obtaining a model result of the model to be trained according to the characteristic value;
the receiving module is used for receiving the parameters of the sharing module returned by each client, and the parameters of the sharing module are obtained by training the model to be trained by each client according to the point cloud data stored locally;
and the updating module is used for updating the model to be trained according to the parameters of the sharing modules.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the method according to the first aspect when executing the computer program.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the method according to the first aspect.
The model training method, the device, the equipment and the storage medium are applied to a central server in a model training system, the model training system comprises the central server and at least two clients, structural information and parameter information of a sharing module in a model to be trained are sent to each client, parameters of the sharing module returned by each client are received, and the model to be trained is updated according to the parameters of each sharing module. The model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into a model to be trained; the local module is used for obtaining a model result of the model to be trained according to the characteristic value, and the parameters of the sharing module are obtained by training the model to be trained by each client according to the point cloud data stored locally. In the embodiment of the application, the parameters of the sharing module for updating the model to be trained are obtained by training the model to be trained according to the point cloud data stored on at least two clients, and the number of the used point cloud data is more than that of the point cloud data used for model training in the traditional method, so that the accuracy of the model to be trained is improved. Furthermore, by receiving the parameters of the sharing module returned by each client and updating the sharing module according to the parameters of the sharing module, the central server only needs to receive the parameters of the sharing module returned by each client and updates the sharing module according to the parameters of the sharing module, and does not need to update the model to be trained according to all the model parameters returned by each client, so that the data volume of the central server for processing data is reduced, and the efficiency of the central server is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a model training system in one embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating a model training method according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a model to be trained in another embodiment of the present application;
FIG. 4 is a schematic flow chart of a model training method according to another embodiment of the present application;
FIG. 5 is a schematic structural diagram of a model training apparatus provided in an embodiment of the present application;
fig. 6 is an internal structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It is to be understood that the terms "first," "second," "third," "fourth," and the like (if any) in the embodiments of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The model training method provided by the embodiment can be applied to the application environment shown in fig. 1. The system includes a central server 100 and at least two clients 200, the central server 100 is in communication connection with each client, the central server 100 may be, but not limited to, an electronic device with a data processing function, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, or a personal digital assistant, and the specific form of the central server 100 is not limited in this embodiment. Each client may be, but not limited to, an electronic device with a data processing function, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, or a personal digital assistant, and the specific form of the client is not limited in this embodiment.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments.
It should be noted that the execution subject of the method embodiments described below may also be a model training apparatus, and the apparatus may be implemented as part or all of the electronic device by software, hardware, or a combination of software and hardware. The following method embodiments are described by taking an execution subject as an electronic device as an example.
Fig. 2 is a schematic flowchart of a model training method according to an embodiment of the present application. The embodiment is applied to a central server in a model training system, the model training system comprises the central server and at least two clients, and the specific process of how to improve the training accuracy of the model to be trained is involved. As shown in fig. 2, the method comprises the steps of:
s101, sending structural information and parameter information of a sharing module in a model to be trained to each client; the model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into a model to be trained; and the local module is used for obtaining a model result of the model to be trained according to the characteristic value.
The model to be trained may use the point cloud data as a neural network model of input data, which may be a neural network model for an indoor map or a neural network model for autopilot navigation, and the embodiment of the present application does not limit this. As shown in fig. 3, the model to be trained includes a sharing module and a local module, where the sharing module is configured to obtain a feature value of point cloud data input into the model to be trained, and the local module is configured to obtain a model result of the model to be trained according to the feature value. The shared module may be stored on a central server, the local module may be stored on a corresponding client, and the central server may send the structure information of the shared module and the parameter information of the shared module when sending the model information to each client. The structure information of the shared module may be used to indicate the number of convolutional layers, the number of pooling layers, the number of fully-connected layers, the type of classifier, the number of classifiers included in the shared module, and the connection relationship among the convolutional layers, the pooling layers, the fully-connected layers, and the classifiers. The parameter information of the shared module may be used to indicate an initial parameter of the shared module, for example, the parameter information may be an initial parameter of a loss function of the shared module, or may also be an initial network parameter of the shared module, which is not limited in this embodiment of the application. It should be noted that the structure information and the parameter information of the local module may be similar to those of the shared module, and are not described herein again.
The sharing module is used for extracting the characteristic value of point cloud data of the model to be trained, and for different types of models, the characteristic value extraction is universal operation and can be uniformly managed on a central server. The point cloud data is unstructured data and has the characteristics of rotation invariance and displacement invariance, and the sharing module can adopt a graph convolution network to configure the characteristics of the point cloud data. In a possible situation, in order to avoid the problem that the field diffusion of the graph convolution network is slow, which causes the non-ideal extraction of the large-scale point cloud data features, the sharing module can also adopt an Edge convolution layer (Edge Conv) as a convolution layer for feature extraction, and the problem that the extraction of the large-scale point cloud data features is not ideal is solved by using the characteristic of sparse sampling of the Edge convolution layer. The local module is used for obtaining a model result of the model to be trained according to the characteristic value, setting the model result in a personalized mode according to the purpose of the model to be trained, and setting the local module on each client.
When the central server sends the model information to each client, the central server can directly send the model information to all the clients at one time; the client sides can also be divided into different batches, and the model information is sent to the corresponding client sides in batches according to the batches of the client sides; the model information may also be sent to one client at a time, which is not limited in the embodiments of the present application.
And S102, receiving parameters of the sharing module returned by each client, wherein the parameters of the sharing module are obtained by training the model to be trained by each client according to the locally stored point cloud data.
The locally stored point cloud data may be sample data for model training obtained by labeling a point cloud, and the parameter of the sharing module may be a network parameter of the sharing module in the adjusted model to be trained, where the locally stored point cloud data is used as input data of the model to be trained, and the model to be trained is trained, and the network parameter of the sharing module in the adjusted model to be trained is obtained.
After receiving the structural information of the sharing module and the parameter information of the sharing module sent by the central server, each client can construct a completed model to be trained according to the structural information of the sharing module, the parameter information of the sharing module, the structural information of the local module prestored on the client and the parameter information of the local module, and then train the model to be trained by taking the point cloud data locally stored as input data of the model to be trained to obtain and return parameters of the sharing module to the central server.
The central server may uniformly receive the parameters of the sharing modules returned by all the clients at a preset time, may also receive the parameters of the sharing modules returned by all the clients in real time at a time when the clients complete the training of the model to be trained, and may also receive the parameters of the sharing modules returned by all the clients respectively according to the preset return time of all the clients, which is not limited in the embodiments of the present application. For example, the central server is in communication connection with 5 clients, namely client 1, client 2, client 3, client 4 and client 5, and the central server can uniformly receive parameters of the sharing modules returned by the 5 clients at regular time at a preset time of 1: 00; the parameters of the sharing module returned by the client 1 are received at the time 1:02 when the client 1 finishes the training of the model to be trained, the parameters of the sharing module returned by the client 1 are received at the time 2:05 when the client 2 finishes the training of the model to be trained, the parameters of the sharing module returned by the client 3 are received at the time 1:12 when the client 3 finishes the training of the model to be trained, the parameters of the sharing module returned by the client 4 are received at the time 4:33 when the client 4 finishes the training of the model to be trained, and the parameters of the sharing module returned by the client 5 are received at the time 2:05 when the client 5 finishes the training of the model to be trained.
And S103, updating the model to be trained according to the parameters of each sharing module.
After receiving the parameters of the sharing modules returned by the clients, the central server may take an average value of the parameters of the sharing modules to update the model to be trained, may also perform a weighted average on the parameters of the sharing modules according to the weighted values of the parameters of the sharing modules to update the model to be trained, and may also select a part of the parameters of the sharing modules from the parameters of the models to update the model to be trained, which is not limited in the embodiments of the present application.
The model training method is applied to a central server in a model training system, the model training system comprises the central server and at least two client sides, the structural information and the parameter information of a sharing module in a model to be trained are sent to each client side, the parameter of the sharing module returned by each client side is received, and then the model to be trained is updated according to the parameter of each sharing module. The model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into a model to be trained; the local module is used for obtaining a model result of the model to be trained according to the characteristic value, and the parameters of the sharing module are obtained by training the model to be trained by each client according to the point cloud data stored locally. In the embodiment of the application, the parameters of the sharing module for updating the model to be trained are obtained by training the model to be trained according to the point cloud data stored on at least two clients, and the number of the used point cloud data is more than that of the point cloud data used for model training in the traditional method, so that the accuracy of the model to be trained is improved. Furthermore, by receiving the parameters of the sharing module returned by each client and updating the sharing module according to the parameters of the sharing module, the central server only needs to receive the parameters of the sharing module returned by each client and updates the sharing module according to the parameters of the sharing module, and does not need to update the model to be trained according to all the model parameters returned by each client, so that the data volume of the central server for processing data is reduced, and the efficiency of the central server is improved.
Fig. 4 is a schematic flowchart of a model training method in another embodiment of the present application, which relates to a specific process of how to update a model to be trained according to parameters of each model, and as shown in fig. 4, one possible implementation method of the above-mentioned S103 "updating a model to be trained according to parameters of each shared module" includes:
s201, carrying out weighted average on the parameters of the sharing modules to obtain the updating parameters of the sharing modules.
When the parameters of each sharing module are weighted and averaged, the parameters of each sharing module can be weighted and averaged according to the weight value corresponding to the parameters of each sharing module, so as to obtain the update parameters of the sharing modules; or selecting part of parameters of the sharing module from the parameters of each sharing module, and performing weighted average on the selected parameters of the sharing module according to the weighted value corresponding to the selected parameters of the sharing module to obtain the updated parameters of the sharing module; the embodiments of the present application do not limit this. For example, the central server may select, from the parameters of each sharing module, a parameter of the sharing module, which is used by the local module to obtain the classification result of the point cloud data according to the feature value, and perform weighted average according to the weighted value corresponding to the parameter of each sharing module, so as to obtain an update parameter of the sharing module.
In a possible case, the central server may further obtain a weight value of the parameter of each sharing module according to a weight corresponding to each client obtained in the multiple rounds of communications with each client. For example, the weight value of the parameter of each shared module may be determined by the following method.
Figure BDA0002878777310000071
Figure BDA0002878777310000072
ClientUpdate(k,ws)
for local epoch ═ 1: E
Each batch beta of data in for client
Figure BDA0002878777310000073
Return ws
Wherein k represents the serial number of the client, t represents the number of communication rounds,
Figure BDA0002878777310000081
indicating that the kth client shares the updated parameter weight of the module after weighted average in the t-th round of communication,
Figure BDA0002878777310000082
indicating the updated parameter weight of the sharing module after the k-th client performs weighted average in the t-1 communication,
Figure BDA0002878777310000083
representing the weight of the parameter of the sharing module in the t-th round of communication, n representing the total data volume, Clientupdate representing the updating process of the client, epoch representing the iteration round of the client, E representing the most important of the clientLarge iteration round, wsRepresenting the weight of the parameter of the shared module, batch beta representing the beta lot,
Figure BDA0002878777310000084
and the parameter weight of the local template of the kth client is represented, and eta represents the learning rate of deep learning.
And S202, updating the model to be trained according to the update parameters of the sharing module.
In a possible situation, after the central server updates the model to be trained according to the update parameters of the sharing module, the update parameters of the sharing module can be returned to each client, so that each client can store the updated model to be trained on the client, and each client can conveniently and directly use the updated model to be trained.
Optionally, the update parameters of the sharing module are returned to each client.
The central server can immediately return the update parameters of the sharing module to each client when the update parameters of the sharing module are obtained; or after the update parameters of the shared module are obtained, the update parameters of the shared module can be returned to each client when the preset time is reached; the update parameters of the sharing module can be returned to each client in batches according to preset batch information; the embodiment of the present application does not limit this.
In a possible situation, when the model parameters returned to the central server by the client are parameters of the sharing module, after the central server performs weighted average on the model parameters and updates the sharing module according to the update parameters of the sharing module, the central server updates the parameters of the sharing module returned to each client, each client updates the sharing module of the model to be trained on the client according to the update parameters of the sharing module, and updates the local module of the model to be trained on the client according to the update parameters of the model of the local module obtained by training, so as to obtain the updated model to be trained.
According to the model training method, after the central server obtains the shared module updating parameters, the shared module updating parameters are returned to the client sides, so that the client sides can update the model to be trained on the client sides according to the shared module updating parameters to obtain the updated model to be trained, and further, when the updated model to be trained is used, the client sides can directly use the updated model to be trained stored on the client sides, and the situation that the updated model to be trained cannot be used due to unsmooth communication between the central server and the client sides is avoided.
When the network transmission speed between the central server and each client is low, at least two clients can be divided into different batches, and the structural information and the parameter information of the shared module in the model to be trained are sent to each client in batches.
Optionally, according to the network transmission speed, the structural information and the parameter information of the shared module in the model to be trained are sent to the clients in batches according to the batches corresponding to the clients.
When the network transmission speed is lower than the preset speed threshold, the central server may send the structural information and the parameter information of the shared module in the model to be trained to each client in batches according to the batch corresponding to each client. When the network transmission speed is higher than the preset speed threshold, the central server can directly send the structural information and the parameter information of the shared module in the model to be trained to all the clients. For example, when the network transmission speed is lower than the preset threshold, 10 corresponding clients are divided into 2 batches, wherein the clients 1-5 are clients of a first batch, and the clients 6-10 are clients of a second batch, and then the central server sends the structural information and the parameter information of the shared module in the model to be trained to the clients 1-5, and then sends the structural information and the parameter information of the shared module in the model to be trained to the clients 6-10.
According to the model training method, the central server sends the structural information and the parameter information of the shared module in the model to be trained to each client in batches according to the network transmission speed and the batch corresponding to each client. The problem that the structural information and the parameter information of the shared module in the model to be trained cannot be sent to each client due to the fact that data blockage is caused by low network transmission speed is avoided.
In one embodiment, the local modules are arranged on the corresponding clients, and the parameters of the shared modules and the parameters of the local modules can be obtained by training the model to be trained according to the structural information of the shared modules, the parameter information of the shared modules, the structural information of the local modules preset on the clients and the parameter information of the local modules.
Optionally, the parameters of the local module are obtained by training the model to be trained by each client according to the structural information of the shared module, the parameter information of the shared module, the structural information of the local module preset on the client, the parameter information of the local module, and the point cloud data stored on the client. After training the model to be trained according to the structural information of the sharing module, the parameter information of the sharing module, the structural information of the local module preset on the client and the parameter information of the local module to obtain the parameters of the sharing module and the parameters of the local module, the parameters of the sharing module can be returned to the central server only. The parameters of the local module are stored on the client, and the local module is updated according to the parameters of the local module. The client may update the local module immediately after obtaining the parameter of the local module, or update the local module after receiving an update instruction sent by the central server, which is not limited in this embodiment of the present application.
Optionally, an update instruction is sent to each client, so that each client updates the local module according to the parameter of the local module. It should be noted that the local modules on the clients may be the same or different, and this is not limited in this embodiment of the application. Optionally, the local module is configured to obtain a classification result or a segmentation result of the point cloud data according to the feature value. When the local modules on the clients are different, for example, the local module of the client 1 obtains the classification result of the point cloud data according to the feature values, and the local module of the client 2 obtains the segmentation result of the point cloud data according to the feature values. The client 1 trains a model to be trained according to the structural information of the sharing module, the parameter information of the sharing module, the structural information of a local module preset on the client and the parameter information of the local module to obtain the parameters of the sharing module and the parameters of the local module, wherein the parameters of the local module are parameters suitable for the classification task, the local module on the client 1 is updated through the parameters of the local template obtained by training of the client 1, and the obtained model to be trained is more suitable for the classification task. The client 2 trains the model to be trained according to the structural information of the sharing module, the parameter information of the sharing module, the structural information of a local module preset on the client and the parameter information of the local module to obtain the parameters of the sharing module and the parameters of the local module, wherein the parameters of the local module are parameters suitable for the segmentation task, the local module on the client 2 is updated through the parameters of the local template obtained by training of the client 2, and the obtained model to be trained is more suitable for the segmentation task.
According to the model training method, the model to be trained is trained according to the structure information of the sharing module, the parameter information of the sharing module, the structure information of the local module preset on the client and the parameter information of the local module, the parameters of the sharing module and the parameters of the local module are obtained, and when the model parameters are returned to the central server, the parameters of the sharing module are only required to be returned, so that the communication data volume between the central server and the client is reduced, namely, the communication time between the central server and the client is reduced, and the model training efficiency is improved.
It should be understood that, although the respective steps in the flowcharts in the above-described embodiments are sequentially shown as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the flowchart may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
Fig. 5 is a schematic structural diagram of a model training apparatus in an embodiment of the present application, which is applied to a central server in a model training system, where the model training system includes the central server and at least two clients, and as shown in fig. 5, the model training apparatus includes: a sending module 110, a receiving module 120, and an updating module 130, wherein:
a sending module 110, configured to send structure information and parameter information of a sharing module in a model to be trained to each client; the model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into a model to be trained; the local module is used for obtaining a model result of the model to be trained according to the characteristic value;
the receiving module 120 is configured to receive a parameter of the sharing module returned by each client, where the parameter of the sharing module is obtained by training a model to be trained by each client according to locally stored point cloud data;
and the updating module 130 is configured to update the model to be trained according to the parameters of each sharing module.
In one embodiment, the update module 130 is specifically configured to perform weighted average on the parameters of each sharing module to obtain an update parameter of the sharing module; and updating the model to be trained according to the update parameters of the sharing module.
In one embodiment, the model training apparatus further includes: returning to the module 140, wherein:
and the returning module 140 is used for returning the shared module updating parameters to each client.
In an embodiment, the sending module 110 is specifically configured to send, to each client in batches according to the network transmission speed and the batch corresponding to each client, the structural information and the parameter information of the shared module in the model to be trained.
In one embodiment, the local modules are arranged on corresponding clients; the parameters of the local module are obtained by training the model to be trained by each client according to the structural information of the sharing module, the parameter information of the sharing module, the structural information of the local module preset on the client, the parameter information of the local module and the point cloud data stored on the client.
In one embodiment, the sending module 110 is further configured to send an update instruction to each client, so that each client updates the local module according to the parameter of the local module.
In one embodiment, the local module is configured to obtain a classification result or a segmentation result of the point cloud data according to the feature value.
The model training device provided by the embodiment of the application can execute the method embodiment, the implementation principle and the technical effect are similar, and the details are not repeated herein.
For a specific definition of the model training device, reference may be made to the above definition of the model training method, which is not described herein again. The modules in the model training device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent of a processor in the electronic device, or can be stored in a memory in the electronic device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, an electronic device is provided, the internal structure of which may be as shown in FIG. 6. The electronic device includes a processor, a memory, a network interface, and an input device connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the electronic device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a model training method.
Those skilled in the art will appreciate that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with the present application, and does not constitute a limitation on the electronic device to which the present application is applied, and a particular electronic device may include more or less components than those shown in the drawings, or may combine certain components, or have a different arrangement of components.
It should be clear that, in the embodiments of the present application, the process of executing the computer program by the processor is consistent with the process of executing the steps in the above method, and specific reference may be made to the description above.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, is capable of implementing the model training method provided by the above-mentioned method embodiments of the present application.
It should be clear that, in the embodiments of the present application, the process of executing the computer program by the processor is consistent with the process of executing the steps in the above method, and specific reference may be made to the description above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A model training method is applied to a central server in a model training system, wherein the model training system comprises the central server and at least two clients, and the method comprises the following steps:
sending structural information and parameter information of a sharing module in the model to be trained to each client; the model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into the model to be trained; the local module is used for obtaining a model result of the model to be trained according to the characteristic values;
receiving parameters of a sharing module returned by each client, wherein the parameters of the sharing module are obtained by training the model to be trained by each client according to locally stored point cloud data;
and updating the model to be trained according to the parameters of each sharing module.
2. The method according to claim 1, wherein the updating the model to be trained according to the parameters of each shared module comprises:
carrying out weighted average on the parameters of each sharing module to obtain the updating parameters of the sharing modules;
and updating the model to be trained according to the update parameters of the sharing module.
3. The method of claim 2, wherein after updating the model to be trained according to the shared module update parameters, the method further comprises:
and returning the update parameters of the shared module to each client.
4. The method according to any one of claims 1 to 3, wherein the sending of the structural information and the parameter information of the shared module in the model to be trained to each client comprises:
and sending the structural information and the parameter information of the shared module in the model to be trained to each client in batches according to the network transmission speed and the batch corresponding to each client.
5. A method according to any of claims 1-3, wherein the local modules are provided on respective clients;
the parameters of the local module are obtained by training the model to be trained by each client according to the structural information of the sharing module, the parameter information of the sharing module, the structural information of the local module preset on the client and the parameter information of the local module, and the point cloud data stored on the client.
6. The method of claim 5, further comprising:
and sending an updating instruction to each client so that each client updates the local module according to the parameters of the local module.
7. The method of claim 5, wherein the local module is configured to obtain a classification result or a segmentation result of the point cloud data according to the feature value.
8. A model training apparatus applied to a central server in a model training system, the model training system comprising the central server and at least two clients, the apparatus comprising:
the sending module is used for sending the structural information and the parameter information of the sharing module in the model to be trained to each client; the model to be trained comprises a sharing module and a local module; the sharing module is used for acquiring a characteristic value of point cloud data input into the model to be trained; the local module is used for obtaining a model result of the model to be trained according to the characteristic values;
the receiving module is used for receiving the parameters of the sharing module returned by each client, and the parameters of the sharing module are obtained by training the model to be trained by each client according to the point cloud data stored locally;
and the updating module is used for updating the model to be trained according to the parameters of the sharing modules.
9. An electronic device, comprising a memory storing a computer program and a processor implementing the method of any of claims 1 to 7 when the processor executes the computer program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202011637010.8A 2020-12-31 2020-12-31 Model training method, device, equipment and storage medium Pending CN112734033A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011637010.8A CN112734033A (en) 2020-12-31 2020-12-31 Model training method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011637010.8A CN112734033A (en) 2020-12-31 2020-12-31 Model training method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112734033A true CN112734033A (en) 2021-04-30

Family

ID=75608763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011637010.8A Pending CN112734033A (en) 2020-12-31 2020-12-31 Model training method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112734033A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114581706A (en) * 2022-03-02 2022-06-03 平安科技(深圳)有限公司 Configuration method and device of certificate recognition model, electronic equipment and storage medium
CN114944988A (en) * 2022-05-12 2022-08-26 重庆金美通信有限责任公司 Communication network training method based on equipment cloud platform
CN117153312A (en) * 2023-10-30 2023-12-01 神州医疗科技股份有限公司 Multi-center clinical test method and system based on model average algorithm

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114581706A (en) * 2022-03-02 2022-06-03 平安科技(深圳)有限公司 Configuration method and device of certificate recognition model, electronic equipment and storage medium
CN114581706B (en) * 2022-03-02 2024-03-08 平安科技(深圳)有限公司 Method and device for configuring certificate recognition model, electronic equipment and storage medium
CN114944988A (en) * 2022-05-12 2022-08-26 重庆金美通信有限责任公司 Communication network training method based on equipment cloud platform
CN117153312A (en) * 2023-10-30 2023-12-01 神州医疗科技股份有限公司 Multi-center clinical test method and system based on model average algorithm

Similar Documents

Publication Publication Date Title
CN109241903B (en) Sample data cleaning method, device, computer equipment and storage medium
CN112734033A (en) Model training method, device, equipment and storage medium
EP4145308A1 (en) Search recommendation model training method, and search result sorting method and device
CN111666763A (en) Network structure construction method and device for multitask scene
CN111615702B (en) Method, device and equipment for extracting structured data from image
CN108229986B (en) Feature construction method in information click prediction, information delivery method and device
CN111414353A (en) Intelligent missing data filling method and device and computer readable storage medium
WO2019232772A1 (en) Systems and methods for content identification
CN110796162A (en) Image recognition method, image recognition model training method, image recognition device, image recognition training device and storage medium
CN110659667A (en) Picture classification model training method and system and computer equipment
CN110321892B (en) Picture screening method and device and electronic equipment
CN112418292A (en) Image quality evaluation method and device, computer equipment and storage medium
CN111914159A (en) Information recommendation method and terminal
WO2019232723A1 (en) Systems and methods for cleaning data
CN113342799B (en) Data correction method and system
CN110532448B (en) Document classification method, device, equipment and storage medium based on neural network
CN112819157A (en) Neural network training method and device and intelligent driving control method and device
CN116993577A (en) Image processing method, device, terminal equipment and storage medium
CN114445692B (en) Image recognition model construction method and device, computer equipment and storage medium
CN117688984A (en) Neural network structure searching method, device and storage medium
CN116128044A (en) Model pruning method, image processing method and related devices
US20210224632A1 (en) Methods, devices, chips, electronic apparatuses, and storage media for processing data
CN115392361A (en) Intelligent sorting method and device, computer equipment and storage medium
CN110929118B (en) Network data processing method, device, apparatus and medium
CN114037772A (en) Training method of image generator, image generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination