WO2022217781A1 - 数据处理方法、装置、设备以及介质 - Google Patents

数据处理方法、装置、设备以及介质 Download PDF

Info

Publication number
WO2022217781A1
WO2022217781A1 PCT/CN2021/108748 CN2021108748W WO2022217781A1 WO 2022217781 A1 WO2022217781 A1 WO 2022217781A1 CN 2021108748 W CN2021108748 W CN 2021108748W WO 2022217781 A1 WO2022217781 A1 WO 2022217781A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
local
target
multimedia
global
Prior art date
Application number
PCT/CN2021/108748
Other languages
English (en)
French (fr)
Inventor
吴佳祥
白帆
沈鹏程
李绍欣
李季檩
Original Assignee
腾讯云计算(北京)有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯云计算(北京)有限责任公司 filed Critical 腾讯云计算(北京)有限责任公司
Publication of WO2022217781A1 publication Critical patent/WO2022217781A1/zh
Priority to US18/128,719 priority Critical patent/US20230237326A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a data processing method, apparatus, device, and medium.
  • Federated learning has become a new training method that solves the problem of cross-departmental and even cross-platform data islands.
  • Model training can be performed to obtain model parameters without giving data of one's own, that is, joint training can be performed while ensuring data privacy. Since the federated learning process requires a large amount of data to support, and the data is distributed to different data holders, it is necessary to unite various data holders for model building. When building a model jointly with each data holder, it is necessary to perform parameter fusion on the model parameters trained by each data holder.
  • each data holder can use its own data to train a local model, and all data holders can periodically upload the local model parameters corresponding to the trained local model to the server, and the server sets the parameters of the local model.
  • the total model is obtained on average, and the total model is delivered to each data holder to continue local training until the training convergence condition is reached.
  • the local model parameters of each data holder are averaged, the fusion effect between the local model parameters is poor, thus affecting the generalization effect of the federated model.
  • the embodiments of the present application provide a data processing method, apparatus, device, and medium, which can improve the effectiveness of parameter fusion between federated training models, thereby improving the generalization effect of the federated recognition model.
  • the embodiments of the present application provide a data processing method, including:
  • the N local recognition models are obtained by independent training of N clients, and each client includes multimedia sample data for training the associated local recognition model.
  • the sample data contains objects of the target object type, and N is a positive integer greater than 1;
  • M parameter fusion methods associated with the local model parameter set, and perform parameter fusion on the local model parameter set according to each parameter fusion method to obtain M candidate global models; the local model parameter set is based on the N local recognition models respectively. The corresponding local model parameters are determined, and M is a positive integer;
  • the evaluation indicators of the M candidate global models in the multimedia verification data set respectively, determine the target global model in the M candidate global models according to the evaluation indicators, and transmit the target global model to N clients, so that the N clients
  • the parameters of the associated local recognition models are updated respectively according to the target global model to obtain the object recognition model; the object recognition model is used to recognize the objects of the target object type contained in the multimedia data.
  • the embodiments of the present application provide a data processing method, including:
  • the local model parameters uploaded by the client include the local model parameters corresponding to the target local recognition model.
  • the target global model is determined by the evaluation indicators of the M candidate global models in the multimedia verification data set, and the M candidate global models are determined by the local model.
  • M parameter fusion methods associated with the parameter set and the local model parameter set are determined, the local model parameter set is determined based on the local model parameters uploaded by N clients respectively, N is a positive integer greater than 1, and M is a positive integer;
  • Receive the target global model returned by the service device update the parameters of the target local recognition model according to the target global model, and determine the target local recognition model after parameter updating as the object recognition model; the object recognition model is used to recognize the target contained in the multimedia data.
  • Object of type object
  • the embodiments of the present application provide a data processing apparatus, including:
  • the parameter acquisition module is used to acquire the local model parameters corresponding to the N local recognition models; the N local recognition models are obtained by independent training of N clients, and each client includes the associated local recognition model for training.
  • the multimedia sample data, the multimedia sample data includes the object of the target object type, and N is a positive integer greater than 1;
  • the parameter fusion module is used to obtain M parameter fusion methods associated with the local model parameter set, and perform parameter fusion on the local model parameter set according to each parameter fusion method to obtain M candidate global models; the local model parameter set is based on The local model parameters corresponding to the N local recognition models are determined, and M is a positive integer;
  • the model determination module is used to obtain the evaluation indicators of the M candidate global models in the multimedia verification data set respectively, determine the target global model from the M candidate global models according to the evaluation indicators, and transmit the target global model to the N clients, So that the N clients respectively update the parameters of the associated local recognition models according to the target global model to obtain the object recognition model; the object recognition model is used to recognize the objects of the target object type contained in the multimedia data.
  • the embodiments of the present application provide a data processing apparatus, including:
  • the model parameter uploading module is used to upload the local model parameters corresponding to the target local recognition model to the service device in response to the training times of the target local recognition model meeting the synchronization period, so that the service device is based on the local model parameters uploaded by the N clients respectively.
  • the target global model is obtained; the local model parameters uploaded by the N clients respectively include the local model parameters corresponding to the target local recognition model, and the target global model is determined by the evaluation indicators of the M candidate global models in the multimedia verification data set, and the M
  • the candidate global model is determined by the M parameter fusion methods associated with the local model parameter set and the local model parameter set.
  • the local model parameter set is determined based on the local model parameters uploaded by N clients respectively.
  • N is a positive integer greater than 1.
  • M is a positive integer;
  • the target global model receiving module is used to receive the target global model returned by the service device, update the parameters of the target local recognition model according to the target global model, and determine the target local recognition model after parameter update as the object recognition model; the object recognition model is used for An object that identifies the target object type contained in the multimedia data.
  • an embodiment of the present application provides a computer device, including a memory and a processor, the memory is connected to the processor, the memory is used for storing a computer program, and the processor is used for calling the computer program, so that the computer device executes the embodiments of the present application.
  • a computer device including a memory and a processor, the memory is connected to the processor, the memory is used for storing a computer program, and the processor is used for calling the computer program, so that the computer device executes the embodiments of the present application.
  • An aspect of the embodiments of the present application provides a non-transitory computer-readable storage medium, where a computer program is stored in the non-transitory computer-readable storage medium, and the computer program is adapted to be loaded and executed by a processor, so that a computer having the processor
  • the computer device executes the method provided by the foregoing aspect in the embodiments of the present application.
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method provided by the above aspect.
  • This embodiment of the present application can acquire the local model parameters of the local identification models to which N clients respectively upload, and acquire M parameter fusion methods for the local model parameter set determined based on the N local model parameters, and each parameter fusion
  • the parameters of the local model parameter sets are respectively fused to obtain M alternative global models, and then the optimal target is selected from the M alternative global models through the evaluation indicators of the M alternative global models in the multimedia verification data set.
  • global model That is, selecting the optimal target global model from the M alternative global models obtained according to the M parameter fusion methods can improve the fusion effectiveness between the N local model parameters.
  • the parameters of the local recognition model to which it belongs can be updated, which can improve the generalization effect of the object recognition model.
  • FIG. 1 is a schematic structural diagram of a network architecture provided by an embodiment of the present application.
  • FIGS. 2a and 2b are schematic diagrams of a federated training scenario of a recognition model provided by an embodiment of the present application
  • FIG. 3 is a schematic time sequence diagram of a data processing method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a determination target global model provided by an embodiment of the present application.
  • FIG. 5 is a flowchart of a federated model training method provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a weight combination in a multimedia verification data set provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a user identity authentication scenario provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a data processing apparatus provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a data processing apparatus provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • This application involves artificial intelligence (Artificial Intelligence, AI) technology, block chain (Block Chain) technology and cloud technology.
  • AI Artificial Intelligence
  • Block Chain Block Chain
  • FIG. 1 is a schematic structural diagram of a network architecture provided by an embodiment of the present application.
  • the network architecture includes a server 10d and a user terminal cluster, and the user terminal cluster includes one or more user terminals, and the number of user terminals is not limited here.
  • the user terminal cluster may include a user terminal 10a, a user terminal 10b, a user terminal 10c, and the like.
  • the server 10d may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud service, cloud database, cloud computing, cloud function, cloud storage, network service, cloud Cloud servers for basic cloud computing services such as communication, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms.
  • cloud service such as communication, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms.
  • the user terminal 10a, the user terminal 10b, the user terminal 10c, etc. may include: a smart phone, a tablet computer, a notebook computer, a palmtop computer, a mobile internet device (mobile internet device, MID), a wearable device (such as a smart watch, a smart bracelet, etc.) etc.) and smart terminals with multimedia data identification functions such as smart TVs.
  • a smart phone a tablet computer, a notebook computer, a palmtop computer, a mobile internet device (mobile internet device, MID), a wearable device (such as a smart watch, a smart bracelet, etc.) etc.) and smart terminals with multimedia data identification functions such as smart TVs.
  • a smart phone a tablet computer
  • a notebook computer such as a smart watch, a smart bracelet, etc.
  • a wearable device such as a smart watch, a smart bracelet, etc.
  • smart terminals with multimedia data identification functions such as smart TVs.
  • FIG. 1 the user terminal 10a, the user terminal 10b,
  • each user terminal in the user terminal cluster can be integrated with a client, and the number of integrated clients in each user terminal can be one or more, just as a user terminal can be integrated with Different clients, different clients can hold different multimedia data, and the multimedia data held by different clients can be used to train the recognition model (the multimedia data held by the clients involved in this application are defaulted here.
  • the data is the same type of data, for example, the multimedia data held by different clients are all face image data). Because training the recognition model requires a large amount of sample data, and the multimedia data held by different clients may involve private information or confidential information, that is to say, the multimedia data held by each client cannot be disclosed, so it can be used
  • the training of the recognition model is completed by means of federated training.
  • each client can use the multimedia data held by itself as the multimedia sample data for training the recognition model, and conduct training on the multimedia sample data held by itself, and different clients can periodically synchronize model parameters (at this time
  • the synchronized model parameters can be referred to as local model parameters), that is, each client can periodically upload the model parameters obtained by training to the server 10d, and the server 10d can collect the local model parameters
  • the local model parameters uploaded by the clients are fused to obtain the target global model in each cycle, and then the target global model can be delivered to each client, and each client can continue to train the local model parameters according to the target global model. Until the convergence condition is reached or the number of training iterations reaches the preset maximum number of iterations, the trained object recognition model is obtained.
  • the object recognition model can be used to identify objects of the target object type contained in the multimedia data, and can improve the performance of the object recognition model.
  • the target object types may include, but are not limited to, object types such as faces, plants, commodities, pedestrians, various animals, and various scenes.
  • FIG. 2 a and FIG. 2 b are schematic diagrams of a federated training scenario of a recognition model provided by an embodiment of the present application.
  • the client 1 shown in FIG. 2a and FIG. 2b may be the client with the authority of the federated training recognition model integrated in the user terminal 10a shown in FIG. 1, and the client 2 may be the user terminal shown in FIG. 1.
  • the client integrated in 10b has the authority of the federated training recognition model
  • the client N can be the client with the authority of the federated training recognition model integrated in the user terminal 10c shown in FIG. 1
  • the parameter service device can be the above-mentioned figure.
  • the number of clients participating in the federated training recognition model is N, and the value of N can be a positive integer greater than 1, for example, N can be 2, 3, . . .
  • Each client can hold the face sample data for training the recognition model, and the face sample data held by each client is independent of each other. For example, in order to ensure the privacy of the data, client 1 will not The face sample data held by itself is sent to other devices (for example, client 2, client N, parameter service equipment, etc.), so each client can use the face sample data held by itself to execute locally Partial training of the recognition model (the recognition model trained locally by the client can be referred to as a local recognition model, and the model parameters obtained by the client locally trained can be referred to as local model parameters).
  • each client needs to periodically upload local model parameters to the parameter service device, so that the parameter service device can perform local training on the N clients.
  • the model parameters are synchronized, that is, parameter fusion is performed on the local model parameters trained by N clients to obtain a global model. For example, if every 100 training iterations (also called training times, or training steps) is set as a synchronization period, each client needs to upload a local model to the parameter service device every 100 training iterations parameter.
  • the client 1 when client 1 locally trains the local recognition model for 100 iterations, the client 1 can use the model parameter 1 obtained in the 100th training iteration (that is, client 1 in the 100th
  • the local model parameters obtained by the first training iteration) are sent to the parameter service device; similarly, when the number of training iterations of the local recognition model by the client 2 reaches 100 locally, the client 2 can use the 100th training iteration.
  • the obtained model parameter 2 is sent to the parameter service device; the client N can send the model parameter N obtained by itself in the 100th training iteration to the parameter service device.
  • the parameter service device After the parameter service device receives the local model parameters (including model parameter 1, model parameter 2, ..., model parameter N) obtained by the 100th training iteration respectively sent by N clients, it can The search unit obtains different model parameter fusion schemes (for example, different weight combinations), and fuses the local model parameters sent by N clients through the above model parameter fusion scheme to obtain an alternative global model.
  • the global model can also understand models that employ different fusion schemes of model parameters.
  • the candidate global model can be transmitted to an evaluation unit (Arbiter), and the evaluation unit can be a component integrated in the parameter service device, or an external component with a communication connection relationship with the parameter service device.
  • the evaluation index corresponding to the candidate global model can be obtained through the verification data set, and the evaluation index corresponding to the candidate global model can be returned to the parameter service device.
  • the verification data set may include face sample data carrying label information, and after inputting the face sample data in the verification data set into the alternative global model, the face recognition result for the face sample data can be output through the alternative global model . Then, the output face recognition result can be compared with the label information carried by the face sample data. If the face recognition result is the same as the label information, it means that the candidate global model prediction is correct. If they are not the same, the alternative global model predicts incorrectly.
  • the evaluation index of the alternative global model in the verification data set can be determined; the evaluation index can include but is not limited to: Accuracy (Accuracy, in all sample data, the model predicts correct The proportion of sample data), recall rate (Recall, in all real sample data, the proportion of sample data predicted by the model as "true"), precision (Precision, in the sample data predicted by the model as "true", It is indeed the proportion of real sample data), F1 value (an indicator designed to comprehensively consider precision and recall).
  • the parameter service device can select the optimal candidate global model among the candidate global models as the target global model corresponding to the current synchronization cycle according to the evaluation index corresponding to the candidate global model, and return the target global model to each customer
  • Each client can update the local local model parameters according to the target global model returned by the parameter service device, and continue to train.
  • the number of training iterations for the local local recognition model by each client reaches 200, the above operations need to be repeated to obtain the target global model corresponding to the next synchronization period for continuous training until the training of the local recognition model reaches the convergence condition. Or the number of training iterations reaches the set maximum number of iterations, and the local model parameters at this time are saved.
  • the local recognition model including the current local model parameters can be determined as the trained local recognition model, and the application can use the trained local recognition model.
  • the model is determined to be an object recognition model.
  • the face sample data held by client 1 constitutes a dataset 20a
  • the face sample data held by client 2 constitutes a dataset 20b
  • the faces held by client N The sample data constitutes a dataset 20c; the client 1 can use the dataset 20a to train the local recognition model 20d locally, the client 2 uses the dataset 20b to train the local recognition model 20e locally, ..., the client N is in The local recognition model 20f is trained locally using the dataset 20c.
  • the local model parameters obtained by the 100th training iteration need to be sent to the parameter service device.
  • the parameter service device can obtain the local model parameter set 20g, and the local model parameter set 20g can include the local model parameters respectively sent by the above N clients, such as model parameter 1 sent by client 1 and model parameter 2 sent by client 2 , ..., the model parameters N sent by client N.
  • the parameter service device may obtain M weight combinations for the local model parameter set 20g through the search unit (the value of M may be a positive integer, such as M may be 1, 2, 3, . . . ), at this time
  • the M weight combinations may refer to the M types of model parameter fusion modes selected by the search unit for the local model parameter set 20g, and each weight combination may include the respective training influence weights corresponding to the local model parameters included in the local model parameter set 20g, That is to say, each weight combination can be understood as an N-dimensional vector.
  • the above M weight combinations can include: ⁇ a1, a2, a3, ..., aN ⁇ , ⁇ b1, b2, b3, ..., bN ⁇ , and further parameter fusion can be performed on the local model parameter set 20g according to each weight combination to obtain M candidate global models.
  • the evaluation indicators corresponding to the M candidate global models are obtained through the verification data set.
  • the evaluation indicators of the candidate global model 1 in the verification data set are: evaluation indicator 1, and the candidate global model 2 in the verification data set.
  • the evaluation index is: evaluation index 2, ..., and the evaluation index of the candidate global model M in the verification data set is: evaluation index M.
  • the local local model parameters can be updated according to the target global model, and then the training can be continued based on the updated local model parameters; when the number of training iterations reaches the next synchronization cycle (for example, At the 200th training iteration), the above operations can be repeated until the training process of the local recognition model is completed.
  • N clients perform federated training on the same recognition model, that is, the initial recognition model used before federated training is the same, and the maximum number of iterations, synchronization period, and The training methods and other information are the same.
  • the object recognition model obtained by each client may be the same, or there may be some differences, which are related to the differences between the face sample data held by each client.
  • the parameter service device determines the target global model 50 in the 50th synchronization cycle, it sends it to each client, and each client can update the parameters of its associated local identification model according to the target global model 50 , the local recognition model of each client is the same.
  • Each client can continue to train the local recognition model based on the face sample data held by each client.
  • the recognition model that is, the local model parameters obtained after the 50th synchronization cycle is not parameterized, so there may be some differences in the object recognition models finally obtained by each client.
  • the embodiments of the present application can improve the effectiveness of parameter fusion between local model parameters during the federated training process, thereby improving the generalization effect of the object recognition model.
  • FIG. 3 is a schematic time sequence diagram of a data processing method provided by an embodiment of the present application.
  • the data processing method can be executed interactively by a client and a service device, the client can be a client integrated with any user terminal in the user terminal cluster shown in FIG. 1, and the service device can be an independent server. (for example, the server 10d shown in FIG. 1 above), or a server cluster composed of multiple servers, or a user terminal or the like.
  • the data processing method may include the following steps:
  • Step S101 the client uploads the local model parameters corresponding to the target local recognition model in response to the training times of the target local recognition model meeting the synchronization period.
  • the multimedia data held by N (N can be a positive integer greater than 1) clients are of the same type, and the multimedia data held by N clients involves data privacy and data security , indicating that the multimedia data held by N clients cannot be aggregated. If it is necessary to use the multimedia data held by N clients to train the recognition model, on the premise of ensuring the data security and privacy of each client, the recognition model can be trained by means of federated training.
  • the multimedia data held by the N clients may all be used as multimedia sample data.
  • the multimedia sample data may include face image data, user financial data, surveillance video data, user commodity data, etc.
  • each multimedia sample data may include objects of target object types, and the target object types may include human faces, Object types such as pedestrians, commodities, etc.
  • N clients can use their own multimedia data for independent training locally.
  • the recognition model independently trained by each client can be called a local recognition model, and each client can periodically upload the locally trained local recognition model.
  • the synchronization period may be set according to actual requirements. For example, the synchronization period may be set to K training times (also referred to as the number of training steps), which means that the local recognition model in the client needs to
  • the local model parameters corresponding to the identification model are uploaded to the service device (such as the parameter service device in the embodiment corresponding to the above Figure 2a) for synchronization, and the value of K is a positive integer greater than 1, such as K can be 100, 400 , 1600 and other values.
  • the training process of each of the N clients for the local recognition model is similar, but the multimedia sample data used are different.
  • the following selects any client from the N clients as the target client, and uses the target Taking the client as an example, the training process of the local recognition model is described.
  • the target client can obtain the multimedia sample data held by itself, and input the multimedia sample data into the target local recognition model (the target local model here refers to the local recognition model that the target client independently trains locally), through the target local recognition model.
  • the recognition model can output the object space features corresponding to the multimedia sample data.
  • the target client in the process of training the target local recognition model, can read the multimedia sample data held by itself, and form a batch of the read multimedia sample data. The multimedia sample data contained in the batch can be input to the target local recognition model.
  • the target local recognition model may be a convolutional neural network, and the target local recognition model at this time may include a convolution layer (Convolution Layer), a nonlinear activation layer (ReLU (Rectified Linear Unit, linear rectification function) Layer) and a pooling layer (Pooling Layer) and other network layers.
  • Convolution Layer convolution Layer
  • ReLU Rectified Linear Unit, linear rectification function
  • Pooling Layer a pooling layer
  • convolution calculation convolution calculation is performed through the convolution layer
  • nonlinear activation function calculation through the nonlinear activation layer
  • perform nonlinear activation function calculation pooling calculation (perform pooling calculation through the pooling layer) and other operations, and output the object space features corresponding to the multimedia sample data, that is, the object space features in the multimedia sample data can be extracted through the target local recognition model.
  • the multimedia sample data contained in the above batch can be expressed as X k
  • the target client can use the gradient descent (GD) method for iterative training.
  • Gradient descent is an iterative learning algorithm.
  • Multimedia sample data can be used to update the local model parameters of the target local recognition model.
  • the batch size (also called batch size) is a hyperparameter of gradient descent that controls the number of training samples before the internal parameters of the target local recognition model are updated.
  • the target client can determine the training loss function corresponding to the target local recognition model according to the label information corresponding to the object space feature and the multimedia sample data, and then can determine the training gradient of the target local recognition model according to the training loss function, according to the training gradient and The training learning rate corresponding to the target local recognition model, update the parameters of the target local recognition model, and count the training times corresponding to the target local recognition model.
  • the target client can calculate the training loss corresponding to the training loss function according to the object space features extracted by the target local recognition model and the label information carried by the multimedia sample data. After the training loss is calculated, the training loss can be calculated according to the chain rule. gradient where L is the training loss function, represents the gradient computation, and ⁇ can be used to represent the local model parameters trained by the target client.
  • the training loss function can be a classification function (for example, a softmax function), or a CosFace function (a loss function that maximizes between-class differences and within-class differences through normalization and maximization of cosine decision boundaries) The minimization of ) and the ArcFace function (a loss function that optimizes the inter-class distance from the arc cosine space, by adding m to the angle, making the cos value smaller in the monotonic interval).
  • a classification function for example, a softmax function
  • CosFace function a loss function that maximizes between-class differences and within-class differences through normalization and maximization of cosine decision boundaries
  • the minimization of and the ArcFace function (a loss function that optimizes the inter-class distance from the arc cosine space, by adding m to the angle, making the cos value smaller in the monotonic interval).
  • the training learning rate corresponding to the target local recognition model can be obtained, and the local model parameters of the target local recognition model are updated according to the training learning rate and the training gradient.
  • the update method can be expressed as: Among them, ⁇ a, r, k can be expressed as the target local recognition model (that is, the local recognition model independently trained by the a-th client among the N clients, a is a positive integer less than or equal to N) in the r-th synchronization
  • the local model parameters obtained by the kth training in the cycle, x a, r, k can be expressed as the multimedia sample data used by the kth training of the target local recognition model in the rth synchronization cycle, ⁇ r can be expressed as is the training learning rate of the target local recognition model in the rth synchronization cycle, It can be expressed as the training gradient of the target local recognition model during the kth training in the rth synchronization cycle, and ⁇ a,r,k+1 can
  • the training can be terminated, which means that the training process of the target local recognition model is completed.
  • the training times of the target local recognition model can be increased once, that is, the target client can count the training times of the target local recognition model in real time.
  • the training times of the target local recognition model satisfies the synchronization period, that is, when the training times of the target local recognition model is a multiple of the above synchronization period K
  • the current local model parameters of the target local recognition model can be sent to the service device.
  • the synchronization period K is 100
  • the training times of the target local recognition model is 100
  • the local model parameters obtained by the 100th training need to be sent to the service equipment for synchronization
  • the training times of the target local recognition model are 200
  • the local model parameters obtained from the 200th training can be sent to the service device for synchronization; and so on, until the training times of the target local recognition model reaches the set maximum number of iterations, the training of the target local recognition model is terminated.
  • the above operations can be performed, and when the training times of the associated local recognition models meet the synchronization period, the local model parameters of the associated local recognition models can be sent to the service device. .
  • Step S102 the service device obtains the local model parameters corresponding to the N local recognition models respectively; the N local recognition models are obtained by independently training N clients, and each client includes a parameter for training the associated local recognition model.
  • Multimedia sample data the multimedia sample data includes objects of the target object type, and N is a positive integer greater than 1.
  • the service device can obtain the local model parameters corresponding to the N local recognition models respectively.
  • Each local recognition model can correspond to one client, and N local recognition models can be independently trained in different clients.
  • the multimedia sample data held by each client for training the local recognition model is not disclosed to the public of.
  • the synchronization period K (also referred to as the synchronization interval) can be set to a value of hundreds or thousands. It is the local model parameters of the local recognition model that are synchronized between each other, rather than the gradients at each training iteration, which can improve the efficiency of federated training.
  • Step S103 the service device obtains M parameter fusion methods associated with the local model parameter set, and performs parameter fusion on the local model parameter set according to each parameter fusion method to obtain M candidate global models; the local model parameter set is based on N The local model parameters corresponding to each local recognition model are determined, and M is a positive integer.
  • the service device After acquiring the local model parameters sent by the N clients respectively, the service device determines a local model parameter set (such as the local model parameter set 20g in the embodiment corresponding to FIG. 2 b ) based on the local model parameters uploaded by the N clients. .
  • a local model parameter set such as the local model parameter set 20g in the embodiment corresponding to FIG. 2 b
  • the manner of determining the local model parameter set based on the local model parameters corresponding to the N local recognition models includes: taking the set including the local model parameters corresponding to the N local recognition models as the local model parameter set; or, Select the local model parameters corresponding to L (L is a positive integer less than N) local recognition models from the local model parameters corresponding to the N local recognition models, and the set of local model parameters corresponding to the L local recognition models will be included. as a set of local model parameters.
  • the embodiment of the present application does not limit the manner of selecting the local model parameters corresponding to the L local recognition models from the local model parameters corresponding to the N local recognition models respectively.
  • the local model parameters corresponding to the L local recognition models are randomly selected from the local model parameters corresponding to the N local recognition models; or, L is selected from the local model parameters corresponding to the N local recognition models according to experience.
  • the local model parameters corresponding to each local recognition model respectively.
  • the local model parameters respectively sent by the N clients may be expressed as: ⁇ j , j ⁇ 1, 2,...,N ⁇ , and the above-mentioned local model parameter set may include the local model parameters corresponding to the N clients respectively , and may also include local model parameters corresponding to L clients among the N clients, where L is a positive integer less than N.
  • the service device can use different parameter fusion methods to perform parameter fusion for each local model parameter included in the local model parameter set, and after determining the local model parameter set, obtain M (M is a positive integer) associated with the local model parameter set. Then, according to each parameter fusion, the local model parameters are respectively fused to obtain M alternative global models. According to each parameter fusion method, an alternative global model can be obtained.
  • the parameter fusion method is a method that can be used under the parameter fusion scheme, and the parameter fusion scheme includes but is not limited to a global weighted average scheme, a voting scheme, an average scheme, and the like. That is to say, the parameter fusion method under the global weighted average scheme can be used to perform parameter fusion on the local model parameter set, or the parameter fusion method under the voting scheme or the parameter fusion method under the averaging scheme can be used to perform parameter fusion on the local model parameter set.
  • the M parameter fusion manners may include parameter fusion manners under one or more parameter fusion schemes.
  • the embodiments of the present application take the parameter fusion of the local model parameter set based on the global weighted average scheme as an example for description. That is to say, the M parameter fusion methods are parameter fusion methods under the global weighted average scheme.
  • a parameter fusion method is implemented based on a weight combination.
  • the service device may search for the optimal weight combination for the local model parameter set in the search space, and perform a weighted average of the optimal weight combination and the local model parameter set to obtain the optimal global model.
  • the service device may acquire M weight combinations associated with the local identification model parameter set, and perform parameter fusion on the local model parameters according to each weight combination to obtain M candidate global models.
  • Each weight combination includes training influence weights respectively corresponding to each of the local model parameters in the local model parameters.
  • the process of performing parameter fusion on the local model parameters according to the weight combination i is: combining the training influence weight included in the weight combination i with the parameters included in the local model parameter set.
  • the weighted average of each local model parameter of obtains the fusion model parameters, and the identification model carrying the fusion model parameters is determined as the candidate global model i associated with the weight combination i.
  • one weight combination may include training influence weights corresponding to N local model parameters; for any weight combination i in the M weight combinations, you can Perform a weighted average of the training influence weight included in the weight combination i and the N local model parameters included in the local model parameter set to obtain the fusion model parameters, and determine the identification model carrying the fusion model parameters as the above weight combination i.
  • Alternative global model i where i is a positive integer less than or equal to M.
  • the service device may randomly generate M weight combinations associated with the local identification model parameter set in each synchronization process, and the M weight combinations may be expressed as:
  • any one of the M weight combinations (that is, the above weight combination i) may include training influence weights corresponding to N local model parameters, and each weight combination The sum of all training influence weights contained in is 1.
  • weight combination The training influence weights in can be expressed as w a , a ⁇ ⁇ 1, 2, ..., N ⁇ , and the weights are combined All training influence weights in satisfies the condition Then the weights can be combined
  • the N training influence weights included in the weighted average are weighted with the N local model parameters included in the local recognition model parameter set to obtain the fusion model parameters.
  • the recognition model carrying the fusion model parameters is determined as the candidate global model (i.e. the above alternative global model i), i.e. Based on the above operation process, M alternative global models can be obtained, and the M alternative global models can be expressed as:
  • the acquisition process may include: N values are sampled within the value range, the sum of the absolute values corresponding to the N values is determined as the norm value, and the ratio between the N values and the norm value is determined as the weight combination i associated with the local model parameter set .
  • N values can be sampled on a uniform distribution in [0, 1], the N values can be formed into an N-dimensional vector, and then the N values can be divided by the L1 norm of the vector (that is, the N values The sum of the corresponding absolute values) to ensure that the sum of the N values obtained is 1, and the weight combination i can be obtained. weight combination.
  • the model parameter fusion scheme can also adopt a partial local fusion scheme.
  • the local model parameters of L clients can be randomly selected for fusion in each synchronization, increasing the randomness in the model parameter fusion process. That is, local model parameters corresponding to L clients can be selected from N clients in each synchronization, and the local model parameter set at this time can include local model parameters corresponding to L clients respectively.
  • M weight combinations associated with the local identification model parameter set can be obtained, and one weight combination can include the training influence weights corresponding to the L local model parameters, that is, the current
  • Each weight combination may refer to an L-dimensional vector, and the sum of the L training influence weights included in each weight combination is 1.
  • the acquisition method of the M weight combinations and the parameter fusion process of the L local model parameters are the same as the operations in the case where the above-mentioned local model parameter set includes local model parameters corresponding to N clients respectively, and will not be repeated here.
  • Step S104 the service device obtains the evaluation indicators of the M candidate global models in the multimedia verification data set respectively, and determines the target global model among the M candidate global models according to the evaluation indicators.
  • the service device may obtain a multimedia verification data set including positive sample pairs and negative sample pairs in the evaluation unit, wherein the positive sample pairs refer to multimedia sample data pairs (for example, fancier sample pairs) that include the same object, and the negative sample pairs refer to Pairs of multimedia sample data containing different objects (eg, pairs of non-fan samples).
  • the positive sample pairs refer to multimedia sample data pairs (for example, fancier sample pairs) that include the same object
  • the negative sample pairs refer to Pairs of multimedia sample data containing different objects (eg, pairs of non-fan samples).
  • the candidate global model i (any candidate global model among the M candidate global models) from the M candidate global models, input the positive sample pair to the candidate global model i, and use the candidate global model i
  • the first object prediction result of the positive sample pair can be output; the negative sample pair can be input to the alternative global model i, and the second object prediction result of the negative sample pair can be output through the alternative global model i; and then can be predicted according to the first object
  • the result and the second object prediction result determine the evaluation index of the candidate global model i in the multimedia verification data set.
  • the service device can sequentially input each sample pair (positive sample pair and negative sample pair) included in the multimedia verification data set into the candidate global model i, and the candidate global model i can output each positive sample pair separately
  • the evaluation index of the candidate global model i in the multimedia verification data set can be determined according to the prediction result.
  • the above method can be used to determine the evaluation index of each candidate global model in the multimedia verification data set, so that the target global model is determined from the M alternative global models according to the evaluation index.
  • the candidate global model corresponding to the largest evaluation index is determined as the target global model.
  • the determination process of the evaluation index may include: the service device, according to the first object prediction result, counts the number of first correct predictions of the candidate global model i in the positive sample pair; Two-object prediction results, count the second correct prediction number of the candidate global model i in the negative sample pair; determine the sum of the first correct prediction number and the second correct prediction number as the candidate global model i in the multimedia verification data set Obtain the total number of sample pairs corresponding to the multimedia verification data set, and determine the evaluation of the candidate global model i in the multimedia verification data set according to the ratio between the total number of predicted correct sample pairs and the total number of sample pairs index.
  • the number of first correct predictions of the candidate global model i in the positive sample pair can be counted (which can refer to the number of correct predictions that are themselves positive sample pairs, It can also be called true positive, TP), the number of first mispredictions in positive sample pairs (which can refer to the number of mispredicted and itself positive sample pairs, also called false negative, FN), in negative sample pairs
  • TP true positive
  • FN false negative
  • the second number of correct predictions in And itself is the number of negative sample pairs, which can also be called false positive, FP).
  • the above-mentioned P multimedia verification data sets may include multimedia verification data sets j, where P is a positive integer, and j is a positive integer less than or equal to P.
  • the evaluation index The determination process may include: the service device may use the total number of correct sample pairs (TP+TN) predicted by the candidate global model i in the multimedia verification data set j, and the total number of sample pairs (TP+TN) corresponding to the multimedia verification data set j
  • the prediction accuracy rates of the candidate global models i in the P multimedia verification data sets can be obtained, and the average accuracy rate mean corresponding to the P prediction accuracy rates and the standard deviation value std corresponding to the P prediction accuracy rates can be counted; rate and standard deviation to determine the evaluation index of the candidate global model i in the multimedia validation dataset Among them, the evaluation index can be calculated as:
  • the evaluation index (such as the above prediction accuracy) in the multimedia verification data set j can be expressed as S j , j ⁇ 1, 2,...,P ⁇ , and then the initial evaluation index S j is normalized to eliminate the multimedia Validate the effect of discrepancies between datasets.
  • the initial evaluation index S j can be locally normalized (Local Norm):
  • S'j in the above formula (2) can be expressed as an evaluation index after local normalization
  • can be expressed as an activation function
  • can be a hyperparameter in the training process
  • the hyperparameter ⁇ can be based on actual needs. Make settings.
  • Moving normalization processing can be performed on the initial evaluation index S j :
  • S'j in the above formulas (3) to (5) can be expressed as an evaluation index after moving normalization
  • can be expressed as a normalization parameter
  • can be expressed as a moving average
  • v can be expressed as is the moving variance
  • can be the hyperparameter in the training process
  • ⁇ last can be the moving average corresponding to the latest training
  • v last can be the moving variance corresponding to the latest training
  • the hyperparameter ⁇ in the above formula (2) can be the same or different.
  • the weight corresponding to the largest evaluation index in the M evaluation indexes can be obtained. combination, as the optimal weight combination, and apply the optimal weight combination to the local recognition model corresponding to each client:
  • S′ j,a in the above formula (6) and formula (7) can be expressed as the evaluation index of the local recognition model corresponding to the ath client in the multimedia verification data set j, can be expressed as applying the optimal weight combination to the local recognition model of the a-th client, a ⁇ ⁇ 1, 2, ..., N ⁇ , It can be expressed as the training influence weight of the a-th local recognition model (the local recognition model corresponding to the a-th client) determined based on the above optimal weight combination; w last can be the local recognition model of the a-th client at the latest
  • the weights corresponding to a training session can be a hyperparameter in the training process, the hyperparameter It can be set according to actual needs.
  • the service device can obtain the candidate global model i included in the M candidate global models in the multimedia Verify the false acceptance rate in the data set, and determine the similarity threshold in the similarity corresponding to the negative sample pair, where the similarity threshold is determined by the number of negative sample pairs and the false acceptance rate; and then the corresponding positive sample pairs can be obtained. Similarity, obtain the first sample pair whose similarity is greater than the similarity threshold in the positive sample pair, and determine the ratio between the number of the first sample pair and the number of positive sample pairs as the evaluation corresponding to the candidate global model i index.
  • TPR recall rate
  • FAR Fixed false acceptance rate
  • the process of determining the false acceptance rate may include: obtaining the number of false predictions in the negative sample pair of the candidate global model i included in the M candidate global models (that is, the above-mentioned second number of false predictions, FP); The ratio between the number of wrong predictions and the number of negative sample pairs (the sum of the second wrong prediction number and the second correct prediction number, i.e. FP+TN) is determined as the false acceptance of the alternative global model i in the multimedia validation dataset
  • the similarity eg, cosine
  • the service device can determine the candidate global model corresponding to the largest evaluation indicator as the target global model according to the evaluation indicators of the M candidate global models in the multimedia verification data set respectively, and the target global model corresponding to the target global model.
  • the weight combination is determined as the optimal weight combination among the M weight combinations. In different synchronization processes, the optimal weight combinations are different, for example, the optimal weight combinations determined in the first synchronization process and the second synchronization process are different.
  • FIG. 4 is a schematic diagram of a global model for determining a target provided by an embodiment of the present application.
  • the determination process of the weight combination is described by taking the evaluation index as the accuracy rate as an example.
  • the shades of color in the area 30a are used to indicate that the candidate global model corresponding to the 12800th training iteration is on the multimedia verification data set
  • the histogram area 30b can be used to explain the relationship between the colors in the area 30a and the accuracy value, and each position in the area 30a can represent a weight combination.
  • the color depth in the area 30c is used to indicate that the candidate global model corresponding to the 256,000th training iteration is on the multimedia verification data set
  • the histogram area 30d can be used to explain the relationship between the colors in the area 30c and the accuracy value, and each position in the area 30c can also represent a weight combination.
  • Regions 30a and 30c indicate that in different training stages, the weight combination of the best results on the multimedia validation dataset is in different positions and changes dynamically. As shown in Figure 4, the optimal weight combination at the 12800th training iteration is: optimal weight combination 1, and the optimal weight combination at the 256000th training iteration is: optimal weight combination 2.
  • the service device may determine the rth synchronization in the candidate global model according to the evaluation index.
  • the target global model corresponding to the cycle is obtained, and the historical global model corresponding to the (r-1)th synchronization cycle is obtained, wherein the historical global model is based on the local models uploaded by N clients in the (r-1)th synchronization cycle. generated by the parameters.
  • the training learning rate of the N local recognition models in the rth synchronization period can be obtained; the model parameter difference between the target global model and the historical global model can be obtained; the ratio between the model parameter difference and the training learning rate can be determined.
  • the federated momentum is sent to N clients; the federated momentum together with the target global model is used to instruct the N clients to update the parameters of the associated local recognition models, and the federated momentum is used to instruct the N local recognition models respectively.
  • the training direction in the owning client is used to instruct the federated momentum.
  • the training learning rate in the rth synchronization cycle can be expressed as ⁇ r
  • the federated momentum at this time can be expressed as ( can be expressed as the federated momentum corresponding to the rth synchronization period, can be expressed as the above model parameter difference).
  • the training learning rate ⁇ r can be a fixed value, or can be adaptively changed, such as the training learning rate can be set to 0.1 when any client completes training all multimedia sample data held for the first time.
  • the training learning rate when fully training all multimedia sample data held can be set to 0.02, etc.
  • the federated momentum at the first synchronization cycle can be expressed as:
  • Step S105 the service device returns to the target global model.
  • the service device can return the above target global model to N clients. After receiving the target global model returned by the service device, any client can update the parameters of the local recognition model according to the target global model, and based on the updated local model parameters to continue training.
  • the service device when the service device generates federated momentum , the service device can combine the target global model and federated momentum Return to N clients together, any client receives the target global model and federated momentum returned by the service device After that, it can be based on the target global model and federated momentum
  • the parameters of the local recognition model are updated, and the training is continued based on the updated local model parameters.
  • Step S106 the client receives the target global model, updates the parameters of the target local recognition model according to the target global model, and determines the target local recognition model after parameter update as the object recognition model; the object recognition model is used to recognize the multimedia data.
  • An object of the target object type is used to recognize the multimedia data.
  • the target client After receiving the target global model returned by the service device, the target client can update the parameters of its own target local recognition model according to the target global model, and continue to perform local training on the target local recognition model until the target local recognition model is trained.
  • the training termination condition including the training convergence condition, the maximum number of iterations, etc.
  • the training process of the target local recognition model is completed, and the trained object recognition model is obtained, and the object recognition model is used to recognize the multimedia data.
  • An object of the target object type including the training convergence condition, the maximum number of iterations, etc.
  • the target client receives the target global model and federated momentum returned by the service device Then, the training gradient and federated momentum in the local training of the target client can be Combined, the parameters of the target local recognition model are updated, such as ⁇ i can be expressed as the local model parameter corresponding to the target local recognition model, ⁇ ′ i can be expressed as the local model parameter obtained by the target client i after the update of the rth synchronization period, and g can be expressed as the corresponding value of the rth synchronization period.
  • the training gradient, K can be expressed as the number of training times corresponding to a synchronization cycle.
  • FIG. 5 is a flowchart of a federated model training method provided by an embodiment of the present application. As shown in FIG. 5 , taking the multimedia sample data as a face image as an example, the implementation process of the federated model training method is described in detail. The federated model training method can be implemented through the following steps S11 to S22.
  • Step S11 the client reads the local training data, that is, the face sample data (that is, the above-mentioned multimedia sample data) held by the client can be obtained, and then step S12 can be continued to obtain the face recognition model for initialization processing (The above-mentioned local recognition model), perform local training on the face recognition model through the face sample data, that is, perform step S13, calculate the training loss and training gradient of the face recognition model, and count the training times of the face recognition model in real time.
  • the local training data that is, the face sample data (that is, the above-mentioned multimedia sample data) held by the client can be obtained
  • step S12 can be continued to obtain the face recognition model for initialization processing (The above-mentioned local recognition model)
  • step S13 calculate the training loss and training gradient of the face recognition model, and count the training times of the face recognition model in real time.
  • the client can continue to perform step S14 to determine whether the training times meet the synchronization period (the above-mentioned synchronization period K), if the training times of the face recognition model meet the synchronization period, then continue to perform step S15, upload the client model parameters to the service device ( The current model parameters of the face recognition model, that is, the above-mentioned local model parameters); if the training times of the face recognition model does not meet the synchronization period, continue to step S21 to determine whether the face recognition model meets the training termination condition. If the face recognition model satisfies the training termination condition, it means that the face recognition model is trained; if the face recognition model does not meet the training termination condition, continue to step S22 to update the local model parameters of the face recognition model.
  • All clients can upload the local model parameters to the service device when the number of training times corresponding to the face recognition model in the local training meets the synchronization period, and the service device can receive the uploaded data from all clients. and continue to perform step S16 to generate models corresponding to different fusion schemes in the search space. For example, M weight combinations are acquired, and each weight combination is weighted and averaged with the received local model parameters to obtain M candidate global models. For the specific implementation process of the M weight combinations, refer to step S103 above.
  • step S17 and step S18 the evaluation unit reads the verification set data (that is, the above-mentioned multimedia verification data set), and obtains the evaluation index corresponding to each candidate global model respectively in the verification set (that is, calculates the verification set index), Then continue to execute step S19-step S20.
  • the verification set data that is, the above-mentioned multimedia verification data set
  • the evaluation index corresponding to each candidate global model respectively in the verification set that is, calculates the verification set index
  • the fusion scheme corresponding to the optimal evaluation index is selected for fusion to obtain the target global model (ie, the candidate global model corresponding to the optimal evaluation index), and the target global model is delivered to each client.
  • the client receives the target global model and the face recognition model does not meet the training termination condition, it continues to perform step S22, and updates the parameters of the face recognition model according to the target global model. It can be understood that the above steps S12 to 22 can be repeatedly performed until the face recognition model satisfies the training termination condition, and the training of the face recognition model has been completed.
  • FIG. 6 is a schematic diagram of a weight combination in a multimedia verification data set provided by an embodiment of the present application.
  • the multimedia sample data used in federated training are the above-mentioned client 1, client 2 and client 3 respectively
  • the ordinate may be the training influence weight corresponding to different epochs.
  • the training influence weights corresponding to the local model parameters trained by the three clients are more concentrated, that is to say, the later the training stage, the closer the training influence weights corresponding to each client are.
  • the target client when the multimedia data includes a face image to be identified, and the target object type includes a face type, the target client can obtain the face image to be identified, and use the face image to be identified.
  • the face image is input to the object recognition model, and the face space feature corresponding to the face image to be recognized is output through the object recognition model; then the face classification result corresponding to the face image to be recognized can be determined according to the face space feature; the face classification result
  • the object recognition model can be used in any face recognition scenarios, such as user identity authentication scenarios, missing persons tracing scenarios, business processing scenarios, etc.
  • face recognition scenarios such as user identity authentication scenarios, missing persons tracing scenarios, business processing scenarios, etc.
  • an object recognition model can be used to identify the user's face image provided by the user in the identity authentication scenario to confirm the identity authenticity of the user's face image;
  • the photos of the missing persons before the missing persons can be identified and compared with the existing household registration photos to obtain the suspected users of the missing persons.
  • FIG. 7 is a schematic diagram of a user identity authentication scenario provided by an embodiment of the present application.
  • the user A wants to perform services in the client terminal 1 installed on the user terminal 40a
  • the user A needs to perform identity verification in the client terminal 1.
  • a face verification frame 40b may be displayed in the client terminal 1.
  • User A can align the face with the face verification frame 40b in the user terminal 40a, and follow the instructions to perform corresponding actions (for example, shaking his head, nodding, blinking, etc.), and the user terminal 40a can collect the face verification frame 40b in real time.
  • the client 1 can obtain the certificate image 40e uploaded by the user A in advance from the existing face image database, and compare the certificate image 40e with the face recognition result output by the object recognition model 40d.
  • the face recognition results are the same, it can be determined that the identity verification of user A has passed, and the identity verification passing result is returned to the client 1 of the user terminal 40a; if the certificate image 40e is not the same as the face recognition result, it can be determined that the identity verification of user A has not passed, And return the result that the identity verification failed to the client 1 of the user terminal 40a, and remind the user A to perform the identity verification again.
  • the local model parameters of the local identification models to which N clients are uploaded can be obtained, and M parameter fusion methods (eg, M parameter fusion methods for the local model parameter set determined based on the N local model parameters can be obtained) weight combinations) (M is a positive integer), and through each parameter fusion method (for example, each weight combination), the parameters of the local model parameter set are fused respectively to obtain M candidate global models, and then through the M candidate global models
  • M is a positive integer
  • each parameter fusion method for example, each weight combination
  • the model evaluates the indicators in the multimedia verification dataset respectively, and selects the optimal target global model among M alternative global models. That is, selecting the optimal target global model from the M alternative global models obtained according to the M parameter fusion methods can improve the fusion effectiveness between the N local model parameters.
  • the parameters of the local recognition model to which it belongs can be updated, which can improve the generalization effect of the object recognition model.
  • the embodiments of the present application can be applied to cross-department, cross-enterprise, and even cross-regional business data, and under the condition of ensuring data privacy and security, the recognition effect of the object recognition model can be improved.
  • FIG. 8 is a schematic structural diagram of a data processing apparatus provided by an embodiment of the present application.
  • the above-mentioned data processing apparatus may be a service device (for example, a server 10d ) applied to the above-mentioned embodiment corresponding to FIG. 1 .
  • the data processing device 1 may include: a parameter acquisition module 11, a parameter fusion module 12, and a model determination module 13;
  • the parameter acquisition module 11 is used to acquire the local model parameters corresponding to the N local recognition models respectively; the N local recognition models are obtained by independent training of N clients, and each client includes the associated local recognition used for training.
  • the multimedia sample data of the model, the multimedia sample data contains objects of the target object type, and N is a positive integer greater than 1;
  • the parameter fusion module 12 is used to obtain M parameter fusion methods associated with the local model parameter set, and respectively perform parameter fusion on the local model parameter set according to each parameter fusion method to obtain M candidate global models; the local model parameter set Based on the local model parameters corresponding to the N local recognition models, M is a positive integer;
  • the model determination module 13 is used to obtain the evaluation indicators of the M candidate global models in the multimedia verification data set respectively, determine the target global model in the M candidate global models according to the evaluation indicators, and transmit the target global model to N clients , so that the N clients respectively update the parameters of the associated local recognition model according to the target global model to obtain the object recognition model; the object recognition model is used to recognize the object of the target object type contained in the multimedia data.
  • a parameter fusion method is implemented based on a weight combination.
  • the parameter fusion module 12 is used to obtain M weight combinations associated with the local model parameter set, and according to each weight combination, the local model parameters are respectively The set performs parameter fusion to obtain M candidate global models; each weight combination includes the training influence weight corresponding to each local model parameter in the local model parameter set.
  • the parameter fusion module 12 may include: a weight combination acquisition unit 121, a weighted average unit 122;
  • a weight combination acquisition unit 121 configured to acquire M weight combinations associated with the local model parameter set; the M weight combinations include weight combination i, where i is a positive integer less than or equal to M;
  • the weighted average unit 122 is used to perform a weighted average of the training influence weight included in the weight combination i and each local model parameter included in the local model parameter set to obtain the fusion model parameters, and determine the identification model that carries the fusion model parameters as the weight combination.
  • the model determination module 13 is specifically configured to: among the M candidate global models, determine the candidate global model corresponding to the largest evaluation index as the target global model.
  • step S103 For the specific function implementation manner of the weight combination obtaining unit 121 and the weighted averaging unit 122, reference may be made to step S103 in the embodiment corresponding to FIG. 3 above, which will not be repeated here.
  • the local model parameter set includes local model parameters corresponding to N local identification models respectively
  • the weight combination obtaining unit 121 may include: a norm value determination subunit 1211, and a weight determination subunit 1212;
  • the norm value determination subunit 1211 is used to sample N values within the target value range, and determine the sum of the absolute values corresponding to the N values as the norm value;
  • the weight determination subunit 1212 is configured to determine the ratio between the N values and the norm value as the weight combination i associated with the local model parameter set.
  • step S103 For the specific function implementation manner of the norm value determination subunit 1211 and the weight determination subunit 1212, reference may be made to step S103 in the embodiment corresponding to FIG. 3 above, which will not be repeated here.
  • the model determination module 13 may include: a verification data set acquisition unit 131, a first prediction unit 132, a second prediction unit 133, and a first evaluation index determination unit 134;
  • the verification data set acquisition unit 131 is used to acquire a multimedia verification data set containing positive sample pairs and negative sample pairs; positive sample pairs refer to multimedia sample data pairs containing the same object, and negative sample pairs refer to multimedia sample data containing different objects right;
  • the first prediction unit 132 is used to input the positive sample pair to the candidate global model i among the M candidate global models, and output the first object prediction result of the positive sample pair through the candidate global model i; i is less than or equal to positive integer of M;
  • the second prediction unit 133 is configured to input the negative sample pair to the candidate global model i, and output the second object prediction result of the negative sample pair through the candidate global model i;
  • the first evaluation index determining unit 134 is configured to determine the evaluation index of the candidate global model i in the multimedia verification data set according to the first object prediction result and the second object prediction result.
  • the first evaluation index determination unit 134 may include: a prediction result statistics subunit 1341, a correct sample pair total amount statistics subunit 1342, and an evaluation index calculation subunit 1343;
  • the prediction result statistics subunit 1341 is used to count the first correct prediction number of the candidate global model i in the positive sample pair according to the first object prediction result; according to the second object prediction result, count the candidate global model i in the negative sample. the number of second correct predictions in the pair;
  • the correct sample pair total count subunit 1342 is used to determine the sum of the first correct prediction quantity and the second correct prediction quantity as the predicted correct sample pair total quantity of the candidate global model i in the multimedia verification data set;
  • the evaluation index calculation subunit 1343 is used to obtain the total number of sample pairs corresponding to the multimedia verification data set, and according to the ratio between the total number of predicted correct sample pairs and the total number of sample pairs, determine the candidate global model i in the multimedia verification data set. Evaluation Metrics.
  • the number of multimedia verification data sets is P
  • the P multimedia verification data sets include multimedia verification data set j
  • P is a positive integer
  • j is a positive integer less than or equal to P
  • the evaluation index calculation subunit 1343 is specifically configured to: determine the ratio between the total number of correct sample pairs predicted by the candidate global model i in the multimedia verification data set j and the total number of sample pairs corresponding to the multimedia verification data set j as The prediction accuracy of the alternative global model i in the multimedia verification data set j; obtain the prediction accuracy of the alternative global model i in the P multimedia verification data sets, and calculate the average accuracy corresponding to the P prediction accuracy, and P The standard deviation value corresponding to each prediction accuracy rate; according to the average accuracy rate and standard deviation value, the evaluation index of the candidate global model i in the P multimedia verification data sets is determined.
  • step S104 The specific function implementation of the verification data set obtaining unit 131, the first predicting unit 132, the second predicting unit 133, and the first evaluation index determining unit 134 may refer to step S104 in the embodiment corresponding to FIG. 3 above, which will not be repeated here. .
  • the model determination module 13 may include: a verification data set acquisition unit 131, a similarity threshold determination unit 135, and a second evaluation index determination unit 136;
  • the verification data set acquisition unit 131 is used to acquire a multimedia verification data set containing positive sample pairs and negative sample pairs; positive sample pairs refer to multimedia sample data pairs containing the same object, and negative sample pairs refer to multimedia sample data containing different objects right;
  • the similarity threshold determination unit 135 is used to obtain the false acceptance rate of the candidate global model i included in the M candidate global models in the multimedia verification data set, and determine the similarity threshold in the corresponding similarity of the negative sample pair;
  • the similarity threshold is determined by the number of negative sample pairs and the false acceptance rate, i is a positive integer less than or equal to M;
  • the second evaluation index determining unit 136 is configured to obtain the similarity corresponding to the positive sample pair, obtain the first sample pair whose similarity is greater than the similarity threshold from the positive sample pair, and compare the number of the first sample pair with the positive sample pair. The ratio between the number of pairs is determined as the evaluation metric for the candidate global model i in the multimedia validation dataset.
  • the similarity threshold determination unit 135 may include: a subunit 1351 for obtaining the number of false predictions, and a subunit 1352 for determining the false acceptance rate;
  • the number of false predictions obtaining subunit 1351 is used to obtain the number of false predictions of the candidate global model i included in the M candidate global models in the negative sample pair;
  • the false acceptance rate determination subunit 1352 is configured to determine the ratio between the number of false predictions and the number of negative sample pairs as the false acceptance rate of the candidate global model i in the multimedia verification data set.
  • the specific function implementation of the verification data set acquisition unit 131, the similarity threshold determination unit 135, and the second evaluation index determination unit 136 can refer to step S104 in the embodiment corresponding to FIG. 3, and will not be repeated here.
  • the similarity threshold determining unit 135 and the second evaluation index determining unit 136 all suspend the operation; when When the similarity threshold determination unit 135 and the second evaluation indicator determination unit 136 perform corresponding operations, the first prediction unit 132 , the second prediction unit 133 , and the first evaluation indicator determination unit 134 all suspend performing the corresponding operations.
  • the target global model is generated based on N local model parameters corresponding to the rth synchronization period, where r is a positive integer;
  • the data processing device 1 may further include: a historical global model acquisition module 14, a model parameter difference acquisition module 15, and a federated momentum determination module 16;
  • the historical global model acquisition module 14 is used to acquire the historical global model corresponding to the (r-1)th synchronization period; the historical global model is based on the local models uploaded by the N clients in the (r-1)th synchronization period respectively. generated by parameters;
  • the model parameter difference obtaining module 15 is used to obtain the training learning rate of the N local recognition models in the rth synchronization period; obtain the model parameter difference between the target global model and the historical global model;
  • the federated momentum determination module 16 is used to determine the ratio between the model parameter difference and the training learning rate as the federated momentum, and to send the federated momentum to N clients; the federated momentum together with the target global model is used to indicate the N client pairs The associated local recognition models are updated with parameters, and the federated momentum is used to indicate the training directions of the N local recognition models in their respective clients.
  • step S104 For the specific functional implementation of the historical global model acquisition module 14, the model parameter difference acquisition module 15, and the federated momentum determination module 16, reference may be made to step S104 in the embodiment corresponding to FIG. 3 above, which will not be repeated here.
  • the local model parameters of the local identification models to which N clients are uploaded can be obtained, and M parameter fusion methods (eg, M parameter fusion methods for the local model parameter set determined based on the N local model parameters can be obtained) weight combinations) (M is a positive integer), and through each parameter fusion method (for example, each weight combination), the parameters of the local model parameter set are fused respectively to obtain M candidate global models, and then through the M candidate global models
  • M is a positive integer
  • each parameter fusion method for example, each weight combination
  • the model evaluates the indicators in the multimedia verification dataset respectively, and selects the optimal target global model among M alternative global models. That is, selecting the optimal target global model from the M alternative global models obtained according to the M parameter fusion methods can improve the fusion effectiveness between the N local model parameters.
  • the parameters of the local recognition model to which it belongs can be updated, which can improve the generalization effect of the object recognition model.
  • the embodiments of the present application can be applied to cross-department, cross-enterprise, and even cross-regional business data, and under the condition of ensuring data privacy and security, the recognition effect of the object recognition model can be improved.
  • FIG. 9 is a schematic structural diagram of a data processing apparatus provided by an embodiment of the present application.
  • the data processing apparatus may be a client applied to any user terminal in the user terminal cluster shown in FIG. 1 , and the client may be a computer program (including program code) in a computer device.
  • the data processing device 2 may include: a model parameter uploading module 21, and a target global model receiving module 22;
  • the model parameter uploading module 21 is used to upload the local model parameters corresponding to the target local recognition model to the service device in response to the training times of the target local recognition model meeting the synchronization period, so that the service device is based on the local models uploaded by the N clients respectively.
  • the parameters obtain the target global model; the local model parameters uploaded by the N clients respectively include the local model parameters corresponding to the target local recognition model, and the target global model is determined by the evaluation indicators of the M candidate global models in the multimedia verification data set, M
  • the candidate global models are determined by the M parameter fusion methods associated with the local model parameter set and the local model parameter set.
  • the local model parameter set is determined based on the local model parameters uploaded by N clients respectively, and N is a positive integer greater than 1. , M is a positive integer;
  • the target global model receiving module 22 is used to receive the target global model returned by the service device, update the parameters of the target local recognition model according to the target global model, and determine the target local recognition model after parameter update as the object recognition model; It is used to identify objects of the target object type contained in the multimedia data.
  • step S101 For the specific function implementation of the model parameter uploading module 21 and the target global model receiving module 22, reference may be made to step S101, step S105-step S106 in the embodiment corresponding to FIG. 3, and will not be repeated here.
  • the data processing apparatus 2 may further include: a feature extraction module 23, a loss function determination module 24, and a training times statistics module 25;
  • the feature extraction module 23 is used to obtain multimedia sample data, input the multimedia sample data to the target local recognition model, and output the object space feature corresponding to the multimedia sample data through the target local recognition model;
  • the loss function determination module 24 is used for determining the training loss function corresponding to the target local recognition model according to the label information corresponding to the object space feature and the multimedia sample data;
  • the training times statistics module 25 is used to determine the training gradient of the target local recognition model according to the training loss function, update the parameters of the target local recognition model according to the training gradient and the training learning rate corresponding to the target local recognition model, and count the corresponding target local recognition models. number of training sessions.
  • step S101 for the specific function implementation of the feature extraction module 23, the loss function determination module 24, and the training times statistics module 25, reference may be made to step S101 in the embodiment corresponding to FIG. 3, which will not be repeated here.
  • the multimedia data includes a face image to be recognized, and the target object type includes a face type
  • the data processing device 2 may further include: a face feature extraction module 26 and a face classification module 27;
  • the face feature extraction module 26 is used for acquiring the face image to be recognized, inputting the face image to be recognized into the object recognition model, and outputting the spatial feature of the face corresponding to the face image to be recognized through the object recognition model;
  • the face classification module 27 is used to determine the face classification result corresponding to the face image to be recognized according to the face space feature; the face classification result is used to characterize the identity verification of the object of the face type included in the face image to be recognized result.
  • step S106 for the specific function implementation manner of the face feature extraction module 26 and the face classification module 27, reference may be made to step S106 in the embodiment corresponding to FIG. 3 above, which will not be repeated here.
  • the local model parameters of the local identification models to which N clients are uploaded can be obtained, and M parameter fusion methods (eg, M parameter fusion methods for the local model parameter set determined based on the N local model parameters can be obtained) weight combinations) (M is a positive integer), and through each parameter fusion method (for example, each weight combination), the parameters of the local model parameter set are fused respectively to obtain M candidate global models, and then through the M candidate global models
  • M is a positive integer
  • each parameter fusion method for example, each weight combination
  • the model evaluates the indicators in the multimedia verification dataset respectively, and selects the optimal target global model among M alternative global models. That is, selecting the optimal target global model from the M alternative global models obtained according to the M parameter fusion methods can improve the fusion effectiveness between the N local model parameters.
  • the parameters of the local recognition model to which it belongs can be updated, which can improve the generalization effect of the object recognition model.
  • the embodiments of the present application can be applied to cross-department, cross-enterprise, and even cross-regional business data, and under the condition of ensuring data privacy and security, the recognition effect of the object recognition model can be improved.
  • FIG. 10 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • the computer device 1000 may include: a processor 1001 , a network interface 1004 and a memory 1005 , in addition, the above-mentioned computer device 1000 may further include: a user interface 1003 , and at least one communication bus 1002 .
  • the communication bus 1002 is used to realize the connection and communication between these components.
  • the user interface 1003 may include a display screen (Display) and a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may include a standard wired interface or a wireless interface (eg, a WI-FI interface).
  • the memory 1005 may be high-speed RAM memory or non-volatile memory, such as at least one disk memory.
  • the memory 1005 may also be at least one storage device located away from the aforementioned processor 1001 .
  • the memory 1005 as a computer-readable storage medium may include an operating system, a network communication module, a user interface module, and a device control application program.
  • the network interface 1004 can provide a network communication function;
  • the user interface 1003 is mainly used to provide an input interface for the user; and
  • the processor 1001 can be used to call the device control stored in the memory 1005 application to achieve:
  • the N local recognition models are obtained by independent training of N clients, and each client includes multimedia sample data for training the associated local recognition model.
  • the sample data contains objects of the target object type, and N is a positive integer greater than 1;
  • M parameter fusion methods associated with the local model parameter set, and perform parameter fusion on the local model parameter set according to each parameter fusion method to obtain M candidate global models; the local model parameter set is based on the N local recognition models respectively. The corresponding local model parameters are determined, and M is a positive integer;
  • the evaluation indicators of the M candidate global models in the multimedia verification data set respectively, determine the target global model in the M candidate global models according to the evaluation indicators, and transmit the target global model to N clients, so that the N clients
  • the parameters of the associated local recognition models are updated respectively according to the target global model to obtain the object recognition model; the object recognition model is used to recognize the objects of the target object type contained in the multimedia data.
  • the computer device 1000 described in the embodiment of the present application can execute the description of the data processing method in the embodiment corresponding to FIG. 3 above, and can also execute the description of the data processing apparatus 1 in the embodiment corresponding to FIG. 8 above. It is not repeated here. In addition, the description of the beneficial effects of using the same method will not be repeated.
  • FIG. 11 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • the computer device 2000 may include: a processor 2001 , a network interface 2004 and a memory 2005 , in addition, the above-mentioned computer device 2000 may further include: a user interface 2003 , and at least one communication bus 2002 .
  • the communication bus 2002 is used to realize the connection and communication between these components.
  • the user interface 2003 may include a display screen (Display) and a keyboard (Keyboard), and the optional user interface 2003 may also include a standard wired interface and a wireless interface.
  • the network interface 2004 may include a standard wired interface or a wireless interface (eg, a WI-FI interface).
  • the memory 2005 may be high-speed RAM memory or non-volatile memory, such as at least one disk memory.
  • the memory 2005 may also be at least one storage device located away from the aforementioned processor 2001 .
  • the memory 2005 as a computer-readable storage medium may include an operating system, a network communication module, a user interface module, and a device control application program.
  • the network interface 2004 can provide network communication functions;
  • the user interface 2003 is mainly used to provide an input interface for the user; and
  • the processor 2001 can be used to invoke the device control stored in the memory 2005 application to achieve:
  • the local model parameters uploaded by the client include the local model parameters corresponding to the target local recognition model.
  • the target global model is determined by the evaluation indicators of the M candidate global models in the multimedia verification data set, and the M candidate global models are determined by the local model.
  • M parameter fusion methods associated with the parameter set and the local model parameter set are determined, the local model parameter set is determined based on the local model parameters uploaded by N clients respectively, N is a positive integer greater than 1, and M is a positive integer;
  • Receive the target global model returned by the service device update the parameters of the target local recognition model according to the target global model, and determine the target local recognition model after parameter updating as the object recognition model; the object recognition model is used to recognize the target contained in the multimedia data.
  • Object of type object
  • the computer device 2000 described in the embodiment of the present application can execute the description of the data processing method in the embodiment corresponding to FIG. 3 above, and can also execute the description of the data processing apparatus 2 in the embodiment corresponding to FIG. 9 above. It is not repeated here. In addition, the description of the beneficial effects of using the same method will not be repeated.
  • the embodiment of the present application further provides a non-transitory computer-readable storage medium, and the non-transitory computer-readable storage medium stores the computer executed by the data processing apparatus 1 mentioned above.
  • the program and the computer program executed by the data processing device 2, and the computer program includes program instructions, when the processor executes the program instructions, it can execute the description of the data processing method in the embodiment corresponding to FIG. Repeat.
  • the description of the beneficial effects of using the same method will not be repeated.
  • program instructions may be deployed for execution on one computing device, or on multiple computing devices located at one site, or alternatively, distributed across multiple sites and interconnected by a communications network.
  • multiple computing devices distributed in multiple locations and interconnected by a communication network can form a blockchain system.
  • the embodiments of the present application further provide a computer program product or computer program
  • the computer program product or computer program may include computer instructions, and the computer instructions may be stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor can execute the computer instruction, so that the computer device executes the description of the data processing method in the embodiment corresponding to FIG. Let's go into details. In addition, the description of the beneficial effects of using the same method will not be repeated.
  • the technical details not disclosed in the computer program products or computer program embodiments involved in the present application please refer to the description of the method embodiments of the present application.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM) or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

提供了一种数据处理方法、装置、设备以及介质,方法包括:获取N个局部识别模型分别对应的局部模型参数,N为客户端的数量(S102);获取与局部模型参数集合相关联的M种参数融合方式,根据每种参数融合方式分别对局部模型参数集合进行参数融合,得到M个备选全局模型,M为正整数(S103);获取M个备选全局模型分别在多媒体验证数据集中的评估指标,根据评估指标在M个备选全局模型中确定目标全局模型(S104),将目标全局模型传输至N个客户端(S105),以使N个客户端分别根据目标全局模型对所关联的局部识别模型进行参数更新,得到对象识别模型(S106)。可以提高联邦训练模型之间的参数融合有效性,进而提高联邦识别模型的泛化效果。

Description

数据处理方法、装置、设备以及介质
本申请要求于2021年04月15日提交的申请号为202110407285.0、发明名称为“数据处理方法、装置、设备以及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种数据处理方法、装置、设备以及介质。
背景技术
联邦学习成为一种解决跨部门甚至跨平台间数据孤岛形式的新型训练方式,在不用给出己方数据的情况下,也可以进行模型训练得到模型参数,即在保证数据隐私的同时进行联合训练。由于联邦学习过程需要大量的数据来支持,而数据又分布于不同的数据持有方,因此需要联合各个数据持有方来进行模型构建。在联合各个数据持有方进行模型构建时,需要对各个数据持有方所训练的模型参数进行参数融合。
相关技术中,每个数据持有方可以利用己方数据训练局部模型,所有数据持有方可以将训练的局部模型对应的局部模型参数周期性地上传至服务端,服务端对局部模型参数进行参数平均得到总模型,将总模型下发至各个数据持有方继续进行局部训练,直至达到训练收敛条件。然而,对各个数据持有方的局部模型参数进行参数平均,局部模型参数之间的融合有效性较差,从而影响联邦模型的泛化效果。
发明内容
本申请实施例提供一种数据处理方法、装置、设备以及介质,可以提高联邦训练模型之间的参数融合有效性,进而提高联邦识别模型的泛化效果。
本申请实施例一方面提供了一种数据处理方法,包括:
获取N个局部识别模型分别对应的局部模型参数;N个局部识别模型分别由N个客户端进行独立训练得到,每个客户端均包括用于训练所关联的局部识别模型的多媒体样本数据,多媒体样本数据包含目标对象类型的对象,N为大于1的正整数;
获取与局部模型参数集合相关联的M种参数融合方式,根据每种参数融合方式分别对局部模型参数集合进行参数融合,得到M个备选全局模型;局部模型参数集合基于N个局部识别模型分别对应的局部模型参数确定,M为正整数;
获取M个备选全局模型分别在多媒体验证数据集中的评估指标,根据评估指标在M个备选全局模型中确定目标全局模型,将目标全局模型传输至N个客户端,以使N个客户端分别根据目标全局模型对所关联的局部识别模型进行参数更新,得到对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
本申请实施例一方面提供了一种数据处理方法,包括:
响应于目标局部识别模型的训练次数满足同步周期,将目标局部识别模型对应的局部模型参数上传至服务设备,以使服务设备基于N个客户端分别上传的局部模型参数得到目标全局模型;N个客户端分别上传的局部模型参数包括目标局部识别模型对应的局部模型参数,目标全局模型由M个备选全局模型分别在多媒体验证数据集中的评估指标所确定,M个备选全局模型由局部模型参数集合所关联的M种参数融合方式和局部模型参数集合所确定,局部模型参数集合基于N个客户端分别上传的局部模型参数确定,N为大于1的正整数,M为正整数;
接收服务设备返回的目标全局模型,根据目标全局模型对目标局部识别模型进行参数更新,将参数更新后的目标局部识别模型确定为对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
本申请实施例一方面提供了一种数据处理装置,包括:
参数获取模块,用于获取N个局部识别模型分别对应的局部模型参数;N个局部识别模型分别由N个客户端进行独立训练得到,每个客户端均包括用于训练所关联的局部识别模型的多媒体样本数据,多媒体样本数据包含目标对象类型的对象,N为大于1的正整数;
参数融合模块,用于获取与局部模型参数集合相关联的M种参数融合方式,根据每种参数融合方式分别对局部模型参数集合进行参数融合,得到M个备选全局模型;局部模型参数集合基于N个局部识别模型分别对应的局部模型参数确定,M为正整数;
模型确定模块,用于获取M个备选全局模型分别在多媒体验证数据集中的评估指标,根据评估指标在M个备选全局模型中确定目标全局模型,将目标全局模型传输至N个客户端,以使N个客户端分别根据目标全局模型对所关联的局部识别模型进行参数更新,得到对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
本申请实施例一方面提供了一种数据处理装置,包括:
模型参数上传模块,用于响应于目标局部识别模型的训练次数满足同步周期,将目标局部识别模型对应的局部模型参数上传至服务设备,以使服务设备基于N个客户端分别上传的局部模型参数得到目标全局模型;N个客户端分别上传的局部模型参数包括目标局部识别模型对应的局部模型参数,目标全局模型由M个备选全局模型分别在多媒体验证数据集中的评估指标所确定,M个备选全局模型由局部模型参数集合所关联的M种参数融合方式和局部模型参数集合所确定,局部模型参数集合基于N个客户端分别上传的局部模型参数确定,N为大于1的正整数,M为正整数;
目标全局模型接收模块,用于接收服务设备返回的目标全局模型,根据目标全局模型对目标局部识别模型进行参数更新,将参数更新后的目标局部识别模型确定为对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
本申请实施例一方面提供了一种计算机设备,包括存储器和处理器,存储器与处理器相连,存储器用于存储计算机程序,处理器用于调用计算机程序,以使得该计算机设备执行本申请实施例中上述一方面提供的方法。
本申请实施例一方面提供了一种非临时性计算机可读存储介质,非临时性计算机可读存储介质中存储有计算机程序,计算机程序适于由处理器加载并执行,以使得具有处理器的计算机设备执行本申请实施例中上述一方面提供的方法。
根据本申请的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述一方面提供的方法。
本申请实施例可以获取到N个客户端分别上传的所属局部识别模型的局部模型参数,以及获取针对基于N个局部模型参数确定的局部模型参数集合的M种参数融合方式,通过每种参数融合方式分别对局部模型参数集合进行参数融合,得到M个备选全局模型,进而通过M个备选全局模型分别在多媒体验证数据集中的评估指标,在M个备选全局模型中选择最优的目标全局模型。即从根据M种参数融合方式得到的M个备选全局模型中选择最优的目标全局模型,可以提高N个局部模型参数之间的融合有效性,N个客户端基于目标全局模型继续对各自所属的局部识别模型进行参数更新,可以提高对象识别模型的泛化效果。
附图说明
图1是本申请实施例提供的一种网络架构的结构示意图;
图2a和图2b是本申请实施例提供的一种识别模型的联邦训练场景示意图;
图3是本申请实施例提供的一种数据处理方法的时序示意图;
图4是本申请实施例提供的一种确定目标全局模型的示意图;
图5是本申请实施例提供的一种联邦模型训练方法流程图;
图6是本申请实施例提供的一种在多媒体验证数据集中的权重组合示意图;
图7是本申请实施例提供的一种用户身份认证场景示意图;
图8是本申请实施例提供的一种数据处理装置的结构示意图;
图9是本申请实施例提供的一种数据处理装置的结构示意图;
图10是本申请实施例提供的一种计算机设备的结构示意图;
图11是本申请实施例提供的一种计算机设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请涉及人工智能(Artificial Intelligence,AI)技术、区块链(Block Chain)技术以及云技术。
请参见图1,图1是本申请实施例提供的一种网络架构的结构示意图。如图1所示,该网络架构包括服务器10d和用户终端集群,该用户终端集群包括一个或者多个用户终端,这里不对用户终端的数量进行限制。如图1所示,该用户终端集群可以包括用户终端10a、用户终端10b以及用户终端10c等。其中,服务器10d可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN(Content Delivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器。
用户终端10a、用户终端10b以及用户终端10c等均可以包括:智能手机、平板电脑、笔记本电脑、掌上电脑、移动互联网设备(mobile internet device,MID)、可穿戴设备(例如智能手表、智能手环等)以及智能电视等具有多媒体数据识别功能的智能终端。如图1所示,用户终端10a、用户终端10b以及用户终端10c等可以分别与服务器10d进行网络连接,以便于每个用户终端可以通过该网络连接与服务器10d之间进行数据交互。
如图1所示,用户终端集群中的每个用户终端均可以集成有客户端,每个用户终端所集成的客户端的数量可以为一个,也可以为多个,如同一个用户终端中可以集成有不同的客户端,不同的客户端可以持有不同的多媒体数据,不同的客户端所持有的多媒体数据均可以用于训练识别模型(此处默认本申请所涉及的客户端所持有的多媒体数据为相同类型的数据,如不同的客户端所持有的多媒体数据均为人脸图像数据)。由于训练识别模型需要大量的样本数据,而不同的客户端所持有的多媒体数据可能涉及隐私信息或者机密信息,也就是说每个客户端所持有的多媒体数据是不能公开的,因此可以采用联邦训练的方式完成对识别模型的训练。
换言之,每个客户端可以将自身持有的多媒体数据作为训练识别模型的多媒体样本数据,独自在所持有的多媒体样本数据上进行训练,不同的客户端可以周期性地同步模型参数(此时同步的模型参数可以称为局部模型参数),即每个客户端均可以将训练得到的模型参数周期性地上传至服务器10d,服务器10d可以收集各个客户端分别上传的局部模型参数,并对各个客户端分别上传的局部模型参数进行参数融合,得到各个周期内的目标全局模型,进而可以将目标全局模型下发至各个客户端,各个客户端可以根据目标全局模型继续训练所属的局部模型参数,直至达到收敛条件或者训练迭代次数达到预先设置的最大迭代次数,得到训练完成的对象识别模型,该对象识别模型可以用于识别多媒体数据中所包含的目标对象类型的对象,可以提升对象识别模型的泛化识别效果。其中,目标对象类型可以包括但不限于: 人脸、植物、商品、行人、各类动物、各类场景等对象类型。
请一并参见图2a和图2b,图2a和图2b是本申请实施例提供的一种识别模型的联邦训练场景示意图。如图2a和图2b所示的客户端1可以为上述图1所示的用户终端10a中所集成的具有联邦训练识别模型权限的客户端,客户端2可以为上述图1所示的用户终端10b中所集成的具有联邦训练识别模型权限的客户端,客户端N可以为上述图1所示的用户终端10c中所集成的具有联邦训练识别模型权限的客户端,参数服务设备可以为上述图1所示的服务器10d。如图2a所示,参与联邦训练识别模型的客户端数量为N个,N的取值可以为大于1的正整数,如N可以取值为2,3,……。
以多媒体样本数据为人脸样本数据为例进行说明。各个客户端均可以持有用于训练识别模型的人脸样本数据,且每个客户端所持有的人脸样本数据是相互独立的,如客户端1为了确保数据的隐私性,不会将自身所持有的人脸样本数据给其余设备(例如,客户端2、客户端N、参数服务设备等),因此每个客户端均可以利用各自所持有的人脸样本数据,在本地执行识别模型的局部训练(可以将客户端在本地训练的识别模型称为局部识别模型,客户端在本地训练得到的模型参数可以称为局部模型参数)。
由于每个客户端所采用的人脸样本数据之间具有差异,因此每个客户端均需要周期性地向参数服务设备上传局部模型参数,以使参数服务设备对N个客户端所训练的局部模型参数进行同步,即对N个客户端训练得到的局部模型参数进行参数融合,以得到全局模型。例如,若设置每100次训练迭代次数(也可以称为训练次数,或者称为训练步数)为一个同步周期,则各个客户端需要每训练迭代100次,就向参数服务设备上传一次局部模型参数。
如图2a所示,当客户端1在本地对局部识别模型的训练迭代次数达到100次时,该客户端1可以将第100次训练迭代所得到的模型参数1(即客户端1在第100次训练迭代所得到的局部模型参数)发送给参数服务设备;同理,当客户端2在本地对局部识别模型的训练迭代次数达到100次时,该客户端2可以将第100次训练迭代所得到的模型参数2发送给参数服务设备;客户端N可以将自身在第100次训练迭代所得到的模型参数N发送给参数服务设备。参数服务设备接收到N个客户端分别发送的第100次训练迭代所得到的局部模型参数(包括模型参数1、模型参数2、……、模型参数N)后,可以通过该参数服务设备中的搜索单元获取不同的模型参数融合方案(例如,不同的权重组合),并通过上述模型参数融合方案对N个客户端所发送的局部模型参数进行融合,得到备选全局模型,此时的备选全局模型也可以理解采用不同模型参数融合方案的模型。
进一步地,可以将备选全局模型传输至评估单元(Arbiter),该评估单元可以为集成在参数服务设备内部的组件,也可以为与参数服务设备具有通信连接关系的外部组件。在该评估单元中可以通过验证数据集获取备选全局模型对应的评估指标,并将备选全局模型对应的评估指标返回给参数服务设备。其中,验证数据集可以包括携带标签信息的人脸样本数据,将验证数据集中的人脸样本数据输入至备选全局模型后,通过备选全局模型可以输出针对人脸样本数据的人脸识别结果。进而可以将输出的人脸识别结果与该人脸样本数据所携带的标签信息进行比对,若人脸识别结果与标签信息相同则表示备选全局模型预测正确,若人脸识别结果与标签信息不相同则表示备选全局模型预测错误。
根据备选全局模型输出的人脸识别结果可以确定该备选全局模型在验证数据集中的评估指标;该评估指标可以包括但不限于:准确率(Accuracy,在所有样本数据中,模型预测正确的样本数据的占比)、召回率(Recall,在所有真样本数据中,被模型预测为“真”样本数据的占比)、精确率(Precision,在模型预测为“真”的样本数据中,确实为真样本数据的占比)、F1值(综合考量精确率和召回率而设计的一个指标)。
参数服务设备可以根据备选全局模型所对应的评估指标,在备选全局模型中选择最优的备选全局模型作为当前同步周期所对应的目标全局模型,并将该目标全局模型返回给各个客户端,每个客户端均可以根据参数服务设备返回的目标全局模型对本地的局部模型参数进行更新,并继续进行训练。当每个客户端对本地的局部识别模型的训练迭代次数达到200时, 需要重复执行上述操作,以得到后一个同步周期对应的目标全局模型进行继续训练,直至局部识别模型的训练达到收敛条件,或者训练迭代次数达到设定的最大迭代次数,对此时的局部模型参数进行保存,包含当前局部模型参数的局部识别模型可以确定为训练完成的局部识别模型,本申请可以将训练完成的局部识别模型确定为对象识别模型。
如图2b所示,客户端1所持有的人脸样本数据构成数据集20a,客户端2所持有的人脸样本数据构成数据集20b,……,客户端N所持有的人脸样本数据构成数据集20c;该客户端1可以在本地使用数据集20a对局部识别模型20d进行训练,客户端2在本地使用数据集20b对局部识别模型20e进行训练,……,客户端N在本地使用数据集20c对局部识别模型20f进行训练。当各个客户端对所关联的局部识别模型的训练迭代次数达到100次时,均需要将第100次训练迭代得到的局部模型参数发送给参数服务设备。因此参数服务设备可以获取局部模型参数集合20g,该局部模型参数集合20g可以包括上述N个客户端分别发送的局部模型参数,如客户端1发送的模型参数1,客户端2发送的模型参数2,……,客户端N发送的模型参数N。
进一步地,参数服务设备可以通过搜索单元获取针对局部模型参数集合20g的M个权重组合(M的取值可以为正整数,如M可以取值为1,2,3,……),此时的M个权重组合可以是指搜索单元为局部模型参数集合20g选择的M种模型参数融合方式,每个权重组合均可以包括局部模型参数集合20g所包含的局部模型参数分别对应的训练影响权重,也就是说,每个权重组合均可以理解为一个N维的向量,如上述M个权重组合可以包括:{a1,a2,a3,…,aN},{b1,b2,b3,…,bN},进而可以根据每个权重组合对局部模型参数集合20g进行参数融合,得到M个备选全局模型。
在评估单元中通过验证数据集获取M个备选全局模型分别对应的评估指标,如备选全局模型1在验证数据集中的评估指标为:评估指标1,备选全局模型2在验证数据集中的评估指标为:评估指标2,……,备选全局模型M在验证数据集中的评估指标为:评估指标M。从M个备选全局模型中选择最大的评估指标所对应的备选全局模型作为当前同步周期对应的目标全局模型,进而将目标全局模型下发至N个客户端,任意一个客户端在接收到参数服务设备下发的目标全局模型后,可以根据该目标全局模型对本地的局部模型参数进行更新,进而基于更新后的局部模型参数继续进行训练;当训练迭代次数达到下一个同步周期(例如,第200次训练迭代)时,可以重复执行上述操作,直至完成局部识别模型的训练过程。
可以理解的是,N个客户端是对同一个识别模型进行联邦训练,即联邦训练之前所使用的初始化识别模型是相同的,在联邦训练过程中所设置的最大迭代次数、同步周期、所采用的训练方法等信息均相同。在联邦训练完成后,每个客户端所获得的对象识别模型可能相同,也可以能存在一些差异,这与每个客户端所持有的人脸样本数据之间的差异有关。例如,参数服务设备在第50个同步周期确定了目标全局模型50后,将其下发至各个客户端,每个客户端均可以根据该目标全局模型50对各自关联的局部识别模型进行参数更新,此时各个客户端的局部识别模型是相同的。各个客户端可以基于各自所持有的人脸样本数据对局部识别模型进行继续训练,若在未达到第51个同步周期时达到收敛或者训练迭代次数达到设置的最大迭代次数,则训练完成获取对象识别模型,即第50次同步周期后所得到的局部模型参数并没有进行参数融合,因此各个客户端最终得到的对象识别模型可能存在一些差异。本申请实施例可以在联邦训练过程中,提高局部模型参数之间的参数融合有效性,进而提高对象识别模型的泛化效果。
请参见图3,图3是本申请实施例提供的一种数据处理方法的时序示意图。可以理解地,该数据处理方法可以由客户端和服务设备进行交互执行,客户端可以为上述图1所示的用户终端集群中任意一个用户终端所集成的客户端,服务设备可以为独立的服务器(例如上述图1所示的服务器10d),或者为多个服务器组成的服务器集群,或者为用户终端等。如图3所示,该数据处理方法可以包括以下步骤:
步骤S101,客户端响应于目标局部识别模型的训练次数满足同步周期,上传目标局部识 别模型对应的局部模型参数。
当N(N的取值可以为大于1的正整数)个客户端所持有的多媒体数据为相同类型的数据,且N个客户端所持有的多媒体数据涉及数据隐私性和数据安全性时,表示无法将N个客户端所持有的多媒体数据进行汇总。若需要使用N个客户端所持有的多媒体数据对识别模型进行训练,则在保证各个客户端数据安全性和隐私性的前提下,可以采用联邦训练的方式对识别模型进行训练,在联邦训练中,可以将N个客户端所持有的多媒体数据均作为多媒体样本数据。示例性地,多媒体样本数据可以包括人脸图像数据、用户金融数据、监控视频数据、用户商品数据等,每个多媒体样本数据均可以包含目标对象类型的对象,该目标对象类型可以包括人脸、行人、商品等对象类型。
N个客户端均可以在本地使用自身所持有的多媒体数据进行独立训练,每个客户端独立训练的识别模型可以称为局部识别模型,每个客户端均可以周期性地上传独自训练的局部识别模型参数进行同步。本申请实施例中可以根据实际需求设置同步周期,如同步周期可以设置为K个训练次数(也可以称为训练步数),表示客户端中的局部识别模型每训练K步,就需要将局部识别模型所对应的局部模型参数上传至服务设备(如上述图2a所对应实施例中的参数服务设备)进行同步,K的取值为大于1的正整数,如K可以取值为100,400,1600等数值。N个客户端中的每个客户端对局部识别模型的训练过程是类似的,只是使用的多媒体样本数据不相同,下面从N个客户端中选择任意一个客户端作为目标客户端,以该目标客户端为例,对局部识别模型的训练过程进行描述。
目标客户端可以获取自身所持有的多媒体样本数据,将多媒体样本数据输入至目标局部识别模型(此处的目标局部模型是指目标客户端在本地进行独立训练的局部识别模型),通过目标局部识别模型可以输出多媒体样本数据对应的对象空间特征。示例性地,在对目标局部识别模型进行训练的过程中,目标客户端可以对自身所持有的多媒体样本数据进行读取,将读取到的多媒体样本数据组成一个批量处理(batch),该batch所包含的多媒体样本数据可以输入到目标局部识别模型。该目标局部识别模型可以为卷积神经网络,此时的目标局部识别模型可以包括卷积层(Convolution Layer)、非线性激活层(ReLU(Rectified Linear Unit,线性整流函数)Layer)以及池化层(Pooling Layer)等网络层。
多媒体样本数据输入到目标局部识别模型中后,可以在该目标局部识别模型中对多媒体样本数据执行卷积计算(通过卷积层执行卷积计算)、非线性激活函数计算(通过非线性激活层执行非线性激活函数计算)、池化计算(通过池化层执行池化计算)等操作,输出多媒体样本数据对应的对象空间特征,即可以通过目标局部识别模型提取多媒体样本数据中的对象空间特征。在第k次训练过程中,上述batch所包含的多媒体样本数据可以表示为X k,目标客户端可以采用梯度下降(Gradient Descent,GD)的方式进行迭代训练,梯度下降是一种迭代学习算法,可以使用多媒体样本数据来更新目标局部识别模型的局部模型参数,batch的大小(也可以称为批量大小)是梯度下降的超参数,在目标局部识别模型的内部参数更新之前控制训练样本的数量。
进一步地,目标客户端可以根据对象空间特征与多媒体样本数据对应的标签信息,确定目标局部识别模型对应的训练损失函数,进而可以根据训练损失函数确定目标局部识别模型的训练梯度,根据训练梯度以及目标局部识别模型对应的训练学习率,对目标局部识别模型进行参数更新,统计目标局部识别模型对应的训练次数。
换言之,目标客户端可以根据目标局部识别模型提取到的对象空间特征和多媒体样本数据所携带的标签信息,计算训练损失函数对应的训练损失,训练损失计算完成后,根据链式法则,可以计算训练梯度
Figure PCTCN2021108748-appb-000001
其中L为训练损失函数,
Figure PCTCN2021108748-appb-000002
表示梯度计算,θ可以用于表示目标客户端所训练的局部模型参数。该训练损失函数可以为分类函数(例如,softmax函数),还可以为CosFace函数(一种损失函数,通过归一化和余弦决策边界的最大化,可实现类间差异的最大化和类内差异的最小化)和ArcFace函数(一种损失函数,从反余弦空间优化类间距离,通过在夹角上加个m,使得cos值在单调区间上值更小)。
在确定目标局部识别模型的训练梯度后,可以获取目标局部识别模型对应的训练学习率,根据训练学习率和训练梯度对目标局部识别模型的局部模型参数进行更新,更新的方式可以表示为:
Figure PCTCN2021108748-appb-000003
其中,θ a,r,k可以表示为目标局部识别模型(即N个客户端中的第a个客户端独立训练的局部识别模型,a为小于或等于N的正整数)在第r个同步周期中的第k次训练所得到的局部模型参数,x a,r,k可以表示为目标局部识别模型在第r个同步周期中的第k次训练所使用的多媒体样本数据,η r可以表示为目标局部识别模型在第r个同步周期中的训练学习率,
Figure PCTCN2021108748-appb-000004
可以表示为目标局部识别模型在第r个同步周期中的第k次训练时的训练梯度,θ a,r,k+1可以表示为目标局部识别模型在第r个同步周期中的第k+1次训练所得到的局部模型参数。第k+1次训练的局部模型参数θ a,r,k+1是基于第k次训练的局部模型参数θ a,r,k,以及训练梯度
Figure PCTCN2021108748-appb-000005
和训练学习率η r的乘积进行更新得到。
当训练迭代次数达到设定的最大迭代次数时可以终止训练,此时表示完成了对目标局部识别模型的训练过程。根据
Figure PCTCN2021108748-appb-000006
每更新一次局部模型参数,该目标局部识别模型的训练次数就可以增加一次,即目标客户端可以实时统计目标局部识别模型的训练次数。
当目标局部识别模型的训练次数满足同步周期,即目标局部识别模型的训练次数为上述同步周期K的倍数时,可以将目标局部识别模型的当前局部模型参数发送给服务设备。例如,假设同步周期K为100,则目标局部识别模型的训练次数为100时,需要将第100次训练得到的局部模型参数发送给服务设备进行同步;目标局部识别模型的训练次数为200时,可以将第200次训练得到的局部模型参数发送给服务设备进行同步;以此类推,直至目标局部识别模型的训练次数达到设定的最大迭代次数,终止对目标局部识别模型的训练。
可以理解的是,对于上述N个客户端,均可以执行上述操作,当其所关联的局部识别模型的训练次数满足同步周期时,均可以将所关联局部识别模型的局部模型参数发送给服务设备。
步骤S102,服务设备获取N个局部识别模型分别对应的局部模型参数;N个局部识别模型分别由N个客户端进行独立训练得到,每个客户端均包括用于训练所关联的局部识别模型的多媒体样本数据,多媒体样本数据包含目标对象类型的对象,N为大于1的正整数。
当N个客户端分别将各自所关联局部识别模型的局部模型参数发送给服务设备后,服务设备可以获取N个局部识别模型分别对应的局部模型参数。每个局部识别模型均可以对应一个客户端,N个局部识别模型可以在不同的客户端中进行独立训练,每个客户端所持有的用于训练局部识别模型的多媒体样本数据是不对外公开的。本申请实施例中,考虑到不同客户端之间的实际物理延迟和整体训练效率,同步周期K(也可以称为同步间隔)可以设置为上百或者上千的值,客户端与服务设备之间同步的是局部识别模型的局部模型参数,而不是每次训练迭代时的梯度,进而可以提高联邦训练的效率。
步骤S103,服务设备获取与局部模型参数集合相关联的M种参数融合方式,根据每种参数融合方式分别对局部模型参数集合进行参数融合,得到M个备选全局模型;局部模型参数集合基于N个局部识别模型分别对应的局部模型参数确定,M为正整数。
服务设备在获取到N个客户端分别发送的局部模型参数后,基于N个客户端所上传的局部模型参数确定局部模型参数集合(如上述图2b所对应实施例中的局部模型参数集合20g)。
在一些实施例中,基于N个局部识别模型分别对应的局部模型参数确定局部模型参数集合的方式包括:将包括N个局部识别模型分别对应的局部模型参数的集合作为局部模型参数集合;或者,从N个局部识别模型分别对应的局部模型参数中选择L(L为小于N的正整数)个局部识别模型分别对应的局部模型参数,将包括L个局部识别模型分别对应的局部模型参数的集合作为局部模型参数集合。本申请实施例对从N个局部识别模型分别对应的局部模型参数中选择L个局部识别模型分别对应的局部模型参数的方式不加以限定。示例性地,从N个局部识别模型分别对应的局部模型参数中随机选择L个局部识别模型分别对应的局部模型参数;或者,根据经验从N个局部识别模型分别对应的局部模型参数中选择L个局部识别模 型分别对应的局部模型参数。
示例性地,N个客户端分别发送的局部模型参数可以表示为:θ j,j∈{1,2,…,N},上述局部模型参数集合可以包括N个客户端分别对应的局部模型参数,也可以包括N个客户端中的L个客户端所对应的局部模型参数,其中L为小于N的正整数。
服务设备可以采用不同的参数融合方式对局部模型参数集合所包含的各个局部模型参数进行参数融合,在确定局部模型参数集合后,获取与局部模型参数集合相关联的M(M为正整数)种参数融合方式,然后根据每种参数融合分别对局部模型参数进行参数融合,得到M个备选全局模型。根据每种参数融合方式均可以得到一个备选全局模型。
示例性地,参数融合方式为在参数融合方案下所能利用的方式,参数融合方案包括但不限于全局加权平均方案、投票方案、平均方案等。也就是说,可以采用全局加权平均方案下的参数融合方式对局部模型参数集合进行参数融合,或者可以采用投票方案下的参数融合方式或平均方案下的参数融合方式对局部模型参数集合进行参数融合。示例性地,M种参数融合方式可以包括一种或多种参数融合方案下的参数融合方式。
本申请实施例以基于全局加权平均方案对局部模型参数集合进行参数融合为例进行说明。也就是说,M种参数融合方式为全局加权平均方案下的参数融合方式。此种情况下,一种参数融合方式基于一个权重组合实现。服务设备可以在搜索空间中查找针对局部模型参数集合的最优权重组合,将最优权重组合和局部模型参数集合进行加权平均,以得到最优的全局模型。服务设备可以获取与局部识别模型参数集合相关联的M个权重组合,根据每个权重组合分别对局部模型参数进行参数融合,得到M个备选全局模型。每个权重组合包括局部模型参数中的各个局部模型参数分别对应的训练影响权重。
示例性地,对于该M个权重组合中的任意一个权重组合i,根据权重组合i对局部模型参数进行参数融合的过程为:将权重组合i所包含的训练影响权重与局部模型参数集合所包含的各个局部模型参数进行加权平均,得到融合模型参数,将携带融合模型参数的识别模型确定为权重组合i所关联的备选全局模型i。
若局部模型参数集合包括N个客户端分别对应的局部模型参数,则一个权重组合可以包括N个局部模型参数分别对应的训练影响权重;对于该M个权重组合中的任意一个权重组合i,可以将权重组合i所包含的训练影响权重与局部模型参数集合所包含的N个局部模型参数进行加权平均,得到融合模型参数,将携带该融合模型参数的识别模型确定为上述权重组合i所关联的备选全局模型i,其中i为小于或等于M的正整数。
换言之,服务设备可以在每一次同步过程中,随机生成与局部识别模型参数集合相关联的M个权重组合,该M个权重组合可以表示为:
Figure PCTCN2021108748-appb-000007
此时M个权重组合中的任意一个权重组合
Figure PCTCN2021108748-appb-000008
(即上述权重组合i)均可以包括N个局部模型参数分别对应的训练影响权重,且每个权重组合
Figure PCTCN2021108748-appb-000009
中所包含的所有训练影响权重之和为1。权重组合
Figure PCTCN2021108748-appb-000010
中的训练影响权重可以表示为w a,a∈{1,2,…,N},且权重组合
Figure PCTCN2021108748-appb-000011
中的所有训练影响权重满足条件
Figure PCTCN2021108748-appb-000012
进而可以将权重组合
Figure PCTCN2021108748-appb-000013
中所包含的N个训练影响权重与局部识别模型参数集合所包含的N个局部模型参数进行加权平均,得到融合模型参数,此时携带该融合模型参数的识别模型确定为备选全局模型
Figure PCTCN2021108748-appb-000014
(即上述备选全局模型i),即
Figure PCTCN2021108748-appb-000015
基于上述操作过程,可以得到M个备选全局模型,该M个备选全局模型可以表示为:
Figure PCTCN2021108748-appb-000016
示例性地,在局部模型参数集合包括N个局部识别模型分别对应的局部模型参数的情况下,对于上述M个权重组合中的任意一个权重组合i,其获取过程可以包括:服务设备在目标取值范围内采样N个数值,将N个数值所对应的绝对值之和确定为范数值,将N个数值分别与范数值之间的比值,确定为与局部模型参数集合相关联的权重组合i。示例性地,可以在[0,1]均匀分布上采样N个数值,将这N个数值组成一个N维的向量,进而可以将N个数值除以该向量的L1范数(即N个数值所对应的绝对值之和),以确保得到的N个值的和为1,可以得到权重组合i,该过程可以称为归一化操作;以此类推,重复M次上述操作,可以得到M个权重组合。
可选的,模型参数融合方案还可以采用部分局部融合方案,在N个客户端的联邦训练场景中,每次同步可以随机选择L个客户端的局部模型参数进行融合,增加模型参数融合过程中的随机性,即每次同步均可以从N个客户端中选择L个客户端分别对应的局部模型参数,此时的局部模型参数集合可以包括L个客户端分别对应的局部模型参数。若局部模型参数集合包括L个局部模型参数,则可以获取与局部识别模型参数集合相关联的M个权重组合,一个权重组合可以包括L个局部模型参数分别对应的训练影响权重,即此时的每个权重组合均可以是指一个L维的向量,且每个权重组合所包含的L个训练影响权重之和为1。M个权重组合的获取方式以及L个局部模型参数的参数融合过程与上述局部模型参数集合包括N个客户端分别对应的局部模型参数的情况下的操作相同,这里不再进行赘述。
步骤S104,服务设备获取M个备选全局模型分别在多媒体验证数据集中的评估指标,根据评估指标在M个备选全局模型中确定目标全局模型。
服务设备可以在评估单元中获取包含正样本对和负样本对的多媒体验证数据集,其中,正样本对是指包含相同对象的多媒体样本数据对(例如,同人样本对),负样本对是指包含不同对象的多媒体样本数据对(例如,非同人样本对)。
在M个备选全局模型中获取备选全局模型i(M个备选全局模型中的任意一个备选全局模型),将正样本对输入至备选全局模型i,通过该备选全局模型i可以输出正样本对的第一对象预测结果;将负样本对输入至备选全局模型i,通过该备选全局模型i可以输出负样本对的第二对象预测结果;进而可以根据第一对象预测结果和第二对象预测结果,确定备选全局模型i在多媒体验证数据集中的评估指标。换言之,服务设备可以将多媒体验证数据集所包含的每个样本对(正样本对和负样本对)依次输入至备选全局模型i中,通过备选全局模型i可以输出每个正样本对分别对应的预测结果,根据预测结果可以确定备选全局模型i在多媒体验证数据集中的评估指标。
对于上述M个备选全局模型,均可以采用上述方式,确定每个备选全局模型分别在多媒体验证数据集中的评估指标,从而根据评估指标在M个备选全局模型中确定目标全局模型。示例性地,在M个备选全局模型中,将最大的评估指标所对应的备选全局模型确定为目标全局模型。
示例性地,当评估指标为准确率时,该评估指标的确定过程可以包括:服务设备根据第一对象预测结果,统计备选全局模型i在正样本对中的第一正确预测数量;根据第二对象预测结果,统计备选全局模型i在负样本对中的第二正确预测数量;将第一正确预测数量和第二正确预测数量之和,确定为备选全局模型i在多媒体验证数据集中的预测正确样本对总量;获取多媒体验证数据集对应的样本对总数量,根据预测正确样本对总量与样本对总数量之间的比值,确定备选全局模型i在多媒体验证数据集中的评估指标。
换言之,根据备选全局模型i在多媒体验证数据集中的预测结果,可以统计备选全局模型i在正样本对中的第一正确预测数量(可以是指正确预测且本身为正样本对的数量,也可以称为true positive,TP)、在正样本对中的第一错误预测数量(可以是指错误预测且本身为正样本对的数量,也可以称为false negative,FN)、在负样本对中的第二正确预测数量(可以是指正确预测且本身为负样本对的数量,也可以称为true negative,TN),以及在负样本对中的第二错误预测数量(可以是指错误预测且本身为负样本对的数量,也可以称为false positive,FP)。备选全局模型i在多媒体验证数据集中的评估指标(准确率)可以表示为:acc=(TP+TN)/(TP+FN+TN+FP),其中,TP+FN+TN+FP可以表示为多媒体验证数据集所包含的样本对总数量,TP+TN可以表示为备选全局模型i在多媒体验证数据集中的预测正确的样本对数量。
可选的,当多媒体验证数据集的数量为P个时,上述P个多媒体验证数据集可以包括多媒体验证数据集j,P为正整数,j为小于或等于P的正整数,此时评估指标的确定过程可以包括:服务设备可以将备选全局模型i在多媒体验证数据集j中的预测正确样本对总量(TP+TN),与多媒体验证数据集j对应的样本对总数量(TP+FN+TN+FP)之间的比值,确 定为备选全局模型i在多媒体验证数据集j中的预测准确率,即acc=(TP+TN)/(TP+FN+TN+FP)。进而可以获取备选全局模型i分别在P个多媒体验证数据集中的预测准确率,统计P个预测准确率对应的平均准确率mean,以及P个预测准确率对应的标准差值std;根据平均准确率和标准差值,确定备选全局模型i在多媒体验证数据集中的评估指标
Figure PCTCN2021108748-appb-000017
其中,评估指标
Figure PCTCN2021108748-appb-000018
的计算方式可以表示为:
Figure PCTCN2021108748-appb-000019
其中,上述公式(1)中的
Figure PCTCN2021108748-appb-000020
可以表示为是对备选全局模型i在P个多媒体验证数据集中的预测准确率进行归一化后得到的统一的预测准确率(即上述评估指标);acc j可以表示为备选全局模型i在多媒体验证数据集j中的预测准确率;mean可以表示为备选全局模型i在P个多媒体验证数据集中的预测准确率所对应的平均准确率;std可以表示为备选全局模型i在P个多媒体验证数据集中的预测准确率所对应的标准差值。
需要说明的是,可以采用不同的归一化方法对上述M个权重组合以及上述评估指标进行处理,该归一化方法可以包括但不限于:L范数、M范数;当然,本申请实施例还可以不执行归一化操作。
可选的,备选全局模型
Figure PCTCN2021108748-appb-000021
在多媒体验证数据集j中的评估指标(例如上述预测准确率)可以表示为S j,j∈{1,2,…,P},进而对初始评估指标S j进行归一化,以消除多媒体验证数据集之间的差异性的影响。例如,可以对初始评估指标S j进行局部归一化处理(Local Norm):
Figure PCTCN2021108748-appb-000022
其中,上述公式(2)中的S′ j可以表示为局部归一化处理后的评估指标,σ可以表示为激活函数,ε可以为训练过程中的超参数,该超参数ε可以根据实际需求进行设置。
可选的,可以对初始评估指标S j进行移动归一化处理(Moving Norm):
Figure PCTCN2021108748-appb-000023
Figure PCTCN2021108748-appb-000024
Figure PCTCN2021108748-appb-000025
其中,上述公式(3)至公式(5)中的S' j可以表示为移动归一化处理后的评估指标,γ可以表示为归一化参数,μ可以表示为移动平均值,v可以表示为移动方差,ε可以为训练过程中的超参数,μ last可以为最新一次训练对应的移动平均值,v last可以为最新一次训练对应的移动方差,此处公式(5)中的超参数ε与上述公式(2)中的超参数ε可以相同,也可以不相同。
进一步地,通过上述公式(2),或者上述公式(3)至公式(5)得到M个备选全局模型分别对应的评估指标后,可以将M个评估指标中最大的评估指标所对应的权重组合,作为最优的权重组合,并将最优的权重组合应用在每个客户端所对应的局部识别模型:
Figure PCTCN2021108748-appb-000026
Figure PCTCN2021108748-appb-000027
其中,上述公式(6)和公式(7)中的S′ j,a可以表示为第a个客户端对应的局部识别模型在多媒体验证数据集j中的评估指标,
Figure PCTCN2021108748-appb-000028
可以表示为将最优的权重组合应用于第a个客户端的局部识别模型,a∈{1,2,…,N},
Figure PCTCN2021108748-appb-000029
可以表示为基于上述最优的权重组合所确定的第a个局部识别模型(第a个客户端对应的局部识别模型)的训练影响权重;w last可以为第a个客户端的局部识别模型在最新一次训练所对应的权重,
Figure PCTCN2021108748-appb-000030
可以为训练过程中的超参数,该超参数
Figure PCTCN2021108748-appb-000031
可以根据实际需求进行设置。
可选的,当评估指标为固定错误接受率(False Acceptance Rate,FAR)下对应的召回率(TPR)时,服务设备可以获取M个备选全局模型中所包含的备选全局模型i在多媒体验证 数据集中的错误接受率,在负样本对所对应的相似度中确定相似度阈值,其中,相似度阈值由负样本对的数量和错误接受率所确定;进而可以获取正样本对所对应的相似度,在正样本对中获取相似度大于相似度阈值的第一样本对,将第一样本对的数量与正样本对的数量之间的比值确定为备选全局模型i对应的评估指标。
示例性地,错误接受率的确定过程可以包括:获取M个备选全局模型中所包含的备选全局模型i在负样本对中的错误预测数量(即上述第二错误预测数量,FP);将错误预测数量与负样本对的数量(第二错误预测数量和第二正确预测数量之和,即FP+TN)之间的比值,确定为备选全局模型i在多媒体验证数据集中的错误接受率,该错误接受率可以表示为:FAR=FP/(FP+TN)。
例如,若多媒体验证数据集包括N1个正样本对和N2个负样本对,在FAR=1e-3下的召回率TPR计算方法为:获取N2个负样本对之间的相似度(例如,余弦相似度),以及N1个正样本对之间的相似度,进而可以对N2个负样本对的相似度进行降序排序,将第topx=int(N2*FAR)个相似度确定为相似度阈值,在N1个正样本对中将相似度大于相似度阈值的正样本对确定为第一样本对,将第一样本对的数量与正样本对的数量N1之间的比值确定为召回率TPR(即上述评估指标),其中int()为取整函数,FAR=1e-3可以是指基于实际需求预先设置的数值。
可以理解的是,服务设备可以根据M个备选全局模型分别在多媒体验证数据集中的评估指标,将最大的评估指标所对应的备选全局模型确定为目标全局模型,该目标全局模型所对应的权重组合确定为M个权重组合中的最优权重组合。在不同的同步过程中,其最优权重组合是不一样的,如第一次同步过程中和第二次同步过程中所确定的最优权重组合是不一样的。
请参见图4,图4是本申请实施例提供的一种确定目标全局模型的示意图。如图4所示,当客户端的数量N=3(联邦训练过程中需要使用客户端1所持有的多媒体样本数据、客户端2所持有的多媒体样本数据以及客户端3所持有的多媒体样本数据)时,以评价指标为准确率为例对权重组合的确定过程进行描述。当服务设备接收到的局部模型参数为第12800次训练迭代次数的局部模型参数时,区域30a中的颜色深浅用于表示第12800次训练迭代次数所对应的备选全局模型在多媒体验证数据集上的准确率值,柱状图区域30b可以用于解释区域30a中的颜色与准确率值之间的关系,区域30a中的每个位置可以代表一个权重组合。
当服务设备接收到的局部模型参数为第256000次训练迭代次数的局部模型参数时,区域30c中的颜色深浅用于表示第256000次训练迭代次数所对应的备选全局模型在多媒体验证数据集上的准确率值,柱状图区域30d可以用于解释区域30c中的颜色与准确率值之间的关系,区域30c中的每个位置同样可以代表一个权重组合。
区域30a和区域30c表示在不同训练阶段,多媒体验证数据集上最好结果的权重组合是在不同位置的,且是进行动态变化的。如图4所示,在第12800次训练迭代次数时的最优权重组合为:最优权重组合1,在第256000次训练迭代次数时的最优权重组合为:最优权重组合2。
可选的,若上述目标全局模型是基于第r个同步周期所对应的N个局部模型参数所生成的,r为正整数,服务设备可以根据评估指标在备选全局模型中确定第r个同步周期对应的目标全局模型,获取第(r-1)个同步周期对应的历史全局模型,其中,历史全局模型是基于N个客户端分别在第(r-1)个同步周期所上传的局部模型参数所生成的。进而可以获取N个局部识别模型在第r个同步周期内的训练学习率;获取目标全局模型与历史全局模型之间的模型参数差值;将模型参数差值与训练学习率之间的比值确定为联邦动量,将联邦动量发送至N个客户端;联邦动量连同目标全局模型用于指示N个客户端对所关联的局部识别模型进行参数更新,且联邦动量用于指示N个局部识别模型分别在所属客户端中的训练方向。
例如,若第r个同步周期所对应的目标全局模型表示为
Figure PCTCN2021108748-appb-000032
第(r-1)个同步周期对应的历史全局模型表示为
Figure PCTCN2021108748-appb-000033
在第r个同步周期内的训练学习率可以表示为η r,此时的联邦动量可以表示为
Figure PCTCN2021108748-appb-000034
(
Figure PCTCN2021108748-appb-000035
可以表示为第r个同步周期对应的联邦动量,
Figure PCTCN2021108748-appb-000036
可以表示为上述模型参数差值)。训练学习率η r可以为固定值,或者可以进行自适应变化,如在任意一个客户端第一次完整训练所持有的所有多媒体样本数据时的训练学习率可以设置为0.1,在第10次完整训练所持有的所有多媒体样本数据时的训练学习率可以设置为0.02等。需要说明的是,第1个同步周期时的联邦动量可以表示为:
Figure PCTCN2021108748-appb-000037
步骤S105,服务设备返回目标全局模型。
服务设备可以将上述目标全局模型返回至N个客户端,任意一个客户端在接收到服务设备返回的目标全局模型后,可以根据目标全局模型对局部识别模型进行参数更新,并基于更新的局部模型参数进行继续训练。
可选的,当服务设备生成联邦动量
Figure PCTCN2021108748-appb-000038
时,服务设备可以将目标全局模型和联邦动量
Figure PCTCN2021108748-appb-000039
一同返回至N个客户端,任意一个客户端在接收到服务设备返回的目标全局模型和联邦动量
Figure PCTCN2021108748-appb-000040
后,可以根据目标全局模型和联邦动量
Figure PCTCN2021108748-appb-000041
对局部识别模型进行参数更新,并基于更新的局部模型参数进行继续训练。
步骤S106,客户端接收目标全局模型,根据目标全局模型对目标局部识别模型进行参数更新,将参数更新后的目标局部识别模型确定为对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
目标客户端在接收到服务设备返回的目标全局模型后,可以根据目标全局模型对自身的目标局部识别模型进行参数更新,并继续对该目标局部识别模型进行局部训练,直至目标局部识别模型的训练次数达到训练终止条件(包括训练收敛条件、最大的迭代次数等)时,完成对目标局部识别模型的训练过程,得到训练完成的对象识别模型,该对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
可选的,目标客户端在接收到服务设备返回的目标全局模型和联邦动量
Figure PCTCN2021108748-appb-000042
后,可以将目标客户端局部训练中的训练梯度和联邦动量
Figure PCTCN2021108748-appb-000043
进行结合,对目标局部识别模型进行参数更新,如
Figure PCTCN2021108748-appb-000044
θ i可以表示为目标局部识别模型对应的局部模型参数,θ′ i可以表示为目标客户端i在第r个同步周期更新后得到的局部模型参数,g可以表示为第r个同步周期对应的训练梯度,K可以表示为一个同步周期所对应的训练次数。
请参见图5,图5是本申请实施例提供的一种联邦模型训练方法流程图。如图5所示,以多媒体样本数据为人脸图像为例,对联邦模型训练方法的实现过程进行具体描述,该联邦模型训练方法可以通过下述步骤S11-步骤S22来实现。
步骤S11,客户端读取本地训练数据,即可以获取该客户端所持有的人脸样本数据(即上述多媒体样本数据),进而可以继续执行步骤S12,获取进行初始化处理的人脸识别模型(上述局部识别模型),通过人脸样本数据对人脸识别模型进行局部训练,即执行步骤S13,计算人脸识别模型的训练损失和训练梯度,并实时统计该人脸识别模型的训练次数。客户端可以继续执行步骤S14,判断训练次数是否满足同步周期(上述同步周期K),若人脸识别模型的训练次数的满足同步周期,则继续执行步骤S15,向服务设备上传客户端模型参数(人脸识别模型的当前模型参数,即上述局部模型参数);若人脸识别模型的训练次数未满足同步周期,则继续执行步骤S21,判断人脸识别模型是否满足训练终止条件。若人脸识别模型满足训练终止条件,则表示该人脸识别模型训练完成;若人脸识别模型不满足训练终止条件,则继续执行步骤S22,更新该人脸识别模型的局部模型参数。
所有客户端(即上述N个客户端)在局部训练中的人脸识别模型所对应的训练次数满足同步周期时,均可以将局部模型参数上传至服务设备,服务设备可以接收所有客户端所上传的局部模型参数,并继续执行步骤S16,在搜索空间中生成不同融合方案对应的模型。如获取M个权重组合,并将每个权重组合分别与所接收到的局部模型参数进行加权平均,得到M个备选全局模型,M个权重组合的具体实现过程可以参见上述步骤S103。进而可以继续执行步骤S17和步骤S18,评估单元读取验证集数据(即上述多媒体验证数据集),在验证集中获取每个备选全局模型分别对应的评估指标(也即计算验证集指标),进而继续执行步骤S19- 步骤S20。
选择最优的评估指标所对应的融合方案进行融合,以得到目标全局模型(即最优的评估指标所对应的备选全局模型),并将目标全局模型下发至各客户端。客户端在接收到目标全局模型,且人脸识别模型不满足训练终止条件时,继续执行步骤S22,根据目标全局模型对人脸识别模型进行参数更新。可以理解的是,可以重复执行上述步骤S12-步骤22,直至人脸识别模型满足训练终止条件,已完成对人脸识别模型的训练。
请参见图6,图6是本申请实施例提供的一种在多媒体验证数据集中的权重组合示意图。如图6所示,当客户端的数量N=3(客户端1、客户端2以及客户端3)时,联邦训练所使用的多媒体样本数据为上述客户端1、客户端2以及客户端3分别持有的人脸数据,上述3个客户端基于自身所持有的人脸数据进行局部训练的过程中,假设联邦训练中的epochs=26,其中epoch用于表示各客户端对所持有的人脸数据进行完整训练的数量。如图6所示的坐标图中的横坐标为epochs,纵坐标可以为在不同epoch时所对应的训练影响权重。很显然,随着epoch的增加,三个客户端所训练的局部模型参数对应的训练影响权重越集中,也就是说,训练阶段越到后面,各客户端所对应的训练影响权重越接近。
可选的,对于每个客户端在本地训练的对象识别模型,当多媒体数据包括待识别人脸图像,目标对象类型包括人脸类型时,目标客户端可以获取待识别人脸图像,将待识别人脸图像输入至对象识别模型,通过对象识别模型输出待识别人脸图像对应的人脸空间特征;进而可以根据人脸空间特征确定待识别人脸图像对应的人脸分类结果;人脸分类结果用于表征待识别人脸图像中所包含的人脸类型的对象的身份验证结果。
换言之,对象识别模型可以使用于任何进行人脸识别的场景,如用户身份认证场景、失踪人口追寻场景、业务办理场景等。示例性地,在用户身份认证场景和业务办理场景中,可以采用对象识别模型对用户在身份认证场景中所提供的用户人脸图像进行识别,以确认用户人脸图像的身份真实性;在失踪人口追踪场景中,可以对失踪人口在失踪人口之前的照片进行识别,与现有的户籍照片进行比对,以获取失踪人口的疑似用户。
请参见图7,图7是本申请实施例提供的一种用户身份认证场景示意图。如图7所示,用户A想要在用户终端40a所安装的客户端1中办理业务时,需要用户A在客户端1中进行身份验证。当用户A在用户终端40a所安装的客户端1中发起身份验证请求时,可以在客户端1中显示人脸验证框40b。用户A可以将人脸对准用户终端40a中的人脸验证框40b,并跟随指示执行相应的动作(例如,摇头、点头、眨眼等动作),用户终端40a可以实时采集人脸验证框40b中的待识别人脸图像40c,并将实时采集到的待识别人脸图像40c输入至对象识别模型40d。在对象识别模型40d中对待识别人脸图像40c进行特征提取,获取待识别人脸图像40c对应的人脸识别结果。与此同时,客户端1可以从已有的人脸图像数据库中获取用户A预先上传的证件图像40e,将证件图像40e与对象识别模型40d输出的人脸识别结果进行比较,若证件图像40e与人脸识别结果相同,可以确定用户A身份验证通过,并向用户终端40a的客户端1返回身份验证通过结果;若证件图像40e与人脸识别结果不相同,可以确定用户A身份验证未通过,并向用户终端40a的客户端1返回身份验证未通过结果,提醒用户A重新进行身份验证。
本申请实施例中,可以获取到N个客户端分别上传的所属局部识别模型的局部模型参数,以及获取针对基于N个局部模型参数确定的局部模型参数集合的M种参数融合方式(如,M个权重组合)(M为正整数),通过每种参数融合方式(如,每个权重组合)分别对局部模型参数集合进行参数融合,得到M个备选全局模型,进而通过M个备选全局模型分别在多媒体验证数据集中的评估指标,在M个备选全局模型中选择最优的目标全局模型。即从根据M种参数融合方式得到的M个备选全局模型中选择最优的目标全局模型,可以提高N个局部模型参数之间的融合有效性,N个客户端基于目标全局模型继续对各自所属的局部识别模型进行参数更新,可以提高对象识别模型的泛化效果。本申请实施例可以应用在跨部门、跨企业,甚至跨区域的业务数据中,在确保数据隐私和安全的情况下,可以提高对象识别模型 的识别效果。
请参见图8,图8是本申请实施例提供的一种数据处理装置的结构示意图。如图8所示,上述数据处理装置可以是应用于上述图1所对应实施例中的服务设备(例如,服务器10d)。该数据处理装置1可以包括:参数获取模块11,参数融合模块12,模型确定模块13;
参数获取模块11,用于获取N个局部识别模型分别对应的局部模型参数;N个局部识别模型分别由N个客户端进行独立训练得到,每个客户端均包括用于训练所关联的局部识别模型的多媒体样本数据,多媒体样本数据包含目标对象类型的对象,N为大于1的正整数;
参数融合模块12,用于获取与局部模型参数集合相关联的M种参数融合方式,根据每种参数融合方式分别对局部模型参数集合进行参数融合,得到M个备选全局模型;局部模型参数集合基于N个局部识别模型分别对应的局部模型参数确定,M为正整数;
模型确定模块13,用于获取M个备选全局模型分别在多媒体验证数据集中的评估指标,根据评估指标在M个备选全局模型中确定目标全局模型,将目标全局模型传输至N个客户端,以使N个客户端分别根据目标全局模型对所关联的局部识别模型进行参数更新,得到对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
参数获取模块11,参数融合模块12,模型确定模块13的具体功能实现方式可以参见上述图3所对应实施例中的步骤S102-步骤S105,这里不再进行赘述。
在一些可行的实施方式中,一种参数融合方式基于一个权重组合实现,参数融合模块12,用于获取与局部模型参数集合相关联的M个权重组合,根据每个权重组合分别对局部模型参数集合进行参数融合,得到M个备选全局模型;每个权重组合包括局部模型参数集合中的各个局部模型参数分别对应的训练影响权重。
在一些可行的实施方式中,参数融合模块12可以包括:权重组合获取单元121,加权平均单元122;
权重组合获取单元121,用于获取与局部模型参数集合相关联的M个权重组合;M个权重组合包括权重组合i,i为小于或等于M的正整数;
加权平均单元122,用于将权重组合i所包含的训练影响权重与局部模型参数集合所包含的各个局部模型参数进行加权平均,得到融合模型参数,将携带融合模型参数的识别模型确定为权重组合i所关联的备选全局模型i。
在一些可行的实施方式中,模型确定模块13具体用于:在M个备选全局模型中,将最大的评估指标所对应的备选全局模型确定为目标全局模型。
权重组合获取单元121,加权平均单元122的具体功能实现方式可以参见上述图3所对应实施例中的步骤S103,这里不再进行赘述。
在一些可行的实施方式中,局部模型参数集合包括N个局部识别模型分别对应的局部模型参数,权重组合获取单元121可以包括:范数值确定子单元1211,权重确定子单元1212;
范数值确定子单元1211,用于在目标取值范围内采样N个数值,将N个数值所对应的绝对值之和确定为范数值;
权重确定子单元1212,用于将N个数值分别与范数值之间的比值,确定为与局部模型参数集合相关联的权重组合i。
范数值确定子单元1211,权重确定子单元1212的具体功能实现方式可以参见上述图3所对应实施例中的步骤S103,这里不再进行赘述。
在一些可行的实施方式中,模型确定模块13可以包括:验证数据集获取单元131,第一预测单元132,第二预测单元133,第一评估指标确定单元134;
验证数据集获取单元131,用于获取包含正样本对和负样本对的多媒体验证数据集;正样本对是指包含相同对象的多媒体样本数据对,负样本对是指包含不同对象的多媒体样本数据对;
第一预测单元132,用于将正样本对输入至M个备选全局模型中的备选全局模型i,通过备选全局模型i输出正样本对的第一对象预测结果;i为小于或等于M的正整数;
第二预测单元133,用于将负样本对输入至备选全局模型i,通过备选全局模型i输出负样本对的第二对象预测结果;
第一评估指标确定单元134,用于根据第一对象预测结果和第二对象预测结果,确定备选全局模型i在多媒体验证数据集中的评估指标。
在一些可行的实施方式中,第一评估指标确定单元134可以包括:预测结果统计子单元1341,正确样本对总量统计子单元1342,评估指标计算子单元1343;
预测结果统计子单元1341,用于根据第一对象预测结果,统计备选全局模型i在正样本对中的第一正确预测数量;根据第二对象预测结果,统计备选全局模型i在负样本对中的第二正确预测数量;
正确样本对总量统计子单元1342,用于将第一正确预测数量和第二正确预测数量之和,确定为备选全局模型i在多媒体验证数据集中的预测正确样本对总量;
评估指标计算子单元1343,用于获取多媒体验证数据集对应的样本对总数量,根据预测正确样本对总量与样本对总数量之间的比值,确定备选全局模型i在多媒体验证数据集中的评估指标。
在一些可行的实施方式中,多媒体验证数据集的数量为P个,P个多媒体验证数据集包括多媒体验证数据集j,P为正整数,j为小于或等于P的正整数;
评估指标计算子单元1343具体用于:将备选全局模型i在多媒体验证数据集j中的预测正确样本对总量,与多媒体验证数据集j对应的样本对总数量之间的比值,确定为备选全局模型i在多媒体验证数据集j中的预测准确率;获取备选全局模型i分别在P个多媒体验证数据集中的预测准确率,统计P个预测准确率对应的平均准确率,以及P个预测准确率对应的标准差值;根据平均准确率和标准差值,确定备选全局模型i在P个多媒体验证数据集中的评估指标。
验证数据集获取单元131,第一预测单元132,第二预测单元133,第一评估指标确定单元134的具体功能实现方式可以参见上述图3所对应实施例中的步骤S104,这里不再进行赘述。
在一些可行的实施方式中,模型确定模块13可以包括:验证数据集获取单元131,相似度阈值确定单元135,第二评估指标确定单元136;
验证数据集获取单元131,用于获取包含正样本对和负样本对的多媒体验证数据集;正样本对是指包含相同对象的多媒体样本数据对,负样本对是指包含不同对象的多媒体样本数据对;
相似度阈值确定单元135,用于获取M个备选全局模型中所包含的备选全局模型i在多媒体验证数据集中的错误接受率,在负样本对所对应的相似度中确定相似度阈值;相似度阈值由负样本对的数量和错误接受率所确定,i为小于或等于M的正整数;
第二评估指标确定单元136,用于获取正样本对所对应的相似度,在正样本对中获取相似度大于相似度阈值的第一样本对,将第一样本对的数量与正样本对的数量之间的比值确定为备选全局模型i在多媒体验证数据集中的评估指标。
在一些可行的实施方式中,相似度阈值确定单元135可以包括:错误预测数量获取子单元1351,错误接受率确定子单元1352;
错误预测数量获取子单元1351,用于获取M个备选全局模型中所包含的备选全局模型i在负样本对中的错误预测数量;
错误接受率确定子单元1352,用于将错误预测数量与负样本对的数量之间的比值,确定为备选全局模型i在多媒体验证数据集中的错误接受率。
验证数据集获取单元131,相似度阈值确定单元135,第二评估指标确定单元136的具体功能实现方式可以参见上述图3所对应实施例中的步骤S104,这里不再进行赘述。示例性地,当第一预测单元132,第二预测单元133,第一评估指标确定单元134在执行相应步骤时,相似度阈值确定单元135,第二评估指标确定单元136均暂停执行操作;当相似度阈值确定单 元135,第二评估指标确定单元136在执行相应操作时,第一预测单元132,第二预测单元133,第一评估指标确定单元134均暂停执行相应操作。
在一些可行的实施方式中,目标全局模型是基于第r个同步周期所对应的N个局部模型参数所生成的,r为正整数;
该数据处理装置1还可以包括:历史全局模型获取模块14,模型参数差值获取模块15,联邦动量确定模块16;
历史全局模型获取模块14,用于获取第(r-1)个同步周期对应的历史全局模型;历史全局模型是基于N个客户端分别在第(r-1)个同步周期所上传的局部模型参数所生成的;
模型参数差值获取模块15,用于获取N个局部识别模型在第r个同步周期内的训练学习率;获取目标全局模型与历史全局模型之间的模型参数差值;
联邦动量确定模块16,用于将模型参数差值与训练学习率之间的比值确定为联邦动量,将联邦动量发送至N个客户端;联邦动量连同目标全局模型用于指示N个客户端对所关联的局部识别模型进行参数更新,且联邦动量用于指示N个局部识别模型分别在所属客户端中的训练方向。
历史全局模型获取模块14,模型参数差值获取模块15,联邦动量确定模块16的具体功能实现方式可以参见上述图3所对应实施例中的步骤S104,这里不再进行赘述。
本申请实施例中,可以获取到N个客户端分别上传的所属局部识别模型的局部模型参数,以及获取针对基于N个局部模型参数确定的局部模型参数集合的M种参数融合方式(如,M个权重组合)(M为正整数),通过每种参数融合方式(如,每个权重组合)分别对局部模型参数集合进行参数融合,得到M个备选全局模型,进而通过M个备选全局模型分别在多媒体验证数据集中的评估指标,在M个备选全局模型中选择最优的目标全局模型。即从根据M种参数融合方式得到的M个备选全局模型中选择最优的目标全局模型,可以提高N个局部模型参数之间的融合有效性,N个客户端基于目标全局模型继续对各自所属的局部识别模型进行参数更新,可以提高对象识别模型的泛化效果。本申请实施例可以应用在跨部门、跨企业,甚至跨区域的业务数据中,在确保数据隐私和安全的情况下,可以提高对象识别模型的识别效果。
请参见图9,图9是本申请实施例提供的一种数据处理装置的结构示意图。如图9所示,上述数据处理装置可以是应用于上述图1所示用户终端集群中任意一个用户终端上的客户端,该客户端可以是计算机设备中的一个计算机程序(包括程序代码)。该数据处理装置2可以包括:模型参数上传模块21,目标全局模型接收模块22;
模型参数上传模块21,用于响应于目标局部识别模型的训练次数满足同步周期,将目标局部识别模型对应的局部模型参数上传至服务设备,以使服务设备基于N个客户端分别上传的局部模型参数得到目标全局模型;N个客户端分别上传的局部模型参数包括目标局部识别模型对应的局部模型参数,目标全局模型由M个备选全局模型分别在多媒体验证数据集中的评估指标所确定,M个备选全局模型由局部模型参数集合所关联的M种参数融合方式和局部模型参数集合所确定,局部模型参数集合基于N个客户端分别上传的局部模型参数确定,N为大于1的正整数,M为正整数;
目标全局模型接收模块22,用于接收服务设备返回的目标全局模型,根据目标全局模型对目标局部识别模型进行参数更新,将参数更新后的目标局部识别模型确定为对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
模型参数上传模块21,目标全局模型接收模块22的具体功能实现方式可以参见上述图3所对应实施例中的步骤S101、步骤S105-步骤S106,这里不再进行赘述。
在一些可行的实施方式中,该数据处理装置2还可以包括:特征提取模块23,损失函数确定模块24,训练次数统计模块25;
特征提取模块23,用于获取多媒体样本数据,将多媒体样本数据输入至目标局部识别模型,通过目标局部识别模型输出多媒体样本数据对应的对象空间特征;
损失函数确定模块24,用于根据对象空间特征与多媒体样本数据对应的标签信息,确定目标局部识别模型对应的训练损失函数;
训练次数统计模块25,用于根据训练损失函数确定目标局部识别模型的训练梯度,根据训练梯度以及目标局部识别模型对应的训练学习率,对目标局部识别模型进行参数更新,统计目标局部识别模型对应的训练次数。
特征提取模块23,损失函数确定模块24,训练次数统计模块25的具体功能实现方式可以参见上述图3所对应实施例中的步骤S101,这里不再进行赘述。
在一些可行的实施方式中,多媒体数据包括待识别人脸图像,目标对象类型包括人脸类型;
该数据处理装置2还可以包括:人脸特征提取模块26,人脸分类模块27;
人脸特征提取模块26,用于获取待识别人脸图像,将待识别人脸图像输入至对象识别模型,通过对象识别模型输出待识别人脸图像对应的人脸空间特征;
人脸分类模块27,用于根据人脸空间特征确定待识别人脸图像对应的人脸分类结果;人脸分类结果用于表征待识别人脸图像中所包含的人脸类型的对象的身份验证结果。
人脸特征提取模块26,人脸分类模块27的具体功能实现方式可以参见上述图3所对应实施例中的步骤S106,这里不再进行赘述。
本申请实施例中,可以获取到N个客户端分别上传的所属局部识别模型的局部模型参数,以及获取针对基于N个局部模型参数确定的局部模型参数集合的M种参数融合方式(如,M个权重组合)(M为正整数),通过每种参数融合方式(如,每个权重组合)分别对局部模型参数集合进行参数融合,得到M个备选全局模型,进而通过M个备选全局模型分别在多媒体验证数据集中的评估指标,在M个备选全局模型中选择最优的目标全局模型。即从根据M种参数融合方式得到的M个备选全局模型中选择最优的目标全局模型,可以提高N个局部模型参数之间的融合有效性,N个客户端基于目标全局模型继续对各自所属的局部识别模型进行参数更新,可以提高对象识别模型的泛化效果。本申请实施例可以应用在跨部门、跨企业,甚至跨区域的业务数据中,在确保数据隐私和安全的情况下,可以提高对象识别模型的识别效果。
请参见图10,图10是本申请实施例提供的一种计算机设备的结构示意图。如图10所示,该计算机设备1000可以包括:处理器1001,网络接口1004和存储器1005,此外,上述计算机设备1000还可以包括:用户接口1003,和至少一个通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。其中,用户接口1003可以包括显示屏(Display)、键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。可选的,网络接口1004可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。可选的,存储器1005还可以是至少一个位于远离前述处理器1001的存储装置。如图10所示,作为一种计算机可读存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及设备控制应用程序。
在如图10所示的计算机设备1000中,网络接口1004可提供网络通讯功能;而用户接口1003主要用于为用户提供输入的接口;而处理器1001可以用于调用存储器1005中存储的设备控制应用程序,以实现:
获取N个局部识别模型分别对应的局部模型参数;N个局部识别模型分别由N个客户端进行独立训练得到,每个客户端均包括用于训练所关联的局部识别模型的多媒体样本数据,多媒体样本数据包含目标对象类型的对象,N为大于1的正整数;
获取与局部模型参数集合相关联的M种参数融合方式,根据每种参数融合方式分别对局部模型参数集合进行参数融合,得到M个备选全局模型;局部模型参数集合基于N个局部识别模型分别对应的局部模型参数确定,M为正整数;
获取M个备选全局模型分别在多媒体验证数据集中的评估指标,根据评估指标在M个 备选全局模型中确定目标全局模型,将目标全局模型传输至N个客户端,以使N个客户端分别根据目标全局模型对所关联的局部识别模型进行参数更新,得到对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
应当理解,本申请实施例中所描述的计算机设备1000可执行前文图3所对应实施例中对数据处理方法的描述,也可执行前文图8所对应实施例中对数据处理装置1的描述,在此不再赘述。另外,对采用相同方法的有益效果描述,也不再进行赘述。
请参见图11,图11是本申请实施例提供的一种计算机设备的结构示意图。如图11所示,该计算机设备2000可以包括:处理器2001,网络接口2004和存储器2005,此外,上述计算机设备2000还可以包括:用户接口2003,和至少一个通信总线2002。其中,通信总线2002用于实现这些组件之间的连接通信。其中,用户接口2003可以包括显示屏(Display)、键盘(Keyboard),可选用户接口2003还可以包括标准的有线接口、无线接口。可选的,网络接口2004可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器2005可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。可选的,存储器2005还可以是至少一个位于远离前述处理器2001的存储装置。如图11所示,作为一种计算机可读存储介质的存储器2005中可以包括操作系统、网络通信模块、用户接口模块以及设备控制应用程序。
在如图11所示的计算机设备2000中,网络接口2004可提供网络通讯功能;而用户接口2003主要用于为用户提供输入的接口;而处理器2001可以用于调用存储器2005中存储的设备控制应用程序,以实现:
响应于目标局部识别模型的训练次数满足同步周期,将目标局部识别模型对应的局部模型参数上传至服务设备,以使服务设备基于N个客户端分别上传的局部模型参数得到目标全局模型;N个客户端分别上传的局部模型参数包括目标局部识别模型对应的局部模型参数,目标全局模型由M个备选全局模型分别在多媒体验证数据集中的评估指标所确定,M个备选全局模型由局部模型参数集合所关联的M种参数融合方式和局部模型参数集合所确定,局部模型参数集合基于N个客户端分别上传的局部模型参数确定,N为大于1的正整数,M为正整数;
接收服务设备返回的目标全局模型,根据目标全局模型对目标局部识别模型进行参数更新,将参数更新后的目标局部识别模型确定为对象识别模型;对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
应当理解,本申请实施例中所描述的计算机设备2000可执行前文图3所对应实施例中对数据处理方法的描述,也可执行前文图9所对应实施例中对数据处理装置2的描述,在此不再赘述。另外,对采用相同方法的有益效果描述,也不再进行赘述。
此外,这里需要指出的是:本申请实施例还提供了一种非临时性计算机可读存储介质,且非临时性计算机可读存储介质中存储有前文提及的数据处理装置1所执行的计算机程序以及数据处理装置2所执行的计算机程序,且计算机程序包括程序指令,当处理器执行程序指令时,能够执行前文图3所对应实施例中对数据处理方法的描述,因此,这里将不再进行赘述。另外,对采用相同方法的有益效果描述,也不再进行赘述。对于本申请所涉及的非临时性计算机可读存储介质实施例中未披露的技术细节,请参照本申请方法实施例的描述。作为示例,程序指令可被部署在一个计算设备上执行,或者在位于一个地点的多个计算设备上执行,又或者,在分布在多个地点且通过通信网络互连的多个计算设备上执行,分布在多个地点且通过通信网络互连的多个计算设备可以组成区块链系统。
此外,需要说明的是:本申请实施例还提供了一种计算机程序产品或计算机程序,该计算机程序产品或者计算机程序可以包括计算机指令,该计算机指令可以存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器可以执行该计算机指令,使得该计算机设备执行前文图3所对应实施例中对数据处理方法的描述,因此,这里将不再进行赘述。另外,对采用相同方法的有益效果描述,也不再进行赘述。对于 本申请所涉及的计算机程序产品或者计算机程序实施例中未披露的技术细节,请参照本申请方法实施例的描述。
需要说明的是,对于前述的各个方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某一些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作和模块并不一定是本申请所必须的。本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。本申请实施例装置中的模块可以根据实际需要进行合并、划分和删减。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,计算机程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,存储介质可为磁碟、光盘、只读存储器(Read-Only Memory,ROM)或随机存储器(Random Access Memory,RAM)等。
以上所揭露的仅为本申请可选实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请所作的等同变化,仍属本申请所涵盖的范围。

Claims (19)

  1. 一种数据处理方法,其中,所述方法由服务设备执行,所述方法包括:
    获取N个局部识别模型分别对应的局部模型参数;所述N个局部识别模型分别由N个客户端进行独立训练得到,每个客户端均包括用于训练所关联的局部识别模型的多媒体样本数据,所述多媒体样本数据包含目标对象类型的对象,N为大于1的正整数;
    获取与局部模型参数集合相关联的M种参数融合方式,根据每种参数融合方式分别对所述局部模型参数集合进行参数融合,得到M个备选全局模型;所述局部模型参数集合基于所述N个局部识别模型分别对应的局部模型参数确定,M为正整数;
    获取所述M个备选全局模型分别在多媒体验证数据集中的评估指标,根据所述评估指标在所述M个备选全局模型中确定目标全局模型,将所述目标全局模型传输至所述N个客户端,以使所述N个客户端分别根据所述目标全局模型对所关联的局部识别模型进行参数更新,得到对象识别模型;所述对象识别模型用于识别多媒体数据中所包含的所述目标对象类型的对象。
  2. 根据权利要求1所述的方法,其中,一种参数融合方式基于一个权重组合实现,所述获取与局部模型参数集合相关联的M种参数融合方式,根据每种参数融合方式分别对所述局部模型参数集合进行参数融合,得到M个备选全局模型,包括:
    获取与所述局部模型参数集合相关联的M个权重组合,根据每个权重组合分别对所述局部模型参数集合进行参数融合,得到所述M个备选全局模型;所述每个权重组合包括所述局部模型参数集合中的各个局部模型参数分别对应的训练影响权重。
  3. 根据权利要求2所述的方法,其中,所述M个权重组合包括权重组合i,i为小于或等于M的正整数,所述根据每个权重组合分别对所述局部模型参数集合进行参数融合,得到所述M个备选全局模型,包括:
    将所述权重组合i所包含的训练影响权重与所述局部模型参数集合所包含的各个局部模型参数进行加权平均,得到融合模型参数,将携带所述融合模型参数的识别模型确定为所述权重组合i所关联的备选全局模型i。
  4. 根据权利要求2所述的方法,其中,所述根据所述评估指标在所述M个备选全局模型中确定目标全局模型,包括:
    在所述M个备选全局模型中,将最大的评估指标所对应的备选全局模型确定为所述目标全局模型。
  5. 根据权利要求3所述的方法,其中,所述局部模型参数集合包括所述N个局部识别模型分别对应的局部模型参数,所述获取与局部模型参数集合相关联的M个权重组合,包括:
    在目标取值范围内采样N个数值,将所述N个数值所对应的绝对值之和确定为范数值;
    将所述N个数值分别与所述范数值之间的比值,确定为与所述局部模型参数集合相关联的权重组合i。
  6. 根据权利要求1所述的方法,其中,所述获取所述M个备选全局模型分别在多媒体验证数据集中的评估指标,包括:
    获取包含正样本对和负样本对的多媒体验证数据集;所述正样本对是指包含相同对象的多媒体样本数据对,所述负样本对是指包含不同对象的多媒体样本数据对;
    将所述正样本对输入至所述M个备选全局模型中的备选全局模型i,通过所述备选全局模型i输出所述正样本对的第一对象预测结果;i为小于或等于M的正整数;
    将所述负样本对输入至所述备选全局模型i,通过所述备选全局模型i输出所述负样本对的第二对象预测结果;
    根据所述第一对象预测结果和所述第二对象预测结果,确定所述备选全局模型i在多媒体验证数据集中的评估指标。
  7. 根据权利要求6所述的方法,其中,所述根据所述第一对象预测结果和所述第二对象预测结果,确定所述备选全局模型i在多媒体验证数据集中的评估指标,包括:
    根据所述第一对象预测结果,统计所述备选全局模型i在所述正样本对中的第一正确预测数量;根据所述第二对象预测结果,统计所述备选全局模型i在所述负样本对中的第二正确预测数量;
    将所述第一正确预测数量和所述第二正确预测数量之和,确定为所述备选全局模型i在所述多媒体验证数据集中的预测正确样本对总量;
    获取所述多媒体验证数据集对应的样本对总数量,根据所述预测正确样本对总量与所述样本对总数量之间的比值,确定所述备选全局模型i在多媒体验证数据集中的评估指标。
  8. 根据权利要求7所述的方法,其中,所述多媒体验证数据集的数量为P个,P个多媒体验证数据集包括多媒体验证数据集j,P为正整数,j为小于或等于P的正整数;
    所述根据所述预测正确样本对总量与所述样本对总数量之间的比值,确定所述备选全局模型i在多媒体验证数据集中的评估指标,包括:
    将所述备选全局模型i在所述多媒体验证数据集j中的预测正确样本对总量,与所述多媒体验证数据集j对应的样本对总数量之间的比值,确定为所述备选全局模型i在所述多媒体验证数据集j中的预测准确率;
    获取所述备选全局模型i分别在P个多媒体验证数据集中的预测准确率,统计P个预测准确率对应的平均准确率,以及所述P个预测准确率对应的标准差值;
    根据所述平均准确率和所述标准差值,确定所述备选全局模型i在所述P个多媒体验证数据集中的评估指标。
  9. 根据权利要求1所述的方法,其中,所述获取所述M个备选全局模型分别在多媒体验证数据集中的评估指标,包括:
    获取包含正样本对和负样本对的多媒体验证数据集;所述正样本对是指包含相同对象的多媒体样本数据对,所述负样本对是指包含不同对象的多媒体样本数据对;
    获取所述M个备选全局模型中所包含的备选全局模型i在所述多媒体验证数据集中的错误接受率,在所述负样本对所对应的相似度中确定相似度阈值;所述相似度阈值由所述负样本对的数量和所述错误接受率所确定,i为小于或等于M的正整数;
    获取所述正样本对所对应的相似度,在所述正样本对中获取相似度大于所述相似度阈值的第一样本对,将所述第一样本对的数量与所述正样本对的数量之间的比值确定为所述备选全局模型i在所述多媒体验证数据集中的评估指标。
  10. 根据权利要求9所述的方法,其中,所述获取所述M个备选全局模型中所包含的备选全局模型i在所述多媒体验证数据集中的错误接受率,包括:
    获取所述M个备选全局模型中所包含的备选全局模型i在所述负样本对中的错误预测数量;
    将所述错误预测数量与所述负样本对的数量之间的比值,确定为所述备选全局模型i在所述多媒体验证数据集中的错误接受率。
  11. 根据权利要求1所述的方法,其中,所述目标全局模型是基于第r个同步周期所对应 的N个局部模型参数所生成的,r为正整数;
    所述方法还包括:
    获取第(r-1)个同步周期对应的历史全局模型;所述历史全局模型是基于所述N个客户端分别在第(r-1)个同步周期所上传的局部模型参数所生成的;
    获取所述N个局部识别模型在所述第r个同步周期内的训练学习率;
    获取所述目标全局模型与所述历史全局模型之间的模型参数差值;
    将所述模型参数差值与所述训练学习率之间的比值确定为联邦动量,将所述联邦动量发送至所述N个客户端;所述联邦动量连同所述目标全局模型用于指示所述N个客户端对所关联的局部识别模型进行参数更新,且所述联邦动量用于指示所述N个局部识别模型分别在所属客户端中的训练方向。
  12. 一种数据处理方法,其中,所述方法由客户端执行,所述方法包括:
    响应于目标局部识别模型的训练次数满足同步周期,将所述目标局部识别模型对应的局部模型参数上传至服务设备,以使所述服务设备基于N个客户端分别上传的局部模型参数得到目标全局模型;所述N个客户端分别上传的局部模型参数包括所述目标局部识别模型对应的局部模型参数,所述目标全局模型由M个备选全局模型分别在多媒体验证数据集中的评估指标所确定,所述M个备选全局模型由局部模型参数集合所关联的M种参数融合方式和所述局部模型参数集合所确定,所述局部模型参数集合基于所述N个客户端分别上传的局部模型参数确定,N为大于1的正整数,M为正整数;
    接收所述服务设备返回的所述目标全局模型,根据所述目标全局模型对所述目标局部识别模型进行参数更新,将参数更新后的目标局部识别模型确定为对象识别模型;所述对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
  13. 根据权利要求12的方法,其中,所述方法还包括:
    获取多媒体样本数据,将所述多媒体样本数据输入至所述目标局部识别模型,通过所述目标局部识别模型输出所述多媒体样本数据对应的对象空间特征;
    根据所述对象空间特征与所述多媒体样本数据对应的标签信息,确定所述目标局部识别模型对应的训练损失函数;
    根据所述训练损失函数确定所述目标局部识别模型的训练梯度,根据所述训练梯度以及所述目标局部识别模型对应的训练学习率,对所述目标局部识别模型进行参数更新,统计所述目标局部识别模型对应的训练次数。
  14. 根据权利要求12所述的方法,其中,所述多媒体数据包括待识别人脸图像,所述目标对象类型包括人脸类型;
    所述方法还包括:
    获取所述待识别人脸图像,将所述待识别人脸图像输入至所述对象识别模型,通过所述对象识别模型输出所述待识别人脸图像对应的人脸空间特征;
    根据所述人脸空间特征确定所述待识别人脸图像对应的人脸分类结果;所述人脸分类结果用于表征所述待识别人脸图像中所包含的人脸类型的对象的身份验证结果。
  15. 一种数据处理装置,其中,所述装置包括:
    参数获取模块,用于获取N个局部识别模型分别对应的局部模型参数;所述N个局部识别模型分别由N个客户端进行独立训练得到,每个客户端均包括用于训练所关联的局部识别模型的多媒体样本数据,所述多媒体样本数据包含目标对象类型的对象,N为大于1的正整数;
    参数融合模块,用于获取与局部模型参数集合相关联的M种参数融合方式,根据每种参 数融合方式分别对所述局部模型参数集合进行参数融合,得到M个备选全局模型;所述局部模型参数集合基于所述N个局部识别模型分别对应的局部模型参数确定,M为正整数;
    模型确定模块,用于获取所述M个备选全局模型分别在多媒体验证数据集中的评估指标,根据所述评估指标在所述M个备选全局模型中确定目标全局模型,将所述目标全局模型传输至所述N个客户端,以使所述N个客户端分别根据所述目标全局模型对所关联的局部识别模型进行参数更新,得到对象识别模型;所述对象识别模型用于识别多媒体数据中所包含的所述目标对象类型的对象。
  16. 一种数据处理装置,其中,所述装置包括:
    模型参数上传模块,用于响应于目标局部识别模型的训练次数满足同步周期,将所述目标局部识别模型对应的局部模型参数上传至服务设备,以使所述服务设备基于N个客户端分别上传的局部模型参数得到目标全局模型;所述N个客户端分别上传的局部模型参数包括所述目标局部识别模型对应的局部模型参数,所述目标全局模型由M个备选全局模型分别在多媒体验证数据集中的评估指标所确定,所述M个备选全局模型由局部模型参数集合所关联的M种参数融合方式和所述局部模型参数集合所确定,所述局部模型参数集合基于所述N个客户端分别上传的局部模型参数确定,N为大于1的正整数,M为正整数;
    目标全局模型接收模块,用于接收所述服务设备返回的所述目标全局模型,根据所述目标全局模型对所述目标局部识别模型进行参数更新,将参数更新后的目标局部识别模型确定为对象识别模型;所述对象识别模型用于识别多媒体数据中所包含的目标对象类型的对象。
  17. 一种计算机设备,其中,所述计算机设备包括存储器和处理器;
    所述存储器与所述处理器相连,所述存储器用于存储计算机程序,所述处理器用于调用所述计算机程序,以使得所述计算机设备执行权利要求1-14任一项的方法。
  18. 一种非临时性计算机可读存储介质,其中,所述非临时性计算机可读存储介质中存储有计算机程序,所述计算机程序适于由处理器加载并执行,以使得具有所述处理器的计算机设备执行权利要求1-14任一项的方法。
  19. 一种计算机程序产品,其中,所述计算机程序产品包括计算机指令,所述计算机指令存储在计算机可读存储介质中,计算机设备的处理器从所述计算机可读存储介质读取所述计算机指令,所述处理器执行所述计算机指令,使得所述计算机设备执行权利要求1-14任一项的方法。
PCT/CN2021/108748 2021-04-15 2021-07-27 数据处理方法、装置、设备以及介质 WO2022217781A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/128,719 US20230237326A1 (en) 2021-04-15 2023-03-30 Data processing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110407285.0 2021-04-15
CN202110407285.0A CN114676853A (zh) 2021-04-15 2021-04-15 数据处理方法、装置、设备以及介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/128,719 Continuation US20230237326A1 (en) 2021-04-15 2023-03-30 Data processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2022217781A1 true WO2022217781A1 (zh) 2022-10-20

Family

ID=82070532

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/108748 WO2022217781A1 (zh) 2021-04-15 2021-07-27 数据处理方法、装置、设备以及介质

Country Status (3)

Country Link
US (1) US20230237326A1 (zh)
CN (1) CN114676853A (zh)
WO (1) WO2022217781A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117474127A (zh) * 2023-12-27 2024-01-30 苏州元脑智能科技有限公司 分布式机器学习模型训练系统、方法、装置及电子设备

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761850B (zh) * 2022-11-16 2024-03-22 智慧眼科技股份有限公司 人脸识别模型训练方法、人脸识别方法、装置及存储介质
CN115828022B (zh) * 2023-02-21 2023-04-25 中国电子科技集团公司第十五研究所 一种数据识别方法、联邦训练模型、装置和设备
CN116522228B (zh) * 2023-04-28 2024-02-06 哈尔滨工程大学 一种基于特征模仿联邦学习的射频指纹识别方法
CN116862269B (zh) * 2023-09-04 2023-11-03 中国标准化研究院 一种利用大数据评估快速检测方法精密度的方法
CN117114821A (zh) * 2023-10-23 2023-11-24 湖南快乐阳光互动娱乐传媒有限公司 物品推荐方法、装置、存储介质和电子设备
CN117370472B (zh) * 2023-12-07 2024-02-27 苏州元脑智能科技有限公司 数据处理方法、装置、设备及存储介质

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102238555A (zh) * 2011-07-18 2011-11-09 南京邮电大学 认知无线电中基于协作学习的多用户动态频谱接入方法
CN108490388A (zh) * 2018-03-13 2018-09-04 同济大学 一种基于uwb与vlc技术的多源联合室内定位方法
CN108763362A (zh) * 2018-05-17 2018-11-06 浙江工业大学 基于随机锚点对选择的局部模型加权融合Top-N电影推荐方法
US20190227980A1 (en) * 2018-01-22 2019-07-25 Google Llc Training User-Level Differentially Private Machine-Learned Models
US20200027033A1 (en) * 2018-07-19 2020-01-23 Adobe Inc. Updating Machine Learning Models On Edge Servers
CN110874484A (zh) * 2019-10-16 2020-03-10 众安信息技术服务有限公司 基于神经网络和联邦学习的数据处理方法和系统
CN110874648A (zh) * 2020-01-16 2020-03-10 支付宝(杭州)信息技术有限公司 联邦模型的训练方法、系统和电子设备
CN110874637A (zh) * 2020-01-16 2020-03-10 支付宝(杭州)信息技术有限公司 基于隐私数据保护的多目标融合学习方法、装置和系统
CN112365007A (zh) * 2020-11-11 2021-02-12 深圳前海微众银行股份有限公司 模型参数确定方法、装置、设备及存储介质
CN112651511A (zh) * 2020-12-04 2021-04-13 华为技术有限公司 一种训练模型的方法、数据处理的方法以及装置

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102238555A (zh) * 2011-07-18 2011-11-09 南京邮电大学 认知无线电中基于协作学习的多用户动态频谱接入方法
US20190227980A1 (en) * 2018-01-22 2019-07-25 Google Llc Training User-Level Differentially Private Machine-Learned Models
CN108490388A (zh) * 2018-03-13 2018-09-04 同济大学 一种基于uwb与vlc技术的多源联合室内定位方法
CN108763362A (zh) * 2018-05-17 2018-11-06 浙江工业大学 基于随机锚点对选择的局部模型加权融合Top-N电影推荐方法
US20200027033A1 (en) * 2018-07-19 2020-01-23 Adobe Inc. Updating Machine Learning Models On Edge Servers
CN110874484A (zh) * 2019-10-16 2020-03-10 众安信息技术服务有限公司 基于神经网络和联邦学习的数据处理方法和系统
CN110874648A (zh) * 2020-01-16 2020-03-10 支付宝(杭州)信息技术有限公司 联邦模型的训练方法、系统和电子设备
CN110874637A (zh) * 2020-01-16 2020-03-10 支付宝(杭州)信息技术有限公司 基于隐私数据保护的多目标融合学习方法、装置和系统
CN112365007A (zh) * 2020-11-11 2021-02-12 深圳前海微众银行股份有限公司 模型参数确定方法、装置、设备及存储介质
CN112651511A (zh) * 2020-12-04 2021-04-13 华为技术有限公司 一种训练模型的方法、数据处理的方法以及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117474127A (zh) * 2023-12-27 2024-01-30 苏州元脑智能科技有限公司 分布式机器学习模型训练系统、方法、装置及电子设备
CN117474127B (zh) * 2023-12-27 2024-03-26 苏州元脑智能科技有限公司 分布式机器学习模型训练系统、方法、装置及电子设备

Also Published As

Publication number Publication date
US20230237326A1 (en) 2023-07-27
CN114676853A (zh) 2022-06-28

Similar Documents

Publication Publication Date Title
WO2022217781A1 (zh) 数据处理方法、装置、设备以及介质
Nguyen et al. Federated learning for internet of things: A comprehensive survey
US10083186B2 (en) System and method for large scale crowdsourcing of map data cleanup and correction
US10360482B1 (en) Crowd-sourced artificial intelligence image processing services
CN110830448B (zh) 目标事件的流量异常检测方法、装置、电子设备及介质
CN112712182A (zh) 一种基于联邦学习的模型训练方法、装置及存储介质
US20220044120A1 (en) Synthesizing a singular ensemble machine learning model from an ensemble of models
KR20160083900A (ko) 얼굴 표현을 위한 시스템 및 방법
CN113065843B (zh) 一种模型处理方法、装置、电子设备和存储介质
CN114332984B (zh) 训练数据处理方法、装置和存储介质
CN112799708B (zh) 联合更新业务模型的方法及系统
CN110298240B (zh) 一种汽车用户识别方法、装置、系统及存储介质
CN113011883A (zh) 一种数据处理方法、装置、设备及存储介质
CN114205690A (zh) 流量预测、模型训练方法及装置、电子设备、存储介质
US20200193168A1 (en) Shop platform using blockchain
CN112116103A (zh) 基于联邦学习的个人资质评估方法、装置及系统及存储介质
CN110874638B (zh) 面向行为分析的元知识联邦方法、装置、电子设备及系统
CN107256231B (zh) 一种团队成员识别设备、方法及系统
CN114004639B (zh) 优惠信息的推荐方法、装置、计算机设备和存储介质
US20230281462A1 (en) Data processing method and apparatus, device, and medium
CN116681045A (zh) 报表生成方法、装置、计算机设备及存储介质
WO2022089220A1 (zh) 图像数据处理方法、装置、设备、存储介质及产品
CN112258009B (zh) 一种智慧政务请求处理方法
WO2019143360A1 (en) Data security using graph communities
CN113642519A (zh) 一种人脸识别系统和人脸识别方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21936653

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21936653

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.03.2024)