CN113240035A

CN113240035A - Data processing method, device and equipment

Info

Publication number: CN113240035A
Application number: CN202110587053.8A
Authority: CN
Inventors: 宋旭鸣
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2021-05-27
Filing date: 2021-05-27
Publication date: 2021-08-10

Abstract

The application provides a data processing method, a device and equipment, wherein the method comprises the following steps: inputting training data in an original data set to a first original sub-network of an original trained model to obtain a first original characteristic; inputting training data in a target data set into a first target sub-network of a target model to be trained to obtain a first target characteristic, wherein the target model to be trained also comprises a second target sub-network; inputting the first original characteristic and the first target characteristic into a second target sub-network of the target model to be trained to obtain a second target characteristic; the method includes training a first target subnetwork based on first target features, training a second target subnetwork based on second target features, and obtaining a target trained model based on the trained first target subnetwork and the trained second target subnetwork. According to the technical scheme, all training images in the original data set do not need to be calibrated again, the workload of recalibration is reduced, and calibration resources can be saved.

Description

Data processing method, device and equipment

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a data processing method, apparatus, and device.

Background

Machine learning is a way to realize artificial intelligence, is a multi-field cross subject, and relates to a plurality of subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. Machine learning is used to study how computers simulate or implement human learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to improve their performance. Machine learning focuses more on algorithm design, so that a computer can automatically learn rules from data and predict unknown data by using the rules. Machine learning has found a wide variety of applications, such as deep learning, data mining, computer vision, natural language processing, biometric recognition, search engines, medical diagnostics, speech recognition, and handwriting recognition.

In order to implement artificial intelligence processing by machine learning, a data set can be constructed, a machine learning model is trained based on the data set, and artificial intelligence processing is implemented by the machine learning model. The data set includes a plurality of training data, each training data including a training image and a label class for representing a class of the training image. For example, assuming that the machine learning model is used to detect a target class, and the target class is class a or class B, the label class is used to indicate that the class of the training image is class a or class B.

When the machine learning model is applied to an actual business, the target class detected by the machine learning model may change due to a business change. For example, when a business starts, a target class detected by the machine learning model is class a or class B, and after the business is deployed for a period of time, a business demand changes, and the target class detected by the machine learning model is class a, class B, or class C.

Obviously, when the target class detected by the machine learning model changes, the machine learning model cannot obtain good performance, and the machine learning model needs to be retrained. To retrain the machine learning model, all training images in the dataset need to be re-labeled, i.e., the label class of each training image is re-labeled, i.e., the label class is class a, class B, or class C.

Because a large number of training images exist in the data set, when the large number of training images in the data set are recalibrated, the workload of recalibration is large, and large calibration resource consumption is brought.

Disclosure of Invention

The application provides a data processing method, which comprises the following steps:

inputting training data in an original data set to a first original sub-network of an original trained model to obtain a first original characteristic; wherein the original trained model further comprises a second original subnetwork, the first original subnetwork comprises the first M network layers in the original trained model, and the second original subnetwork comprises the remaining network layers in the original trained model except the first M network layers;

inputting training data in a target data set to a first target sub-network of a target model to be trained to obtain a first target characteristic; the model to be trained of the target further comprises a second target subnetwork, the first target subnetwork comprises the first M network layers in the model to be trained of the target, and the second target subnetwork comprises the rest network layers except the first M network layers in the model to be trained of the target;

inputting the first original feature and the first target feature into a second target sub-network of the target model to be trained, and outputting a second target feature by the second target sub-network;

training a first target subnetwork based on the first target feature, training a second target subnetwork based on the second target feature, and acquiring a target trained model based on the trained first target subnetwork and the trained second target subnetwork; wherein the target trained model is used to implement data processing.

In one possible implementation, the number of the original trained models is N, the N original trained models correspond to the N original data sets one to one, and the first original features input to the second target subnetwork are N first original features corresponding to the N original trained models;

wherein, for each original trained model, the training process of the original trained model comprises:

inputting training data in an original data set corresponding to the original trained model to the original model to be trained to obtain a second original characteristic; training the original model to be trained based on the second original features to obtain an original trained model; wherein the original trained model is used to implement data processing.

In one possible implementation, the training data in the original data set includes an original training image and an original label class corresponding to the original training image; training data in a target data set comprises a target training image and a target label category corresponding to the target training image;

the target class set corresponding to the target data set is a union set of N original class sets corresponding to N original data sets, and the N original data sets are in one-to-one correspondence with the N original class sets;

the target category set comprises all target label categories in a target data set;

the original set of categories includes all original label categories in the original dataset.

In one possible embodiment, the training a first target subnetwork based on the first target feature, training a second target subnetwork based on the second target feature, and obtaining a target trained model based on the trained first target subnetwork and the trained second target subnetwork includes:

determining whether the first target subnetwork has converged; if not, adjusting the parameters of the first target sub-network based on the first target feature to obtain an adjusted sub-network, determining the adjusted sub-network as the first target sub-network, and returning to perform the operation of inputting the training data in the target data set to the first target sub-network; if so, determining the first target sub-network as a first converged target sub-network;

determining whether the second target subnetwork has converged; if not, adjusting the parameters of the second target sub-network based on the second target feature to obtain an adjusted sub-network, determining the adjusted sub-network as the second target sub-network, and returning to execute the operation of inputting the first original feature and the first target feature into the second target sub-network; if so, determining the second target subnetwork as a second converged target subnetwork;

a target trained model is obtained based on the first converged target sub-network and the second converged target sub-network.

In one possible embodiment, the determining whether the first target subnet has converged and the determining whether the second target subnet has converged includes: if the first target subnetwork meets the convergence condition and the second target subnetwork meets the convergence condition, determining that the first target subnetwork has converged and determining that the second target subnetwork has converged; or if the first target subnetwork meets the convergence condition and the second target subnetwork does not meet the convergence condition, determining that the first target subnetwork has converged and determining that the second target subnetwork has converged; alternatively, if the first target subnetwork does not satisfy the convergence condition and the second target subnetwork satisfies the convergence condition, then it is determined that the first target subnetwork has converged and it is determined that the second target subnetwork has converged.

In a possible implementation, after obtaining the trained target model based on the trained first target subnetwork and the trained second target subnetwork, the method further includes:

if the target trained model is used for realizing target detection, after data to be processed is obtained, the data to be processed is input into the target trained model, the target trained model carries out detection processing on the data to be processed to obtain a detection result, and target detection is carried out based on the detection result;

if the target trained model is used for realizing target classification, after data to be processed is obtained, the data to be processed is input into the target trained model, the data to be processed is classified by the target trained model to obtain a classification result, and target classification is carried out based on the classification result.

In one possible implementation, the network structures of the first M network layers in the original trained model are the same as the network structures of the first M network layers in the target trained model;

if N original trained models exist, and N is a positive integer greater than 1, the network structures of the first M network layers in the N original trained models are the same.

The application proposes a data processing apparatus, said apparatus comprising:

the acquisition module is used for inputting training data in the original data set to a first original sub-network of an original trained model to obtain a first original characteristic; wherein the original trained model further comprises a second original subnetwork, the first original subnetwork comprises the first M network layers in the original trained model, and the second original subnetwork comprises the remaining network layers in the original trained model except the first M network layers;

inputting training data in the target data set into a first target sub-network of the target model to be trained to obtain a first target characteristic; the model to be trained of the target further comprises a second target subnetwork, the first target subnetwork comprises the first M network layers in the model to be trained of the target, and the second target subnetwork comprises the rest network layers except the first M network layers in the model to be trained of the target;

a training module, configured to train the first target subnetwork based on the first target feature, train the second target subnetwork based on the second target feature, and obtain a target trained model based on the trained first target subnetwork and the trained second target subnetwork;

wherein the target trained model is used to implement data processing.

In a possible implementation, the training module is specifically configured to:

The application proposes a data processing device comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor;

the processor is configured to execute machine executable instructions to perform the steps of:

According to the technical scheme, the original trained model can be obtained based on original data set training, artificial intelligence processing is achieved through the original trained model, and the target class is detected. When the original trained model is applied to actual business, if the business changes to cause the target type detected by the original trained model to change, the original trained model is trained on the basis of a target data set to obtain a target trained model, artificial intelligence processing is realized through the target trained model, and the changed target type is detected. Because the target data set is different from the original data set, all training images in the original data set do not need to be re-calibrated, namely the label category of each training image does not need to be re-calibrated, so that the workload of re-calibration is reduced, the calibration resources are saved, the resource consumption investment of re-calibration is reduced, and the target data set is fused for joint training without performing label category supplementary calibration on the original data set. The training data in the target data set may be less than the training data in the original data set, i.e., only a small amount of training data is needed to train to obtain the target trained model. The original features corresponding to the training data in the original data set can be participated in the training process of the target trained model, the performance of the target trained model is improved, and the accuracy of the intelligent analysis result of the target trained model is high. Because the original features participate in the training process of the target trained model, rather than the training data in the original data set, the sensitive data are prevented from participating in the training process of the target trained model, the confidential information of the training data is hidden, and the safety is improved.

Drawings

FIG. 1 is a schematic flow chart diagram of a data processing method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a training process in one embodiment of the present application;

FIG. 3 is a schematic diagram of a training process of an original trained model in one embodiment of the present application;

FIG. 4 is a schematic diagram illustrating a training process of a target trained model according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application.

Detailed Description

The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in the embodiments of the present application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".

Before the technical solutions of the present application are introduced, concepts related to the embodiments of the present application are introduced.

Machine learning: machine learning is a way to implement artificial intelligence, and is used to study how a computer simulates or implements human learning behaviors to acquire new knowledge or skills, and reorganize an existing knowledge structure to continuously improve its performance. Deep learning, which is a subclass of machine learning, is a process of modeling a specific problem in the real world using a mathematical model to solve similar problems in the field.

The models in the embodiments of the present application all refer to machine learning models, for example, the models may be deep learning models in the machine learning models, and certainly, the deep learning models are only an example, and are not limited thereto.

Taking the example that the model of the embodiment of the present application is a deep learning model, an example of the deep learning model is a neural network model, and in the following embodiments, taking the neural network model as an example, the structure and the function of the neural network model are described, and other subclasses of the deep learning model are similar to the structure and the function of the neural network model.

For example, the model in the embodiment of the present application may be a detection model, that is, a model for implementing target detection, or may be a classification model, that is, a model for implementing target classification, and the type of the model is not limited. The detection model may be a deep learning model for implementing target detection, such as a detection model for detecting the position of a target in a specified category in an image, for example, yolov2, FRCNN, and the like. The classification model may be a deep learning model for implementing object classification, such as for detecting the type of a specified object in an image.

A neural network model: the neural network model includes, but is not limited to, a convolutional neural network model, a cyclic neural network model, a fully-connected network model, and the like, and the structural units of the neural network model may include, but are not limited to, a convolutional layer (Conv), a pooling layer (Pool), an excitation layer, a fully-connected layer (FC), and the like. In practical application, one or more convolution layers, one or more pooling layers, one or more excitation layers and one or more fully-connected layers can be combined to construct a neural network model according to different requirements.

In the convolutional layer, the input data features are enhanced by performing a convolution operation on the input data features using a convolution kernel, the convolution kernel may be a matrix of m × n, the input data features of the convolutional layer are convolved with the convolution kernel, the output data features of the convolutional layer may be obtained, and the convolution operation is actually a filtering process.

In the pooling layer, the input data features (such as the output of the convolutional layer) are subjected to operations of taking the maximum value, taking the minimum value, taking the average value and the like, so that the input data features are sub-sampled by utilizing the principle of local correlation, the processing amount is reduced, the feature invariance is kept, and the operation of the pooling layer is actually a down-sampling process.

In the excitation layer, the input data features can be mapped using an activation function (e.g., a nonlinear function), thereby introducing a nonlinear factor such that the neural network enhances expressive power through a combination of nonlinearities. The activation function may include, but is not limited to, a ReLU (Rectified Linear Unit) function that is used to set features less than 0 to 0, while features greater than 0 remain unchanged.

In the fully-connected layer, the fully-connected layer is configured to perform fully-connected processing on all data features input to the fully-connected layer, so as to obtain a feature vector, and the feature vector may include a plurality of data features.

In order to implement artificial intelligence processing, an original data set may be constructed, an original trained model is trained based on the original data set, and artificial intelligence processing is implemented using the original trained model. The raw data set includes a large amount of training data, each of which includes a training image and a label class for representing a class of the training image. For example, if the original trained model is used to detect a target class, and the target class is class a or class B, the label class is used to indicate that the training image is class a or class B.

When the original trained model is applied to an actual business, the class of the target detected by the original trained model may change due to business change. For example, after the service requirement changes, the target class detected by the original trained model is class a, class B, or class C. Obviously, the original trained model cannot detect the class C, that is, an accurate detection result cannot be obtained, and the model needs to be retrained. In order to retrain the model, all training images in the original dataset need to be re-labeled, i.e., the label class of each training image is re-labeled, i.e., the label class is class a, class B, or class C. Clearly, when re-calibrating a large number of training images in the original dataset, the re-calibration effort is large.

In view of the above findings, an embodiment of the present application provides a data processing method, where an original trained model is obtained by training based on an original data set, and a target trained model is obtained by training based on a target data set, and since the target data set is different from the original data set, it is not necessary to recalibrate all training images in the original data set, that is, it is not necessary to recalibrate a label type of each training image, so that the recalibration workload is reduced, and a target data set is fused for joint training without performing label type supplementary calibration on the original data set.

The technical solutions of the embodiments of the present application are described below with reference to specific embodiments.

Referring to fig. 1, a flow chart of a data processing method is schematically shown, where the method may include:

step 101, inputting training data in an original data set to a first original subnetwork of an original trained model to obtain a first original feature. For example, the original trained model may further include a second original subnetwork including the first M network layers in the original trained model, the second original subnetwork including the remaining network layers in the original trained model except the first M network layers.

For example, the number of the original trained models may be N, where N is a positive integer, and the N original trained models correspond to the N original data sets one by one, and before step 101, the original model to be trained may be trained based on training data in the original data sets to obtain the original trained model. For example, training an original model 1 to be trained based on training data in an original data set 1 to obtain an original trained model 1 corresponding to the original model 1 to be trained; training the original model 2 to be trained based on the training data in the original data set 2 to obtain the original trained model 2 corresponding to the original model 2 to be trained, and so on.

For example, for each original trained model, the training process of the original trained model may include: inputting training data in an original data set corresponding to the original trained model to the original model to be trained to obtain a second original characteristic; and training the original model to be trained based on the second original characteristic to obtain an original trained model, wherein the original trained model is used for realizing data processing.

In the process of training the original model to be trained based on the second original characteristic to obtain the original trained model, whether the original model to be trained is converged or not can be determined; if not, adjusting the parameters of the original model to be trained based on the second original features to obtain an adjusted model, determining the adjusted model as the original model to be trained, and returning to execute the operation of inputting the training data in the original data set to the original model to be trained; if yes, the original model to be trained is determined as the original trained model.

Step 102, inputting training data in the target data set to a first target sub-network of the target model to be trained to obtain a first target feature. For example, the model to be trained may further include a second target subnetwork, the first target subnetwork including the first M network layers in the model to be trained, and the second target subnetwork including the remaining network layers except the first M network layers in the model to be trained.

For example, the training data in the raw data set may include a raw training image and a raw label class corresponding to the raw training image, and the training data in the target data set may include a target training image and a target label class corresponding to the target training image. The target category set corresponding to the target data set may be a union of N original category sets corresponding to the N original data sets, and the N original data sets correspond to the N original category sets one to one. The target class set may include all target label classes in the target data set, and the original class set may include all original label classes in the original data set.

For example, assuming that N original data sets are an original data set 1 and an original data set 2, the original data set 1 corresponds to an original category set 1, the original category set 1 may include all original label categories of the original data set 1, such as category a and category B, the original data set 2 corresponds to an original category set 2, and the original category set 2 may include all original label categories of the original data set 2, such as category a and category C.

Based on this, the target dataset corresponds to a target category set, which is a union of the original category set 1 and the original category set 2, i.e. the target category set includes category a, category B and category C, that is, all target label categories of the target dataset may be category a, category B and category C.

Step 103, inputting the first original feature and the first target feature into a second target subnetwork of the target model to be trained, and outputting the second target feature by the second target subnetwork, that is, inputting the first original feature output by the first original subnetwork of the original trained model and the first target feature output by the first target subnetwork of the target model to be trained into the second target subnetwork of the target model to be trained together, and processing the first original feature and the first target feature by the second target subnetwork to obtain the second target feature.

For example, since the number of the original trained models may be N, the first original features input to the second target subnetwork are N first original features corresponding to the N original trained models.

And 104, training the first target sub-network based on the first target feature, training the second target sub-network based on the second target feature, and acquiring a target trained model based on the trained first target sub-network and the trained second target sub-network, wherein the target trained model is used for realizing data processing.

For example, it may be determined whether the first target subnetwork has converged; if not, adjusting the parameters of the first target sub-network based on the first target feature to obtain an adjusted sub-network, determining the adjusted sub-network as the first target sub-network, and returning to execute the operation of inputting the training data in the target data set to the first target sub-network; if so, the first target subnetwork is determined to be the first converged target subnetwork. Determining whether the second target subnetwork has converged; if not, adjusting the parameters of the second target sub-network based on the second target feature to obtain an adjusted sub-network, determining the adjusted sub-network as the second target sub-network, and returning to execute the operation of inputting the first original feature and the first target feature into the second target sub-network; if so, the second target subnetwork is determined to be the second converged target subnetwork. On this basis, the target-trained model may be obtained based on the first converged target sub-network comprising the first M network layers in the target-trained model and the second converged target sub-network comprising the remaining network layers in the target-trained model except the first M network layers, that is, the target-trained model comprises the first converged target sub-network and the second converged target sub-network.

In the above embodiment, if the first target subnetwork meets the convergence condition and the second target subnetwork meets the convergence condition, it may be determined that the first target subnetwork has converged and that the second target subnetwork has converged. Alternatively, if the first target subnetwork meets the convergence condition and the second target subnetwork does not meet the convergence condition, it may be determined that the first target subnetwork has converged and that the second target subnetwork has converged; alternatively, if the first target subnetwork does not satisfy the convergence condition and the second target subnetwork satisfies the convergence condition, then it may be determined that the first target subnetwork has converged and it may be determined that the second target subnetwork has converged.

In one possible implementation, after obtaining the target trained model based on the trained first target subnetwork and the trained second target subnetwork, an artificial intelligence process may also be implemented based on the target trained model. For example, if the target-trained model is used to implement target detection, after the data to be processed is obtained, the data to be processed may be input to the target-trained model, the target-trained model performs detection processing on the data to be processed to obtain a detection result, and target detection is performed based on the detection result. Or, if the target-trained model is used to implement target classification, after the data to be processed is obtained, the data to be processed may be input to the target-trained model, the target-trained model performs classification processing on the data to be processed to obtain a classification result, and the target classification is performed based on the classification result.

In one possible implementation, the network structures of the first M network layers in the original trained model are the same as the network structures of the first M network layers in the target trained model. If N original trained models exist, where N may be a positive integer greater than 1, the network structures of the first M network layers in the N original trained models are the same, that is, the network structures of the first M network layers in the N original trained models are the same as the network structures of the first M network layers in the target trained model, where M may be a positive integer.

The data processing method according to the embodiment of the present application is described below with reference to specific application scenarios.

In order to realize artificial intelligence processing, an original data set can be constructed, an original trained model is trained on the basis of the original data set, and artificial intelligence processing is realized by adopting the original trained model. When the original trained model is applied to an actual business, the class of the target detected by the original trained model may change due to business change. For example, when the service of entering the elevator by the battery car is just started, the originally trained model needs to detect two target categories of the battery car and the bicycle, and with the improvement of the requirement on the accuracy, a head-shoulder target needs to be added, namely three target categories of the battery car, the bicycle, the head and the shoulder need to be detected. For another example, the original trained model 1 needs to detect three target classes such as human, dog, and cat, the original trained model 2 needs to detect two target classes such as dog and pig, and a new model needs to detect four target classes such as human, cat, dog, and pig.

When the above situation occurs, a supplementary calibration operation may be performed on the training data in the original data set, that is, the label category of each training image in the original data set is re-calibrated, and the workload of re-calibration is large.

The above problem can be abstracted as: there is at least one original data set with independent calibration, there may be overlap of target classes or not between each original data set, and each original data set has an independent original trained model, and the original trained model can detect the target calibrated under the original data set with high performance. On the basis, a new model, namely a target trained model, needs to be trained, the target trained model can detect a union set of target classes of all original data sets, and the training process of the target trained model cannot perform supplementary calibration on the original data sets.

Referring to fig. 2, which is a schematic diagram of a training process according to An embodiment of the present application, N raw data sets may be constructed, where the N raw data sets are denoted as a raw data set a1 — a raw data set An, and the raw data set includes a training image and a label category corresponding to the training image. For convenience of distinguishing, the training images in the original data set are recorded as original training images, and the label classes in the original data set are recorded as original label classes. The original data set a1 corresponds to the original set of classes a11, the original set of classes a11 includes all original label classes of the original data set a1, the original data set a2 corresponds to the original set of classes a21, the original set of classes a21 includes all original label classes of the original data set a2, and so on, the original data set An corresponds to the original set of classes An1, and the original set of classes An1 includes all original label classes of the original data set An.

The original trained model a12 may be trained based on training data in the original data set a1, the original trained model a22 may be trained based on training data in the original data set a2, and so on, the original trained model An2 may be trained based on training data in the original data set An.

On the basis, if a new model is obtained through training, a target data set B can be constructed, wherein the target data set B comprises a training image and a label category corresponding to the training image. For convenience of distinction, the training image in the target data set B may be denoted as a target training image, and the label class in the target data set B may be denoted as a target label class. Target dataset B corresponds to target class set B1, and target class set B1 may include all target tab classes of target dataset B. For the target class set B1, the target label class in the target class set B1 may be a union of the original label classes in all the original class sets, for example, B1 ═ a11 ═ a21 ═ a31 ≦ … ≦ An 1.

The target trained model B2 may be obtained by training based on the training data in the target data set B, and when the target trained model B2 is obtained by training based on the training data in the target data set B, the data features output by the original trained model a12, the data features output by the original trained model a22, …, and the data features output by the original trained model An2 are referred to, that is, the target trained model B2 is obtained by training based on the training data in the target data set B and the data features output by each original trained model.

Firstly, training an original model to be trained based on an original data set to obtain an original trained model.

For example, the original model that has not been trained may be referred to as the original model to be trained, and the original model that has been trained may be referred to as the original trained model. Based on this, an original model to be trained can be constructed, the structure and the function of the original model to be trained are not limited, and a data set used for training the original model to be trained is constructed and is called as an original data set. On this basis, the original model to be trained may be trained based on the original data set, resulting in an original trained model.

Referring to fig. 3, the raw data set may include a plurality of training data, which may include, for each training data, a raw training image and a raw label class representing a class of the raw training image. Of course, the training data may also include other types of calibration information, such as the position of the target frame, and the like, and the training data is not limited. The target frame position may be coordinate information of a rectangular frame in the original training image (e.g., coordinates of the upper left corner of the rectangular frame, width and height of the rectangular frame, etc.).

A plurality of training data in the original data set may be input to the original model to be trained, so that parameters (such as convolutional layer parameters, pooling layer parameters, excitation layer parameters, full-link layer parameters, and the like) of the original model to be trained are trained by using the training data, and the training process is not limited. After the training of the original model to be trained is completed, the original model to be trained, which has been trained, is the original trained model.

In one possible embodiment, the original trained model may be trained using the following steps:

and step S11, inputting the training data in the original data set to the original model to be trained to obtain a second original characteristic. For example, a plurality of training data (such as original training images and original label categories) in the original data set are input to the original model to be trained, the original model to be trained processes the training data, the processing process is not limited, and the data features processed by the original model to be trained are the second original features.

For example, the original model to be trained includes a plurality of network layers, such as network layer 1, network layer 2, …, network layer M +1, and network layer …, where K is greater than M, that is, there are K network layers in total, and network layer K is the last network layer of the original model to be trained. Based on this, the training data can be input to the network layer 1 of the original model to be trained, the training data is processed by the network layer 1 to obtain the data feature 1, the data feature 1 is input to the network layer 2, the data feature 1 is processed by the network layer 2 to obtain the data feature 2, the data feature 2 is input to the network layers 3 and …, the data feature M-1 is processed by the network layer M to obtain the data feature M, the data feature M is input to the network layers M +1 and …, the data feature K-1 is processed by the network layer K to obtain the data feature K, and the data feature K is the second original feature.

And step S12, determining whether the original model to be trained has converged.

If not, step S13 is executed, and if yes, step S14 is executed.

For example, after the second original feature is obtained, the loss value may be determined based on the second original feature, for example, a target loss function is configured in advance, an input of the target loss function is a data feature output by the network layer K, and an output of the target loss function is a loss value. Based on this, after the second original feature is obtained, the second original feature may be substituted into the target loss function to obtain a loss value corresponding to the second original feature.

After obtaining the loss value corresponding to the second original feature, it may be determined whether the original model to be trained has converged based on the loss value. For example, if the loss value is smaller than a preset threshold, it is determined that the original model to be trained has converged, and if the loss value is not smaller than the preset threshold, it is determined that the original model to be trained has not converged.

Of course, other ways may also be used to determine whether the original model to be trained has converged, and this determination way is not limited. For example, if the number of iterations of the original model to be trained (a training process of using all training data in the original data set to the original model to be trained is referred to as one iteration) reaches a preset number threshold, it may be determined that the original model to be trained has converged, otherwise, it may be determined that the original model to be trained has not converged. For another example, if the iteration duration of the original model to be trained reaches the preset duration threshold, it may be determined that the original model to be trained has converged, otherwise, it may be determined that the original model to be trained has not converged.

And step S13, adjusting the parameters of the original model to be trained based on the second original features to obtain an adjusted model, determining the adjusted model as the original model to be trained, and returning to execute the operation of inputting the training data in the original data set to the original model to be trained, namely returning to execute the step S11.

For example, the second original feature may be substituted into a target loss function to obtain a loss value corresponding to the second original feature, and a parameter (i.e., a network parameter) of the original model to be trained is adjusted based on the loss value to obtain an adjusted model. For example, based on the loss value corresponding to the second original feature, the network parameters (i.e., network weights) of the original model to be trained may be updated through a back propagation algorithm, so as to obtain an updated original model to be trained, i.e., the adjusted model, and the updating process is not limited. An example of a back propagation algorithm may be a gradient descent method, i.e. the network parameters of the original model to be trained are updated by the gradient descent method.

After obtaining the adjusted model, the adjusted model may be determined as the original model to be trained, and the step S11 is executed to train the original model to be trained again based on the training data in the original data set.

And step S14, determining the converged original model to be trained as an original trained model.

After the original trained model (i.e., the trained model) is obtained, data processing, i.e., artificial intelligence processing, can be implemented through the original trained model. For example, if the original trained model is used to implement target detection, after the data to be processed is obtained, the data to be processed may be input to the original trained model, the original trained model performs detection processing on the data to be processed to obtain a detection result, and target detection is performed based on the detection result. Or, if the original trained model is used to implement target classification, after the data to be processed is obtained, the data to be processed may be input to the original trained model, the data to be processed is classified by the original trained model to obtain a classification result, and the target classification is performed based on the classification result.

In a possible implementation manner, the number of the original models to be trained is N, the number of the original data sets is N, the number of the original trained models is N, and N is a positive integer.

For example, the original data set a1 corresponds to the original category set a11, and the original model a13 to be trained may be trained based on the training data in the original data set a1, so as to obtain the original trained model a 12. The original data set A2 has a corresponding original category set A21, the original model A23 to be trained can be trained based on the training data in the original data set A2 to obtain An original trained model A22, and so on, the original data set An has a corresponding original category set An1, and the original model An3 to be trained can be trained based on the training data in the original data set An to obtain An original trained model An 2.

And secondly, training the target model to be trained based on the target data set to obtain a target trained model.

For example, the target model that is not trained may be recorded as the target model to be trained, and the target model that is trained may be recorded as the target trained model. Based on the above, a target model to be trained can be constructed, the structure and function of the target model to be trained are not limited, and a data set for training the target model to be trained is constructed and is called a target data set. On this basis, the target model to be trained can be trained based on the target data set, so as to obtain a target trained model.

Referring to fig. 4, the target data set B includes a plurality of training data, and for each training data, the training data includes a target training image and a target label class, and the target label class is used to represent a class of the target training image. Of course, the training data may also include other types of calibration information, such as target frame positions, and the like, without limitation. The target data set B corresponds to the target class set B1, the target class set B1 includes all target label classes of the target data set B, and the target label class in the target class set B1 may be a union of original label classes in all original class sets, for example, B1 ═ a11 ═ a21 ═ a31 ═ … ≡ An1, so that the original label classes in all original class sets are completely calibrated.

Referring to fig. 4, a plurality of training data in An original data set a1 may be input to An original trained model a12 to obtain a first original feature a14, the first original feature a14 is input to a target model to be trained, a plurality of training data in the original data set a2 is input to An original trained model a22 to obtain a first original feature a24, the first original feature a24 is input to a target model to be trained, …, a plurality of training data in An original data set An is input to An original trained model An2 to obtain a first original feature An4, and the first original feature An4 is input to a target model to be trained. And a plurality of training data in the target data set B can be input to the target model to be trained. In summary, parameters (such as convolutional layer parameters, pooling layer parameters, excitation layer parameters, full-link layer parameters, and the like) of the model to be trained of the target may be trained based on the plurality of training data and the first original features in the target data set B, and after the model to be trained of the target is trained, the model to be trained of the target, which has been trained, is the trained model of the target.

In one possible embodiment, the target trained model can be obtained by training with the following steps:

step S21, inputting the training data in the original data set to a first original sub-network of the original trained model to obtain a first original feature. For example, the first original sub-network processes the training data to obtain a data feature, and the data feature processed by the first original sub-network is recorded as a first original feature.

For example, assuming that the original trained model includes K network layers (the original model to be trained also includes K network layers, which are identical in structure), the K network layers of the original trained model may be divided into a first original sub-network and a second original sub-network, where the first original sub-network includes the first M (M is smaller than K) network layers, and the second original sub-network includes the remaining network layers except the first M network layers. For example, if the original trained model includes network layer 1, network layer 2, …, network layer M +1, … network layer K, and network layer K is the last network layer of the original trained model, the first original sub-network includes network layer 1, network layer 2, …, and network layer M, i.e., network layer M is the last network layer of the first original sub-network, the second original sub-network includes network layer M +1, … network layer K, i.e., network layer M +1 is the first network layer of the second original sub-network, and network layer K is the last network layer of the second original sub-network.

Based on this, the training data in the original data set can be input to the network layer 1 of the original trained model, the training data is processed by the network layer 1 to obtain the data feature 1, the data feature 1 is input to the network layer 2, the data feature 1 is processed by the network layer 2 to obtain the data feature 2, the data feature 2 is input to the network layers 3 and …, the data feature M is processed by the network layer M to obtain the data feature M, the data feature M is the first original feature, and the data feature M does not need to be input to the network layer M + 1.

In the above embodiment, the data features output by the first original sub-network (i.e. the data features output by the network layer M) are used as the first original features, and the data features output by the second original sub-network (i.e. the data features output by the network layer K) are not used as the first original features, i.e. the training process of the target trained model, the second original sub-network does not participate in the training process, but only the first original sub-network is required to participate in the training process.

For example, assuming that there are N original trained models, the training data in the original data set a1 may be input to a first original subnetwork of the original trained model a12 to obtain a first original feature a14, the training data in the original data set a2 may be input to a first original subnetwork of the original trained model a22 to obtain a first original feature a24, …, and the training data in the original data set An may be input to a first original subnetwork of the original trained model An2 to obtain a first original feature An 4.

Step S22, inputting the training data in the target data set to a first target sub-network of the target model to be trained to obtain a first target feature. For example, the first target subnetwork processes the training data to obtain a data feature, and the data feature processed by the first target subnetwork is marked as a first target feature.

For example, assuming that the target model to be trained includes L network layers (the target trained model also includes L network layers, which are identical in structure), the L network layers of the target model to be trained may be divided into a first target sub-network and a second target sub-network, where the first target sub-network includes M (M is smaller than L) network layers, and the second target sub-network includes the rest of the network layers except the M network layers. For example, the target model to be trained includes network layer 1, network layer 2, …, network layer M +1, … network layer L, and network layer L is the last network layer of the target model to be trained, then the first target sub-network includes network layer 1, network layer 2, …, and network layer M, i.e., network layer M is the last network layer of the first target sub-network, the second target sub-network includes network layer M +1, … network layer L, i.e., network layer M +1 is the first network layer of the second target sub-network, and network layer L is the last network layer of the second target sub-network.

Illustratively, the value of L and the value of K may be the same or different.

Based on this, the training data in the target data set may be input to the network layer 1 of the target model to be trained, the network layer 1 processes the training data to obtain the data feature 1, the data feature 1 is input to the network layer 2, the network layer 2 processes the data feature 1 to obtain the data feature 2, the data feature 2 is input to the network layers 3 and …, and the network layer M processes the data feature M-1 to obtain the data feature M (the output of the last network layer of the first target subnetwork), where the data feature M is the first target feature.

And step S23, inputting the first original feature and the first target feature into a second target sub-network of the model to be trained, and outputting the second target feature by the second target sub-network. For example, the first original feature and the first target feature are input to the second target subnetwork together, the second target subnetwork processes the first original feature and the first target feature to obtain the data feature, and the processed data feature is the second target feature.

For example, assuming that the model to be trained includes L network layers, the second target subnetwork of the model to be trained includes network layers M +1 and …, based on which the first original feature (e.g., the first original feature a14, the first original feature a24, …, the first original feature An4) and the first target feature may be input to the network layer M +1 of the model to be trained, the first original feature and the first target feature are processed by the network layer M +1 to obtain a data feature M +1, and the data feature M +1 is input to the network layers M +2 and …, and the network layer L processes the data feature L-1 to obtain a data feature L (i.e., the output of the last network layer of the second target subnetwork), where the data feature L is the second target feature.

In summary, it can be seen that, since the number of the original trained models is N, the first original features input to the second target subnetwork are N first original features corresponding to the N original trained models.

In one possible implementation, the network structures of the first M network layers in the N original trained models are all the same. The network structures of the first M network layers in the original trained model are the same as the network structures of the first M network layers in the target model to be trained (i.e., the network structures of the first M network layers in the target trained model). For example, the network structures of the first M network layers of the original trained model a12 are the same as those of the first M network layers of the original trained model a22, the network structures of the first M network layers of the original trained model a12 are the same as those of the first M network layers of the original trained model a32, and …, the network structures of the first M network layers of the original trained model a12 are the same as those of the first M network layers of the original trained model An 2. The network structures of the first M network layers of the original trained model a12 are the same as those of the target model to be trained/the target trained model.

The network structures of the first M network layers of the original trained model a12 are the same as those of the first M network layers of the original trained model a22, and may refer to: the network structure of the first network layer (i.e., network layer 1) of the original trained model a12 is the same as the network structure of the first network layer (i.e., network layer 1) of the original trained model a22, the network structure of the second network layer (i.e., network layer 2) of the original trained model a12 is the same as the network structure of the second network layer (i.e., network layer 2) of the original trained model a22, …, and the network structure of the mth network layer (i.e., network layer M) of the original trained model a12 is the same as the network structure of the mth network layer (i.e., network layer M) of the original trained model a 22.

The network structures of the first M network layers of the original trained model a12 are the same as those of the first M network layers of the target model to be trained/the target trained model, and may refer to: the network structure of the first network layer of the original trained model a12 is the same as the network structure of the first network layer of the target model to be trained/target trained, …, and the network structure of the mth network layer of the original trained model a12 is the same as the network structure of the mth network layer of the target model to be trained/target trained.

The network structure of the first network layer (denoted as network layer a121) of the original trained model a12 is the same as the network structure of the first network layer (denoted as network layer a221) of the original trained model a22, and may refer to: the network layer a121 and the network layer a221 are composed of the same structural unit, and the connection relationship of the structural units is the same, but the parameters of the structural units may be the same or different. For example, the network layer a121 and the network layer a221 are each composed of 2 convolutional layers and 1 pooling layer, and the connection relationship between 2 convolutional layers and 1 pooling layer in the network layer a121 is the same as the connection relationship between 2 convolutional layers and 1 pooling layer in the network layer a 221.

Step S24, determining whether the first target sub-network of the target model to be trained has converged.

If not, step S25 may be executed, and if yes, step S26 may be executed.

Step S25, adjusting the parameters of the first target sub-network based on the first target feature to obtain an adjusted sub-network, determining the adjusted sub-network as the first target sub-network, and returning to perform the operation of inputting the training data in the target data set to the first target sub-network, i.e., returning to perform step S22.

Step S26, the first target sub-network is determined to be the first converged target sub-network.

Step S27, determining whether the second target sub-network of the target model to be trained has converged.

If not, step S28 may be executed, and if yes, step S29 may be executed.

Step S28, adjusting the parameters of the second target sub-network based on the second target feature to obtain an adjusted sub-network, determining the adjusted sub-network as the second target sub-network, and returning to perform the operation of inputting the first original feature and the first target feature into the second target sub-network, i.e., returning to perform step S23.

Step S29, the second target sub-network is determined as the second converged target sub-network.

Step S30, after obtaining the first converged target sub-network and the second converged target sub-network, obtaining a target trained model based on the first converged target sub-network and the second converged target sub-network, where the target trained model includes the first converged target sub-network (the trained first target sub-network) and the second converged target sub-network (the trained second target sub-network), the first converged target sub-network includes the first M network layers, and the second converged target sub-network includes the remaining network layers except the first M network layers.

For steps S24 to S30, the following steps can be implemented:

manner 1, if the first target subnetwork meets the convergence condition and the second target subnetwork meets the convergence condition, it is determined that the first target subnetwork has converged and it is determined that the second target subnetwork has converged. Determining a first target subnetwork as a first converged target subnetwork, determining a second target subnetwork as a second converged target subnetwork, and obtaining a target trained model based on the first converged target subnetwork and the second converged target subnetwork.

And if the first target sub-network does not meet the convergence condition and/or the second target sub-network does not meet the convergence condition, determining that the first target sub-network is not converged and determining that the second target sub-network is not converged. Based on this, parameters of the first target sub-network are adjusted based on the first target feature to obtain an adjusted sub-network, the adjusted sub-network is determined as the first target sub-network, parameters of the second target sub-network are adjusted based on the second target feature to obtain an adjusted sub-network, and the adjusted sub-network is determined as the second target sub-network. And returning to the step S22 and the step S23 based on the adjusted first target sub-network and the second target sub-network.

Manner 2, if the first target subnetwork meets the convergence condition and the second target subnetwork does not meet the convergence condition, it may be determined that the first target subnetwork has converged and that the second target subnetwork has converged, that is, as long as the first target subnetwork meets the convergence condition, it may be determined that the first target subnetwork and the second target subnetwork have converged at the same time. Based on this, the first target subnetwork may be determined as the first converged target subnetwork and the second target subnetwork may be determined as the second converged target subnetwork, and the target trained model may be obtained based on the first converged target subnetwork and the second converged target subnetwork.

If the first target subnetwork does not meet the convergence condition, determining that the first target subnetwork is not converged and determining that the second target subnetwork is not converged whether the second target subnetwork meets the convergence condition or not. Based on this, parameters of the first target sub-network are adjusted based on the first target feature to obtain an adjusted sub-network, the adjusted sub-network is determined as the first target sub-network, parameters of the second target sub-network are adjusted based on the second target feature to obtain an adjusted sub-network, and the adjusted sub-network is determined as the second target sub-network. And returning to the step S22 and the step S23 based on the adjusted first target sub-network and the second target sub-network.

Mode 3, if the first target subnetwork does not satisfy the convergence condition and the second target subnetwork satisfies the convergence condition, it may be determined that the first target subnetwork has converged and the second target subnetwork has converged, that is, as long as the second target subnetwork satisfies the convergence condition, it may be determined that the first target subnetwork and the second target subnetwork have converged at the same time. Based on this, the first target subnetwork may be determined as the first converged target subnetwork and the second target subnetwork may be determined as the second converged target subnetwork, and the target trained model may be obtained based on the first converged target subnetwork and the second converged target subnetwork.

And if the second target sub-network does not meet the convergence condition, determining that the first target sub-network is not converged and determining that the second target sub-network is not converged regardless of whether the first target sub-network meets the convergence condition. Based on this, parameters of the first target sub-network are adjusted based on the first target feature to obtain an adjusted sub-network, the adjusted sub-network is determined as the first target sub-network, parameters of the second target sub-network are adjusted based on the second target feature to obtain an adjusted sub-network, and the adjusted sub-network is determined as the second target sub-network. And returning to the step S22 and the step S23 based on the adjusted first target sub-network and the second target sub-network.

Manner 4, if the first target subnetwork meets the convergence condition but the second target subnetwork does not, the first target subnetwork is determined to have converged but the second target subnetwork is determined to have not converged. Based on this, the first target sub-network may be determined as the first converged target sub-network, and in the subsequent process, the parameters of the first converged target sub-network are not adjusted any more, and this first converged target sub-network is retained.

And adjusting parameters of the second target subnetwork based on the second target characteristic to obtain an adjusted subnetwork, determining the adjusted subnetwork as the second target subnetwork, and returning to step S23 based on the adjusted second target subnetwork, because the second target subnetwork is not converged. When adjusting the parameters of the second target sub-network, no adjustment of the parameters of the first target sub-network is required.

In the subsequent process, the parameters of the second target subnetwork only need to be adjusted until the second target subnetwork meets the convergence condition, and the second target subnetwork is determined to be converged, so that the second target subnetwork can be determined to be the second converged target subnetwork. On this basis, the target trained model may be obtained based on the first converged target sub-network and the second converged target sub-network.

Manner 5, if the second target subnetwork satisfies the convergence condition but the first target subnetwork does not satisfy the convergence condition, it is determined that the second target subnetwork has converged but the first target subnetwork has not converged. Based on this, the second target sub-network may be determined as the second converged target sub-network, and in a subsequent process, the parameters of the second converged target sub-network are not adjusted any more, and this second converged target sub-network is retained.

And adjusting parameters of the first target subnetwork based on the first target feature to obtain an adjusted subnetwork, determining the adjusted subnetwork as the first target subnetwork, and returning to step S22 based on the adjusted first target subnetwork, because the first target subnetwork is not converged. When adjusting parameters of a first target sub-network, no adjustment of parameters of a second target sub-network is required.

In the subsequent process, the parameters of the first target subnetwork only need to be adjusted until the first target subnetwork meets the convergence condition, and the first target subnetwork is determined to be converged, so that the first target subnetwork can be determined to be the first converged target subnetwork. On this basis, the target trained model may be obtained based on the first converged target sub-network and the second converged target sub-network.

In the above embodiment, whether the first target subnetwork meets the convergence condition may include: after the first target feature is obtained, the loss value is determined based on the first target feature, for example, a loss function a is arranged in advance, the input of the loss function a is the first target feature, the output of the loss function a is the loss value, based on which, after the first target feature is obtained, the first target feature is substituted into the loss function a to obtain the loss value corresponding to the first target feature, and based on the loss value, it is determined whether the first target subnetwork satisfies the convergence condition. For example, if the loss value is less than the preset threshold, it is determined that the first target subnetwork meets the convergence condition, and if the loss value is not less than the preset threshold, it is determined that the first target subnetwork does not meet the convergence condition. Of course, other ways to determine whether the first target subnetwork meets the convergence condition may be used, and the determination is not limited thereto. For example, whether the first target subnetwork meets the convergence condition may be determined based on the number of iterations or the duration of the iterations.

In the above embodiment, whether the second target subnetwork meets the convergence condition may include: after the second target feature is obtained, the loss value is determined based on the second target feature, for example, a loss function b is arranged in advance, the input of the loss function b is the second target feature, and the output of the loss function b is the loss value, based on which, after the second target feature is obtained, the second target feature is substituted into the loss function b to obtain the loss value corresponding to the second target feature, and based on the loss value, it is determined whether the second target subnetwork satisfies the convergence condition. For example, if the loss value is less than the predetermined threshold, it is determined that the second target subnetwork meets the convergence condition, and if the loss value is not less than the predetermined threshold, it is determined that the second target subnetwork does not meet the convergence condition. Of course, other ways to determine whether the second target subnetwork meets the convergence condition may be used, and the determination is not limited thereto. For example, whether the second target subnetwork meets the convergence condition may be determined based on the number of iterations or the duration of the iterations.

In the above embodiments, the adjusting the parameter of the first target sub-network based on the first target feature may include, but is not limited to: and substituting the first target characteristic into the loss function a to obtain a loss value corresponding to the first target characteristic, and adjusting the parameter (namely the network parameter) of the first target sub-network based on the loss value. For example, the network parameters (i.e., network weights) of the first target sub-network may be updated by a back-propagation algorithm based on the loss value corresponding to the first target feature, and the updating process is not limited. An example of a back propagation algorithm may be a gradient descent method, i.e. a network parameter of the first target subnetwork is updated by the gradient descent method.

In the above embodiments, the adjusting the parameter of the second target sub-network based on the second target feature may include, but is not limited to: and substituting the second target characteristic into the loss function b to obtain a loss value corresponding to the second target characteristic, and adjusting the parameter (namely the network parameter) of the second target sub-network based on the loss value. For example, based on the loss value corresponding to the second target feature, the network parameters (i.e., network weights) of the second target sub-network may be updated through a back propagation algorithm, and the updating process is not limited. An example of a back propagation algorithm may be a gradient descent method, i.e. a network parameter of the second target subnetwork is updated by the gradient descent method.

In one possible embodiment, after the target trained model is obtained, data processing, that is, artificial intelligence processing, may be implemented by the target trained model. For example, if the target-trained model is used to implement target detection, after the data to be processed is obtained, the data to be processed may be input to the target-trained model, the target-trained model performs detection processing on the data to be processed to obtain a detection result, and target detection is performed based on the detection result. Or, if the target-trained model is used to implement target classification, after the data to be processed is obtained, the data to be processed may be input to the target-trained model, the data to be processed is classified by the target-trained model to obtain a classification result, and the target classification is performed based on the classification result.

In the above embodiment, the model to be trained for the target can be trained based on the training data in the target data set and the training data in the original data set, so that incremental joint training of the target data set and the original data set is realized, supplementary calibration of a large amount of training data in the original data set is ingeniously avoided, the target data set and the original data set can participate in training of a new model together under the condition that the supplementary calibration of the original data set is not performed, the completely calibrated target data set is fused for joint training, and the knowledge of the target data set and the knowledge of the original data set can be learned. In each training process, training data from the target data set and training data from the original data set are provided, and the mode plays a role in data equalization. Because the model to be trained of the target is trained based on a large amount of training data in the original data set, the training data in the target data set does not need to be too much, and can be composed of a small amount of training data.

Because the original features (namely the features output by the Mth network layer of the original trained model, which can be a feature map) participate in the training process of the target trained model, but not the training data in the original data set participate in the training process, sensitive data are prevented from participating in the training process, confidential information of the training data is hidden, and the safety is improved. For example, considering that the training data in the original data set itself contains sensitive data and is inconvenient to disclose, in this embodiment, the training data is first input to the original trained model to obtain the original features output by the mth network layer of the original trained model, obviously, the original features are a low feature map (i.e., only low-level features of the training data, but not the training data itself), and the model to be trained is trained by introducing the original features, i.e., training is performed with reference to a large amount of training data in the original data set, so that the performance of the trained model is better, and since the original features are the low feature map and do not include the sensitive data of the training data, leakage of the sensitive data can be avoided.

Based on the same application concept as the method, an embodiment of the present application provides a data processing apparatus, as shown in fig. 5, which is a schematic structural diagram of the data processing apparatus, and the apparatus may include:

an obtaining module 51, configured to input training data in an original data set to a first original sub-network of an original trained model to obtain a first original feature; wherein the original trained model further comprises a second original subnetwork, the first original subnetwork comprises the first M network layers in the original trained model, and the second original subnetwork comprises the remaining network layers in the original trained model except the first M network layers;

a training module 52, configured to train the first target subnetwork based on the first target feature, train the second target subnetwork based on the second target feature, and obtain a target trained model based on the trained first target subnetwork and the trained second target subnetwork;

wherein the target trained model is used to implement data processing.

In a possible implementation, the training module 52 is specifically configured to:

In one possible embodiment, the training module 52 determines whether the first target subnetwork has converged, and when determining whether the second target subnetwork has converged, is specifically configured to:

if the first target subnetwork meets the convergence condition and the second target subnetwork meets the convergence condition, determining that the first target subnetwork has converged and determining that the second target subnetwork has converged; alternatively, the first and second electrodes may be,

if the first target subnetwork meets the convergence condition and the second target subnetwork does not meet the convergence condition, determining that the first target subnetwork has converged and determining that the second target subnetwork has converged; alternatively, the first and second electrodes may be,

if the first target subnetwork does not satisfy the convergence condition and the second target subnetwork satisfies the convergence condition, determining that the first target subnetwork has converged and determining that the second target subnetwork has converged.

Based on the same concept as the method, an embodiment of the present application provides a data processing apparatus, which may include: a processor and a machine-readable storage medium having stored thereon machine-executable instructions executable by the processor; the processor is configured to execute the machine executable instructions to perform the steps of:

Based on the same application concept as the method, embodiments of the present application further provide a machine-readable storage medium, where several computer instructions are stored, and when the computer instructions are executed by a processor, the data processing method disclosed in the above example of the present application can be implemented.

The machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of data processing, the method comprising:

2. The method of claim 1, wherein the number of original trained models is N, the N original trained models are in one-to-one correspondence with the N original data sets, and the first original features input to the second target subnetwork are N first original features corresponding to the N original trained models;

3. The method of claim 1, wherein the training data in the raw data set comprises raw training images and raw label classes to which the raw training images correspond; training data in a target data set comprises a target training image and a target label category corresponding to the target training image;

4. The method of claim 1,

the training a first target subnetwork based on the first target feature, training a second target subnetwork based on the second target feature, and obtaining a target trained model based on the trained first target subnetwork and the trained second target subnetwork, comprising:

5. The method of claim 4, wherein determining whether the first target subnet has converged and determining whether the second target subnet has converged comprises:

6. The method of claim 1, wherein after obtaining the trained target model based on the trained first target subnetwork and the trained second target subnetwork, further comprising:

7. The method according to any one of claims 1 to 6,

the network structures of the first M network layers in the original trained model are the same as the network structures of the first M network layers in the target trained model;

8. A data processing apparatus, characterized in that the apparatus comprises:

wherein the target trained model is used to implement data processing.

9. The apparatus of claim 8, wherein the training module is specifically configured to:

10. A data processing apparatus, characterized by comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor;