CN113762520A - Data processing method, device and equipment - Google Patents

Data processing method, device and equipment Download PDF

Info

Publication number
CN113762520A
CN113762520A CN202010501078.7A CN202010501078A CN113762520A CN 113762520 A CN113762520 A CN 113762520A CN 202010501078 A CN202010501078 A CN 202010501078A CN 113762520 A CN113762520 A CN 113762520A
Authority
CN
China
Prior art keywords
data
baseline model
model
initial baseline
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010501078.7A
Other languages
Chinese (zh)
Inventor
宋旭鸣
许朝斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202010501078.7A priority Critical patent/CN113762520A/en
Publication of CN113762520A publication Critical patent/CN113762520A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application provides a data processing method, a device and equipment, wherein the method comprises the following steps: acquiring an initial baseline model, and generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information; processing the data calibration information through the generated model to obtain a first data characteristic; processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics; training the initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model; and deploying the initial baseline model or the target baseline model at the terminal equipment so as to process the application data through the initial baseline model or the target baseline model. Through the technical scheme, the performance of the target baseline model is improved, and the accuracy of the intelligent analysis result of the target baseline model is high.

Description

Data processing method, device and equipment
Technical Field
The present application relates to the field of intelligent monitoring, and in particular, to a data processing method, apparatus, and device.
Background
Machine learning is a way to realize artificial intelligence, is a multi-field cross subject, and relates to a plurality of subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. Machine learning is used to study how computers simulate or implement human learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to improve their performance. Machine learning focuses more on algorithm design, so that a computer can automatically learn rules from data and predict unknown data by using the rules.
Machine learning has found a wide variety of applications, such as deep learning, data mining, computer vision, natural language processing, biometric identification, search engines, medical diagnostics, detecting credit card fraud, stock market analysis, DNA sequence sequencing, speech and handwriting recognition, strategic gaming, and robotic applications.
In order to implement artificial intelligence processing by machine learning, the server needs to acquire a large amount of sample data, train a machine learning model based on the sample data, and deploy the machine learning model to the terminal device (such as a camera) so that the terminal device implements artificial intelligence processing based on the machine learning model.
Due to data privacy, the terminal equipment cannot provide data of the environment where the terminal equipment is located for the server, the server cannot train the machine learning model based on the data of the environment where the terminal equipment is located, the machine learning model deployed to the terminal equipment cannot be matched with the environment where the terminal equipment is located, and the performance of the machine learning model is low.
Disclosure of Invention
The application provides a data processing method, which is applied to terminal equipment and comprises the following steps:
acquiring an initial baseline model, and generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information;
processing the data calibration information through a generation model to obtain a first data characteristic;
processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics;
training an initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model; deploying an initial baseline model or a target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model.
The initial baseline model comprises a first sub-network model, the first sub-network model is a network model which does not need incremental training, and the processing of the scene data of the terminal equipment through the initial baseline model to obtain a second data characteristic comprises the following steps: and inputting the scene data of the terminal equipment to the first sub-network model so that the first sub-network model processes the scene data to obtain a second data characteristic.
The initial baseline model further includes a second sub-network model, the second sub-network model is a network model that needs incremental training, and the training of the initial baseline model through the first data feature and the second data feature to obtain a trained target baseline model includes:
inputting the first data characteristic and the second data characteristic into the second sub-network model to train the second sub-network model to obtain a trained second sub-network model;
generating a target baseline model based on the first sub-network model and the trained second sub-network model.
The deploying of the initial baseline model or the target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model comprises:
comparing the performance of the target baseline model to the performance of the initial baseline model;
if the performance of the target baseline model is superior to that of the initial baseline model, deploying the target baseline model at the terminal equipment so as to process the application data through the target baseline model;
and if the performance of the initial baseline model is better than that of the target baseline model, deploying the initial baseline model at the terminal equipment so as to process the application data through the initial baseline model.
The comparing the performance of the target baseline model to the performance of the initial baseline model comprises: acquiring a test data set, wherein the test data set comprises a plurality of test data; determining a first performance index corresponding to the initial baseline model based on the plurality of test data, and determining a second performance index corresponding to the target baseline model based on the plurality of test data; comparing the performance of the target baseline model to the performance of the initial baseline model based on the first performance metric and the second performance metric.
The determining a first performance indicator corresponding to the initial baseline model based on the plurality of test data and a second performance indicator corresponding to the target baseline model based on the plurality of test data comprises: processing the plurality of test data through the initial baseline model to obtain initial prediction categories of the plurality of test data; determining the first performance indicator based on an initial predicted category of the plurality of test data and an actual category of the plurality of test data; processing the plurality of test data through the target baseline model to obtain target prediction categories of the plurality of test data; determining the second performance indicator based on a target prediction category of the plurality of test data and an actual category of the plurality of test data.
The total number of test data in the test data set is less than the total number of scene data in a training data set, and the scene data in the training data set is the scene data input to the initial baseline model;
and/or the test data set comprises a first test subset and a second test subset, wherein the total number of test data in the first test subset is less than the total number of test data in the second test subset;
the test data in the first test subset is derived from the scene data in the training data set, and the test data in the second test subset is derived from the scene data acquired by the terminal device on site.
Training an initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model, including: inputting the first data characteristic and the second data characteristic into an initial baseline model, and processing the first data characteristic and the second data characteristic through the initial baseline model to obtain a third data characteristic; determining whether the initial baseline model has converged based on the third data characteristic; if so, determining the initial baseline model as a trained target baseline model;
and if not, adjusting the initial baseline model, and returning to execute the operation of inputting the first data characteristic and the second data characteristic to the initial baseline model based on the adjusted initial baseline model.
The application provides a data processing device, is applied to terminal equipment, the device includes: the acquisition module is used for acquiring an initial baseline model, generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information;
the processing module is used for processing the data calibration information through the generated model to obtain a first data characteristic; processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics;
the training module is used for training the initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model;
a deployment module, configured to deploy the initial baseline model or the target baseline model at the terminal device, so as to process application data through the initial baseline model or the target baseline model.
The application provides a terminal device, including: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor;
the processor is configured to execute machine executable instructions to perform the steps of:
acquiring an initial baseline model, and generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information;
processing the data calibration information through a generation model to obtain a first data characteristic;
processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics;
training an initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model; deploying an initial baseline model or a target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model.
According to the technical scheme, in the embodiment of the application, the terminal device trains the initial baseline model according to scene data (namely data of the environment where the terminal device is located) to obtain the target baseline model, the target baseline model can be matched with the environment where the terminal device is located, the performance of the target baseline model is improved, and the accuracy of the intelligent analysis result of the target baseline model is high. After the target baseline model is obtained, the initial baseline model or the target baseline model is determined to be deployed on the terminal equipment by comparing the performance of the initial baseline model with the performance of the target baseline model, so that the baseline model with better performance is ensured to be deployed on the terminal equipment, the baseline model with poorer performance is prevented from being deployed on the terminal equipment, the baseline model deployed on the terminal equipment is continuously updated in an iterative manner, the performance of the baseline model is continuously improved, the performance of the baseline model is better, and the intelligent analysis result of the baseline model is more accurate.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present application or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings of the embodiments of the present application.
FIG. 1 is a schematic diagram of an online learning system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of an initial baseline model training process in one embodiment of the present application;
FIG. 3 is a schematic diagram of a training process for generating a model in one embodiment of the present application;
FIG. 4 is a flow diagram of a data processing method in one embodiment of the present application;
FIG. 5 is a schematic diagram illustrating an incremental training process for an initial baseline model in one embodiment of the present application;
FIG. 6 is a schematic deployment diagram of a baseline model in one embodiment of the present application;
FIG. 7 is a block diagram of a data processing apparatus according to an embodiment of the present application;
fig. 8 is a block diagram of a terminal device according to an embodiment of the present application.
Detailed Description
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in the embodiments of the present application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Depending on the context, moreover, the word "if" as used may be interpreted as "at … …" or "when … …" or "in response to a determination".
Before the technical solutions of the present application are introduced, concepts related to the embodiments of the present application are introduced.
Machine learning: machine learning is a way to implement artificial intelligence, and is used to study how a computer simulates or implements human learning behaviors to acquire new knowledge or skills, and reorganize an existing knowledge structure to continuously improve its performance. Deep learning, which is a subclass of machine learning, is a process of modeling a specific problem in the real world using a mathematical model to solve similar problems in the field. The neural network is an implementation of deep learning, and for convenience of description, the structure and function of the neural network are described herein by taking the neural network as an example, and for other subclasses of machine learning, the structure and function of the neural network are similar.
A neural network: the neural network may include, but is not limited to, a Convolutional Neural Network (CNN), a cyclic neural network (RNN), a fully-connected network, and the like, and the structural units of the neural network may include, but are not limited to, a convolutional layer (Conv), a pooling layer (Pool), an excitation layer, a fully-connected layer (FC), and the like.
In practical application, one or more convolution layers, one or more pooling layers, one or more excitation layers, and one or more fully-connected layers may be combined to construct a neural network according to different requirements.
In the convolutional layer, the input data features are enhanced by performing a convolution operation on the input data features using a convolution kernel, the convolution kernel may be a matrix of m × n, the input data features of the convolutional layer are convolved with the convolution kernel, the output data features of the convolutional layer may be obtained, and the convolution operation is actually a filtering process.
In the pooling layer, the input data features (such as the output of the convolutional layer) are subjected to operations of taking the maximum value, taking the minimum value, taking the average value and the like, so that the input data features are sub-sampled by utilizing the principle of local correlation, the processing amount is reduced, the feature invariance is kept, and the operation of the pooling layer is actually a down-sampling process.
In the excitation layer, the input data features can be mapped using an activation function (e.g., a nonlinear function), thereby introducing a nonlinear factor such that the neural network enhances expressive power through a combination of nonlinearities. The activation function may include, but is not limited to, a ReLU (Rectified Linear Unit) function that is used to set features less than 0 to 0, while features greater than 0 remain unchanged.
In the fully-connected layer, the fully-connected layer is configured to perform fully-connected processing on all data features input to the fully-connected layer, so as to obtain a feature vector, and the feature vector may include a plurality of data features.
Baseline model of neural network (e.g. convolutional neural network): in the training process of the neural network, sample data may be used to train parameters of each neural network in the neural network, such as convolutional layer parameters (e.g., convolutional kernel parameters), pooling layer parameters, excitation layer parameters, full-link layer parameters, and the like, which is not limited thereto. By training each neural network parameter in the neural network, the neural network can be fitted to obtain the mapping relation between input and output.
After the training of the neural network is completed, the trained neural network is a baseline model of the neural network, which is referred to herein as the baseline model for short. The baseline model can be deployed to each terminal device, so that each terminal device can realize artificial intelligence processing based on the baseline model, such as human face detection, human body detection, vehicle detection and the like.
For example, for face detection, an image including a face may be input to a baseline model, the baseline model performs artificial intelligence processing on the image, and the artificial intelligence processing result is a face detection result.
Generation model and discriminant model of neural network (such as convolutional neural network): a GAN (generic adaptive Networks) is a type of neural network, and can implement artificial intelligence processing for unsupervised learning, and includes a generation Model (generic Model) and a discriminant Model (discriminant Model). A generative model, which may also be referred to as a generator, is capable of randomly generating observation data based on some implicit information (referred to herein as data targeting information). The discriminant model, which may also be referred to as a discriminator, is capable of predicting the observation data generated by the generation model based on the input variables.
The GAN training process is actually a training process for generating a model and a discriminant model, and may be performed by using sample data to train various network parameters in the generated model and the discriminant model, such as convolutional layer parameters (e.g., convolutional kernel parameters), pooling layer parameters, excitation layer parameters, full link layer parameters, and the like, without limitation.
Sample data and scene data: in an intelligent monitoring scene, a large number of terminal devices (such as analog cameras, IPCs (internet cameras), etc.) may be deployed, and these terminal devices may monitor their own environments, that is, collect video data of their own environments, and these video data are referred to as scene data.
For some terminal devices, for example, a terminal device monitoring a bedroom environment, a terminal device monitoring a factory environment, and the like, due to data privacy, the terminal device does not provide scene data to the server after collecting the scene data. For some terminal devices, for example, terminal devices monitoring highway environments, the terminal devices may provide the scene data to the server after collecting the scene data.
In summary, the server may obtain the scene data from some terminal devices, and the scene data obtained by the server is referred to as sample data, and of course, the server may also obtain the sample data by other methods, which is not limited herein. After the server obtains the sample data, the server can train the neural network by using the sample data.
The sample data may be image data, or may be other types of data, for example, and is not limited thereto. The scene data may be image data or other types of data, which is not limited in this respect.
Training data, test data and application data: in an intelligent monitoring scene, terminal equipment can acquire a large amount of scene data of the environment where the terminal equipment is located, and the scene data are divided into training data, testing data and application data. The training data is scene data used for training the baseline model, and the specific training process is described in the following embodiments. The test data is scenario data for testing the performance of the baseline model, and the specific test process is described in the following embodiments. The application data is the scene data that needs to be input to the baseline model, so that the baseline model performs artificial intelligence processing on the application data, and the specific processing process refers to the following embodiments.
Since some terminal devices (hereinafter referred to as terminal device a) do not provide the scene data to the server, the server cannot train the baseline model based on the scene data of the terminal device a, and after the baseline model is deployed to the terminal device a, the baseline model cannot match the environment where the terminal device a is located, and the performance is low.
In view of the above findings, in the embodiment of the present application, after the server deploys the baseline model to the terminal device a, the terminal device a may train the baseline model by using scene data of an environment where the terminal device a is located, so as to obtain a new baseline model. As the baseline model is trained by using the scene data of the environment where the terminal equipment A is located, the new baseline model can be matched with the environment where the terminal equipment is located, and the performance of the new baseline model is better.
The technical solutions of the embodiments of the present application are described below with reference to specific embodiments.
Referring to fig. 1, which is a schematic structural diagram of an online learning system, the online learning system may include, but is not limited to, an offline training module and an online training module, the offline training module may be deployed in a server (which may also be referred to as a model providing device), and the online training module may be deployed in each terminal device.
The off-line training module may include a baseline model training sub-module and a generated model training sub-module, the baseline model training sub-module is used for training the baseline model, and for convenience of distinguishing, the baseline model obtained by training of the baseline model training sub-module is referred to as an initial baseline model. The generative model training submodule is used for training a generative model, and the generative model obtained by training the generative model training submodule can also be called an encryption compression model.
The online training module may include a data construction sub-module, an incremental training sub-module, and a model evaluation sub-module, the data construction sub-module being configured to construct a test data set and a training data set, the test data set including a plurality of test data, the training data set including a plurality of training data. The increment training submodule is used for carrying out increment training on the initial baseline model to obtain a target baseline model, and the increment training refers to: the remainder of the initial baseline model is modified based on retaining a portion of the contents of the initial baseline model. The model evaluation module is used for comparing the performance of the initial baseline model with the performance of the target baseline model. And if the performance of the target baseline model is superior to that of the initial baseline model, deploying the target baseline model in the terminal equipment, and carrying out artificial intelligence processing through the target baseline model. If the performance of the initial baseline model is superior to that of the target baseline model, the initial baseline model is deployed on the terminal equipment, and artificial intelligence processing is carried out through the initial baseline model.
Referring to fig. 2, a schematic diagram of a training process of an initial baseline model is shown, where the initial baseline model is obtained by training according to data calibration information and sample data corresponding to the data calibration information. For example, the baseline model training sub-module may obtain a large amount of sample data, and for each sample data, the sample data has data calibration information, such as an actual category and/or a goal box, and the like, and the data calibration information is not limited.
For example, for an application scenario of face detection, the sample data may be a sample image, the target box may be coordinate information of a certain rectangular box in the sample image (e.g., coordinates of an upper left corner of the rectangular box, a width and a height of the rectangular box, etc.), and the actual category may indicate that the rectangular box area is a face or not a face.
The baseline model training sub-module can input a large amount of sample data and data calibration information corresponding to the sample data into a neural network (such as a convolutional neural network), so that the sample data and the data calibration information are utilized to train parameters of each neural network in the neural network, and the training process is not limited. After the neural network training is completed, the neural network that has completed training may be the initial baseline model.
Referring to fig. 3, a schematic diagram of a training process of generating a model is shown, where the generated model is obtained by training according to an initial baseline model, sample data, and data calibration information corresponding to the sample data. For example, the generative model training sub-module may obtain an initial baseline model (i.e., trained by the baseline model training sub-module), a large number of sample data, and data calibration information for each sample data, such as an actual class and/or a goal box.
The generative model training submodule may input the sample data to the initial baseline model, and the initial baseline model may process the sample data (e.g., network forward processing) to obtain data features corresponding to the sample data.
For example, the initial baseline model includes N layers (e.g., convolutional layer, pooling layer, excitation layer, full-link layer, etc.), sample data is input to the first layer of the initial baseline model, the first layer processes the data to obtain output data of the first layer, the output data of the first layer is input to the second layer of the initial baseline model, the second layer processes the data to obtain output data of the second layer, the output data of the second layer is input to the third layer of the initial baseline model, and so on, until the output data of the N-1 layer is input to the nth layer of the initial baseline model, and the nth layer processes the data to obtain output data of the nth layer.
For example, the generative model training submodule may take the output data of the mth layer of the initial baseline model as the data feature corresponding to the sample data, and output the data feature (i.e., the output data of the mth layer) to the discriminant model of the GAN. M may be a positive integer greater than or equal to 1 and less than N.
For example, M may be configured empirically, without limitation, where M is the layer that does not require incremental training and M +1 is the first layer that requires incremental training. For example, the first layer to the mth layer of the initial baseline model are layers that do not require incremental training, and the M +1 th layer is a layer that requires incremental training, and each of the M +2 th layer to the nth layer may be a layer that requires incremental training or a layer that does not require incremental training. For example, the layers from the M +2 th layer to the M +5 th layer do not need to be incrementally trained, and the layers from the M +5 th layer to the nth layer need to be incrementally trained, which is not limited herein. For another example, the M +2 th layer to the nth layer are all layers that require incremental training, which is not limited to this.
In summary, a first sub-network model of the initial baseline model may be set in advance, and the first sub-network model is a network model that does not require incremental training. For example, the first sub-network model may include first through Mth layers of the initial baseline model, and M +2 through M +5 layers of the initial baseline model. A second sub-network model of the initial baseline model may be pre-set, the second sub-network model being a network model that requires incremental training. For example, the second sub-network model may include the M +1 th layer of the initial baseline model, the M +5 th layer through the nth layer of the initial baseline model. Of course, the above is merely an example, and no limitation is made thereto.
The first subnetwork model does not need to be incrementally trained as follows: after the initial baseline model is deployed to the terminal device, the online training module does not need to adjust each neural network parameter of the first sub-network model.
The second subnetwork model needs to be incrementally trained, which means that: after the initial baseline model is deployed to the terminal device, the online training module needs to adjust each neural network parameter of the second sub-network model.
In summary, the generative model training sub-module inputs the sample data to the first sub-network model, processes the sample data by the first sub-network model to obtain the data characteristics corresponding to the sample data (i.e., the output data of the first sub-network model), and outputs the data characteristics to the GAN discriminant model.
As shown in fig. 3, the generative model training sub-module may input the data calibration information corresponding to the sample data to the generative model of the GAN (e.g., a neural network generator), so that the generative model processes the data calibration information, without limiting the processing process, obtains a data feature corresponding to the data calibration information, and outputs the data feature to a discriminant model of the GAN (e.g., a neural network discriminator).
With continued reference to fig. 3, the GAN discriminative model has two input data, one input data is the data feature corresponding to the sample data from the initial baseline model, and the other input data is the data feature corresponding to the data calibration information from the generated model, and the discriminative model can analyze the probability that the data feature corresponding to the data calibration information is the true data based on the similarity of the two data features.
And if the GAN is determined to be converged according to the probability, finishing the training process of the discriminant model and the generative model, and outputting the discriminant model and the generative model. In this embodiment, the generative model is deployed to the terminal device.
If the GAN is determined not to be converged according to the probability, adjusting each neural network parameter of the discriminant model, adjusting each neural network parameter of the generative model, and obtaining the adjusted discriminant model and the adjusted generative model without limiting the parameter adjustment process. And processing the data calibration information based on the adjusted generation model to obtain data characteristics corresponding to the data calibration information, and outputting the data characteristics to the adjusted discrimination model. And re-analyzing the probability that the data features corresponding to the data calibration information are real data based on the adjusted discrimination model, and determining whether the GAN is converged according to the probability, and so on.
In summary, the offline training module may obtain the initial baseline model and the generated model that have been trained, deploy the initial baseline model and the generated model to the terminal device, and send the data calibration information of the sample data to the terminal device. It should be noted that, the offline training module sends the data calibration information of the sample data to the terminal device, rather than sending the sample data itself to the terminal device.
Illustratively, in order to improve the performance of the target baseline model, the online training module not only trains the initial baseline model by using the scene data of the terminal device, but also acquires the data calibration information of the sample data from the server, and trains the initial baseline model by using the data calibration information of the sample data, that is, trains the initial baseline model by using the data calibration information of the sample data and the scene data to obtain the target baseline model, so that the trained target baseline model has better performance.
For example, considering that the sample data itself contains privacy information, which is sensitive to the user and is inconvenient to disclose, in this embodiment, the server sends the data calibration information of the sample data to the terminal device, and the data calibration information is discrete digital information, which is desensitization information and does not contain privacy information.
In addition, the storage space occupied by the sample data is large, and the storage space occupied by the data calibration information is small, so that the data calibration information of the sample data is sent to the terminal equipment, and the storage resource can be saved.
The offline training module can deploy the initial baseline model and the generated model to the terminal equipment and send the data calibration information of the sample data to the terminal equipment, so that the online training module deployed on the terminal equipment can obtain the initial baseline model, the generated model and the data calibration information of the sample data.
For example, the data construction sub-module may construct a test data set and a training data set according to scenario data of the terminal device, where the test data set may include a plurality of test data, and for convenience of distinction, the scenario data in the test data set is referred to as test data. The training data set may include a plurality of training data, and for the sake of distinction, scene data in the training data set is referred to as training data.
For example, in the process of acquiring scene data by the terminal device, the data construction sub-module may acquire scene data that is not reported to the server by the terminal device, and add the scene data to the training data set. And when the amount of the scene data in the training data set reaches a preset threshold value, performing incremental training on the initial baseline model by the incremental training sub-module according to the scene data in the training data set to obtain a target baseline model. The preset threshold may be configured empirically, and is not limited thereto, for example, the preset threshold may be 5000.
For each scene data in the training data set, the data construction sub-module may further obtain data calibration information of the scene data, such as an actual category and/or a target frame, and the like, and the obtaining manner of the data calibration information is not limited, so that each scene data in the training data set corresponds to the data calibration information.
In the process of acquiring the scene data by the terminal device, the data construction sub-module may further add the scene data to the test data set, for example, add part of the scene data in the training data set to the test data set, and add the scene data that is not reported to the server by the terminal device (the scene data is not added to the training data set) and/or the scene data that is reported to the server by the terminal device to the test data set.
For each scene data in the test data set, the data construction sub-module may further obtain data calibration information of the scene data, such as an actual category and/or a target frame, and the like, and the obtaining manner of the data calibration information is not limited, so that each scene data in the test data set corresponds to the data calibration information.
In one possible implementation, the total number of scene data in the test data set (i.e., the total number of test data) may be less than the total number of scene data in the training data set (i.e., the total number of training data).
For example, a proportional relationship between the total number of test data and the total number of training data may be predetermined, for example, the proportional relationship may be 1: n, and when n is 20 and the total number of training data in the training data set is 5000, the total number of test data in the test data set is 250, although the above values are only examples.
In one possible implementation, the set of test data may include a first test subset and a second test subset, and a total amount of test data in the first test subset may be less than a total amount of test data in the second test subset. For example, a proportional relationship between the total number of test data in the first test subset and the total number of test data in the second test subset is predetermined, for example, the proportional relationship is 1: m, when m is 3 and the total number of test data in the test data set is 400, the total number of test data in the first test subset is 100, and the total number of test data in the second test subset is 300, which is only an example.
The first test subset may also be referred to as a training performance test subset, and the test data in the first test subset is derived from the scenario data in the training data set, i.e., the data construction sub-module may add part of the scenario data in the training data set to the first test subset. The second test subset may also be referred to as a generalization performance test subset, and the test data in the second test subset is derived from scene data acquired by the terminal device on site, that is, the data construction sub-module adds, to the second test subset, scene data that is not reported to the server by the terminal device (the scene data is not added to the training data set) and/or scene data that is reported to the server by the terminal device.
The source of the test data in the second test subset is different scene point locations on the spot as much as possible, and the principle is that the covered point locations are as wide as possible, the covered seasons are as wide as possible, and the covered time period of one day is as wide as possible.
For example, the incremental training sub-module may input the scene data in the training data set to the initial baseline model, so as to perform incremental training on the initial baseline model by using the scene data to obtain the target baseline model, which is described in the following embodiments. The incremental training sub-module performs iterative training on the initial baseline model by using the scene data of the terminal equipment, so that the performance of the target baseline model can be improved.
For example, the model evaluation module may compare the performance of the initial baseline model with the performance of the target baseline model, and decide to deploy the target baseline model or the initial baseline model at the terminal device based on the performance of the initial baseline model and the performance of the target baseline model, as described in the following embodiments.
Based on the application scenario, an embodiment of the present application provides a data processing method, which is shown in fig. 4 and is a schematic flow chart of the data processing method, where the method is applied to a terminal device, and the method includes:
step 401, obtaining an initial baseline model, and generating a model and data calibration information. Illustratively, the initial baseline model is obtained by training according to data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information.
For example, the server may obtain an initial baseline model according to the data calibration information and sample data corresponding to the data calibration information, and deploy the initial baseline model to the terminal device. The server can obtain a generated model according to the initial baseline model, the sample data and the data calibration information through training, and deploy the generated model to the terminal equipment. The server may send the data calibration information to the terminal device. In summary, the terminal device may obtain the initial baseline model, and generate the model and the data calibration information.
Step 402, processing the data calibration information through the generated model to obtain a first data characteristic.
Step 403, processing scene data (such as scene data in the training data set) of the terminal device through the initial baseline model, and obtaining a second data feature corresponding to the scene data.
Step 404, training the initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model (the trained initial baseline model is referred to as a target baseline model).
Referring to steps 402 to 404, referring to fig. 5, data calibration information may be input to a generative model of the GAN, so that the generative model processes the data calibration information, without limitation, to obtain a data feature corresponding to the data calibration information, and the data feature is recorded as a first data feature.
For example, by sending data calibration information of sample data to the terminal device instead of sending the sample data containing privacy information to the terminal device, user privacy can be protected. Even if the data calibration information of the sample data is sent to the terminal equipment, the terminal equipment can obtain the first data characteristic based on the data calibration information, and train the initial baseline model based on the first data characteristic to obtain the target baseline model.
The first data characteristic is obtained after the model is generated to process the data calibration information, the first data characteristic is a low characteristic graph (including only the low-level characteristic of the sample data, but not the sample data), the initial baseline model is trained by introducing the first data characteristic, namely the initial baseline model is trained by referring to the data characteristic of a large amount of sample data of the server, and therefore the trained target baseline model has better performance. Moreover, the first data features are low feature graphs and do not include the privacy information of the sample data, so that the privacy information of the sample data cannot be analyzed based on the first data features, and the leakage of the privacy information is avoided.
Continuing to refer to fig. 5, the scene data in the training data set may be input to the initial baseline model, so that the initial baseline model processes the scene data, without limiting the processing process, to obtain a data feature corresponding to the scene data, and mark the data feature as a second data feature.
With continued reference to fig. 5, after the first data feature and the second data feature are obtained, the first data feature and the second data feature are input to the initial baseline model to process the first data feature and the second data feature through the initial baseline model to obtain a third data feature. For example, assuming that the initial baseline model is used to implement target detection (e.g., human face detection, human body detection, vehicle detection, etc.), input data (i.e., the first data feature and the second data feature) is provided to the initial baseline model, and the initial baseline model processes the input data, for example, processes the input data using all neural network parameters, to obtain output data, which is a feature vector, from which a target detection result can be determined.
In this embodiment, after obtaining the feature vector, instead of determining the target detection result by the feature vector, the feature vector may be used as a third data feature, and it is determined whether the initial baseline model has converged based on the third data feature. For example, a loss value of a loss function may be determined based on the third data characteristic, and a determination may be made as to whether the initial baseline model has converged based on the loss value of the loss function.
If the initial baseline model is converged, the initial baseline model can be determined as a trained target baseline model, and thus, the training process of the initial baseline model is completed, and the target baseline model is obtained.
If the initial baseline model is not converged, the initial baseline model can be adjusted, and the adjustment process is not limited, for example, each neural network parameter of the initial baseline model can be adjusted to obtain the adjusted initial baseline model. Then, the first data characteristic (which is kept unchanged) and the second data characteristic (which is kept unchanged) are input into the adjusted initial baseline model, so that the first data characteristic and the second data characteristic are processed through the adjusted initial baseline model to obtain a new third data characteristic, and whether the adjusted initial baseline model converges or not is determined based on the new third data characteristic, and so on.
For example, a loss function may be pre-constructed, the input of the loss function is related to the third data characteristic, the output of the loss function is a loss value, and the loss function is not limited and may be configured empirically. After substituting the third data characteristic into the loss function, a loss value of the loss function can be obtained.
Whether the initial baseline model has converged may be determined from a loss value, e.g., a loss value of 1 for the loss function based on the third data characteristic. If the loss value of 1 is not greater than the threshold, it is determined that the initial baseline model has converged. If the loss value of 1 is greater than the threshold, it is determined that the initial baseline model did not converge. Alternatively, the first and second electrodes may be,
whether the initial baseline model has converged can be determined according to a plurality of loss values of a plurality of iterations, for example, in each iteration, the initial baseline model of the last iteration is adjusted to obtain an adjusted initial baseline model, and one loss value can be obtained in each iteration. And then determining a change amplitude curve of the plurality of loss values, and determining that the initial baseline model of the last iteration process has converged if the change amplitude of the loss values is determined to be stable (the loss values of the continuous iteration processes are not changed or the change amplitude is small) according to the change amplitude curve and the loss value of the last iteration process is not greater than a threshold value. Otherwise, determining that the initial baseline model of the last iteration process is not converged, continuing to perform the next iteration process to obtain the loss value of the next iteration process, and re-determining the change amplitude curve of the loss values.
In practical applications, it may also be determined whether the initial baseline model has converged, which is not limited in this respect. For example, if the iteration number reaches a preset number threshold, it is determined that the initial baseline model has converged; for another example, if the iteration duration reaches a preset duration threshold, it is determined that the initial baseline model has converged.
Referring to the above embodiments, the initial baseline model may include a first sub-network model that is a network model that does not require incremental training and a second sub-network model that is a network model that requires incremental training. For example, the initial baseline model may include N layers, the first to mth layers of the initial baseline model, may serve as a first sub-network model of the initial baseline model, and the M +1 th to nth layers of the initial baseline model, may serve as a second sub-network model of the initial baseline model.
On this basis, for steps 402-404, the data calibration information may be input to the generative model so that the generative model processes the data calibration information to obtain a first data characteristic corresponding to the data calibration information. The scene data in the training data set may be input to the first sub-network model of the initial baseline model, so that the first sub-network model processes the scene data to obtain a second data feature corresponding to the scene data, where it is noted that the second data feature is an output result of the first sub-network model.
Since the first sub-network model does not need to be incrementally trained (i.e., each neural network parameter of the first sub-network model does not need to be adjusted), and the second sub-network model needs to be incrementally trained (i.e., each neural network parameter of the second sub-network model needs to be adjusted), after the first data feature and the second data feature are obtained, the first data feature and the second data feature are input to the second sub-network model of the initial baseline model to train the second sub-network model, so that the trained second sub-network model is obtained.
And generating a target baseline model based on the first sub-network model and the trained second sub-network model, namely combining the first sub-network model and the trained second sub-network model to obtain the target baseline model.
For example, after the first data feature and the second data feature are input to the second sub-network model, the first data feature and the second data feature are processed by the second sub-network model to obtain a third data feature. Determining whether the second sub-network model has converged based on the third data characteristic, such as determining a loss value of the loss function based on the third data characteristic, and determining whether the second sub-network model has converged based on the loss value of the loss function.
And if the second sub-network model is converged, determining the second sub-network model as the trained second sub-network model, combining the first sub-network model and the trained second sub-network model to obtain a target baseline model, and finishing the training process of the initial baseline model to obtain the target baseline model.
And if the second sub-network model is not converged, adjusting the second sub-network model, and obtaining the adjusted second sub-network model without limiting the adjusting process. And then, inputting the first data characteristic and the second data characteristic into the adjusted second sub-network model, processing the first data characteristic and the second data characteristic through the adjusted second sub-network model to obtain a new third data characteristic, and determining whether the adjusted second sub-network model converges or not based on the new third data characteristic, and so on.
Illustratively, steps 402-404 are incremental training processes for the initial baseline model, which may be implemented by an incremental training submodule. In the incremental training process, the initial baseline model can be subjected to incremental training by using the scene data in the training data set to obtain a target baseline model. Because the scene data in the training data set is data which is not reported to the server by the terminal device, when the initial baseline model is subjected to incremental training by using the scene data, the target baseline model can learn new knowledge of the scene data while acquiring the basic data set information, and thus the performance of the target baseline model can be improved.
The data calibration information of a large amount of sample data is sent to the terminal equipment, so that the incremental training sub-module can input the data calibration information to the generated model, the generated model obtains first data characteristics corresponding to the data calibration information, and the initial baseline model is subjected to incremental training by using the first data characteristics, so that the target baseline model obtains basic data set information, and better model performance is obtained.
Because the data calibration information is sent to the terminal equipment, but not the sample data, sensitive sample data is prevented from being sent to the terminal equipment, and the storage resource of the terminal equipment is saved.
Even if the generation model converts the data calibration information into the first data characteristic, the first data characteristic is an abstract two-dimensional characteristic and is not sample data, and the first data characteristic does not include sensitive information of the sample data, so that the sensitive information of the sample data cannot be leaked. When the first data features are introduced into the incremental training of the initial baseline model, the target baseline model can acquire basic data set information, and the training is more stable.
In step 405, an initial baseline model or a target baseline model is deployed at the terminal device to process the application data through the initial baseline model or the target baseline model (i.e., artificial intelligence processing).
For example, the terminal device may compare the performance of the target baseline model with the performance of the initial baseline model. If the performance of the target baseline model is better than that of the initial baseline model, the target baseline model can be deployed at the terminal equipment so as to process the application data through the target baseline model. After the terminal device deploys the target baseline model, the target baseline model may also be used as an initial baseline model, and the process of performing incremental training on the initial baseline model is returned, see steps 402-404.
If the performance of the initial baseline model is better than that of the target baseline model, the initial baseline model can be deployed at the terminal equipment so as to process the application data through the initial baseline model. After the terminal device deploys the initial baseline model, the process of performing incremental training on the initial baseline model may be returned to.
For example, for application data to be processed, if a target baseline model is deployed at a terminal device, the application data may be input to the target baseline model, so as to process the application data through the target baseline model, and obtain a processing result (e.g., an artificial intelligence processing result). For example, assuming that the target baseline model is used to implement target detection (such as face detection, human body detection, vehicle detection, etc.), the application data is provided to the target baseline model, and the target baseline model processes the application data by using the neural network parameters to obtain output data, where the output data is a feature vector, and a target detection result can be determined by the feature vector.
For example, if the initial baseline model is deployed at the terminal device, the application data may be input to the initial baseline model, so as to process the application data through the initial baseline model, and obtain a processing result.
In one possible embodiment, a test data set may be obtained, where the test data set includes a plurality of test data. A first performance index corresponding to the initial baseline model is determined based on the plurality of test data, and a second performance index corresponding to the target baseline model is determined based on the plurality of test data. Comparing the performance of the target baseline model to the performance of the initial baseline model based on the first performance indicator and the second performance indicator.
For example, the plurality of test data may be processed through the initial baseline model to obtain initial prediction categories of the plurality of test data, and the first performance indicator may be determined based on the initial prediction categories of the plurality of test data and actual categories of the plurality of test data. And processing the plurality of test data through the target baseline model to obtain target prediction categories of the plurality of test data, and determining the second performance index based on the target prediction categories of the plurality of test data and the actual categories of the plurality of test data.
Illustratively, for step 405, a comparison of the performance of the initial baseline model to the performance of the target baseline model may be performed by the model evaluation module. In the performance comparison process, the model evaluation module can compare the performance of the initial baseline model with the performance of the target baseline model and deploy the baseline model with the optimal performance on the terminal equipment, so that the optimal baseline model is output, and the performance of the baseline model is not reduced after each incremental iterative training. Referring to fig. 6, a schematic deployment diagram of the baseline model is shown.
The input of the model evaluation module is a test data set, an initial baseline model and a target baseline model, the construction of the test data set is described in the above embodiments, the test data set includes a plurality of test data, each test data corresponds to data calibration information, and the data calibration information includes actual category of the test data.
For example, for each test data in the test data set, the test data is input to the initial baseline model to be processed by the initial baseline model to obtain a processing result, and the processing result is a prediction category of the test data, which may be referred to as an initial prediction category.
If the initial prediction type of the test data is consistent with the actual type of the test data, it indicates that the identification result of the initial baseline model to the test data is correct, and if the initial prediction type of the test data is inconsistent with the actual type of the test data, it indicates that the identification result of the initial baseline model to the test data is wrong.
After the above processing is performed on each test data, the number of correct recognition results (hereinafter referred to as a1) and the number of errors of recognition results (hereinafter referred to as a2) can be obtained, and the first performance index of the initial baseline model is determined according to the number of correct recognition results a1 and the number of errors of recognition results a 2.
For example, the first performance indicator may be a1/(a1+ a2), and it is obvious that the larger the first performance indicator, the better the performance of the initial baseline model. Alternatively, the first performance indicator may be a2/(a1+ a2), it being apparent that the larger the first performance indicator, the worse the performance of the initial baseline model. Of course, the above is only an example of the first performance index, and the first performance index is not limited, for example, an FPPI (False Positive per Image) index may also be used as the first performance index of the initial baseline model.
For example, for each test data in the test data set, the test data is input to the target baseline model to be processed by the target baseline model, so as to obtain a processing result, where the processing result is a prediction category of the test data, and the prediction category may be referred to as a target prediction category.
If the target prediction type of the test data is consistent with the actual type of the test data, it indicates that the identification result of the target baseline model on the test data is correct, and if the target prediction type of the test data is not consistent with the actual type of the test data, it indicates that the identification result of the target baseline model on the test data is wrong.
After the above processing is performed on each test data, the number of correct recognition results (hereinafter, referred to as b1) and the number of errors of recognition results (hereinafter, referred to as b2) can be obtained, and the second performance index of the target baseline model is determined according to the number of correct recognition results b1 and the number of errors of recognition results b 2.
For example, the second performance indicator may be b1/(b1+ b2), and it is obvious that the larger the second performance indicator, the better the performance of the target baseline model is represented. Alternatively, the second performance level may be b2/(b1+ b2), it being apparent that the greater the second performance level, the worse the performance of the target baseline model. Of course, the above are only two examples of the second performance index, and the second performance index is not limited, for example, the FPPI index may also be used as the second performance index of the target baseline model.
After the first performance index and the second performance index are obtained, the relationship between the performance of the target baseline model and the performance of the initial baseline model is compared based on the first performance index and the second performance index. For example, if the first performance metric is a1/(a1+ a2) and the second performance metric is b1/(b1+ b2), the performance of the initial baseline model is better than the performance of the target baseline model when the first performance metric is greater than the second performance metric. When the first performance index is less than the second performance index, the performance of the target baseline model is superior to the performance of the initial baseline model. If the first performance metric is a2/(a1+ a2) and the second performance metric is b2/(b1+ b2), the performance of the target baseline model is better than the performance of the initial baseline model when the first performance metric is greater than the second performance metric. When the first performance metric is less than the second performance metric, the performance of the initial baseline model is better than the performance of the target baseline model.
According to the technical scheme, in the embodiment of the application, the terminal device trains the initial baseline model according to scene data (namely data of the environment where the terminal device is located) to obtain the target baseline model, the target baseline model can be matched with the environment where the terminal device is located, the performance of the target baseline model is improved, and the accuracy of the intelligent analysis result of the target baseline model is high. After the target baseline model is obtained, the initial baseline model or the target baseline model is determined to be deployed on the terminal equipment by comparing the performance of the initial baseline model with the performance of the target baseline model, so that the baseline model with better performance is ensured to be deployed on the terminal equipment, the baseline model with poorer performance is prevented from being deployed on the terminal equipment, the baseline model deployed on the terminal equipment is continuously updated in an iterative manner, the performance of the baseline model is continuously improved, the performance of the baseline model is better, and the intelligent analysis result of the baseline model is more accurate.
Based on the same application concept as the method, the embodiment of the present application further provides a data processing apparatus, which is applied to a terminal device, as shown in fig. 7, and is a structural diagram of the apparatus, where the apparatus includes:
an obtaining module 71, configured to obtain an initial baseline model, generate a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information; the processing module 72 is configured to process the data calibration information through the generated model to obtain a first data feature; processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics; a training module 73, configured to train the initial baseline model through the first data feature and the second data feature, so as to obtain a trained target baseline model; a deployment module 74 configured to deploy the initial baseline model or the target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model.
The initial baseline model includes a first sub-network model, the first sub-network model is a network model that does not need incremental training, and the processing module 72 processes scene data of the terminal device through the initial baseline model to obtain a second data feature, which is specifically used for:
and inputting the scene data of the terminal equipment to the first sub-network model so that the first sub-network model processes the scene data to obtain a second data characteristic.
The initial baseline model further includes a second sub-network model, the second sub-network model is a network model that needs incremental training, and the training module 73 trains the initial baseline model through the first data feature and the second data feature, and is specifically configured to: inputting the first data characteristic and the second data characteristic into the second sub-network model to train the second sub-network model to obtain a trained second sub-network model;
generating a target baseline model based on the first sub-network model and the trained second sub-network model.
The deployment module 74 is specifically configured to, when the terminal device deploys the initial baseline model or the target baseline model to process the application data through the initial baseline model or the target baseline model:
comparing the performance of the target baseline model to the performance of the initial baseline model;
if the performance of the target baseline model is superior to that of the initial baseline model, deploying the target baseline model at the terminal equipment so as to process the application data through the target baseline model;
and if the performance of the initial baseline model is better than that of the target baseline model, deploying the initial baseline model at the terminal equipment so as to process the application data through the initial baseline model.
The deployment module 74, when comparing the relationship between the performance of the target baseline model and the performance of the initial baseline model, is specifically configured to: acquiring a test data set, wherein the test data set comprises a plurality of test data; determining a first performance index corresponding to the initial baseline model based on a plurality of test data, and determining a second performance index corresponding to the target baseline model based on a plurality of test data; comparing the performance of the target baseline model to the performance of the initial baseline model based on the first performance metric and the second performance metric.
The deployment module 74 determines a first performance index corresponding to the initial baseline model based on the plurality of test data, and specifically, determines a second performance index corresponding to the target baseline model based on the plurality of test data: processing the plurality of test data through an initial baseline model to obtain initial prediction categories of the plurality of test data; determining the first performance indicator based on an initial predicted category of the plurality of test data and an actual category of the plurality of test data; processing the plurality of test data through a target baseline model to obtain target prediction categories of the plurality of test data; determining the second performance indicator based on a target prediction category of the plurality of test data and an actual category of the plurality of test data.
Illustratively, the total number of test data in the test data set is less than the total number of scenario data in a training data set, the scenario data in the training data set being the scenario data input to the initial baseline model; and/or the test data set comprises a first test subset and a second test subset, wherein the total number of test data in the first test subset is less than the total number of test data in the second test subset;
the test data in the first test subset is derived from the scene data in the training data set, and the test data in the second test subset is derived from the scene data acquired by the terminal device on site.
The training module 73 trains the initial baseline model through the first data feature and the second data feature, and when the trained target baseline model is obtained, the training module is specifically configured to:
inputting the first data characteristic and the second data characteristic into an initial baseline model, and processing the first data characteristic and the second data characteristic through the initial baseline model to obtain a third data characteristic;
determining whether the initial baseline model has converged based on the third data characteristic;
if so, determining the initial baseline model as a trained target baseline model;
and if not, adjusting the initial baseline model, and returning to execute the operation of inputting the first data characteristic and the second data characteristic to the initial baseline model based on the adjusted initial baseline model.
Based on the same application concept as the method, a terminal device is also provided in the embodiment of the present application, and from a hardware level, a schematic diagram of a hardware architecture of the terminal device provided in the embodiment of the present application may be as shown in fig. 8. The terminal device may include: a processor 81 and a machine-readable storage medium 82, the machine-readable storage medium 82 storing machine-executable instructions executable by the processor 81; the processor 81 is configured to execute machine-executable instructions to implement the methods disclosed in the above examples of the present application.
For example, the processor 81 is configured to execute machine-executable instructions to perform the following steps:
acquiring an initial baseline model, and generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information;
processing the data calibration information through a generation model to obtain a first data characteristic;
processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics;
training an initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model; deploying an initial baseline model or a target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model.
Based on the same application concept as the method, embodiments of the present application further provide a machine-readable storage medium, where several computer instructions are stored on the machine-readable storage medium, and when the computer instructions are executed by a processor, the method disclosed in the above example of the present application can be implemented.
For example, the computer instructions, when executed by a processor, enable the following steps:
acquiring an initial baseline model, and generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information;
processing the data calibration information through a generation model to obtain a first data characteristic;
processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics;
training an initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model; deploying an initial baseline model or a target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model.
The machine-readable storage medium may be, for example, any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A data processing method is applied to a terminal device, and the method comprises the following steps:
acquiring an initial baseline model, and generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information;
processing the data calibration information through a generation model to obtain a first data characteristic;
processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics;
training an initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model; deploying an initial baseline model or a target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model.
2. The method of claim 1, wherein the initial baseline model comprises a first sub-network model, wherein the first sub-network model is a network model that does not require incremental training, and wherein processing the scene data of the terminal device through the initial baseline model to obtain the second data feature comprises:
and inputting the scene data of the terminal equipment to the first sub-network model so that the first sub-network model processes the scene data to obtain a second data characteristic.
3. The method of claim 2,
the initial baseline model further includes a second sub-network model, the second sub-network model is a network model that needs incremental training, and the training of the initial baseline model through the first data feature and the second data feature to obtain a trained target baseline model includes:
inputting the first data characteristic and the second data characteristic into the second sub-network model to train the second sub-network model to obtain a trained second sub-network model;
generating a target baseline model based on the first sub-network model and the trained second sub-network model.
4. The method according to any one of claims 1 to 3,
the deploying of the initial baseline model or the target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model comprises:
comparing the performance of the target baseline model to the performance of the initial baseline model;
if the performance of the target baseline model is superior to that of the initial baseline model, deploying the target baseline model at the terminal equipment so as to process the application data through the target baseline model;
and if the performance of the initial baseline model is better than that of the target baseline model, deploying the initial baseline model at the terminal equipment so as to process the application data through the initial baseline model.
5. The method of claim 4, wherein comparing the performance of the target baseline model to the performance of the initial baseline model comprises:
acquiring a test data set, wherein the test data set comprises a plurality of test data;
determining a first performance index corresponding to the initial baseline model based on the plurality of test data, and determining a second performance index corresponding to the target baseline model based on the plurality of test data;
comparing the performance of the target baseline model to the performance of the initial baseline model based on the first performance metric and the second performance metric.
6. The method of claim 5,
the determining a first performance indicator corresponding to the initial baseline model based on the plurality of test data and a second performance indicator corresponding to the target baseline model based on the plurality of test data comprises:
processing the plurality of test data through the initial baseline model to obtain initial prediction categories of the plurality of test data; determining the first performance indicator based on an initial predicted category of the plurality of test data and an actual category of the plurality of test data;
processing the plurality of test data through the target baseline model to obtain target prediction categories of the plurality of test data; determining the second performance indicator based on a target prediction category of the plurality of test data and an actual category of the plurality of test data.
7. The method according to claim 5 or 6,
the total number of test data in the test data set is less than the total number of scene data in a training data set, and the scene data in the training data set is the scene data input to the initial baseline model;
and/or the test data set comprises a first test subset and a second test subset, wherein the total number of test data in the first test subset is less than the total number of test data in the second test subset;
the test data in the first test subset is derived from the scene data in the training data set, and the test data in the second test subset is derived from the scene data acquired by the terminal device on site.
8. The method of claim 1, wherein training an initial baseline model with the first data feature and the second data feature to obtain a trained target baseline model comprises:
inputting the first data characteristic and the second data characteristic into an initial baseline model, and processing the first data characteristic and the second data characteristic through the initial baseline model to obtain a third data characteristic;
determining whether the initial baseline model has converged based on the third data characteristic;
if so, determining the initial baseline model as a trained target baseline model;
and if not, adjusting the initial baseline model, and returning to execute the operation of inputting the first data characteristic and the second data characteristic to the initial baseline model based on the adjusted initial baseline model.
9. A data processing apparatus, applied to a terminal device, the apparatus comprising:
the acquisition module is used for acquiring an initial baseline model, generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information;
the processing module is used for processing the data calibration information through the generated model to obtain a first data characteristic; processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics;
the training module is used for training the initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model;
a deployment module, configured to deploy the initial baseline model or the target baseline model at the terminal device, so as to process application data through the initial baseline model or the target baseline model.
10. A terminal device, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor;
the processor is configured to execute machine executable instructions to perform the steps of:
acquiring an initial baseline model, and generating a model and data calibration information; the initial baseline model is obtained by training according to the data calibration information and sample data corresponding to the data calibration information, and the generated model is obtained by training according to the initial baseline model, the sample data and the data calibration information;
processing the data calibration information through a generation model to obtain a first data characteristic;
processing scene data of the terminal equipment through the initial baseline model to obtain second data characteristics;
training an initial baseline model through the first data characteristic and the second data characteristic to obtain a trained target baseline model; deploying an initial baseline model or a target baseline model at the terminal device to process the application data through the initial baseline model or the target baseline model.
CN202010501078.7A 2020-06-04 2020-06-04 Data processing method, device and equipment Pending CN113762520A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010501078.7A CN113762520A (en) 2020-06-04 2020-06-04 Data processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010501078.7A CN113762520A (en) 2020-06-04 2020-06-04 Data processing method, device and equipment

Publications (1)

Publication Number Publication Date
CN113762520A true CN113762520A (en) 2021-12-07

Family

ID=78783697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010501078.7A Pending CN113762520A (en) 2020-06-04 2020-06-04 Data processing method, device and equipment

Country Status (1)

Country Link
CN (1) CN113762520A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542344A (en) * 2023-07-05 2023-08-04 浙江大华技术股份有限公司 Model automatic deployment method, platform and system

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273978A (en) * 2017-05-25 2017-10-20 清华大学 A kind of production of three models game resists the method for building up and device of network model
CN109858445A (en) * 2019-01-31 2019-06-07 北京字节跳动网络技术有限公司 Method and apparatus for generating model
US20190279111A1 (en) * 2018-03-09 2019-09-12 Zestfinance, Inc. Systems and methods for providing machine learning model evaluation by using decomposition
CN110263938A (en) * 2019-06-19 2019-09-20 北京百度网讯科技有限公司 Method and apparatus for generating information
WO2019211856A1 (en) * 2018-05-02 2019-11-07 Saferide Technologies Ltd. Detecting abnormal events in vehicle operation based on machine learning analysis of messages transmitted over communication channels
CN110516418A (en) * 2019-08-21 2019-11-29 阿里巴巴集团控股有限公司 A kind of operation user identification method, device and equipment
WO2019233341A1 (en) * 2018-06-08 2019-12-12 Oppo广东移动通信有限公司 Image processing method and apparatus, computer readable storage medium, and computer device
CN110738304A (en) * 2018-07-18 2020-01-31 科沃斯机器人股份有限公司 Machine model updating method, device and storage medium
CN110909803A (en) * 2019-11-26 2020-03-24 腾讯科技(深圳)有限公司 Image recognition model training method and device and computer readable storage medium
CN111091175A (en) * 2018-10-23 2020-05-01 北京嘀嘀无限科技发展有限公司 Neural network model training method, neural network model classification method, neural network model training device and electronic equipment
CN111126503A (en) * 2019-12-27 2020-05-08 北京同邦卓益科技有限公司 Training sample generation method and device
CN111160380A (en) * 2018-11-07 2020-05-15 华为技术有限公司 Method for generating video analysis model and video analysis system

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273978A (en) * 2017-05-25 2017-10-20 清华大学 A kind of production of three models game resists the method for building up and device of network model
US20190279111A1 (en) * 2018-03-09 2019-09-12 Zestfinance, Inc. Systems and methods for providing machine learning model evaluation by using decomposition
WO2019211856A1 (en) * 2018-05-02 2019-11-07 Saferide Technologies Ltd. Detecting abnormal events in vehicle operation based on machine learning analysis of messages transmitted over communication channels
WO2019233341A1 (en) * 2018-06-08 2019-12-12 Oppo广东移动通信有限公司 Image processing method and apparatus, computer readable storage medium, and computer device
CN110738304A (en) * 2018-07-18 2020-01-31 科沃斯机器人股份有限公司 Machine model updating method, device and storage medium
CN111091175A (en) * 2018-10-23 2020-05-01 北京嘀嘀无限科技发展有限公司 Neural network model training method, neural network model classification method, neural network model training device and electronic equipment
CN111160380A (en) * 2018-11-07 2020-05-15 华为技术有限公司 Method for generating video analysis model and video analysis system
CN109858445A (en) * 2019-01-31 2019-06-07 北京字节跳动网络技术有限公司 Method and apparatus for generating model
CN110263938A (en) * 2019-06-19 2019-09-20 北京百度网讯科技有限公司 Method and apparatus for generating information
CN110516418A (en) * 2019-08-21 2019-11-29 阿里巴巴集团控股有限公司 A kind of operation user identification method, device and equipment
CN110909803A (en) * 2019-11-26 2020-03-24 腾讯科技(深圳)有限公司 Image recognition model training method and device and computer readable storage medium
CN111126503A (en) * 2019-12-27 2020-05-08 北京同邦卓益科技有限公司 Training sample generation method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王国龙 等: "中医诊断古文的词性标注与特征重组", 《 计算机工程与设计》, vol. 36, no. 3, 30 March 2015 (2015-03-30), pages 835 - 841 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542344A (en) * 2023-07-05 2023-08-04 浙江大华技术股份有限公司 Model automatic deployment method, platform and system

Similar Documents

Publication Publication Date Title
CN109241903B (en) Sample data cleaning method, device, computer equipment and storage medium
CN110909651B (en) Method, device and equipment for identifying video main body characters and readable storage medium
CN113239874B (en) Behavior gesture detection method, device, equipment and medium based on video image
CN111291841A (en) Image recognition model training method and device, computer equipment and storage medium
CN112288770A (en) Video real-time multi-target detection and tracking method and device based on deep learning
CN111783997B (en) Data processing method, device and equipment
CN111783996B (en) Data processing method, device and equipment
KR102412832B1 (en) Method for training and testing obfuscation network for processing data to be obfuscated for privacy, and training device and testing device using them
CN110197107B (en) Micro-expression recognition method, micro-expression recognition device, computer equipment and storage medium
CN111985385A (en) Behavior detection method, device and equipment
CN111783630B (en) Data processing method, device and equipment
CN112182384B (en) Content recommendation method and device based on countermeasure learning and computer equipment
CN112001488A (en) Training generative antagonistic networks
Mahjabin et al. Age estimation from facial image using convolutional neural network (cnn)
CN113762520A (en) Data processing method, device and equipment
CN113723407A (en) Image classification and identification method and device, computer equipment and storage medium
CN111639523B (en) Target detection method, device, computer equipment and storage medium
CN112686114A (en) Behavior detection method, device and equipment
CN110889316B (en) Target object identification method and device and storage medium
CN112686300B (en) Data processing method, device and equipment
CN112529078A (en) Service processing method, device and equipment
CN113516182B (en) Visual question-answering model training and visual question-answering method and device
CN114692785A (en) Behavior classification method, device, equipment and storage medium
CN114462073A (en) De-identification effect evaluation method and device, storage medium and product
Reichstaller et al. Compressing uniform test suites using variational autoencoders

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination