CN111444255B

CN111444255B - Training method and device for data model

Info

Publication number: CN111444255B
Application number: CN201811641326.7A
Authority: CN
Inventors: 戚世葛; 孙承华
Original assignee: Hangzhou Haikang Storage Technology Co ltd
Current assignee: Hangzhou Haikang Storage Technology Co ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2023-09-22
Anticipated expiration: 2038-12-29
Also published as: CN111444255A

Abstract

The application discloses a training method of a data model, which comprises the steps of counting difference information reported by network equipment at a network side, wherein the difference information is the difference between an output result and an expected result of a first data model deployed by the network equipment at the user side, taking the counted difference information as a training basis, adjusting model parameters of the first data model, distributing the adjusted model parameters to the network equipment at the user side, and enabling the first data model deployed by the network equipment at the user side to be updated according to the distributed model parameters. The privacy problem of training data required by the training data model is solved, the private data is effectively protected, and the sources of the training data are enriched; the data model can be trained regularly through the collection of the difference information, and model parameters of the data model are updated, so that the data model can be trained more effectively and accurately.

Description

Training method and device for data model

Technical Field

The present application relates to the field of computer data mining, and in particular, to a training method and apparatus for a data model.

Background

Data mining through artificial intelligence, machine learning, statistics, and database interleaving methods to discover data relationships, potential information, and value is one of the computer applications, which is essentially a computational process that discovers patterns in a relatively large dataset. The data used to train the data mining model in the data mining process is referred to as training data. Training data selection generally has the following requirements: the data samples are as large as possible, the data are diversified, and the data sample quality is high.

The training of the current data model is roughly classified into two types according to the source mode of training data:

first category: the data model is trained using data collected from various development teams as training data. The effect of the method depends on the source of sample data collected by a development team, and the data can be different from the actual application scene of a user, so that the recognition rate of a trained data model is low, and the adaptability of the recognition effect to different scenes is poor. And once training is completed, the training results are typically not updated unless a new batch of data is collected.

For example, a Convolutional Neural Network (CNN) model for image recognition is currently adopted as a CNN recognition training algorithm, the data sources of which are data collected by each development team, and the training is performed again based on the collected pictures, and the deployment is performed after the training is completed. After deployment, if a large amount of new data is not collected, the training result is not updated generally, and the user always uses the originally deployed model to identify the image.

The second category: the data model is trained using data stored by public cloud as training data. Because a large amount of data of users are stored on the public cloud, the data has diversity, and the users can continuously update and increase, so that training can be continuously performed to obtain improvement of training results. However, the privacy of the user data in this way cannot be fully guaranteed, and the user may not upload part of the data, and some special types of data may not be counted.

Disclosure of Invention

The application provides a training method of a data model, which is used for improving the accuracy of a training result of the data model.

The application provides a training method of a data model, which comprises the steps of, at a network side,

counting difference information reported by each user side network device, wherein the difference information is the difference between an output result and an expected result of a first data model deployed by the user side network device;

taking the statistical difference information as a training basis, and adjusting model parameters of the first data model;

and distributing the adjusted model parameters to each user side network device, so that the first data model deployed by the user side network device is updated according to the distributed model parameters.

Wherein the difference information is obtained by capturing an error correction operation for the output result by a user side device.

Capturing, by the user side device, error correction operation on the output result includes capturing error correction operation on the output result by an application program of the user side device, and generating reported difference information based on the error correction operation.

Preferably, the statistics include the difference information reported from each user side network device,

according to the difference information, periodically counting sample characteristic values output by a first data model with p generation offspring model parameters in each user side network device;

The step of adjusting model parameters of the first data model by taking the statistical difference information as a training basis comprises,

taking the counted sample characteristic values as training basis, and counting the error between each sample characteristic value and a preset first threshold value;

according to the principle of minimum error, periodically selecting model parameters of m pairs of first data models,

respectively hybridizing m pairs of model parameters of the first data model according to a genetic algorithm to obtain p+1st generation offspring model parameters,

taking the p+1st generation offspring model parameters as the adjusted model parameters;

repeating the steps for iteration until the iteration ending condition is met;

wherein, p and m are natural numbers.

Preferably, when the model parameter of the first data model is an initial model parameter, the counting the error between each sample characteristic value and the preset first threshold value further includes using the counted sample characteristic value as a training basis,

calculating variances of all errors according to the statistical errors, and randomly calculating the initial model parameters by taking the variances as a change interval as a reference to obtain N first generation offspring model parameters;

the p+1st generation offspring model parameters are N,

Wherein N is a natural number.

Preferably, the N p+1st generation offspring model parameters are distributed to each user side network device according to a certain rule or policy.

Preferably, the data model is a convolutional neural network CNN model for identifying text information in the picture;

the capturing the error correction operation of the user side device application program on the output result, generating the reported difference information based on the error correction operation comprises,

editing by an application program the characters in the text generated according to the picture recognition result, recording the coordinate area of each character in the text in the picture,

the coordinate area where the character with the error is identified is subjected to the matting,

the text with errors, the corrected text and the matting are used as reported difference information; or, the picture after the picture is scratched is identified by applying the CNN model for identifying the text information in the picture, and the picture feature vector output by the model is used as the reported difference information.

Preferably, the data model is a CNN model for recognizing text information in voice;

Editing characters in a text generated according to voice recognition by an application program, recording the corresponding relation between the characters and voice time periods,

searching a voice time period in which the wrongly recognized characters are located according to the recorded corresponding relation;

and taking the recognized error characters, the corrected characters and the searched voice frequency spectrum characteristic vectors in the voice time period as reported difference information.

Preferably, the data model is a CNN model for face recognition;

the output result of the first data model deployed by the user side network equipment is that the face recognition is carried out on the pictures by applying the deployed face recognition CNN model through an artificial intelligent algorithm, and the pictures are grouped according to the face recognition, wherein the identification results are the same group;

if the operation is the merging operation, taking the feature vector Oi of each picture i after merging and grouping as reported difference information, and marking the difference information as allergy;

if the deleting operation is performed, the characteristic vector Oi of each picture i in the membership group of the deleted picture and the characteristic vector of the deleted picture are used as reported difference information, and the difference information is marked as low sensitivity;

Wherein i is a natural number.

Preferably, the data model is a CNN model for classifying and identifying pictures;

the output result of the first data model deployed by the user side network equipment is that the image is identified by applying a deployed image classification identification CNN model through an artificial intelligent algorithm, and the images are classified according to the identification types, wherein the identification results are the same classification;

if the classification error operation is performed, taking the feature vectors of all pictures in the first classification to which the moved picture belongs before the movement, the feature vectors of all pictures in the second classification to which the moved picture belongs, and the feature vectors of the moved picture as reported difference information, and marking the difference information as allergy;

if the difference information is not included in the classification operation, the feature vectors of all the pictures in the first classification which the deletion belongs to and the feature vectors of the deleted pictures are used as the reported difference information, and the difference information is marked as low sensitivity.

Preferably, the data model is a CNN model for identifying objects in video;

The output result of the first data model deployed by the user side network equipment is that an object recognition CNN model deployed in the video is applied through an artificial intelligence algorithm to recognize an I frame in the video as a picture and the I frame is classified according to the recognition type, wherein the recognition results are the same classification.

Preferably, the counting of the error between each sample characteristic value and the preset first threshold value includes that the counted sample characteristic value is taken as a training basis,

a difference Di of the feature vector Oi of each picture i and said first threshold Oi' as an ideal value is analyzed,

wherein ,

the first threshold is obtained by calculation based on feature vector statistical information of classifying correct pictures, and when the error information is an allergy mark, the first threshold is reduced; when the error information is insensitive, the first threshold is raised.

In one aspect, the invention provides a training device for a data model, which comprises,

a statistics module for counting the difference information reported by the network devices at the user side, wherein the difference information is the difference between the output result and the expected result of the first data model deployed by the network devices at the user side,

the training module uses the statistical difference information as a training basis to adjust the model parameters of the first data model,

And the distribution module distributes the adjusted model parameters to the user-side network equipment so that the first data model deployed by the user-side network equipment is updated according to the distributed model parameters.

Preferably, the statistics module further includes periodically counting sample feature values outputted from a first data model having a p-th generation offspring model parameter in each user side network device;

the training module may further comprise a processor configured to,

the genetic algorithm module takes the counted sample characteristic values as training basis, and counts the error between each sample characteristic value and a preset first threshold value; according to the principle of minimum error, model parameters of m pairs of first data models are selected periodically; hybridizing the model parameters of the m pairs of first data models according to a genetic algorithm to obtain p+1st generation offspring model parameters; taking the p+1st generation offspring model parameters as the adjusted model parameters; repeating the iteration until the iteration ending condition is met; wherein, p and m are natural numbers.

the p+1st generation offspring model parameters are N,

wherein N is a natural number.

Preferably, the distribution module further includes that the N p+1st generation offspring model parameters are distributed to each user side network device according to a certain rule or policy.

The data model is a CNN model, and the model parameters are coefficient vectors.

In another aspect, the application provides a network-side device comprising a memory and a processor, wherein,

the memory is used for storing a computer program;

the processor is used for executing the program stored in the memory to realize the data model training method.

In a further aspect, the present application provides a storage medium storing a computer program for implementing the data model training method.

According to the embodiment of the application, the difference information with the difference characteristic reported by each user side network device is collected to serve as training sample data, so that the private data of a user is prevented from being acquired, the privacy problem of training data required by a training data model is solved, the privacy of the user data is effectively protected, and the sources of the training data are indirectly enriched; the data model can be trained regularly through the collection of the difference information, and model parameters of the data model are updated, so that the data model can be trained more effectively and accurately.

Drawings

FIG. 1 is a schematic diagram of the CNN model shown in FIG. 1;

fig. 2 is a networking structure of the present embodiment.

Fig. 3 is a schematic flow chart of training the CNN model based on the networking structure of fig. 2 in embodiment 1.

Fig. 4 is a schematic illustration of deleting an erroneous photograph and merging an album of photographs stored in a user-side network device by an APP.

Fig. 5 is an illustration of the manipulation by the APP of an image identified by a trained CNN model deployed by a user-side network device.

Fig. 6 is a flowchart of a network side server process of embodiment 2.

Fig. 7 is a schematic diagram of a network server in this embodiment 4.

Fig. 8 is a schematic diagram of a network server in embodiment 5.

Fig. 9 is a schematic diagram of a training device according to an embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the accompanying drawings, in order to make the objects, technical means and advantages of the present application more apparent.

Aiming at the contradiction that the requirement of large data volume of training data and the requirement of privacy are difficult to obtain the quality of data samples are high, the method and the device give consideration to the universality and the privacy required by the training data, the application adjusts the model parameters of the data model by collecting the difference information with the difference characteristics, which is reported by the network equipment at the user side, and taking the difference information as the training basis, thereby achieving the aim of training the data model and updating the model parameters of the training model regularly. Further, the hybridization factor is optimized based on the analysis of the difference information, and the data model is updated by using a genetic algorithm, so that the training result of the data model tends to be accurate.

Example 1

The CNN model is one of deep learning algorithm models, and the complex network structure thereof enables the CNN model to be applied to recognition of various objects after training data of the objects to be recognized, and the training of the CNN model for face recognition is described below as an embodiment.

Referring to fig. 1, fig. 1 is a schematic diagram of a CNN model. For a trained face recognition CNN model, the model parameters are coefficient vectors C= [ C ] among nodes ₁ ，C ₂ ，…C _N ]. The result (data) output after any picture i passes through the CNN model is the feature vector O _i ＝[O ₁ ,O ₂ ,O ₃ …O _N ]。

In order to improve training efficiency of training the CNN model, avoid excessive load of the server, balance load sharing of the server, and in one embodiment, a corresponding function server is deployed on a network side according to functions, as shown in fig. 2, and fig. 2 is a networking structure of the present embodiment. The application server is used for providing application services for the user side network equipment, including, but not limited to, downloading of application programs, downloading of data models, plug-in units and the like, and can be accessed by the user side network equipment; the information server is used for counting and analyzing information reported from a user side, including but not limited to picture characteristic information, error type, error information, user click rate and the like; the algorithm server is used for completing training of the data model and iteration of the genetic algorithm according to the statistical information reported by the information server, and transmitting the updated data model parameters to the application server. The device is a user-side network device, and can access the network, which is equivalent to a distributed network node in the network, for example, NAS (network attached storage) device.

Referring to fig. 3, fig. 3 is a schematic flow chart of training the CNN model based on the networking structure of fig. 2 in embodiment 1.

Step 301, deploying an initial model to the user-side network device, i.e. deploying a model with model parameters c0= [ C ] ₀₁ ，C ₀₂ ，…C _0N ]The 0 th generation CNN model is installed on user side network equipment, wherein 0 represents the 0 th generation model, and C0 represents coefficient vectors among nodes of the initial model after training; in a specific manner, the configuration may be downloaded by a server, for example, by an application server, or may be directly configured when the user-side network device leaves the factory.

When the pictures are stored in the user side network device, for example, the user stores the pictures in the home NAS device, the user side network device performs face recognition on the stored pictures through the face CNN model deployed by the artificial intelligence algorithm application, then the stored pictures are stored in groups according to the result of the face recognition, the same recognition results are classified into the same group, for example, the picture with the recognition result of face 1 is stored in the face album a, and the picture with the recognition result of face 2 is stored in the face album B …. These group-stored face albums may be presented by a smart terminal Application (APP).

Step 302, obtaining the difference between the face result recognized by the CNN model and the expected result by capturing the error correction operation of the user on the picture by using the APP, namely, obtaining the difference between the current CNN model recognition result and the expected recognition result,

The method is specifically implemented in such a way that the operation of a user on the picture is captured through the change of the album and the deletion of the picture in the album. For example, for a face album, the usual error conditions are: multi-division and misclassification. Multiple division refers to the same person being identified as two persons, or multiple persons creating multiple albums, and wrong division refers to the identification of photos of different persons as one, so that APP provides two operations: merging photo albums and deleting wrong photos in the photo albums. Similarly, more error correction operations can be provided to the user through the development of APP.

Referring to fig. 4, fig. 4 shows an illustration of deleting an erroneous photo and merging the photo album by the APP for the photo stored in the user-side network device. When the pictures are stored on the user side network equipment, the user side network equipment carries out face recognition on the stored pictures through the current CNN model deployed by the AI algorithm application, and the stored pictures are stored in the face album in groups according to the recognition result. The user performs error correction operation on the face album by running an APP program in the intelligent terminal, for example, discovers that the photo stored in the face album C is the same person as the photo stored in the face album B, and operates the face albums C and B to be combined into the face album B, wherein the operation is captured by detecting the change of the face album; finding that photo 3 is a photo of another person, the user deletes the photo, which is captured by the deleted picture in the face album a.

Step 303, the intelligent terminal APP generates error information based on error correction operation, reports to the user side network device, and the user side network device reports the feature vector Oi of the error picture i to the information server, specifically,

if the merging operation is performed, the feature vector Oi of each picture i in the merged face album is reported, for example, the feature vector O of each picture i in the merged face album B is reported in fig. 4 _i The method comprises the steps of carrying out a first treatment on the surface of the And identifying the error message as "allergic";

if the deleting operation is performed, reporting the feature vector Oi of each picture i in the deleted picture belonging to the album and the feature vector of the deleted picture, for example, reporting the feature vector Oi of each picture i in the face album A and the feature vector of the deleted picture 3 in FIG. 4; and identifying the error message as "desensitized";

and 304, the information server stores all the reported information, synthesizes all the reported information and reports the reported information to the algorithm server.

In step 305, for each reported information, the algorithm server analyzes the difference Di between the feature vector Oi of each error picture i and the first threshold Oi' as an ideal value. The first threshold is calculated based on feature vector statistical information of classifying correct pictures, when the error information is allergy identification, the ideal value for identifying the pictures is reduced, namely, the first threshold is reduced, and when the error information is hyposensitization, the ideal value for identifying the pictures is increased, namely, the first threshold is increased.

For example, take reporting information from a NAS device:

when the server receives the information reported by the NAS: error correction operation type is deleting and/or low sensitivity identification, deleted picture i feature vector O ⁱ The feature vector of each picture in the photo album X to which the deleted picture i belongs is treated as a correctly classified picture, the statistic value of the feature vector of the photo album is obtained based on the statistic calculation of the feature vector of each picture, the statistic value is reduced according to the operation type and/or the hypoallergenic identification, a first threshold value Oi' of the ideal value of the deleted picture i is obtained, the difference Di between the feature vector of the deleted picture i and the first threshold value is calculated,

based on the difference Di, the CNN network vector c= [ C ] can be adjusted in reverse ₁ ,C ₂ …,C _n ](e.g., using a gradient descent method) new model parameters are generated.

For the server, when the server receives the information reported by the NAS: the error correction operation type is merging and/or allergy identification, all picture feature vectors of a first album and a second album, then all pictures in the merged album are regarded as correctly classified pictures, based on the statistical calculation of all picture feature vectors in the merged album, the statistical value of the feature vectors of the merged album pictures is obtained, the statistical value is improved according to the operation type and/or allergy identification, a first threshold value Oi 'of ideal values of all pictures in the merged album is obtained, and the difference Di between the feature vectors of each picture i in the merged album and the first threshold value Oi' is calculated and counted

The CNN network vector c= [ C ] can be adjusted inversely according to the statistical difference Di ₁ ,C ₂ …,C _n ](e.g., using a gradient)Drop method) to generate new model parameters.

For another example, take reporting information from a NAS device:

when the server receives the information reported by the NAS: the error correction operation type is deleting, and the feature vector O of the base picture (standard picture) ^x Deleted picture i feature vector O ⁱ A judgment threshold, the distance D between the deleted picture i and the feature vector of the base picture can be judged according to the error correction operation type _xi Too small, i.e. ideal valueShould be greater than the decision threshold, whereby the goal +.>To derive the current D _xi With the ideal value D' _xi Error between:

wherein g is an error calculation function;

D _xi ＝f(O ^x ，O ⁱ ) F is a distance calculation function (e.g.: european distance (L)

According to the error, the CNN network vector C= [ C ] can be reversely adjusted ₁ ,C ₂ …,C _n ](e.g., using a gradient descent method) new model parameters are generated.

When the server receives the information reported by the NAS: the error correction operation type is merging and first album base picture feature vector O ^x Or the feature vector O of the picture in the second photo album base ^Y Any picture feature vector of the first album X or the second album Y and a judgment threshold, the excessive distance between the picture and the feature vector of the base picture, namely an ideal value D 'can be judged according to the error correction operation type' _xi Or D' _yi Should be less than the decision threshold, whereby the ideal value D 'can be determined based on the decision threshold' _xi Or D' _yi Can be derived fromCurrent D _xi With the ideal value D' _xi Or, at present, D _yi With the ideal value D' _yi A second error therebetween.

When the server receives the information reported by the NAS: the error correction operation type is merging and first album base picture feature vector O ^x Second album base picture feature vector O ^Y Any picture feature vector of the first album and any picture feature vector of the second album, and a judgment threshold, the distance between the picture in the two combined albums and the feature vector of the picture in the base library thereof can be judged to be too large according to the error correction operation type, namely, an ideal value D '' _xi 、D’ _yi Should be less than the decision threshold, whereby an ideal value can be determined based on the decision threshold, and the current D can be derived _xi With the ideal value D' _xi First error between D _yi With the ideal value D' _yi A second error therebetween;

averaging the obtained first error and second error to obtain average error, and reversely adjusting CNN network vector C= [ C ] based on the average error ₁ ,C ₂ …,C _n ](e.g., using a gradient descent method) new model parameters are generated.

When the server receives the information reported by the NAS: the error correction operation type is merging and first album base picture feature vector O ^x Or the feature vector O of the picture in the second photo album base ^Y If the feature vectors of all the pictures of the first album or the feature vectors of all the pictures of the second album and the judgment threshold are too large, the distance between each picture in the album and the feature vector of the picture in the base can be judged according to the error correction operation type, namely, the ideal value D '' _xi Or D' _yi Should be smaller than the decision threshold, so that the ideal value can be determined based on the decision threshold, and each picture D in the first album can be derived _xi With the ideal value D' _xi First errors between, or deriving pictures D in the second album _yi With the ideal value D' _yi A second error therebetween;

counting the obtained first error or second error to obtain a statistical error based on the statistical errorDifferential back-adjustable CNN network vector c= [ C ] ₁ ,C ₂ …,C _n ](e.g., using a gradient descent method) new model parameters are generated.

When the server receives the information reported by the NAS: the error correction operation type is merging and first album base picture feature vector O ^x Second album base picture feature vector O ^Y If the feature vectors of all the pictures of the first album, the feature vectors of all the pictures of the second album and the judgment threshold are too large, the distance between each picture in the merged album and the feature vector of the picture in the base can be judged according to the error correction operation type, namely, the ideal value D '' _xi 、D’ _yi Should be respectively smaller than the decision threshold, so that the ideal value can be determined based on the decision threshold, and each picture D in the first album can be respectively derived _xi With the ideal value D' _xi First error between, and deriving each picture D in the second album _yi With the ideal value D' _yi A second error therebetween;

counting the obtained first error and second error to obtain a statistical error, and reversely adjusting the CNN network vector C= [ C ] based on the statistical error ₁ ,C ₂ …,C _n ](e.g., using a gradient descent method) new model parameters are generated.

In this way, the algorithm server periodically counts all the differences Di reported by all NAS, calculates the variance of all the error information, randomly calculates the coefficient vector C0 (coefficient vector of the 0 th generation) of the initial model by taking the variance as a reference, obtains N coefficient vectors after adjustment, and marks the N coefficient vectors as first generation offspring coefficient vectors as C11, C12,. C1n.

Step 306, the algorithm server sends the first generation offspring coefficient vector to the application server, and sets the initial value of the iteration number, for example, records the iteration number p as 1;

after receiving, the application server distributes the first generation offspring coefficient vector to each user side network equipment according to a certain rule or strategy.

Step 307, the user side network device updates the current CNN model according to the first generation offspring coefficient vector, i.e. deploys the first generation CNN model on the user side network device;

the user side network equipment carries out face recognition on the stored pictures through the current face CNN model which is applied and deployed by the artificial intelligent algorithm, then the stored pictures are stored in groups according to the face recognition result, and the recognition results are divided into the same group. In this step, the stored picture may be a picture that is newly stored, may further include a picture that has been identified in step 301, or may not include a picture that has been identified in step 301.

Step 308, the intelligent terminal acquires the difference between the face result recognized by the current p-th generation CNN model and the expected result by capturing the error correction operation of the picture by the user using the APP, generates error information by the intelligent terminal APP based on the error correction operation, reports the error information to the user side network equipment, and the user side network equipment reports the feature vector Oi of the error picture i to the information server, wherein the step 303 is the same;

step 309, the information server stores all the reported information, and integrates all the reported information, and reports to the algorithm server.

Step 310, the algorithm server periodically selects m pairs of user side network devices with the smallest difference according to all the differences, and takes the coefficient vector of the current CNN model deployed in the m pairs of user side network devices as a genetic preference factor, namely, m pairs of coefficient vectors are selected from p-th generation offspring coefficient vectors according to the principle that the error of the recognition result is the smallest;

in step 311, the algorithm server hybridizes the m pairs of coefficient vectors according to the genetic algorithm, that is, hybridizes the two pairs of coefficient vectors in a pair of coefficient vectors to obtain N p+1st generation offspring coefficient vectors, and then sends the offspring coefficient vectors to the application server.

The application server distributes the p+1th generation offspring coefficient vector to each user side network device according to a certain rule or strategy, after updating the current CNN model according to the current offspring coefficient vector, the user side network device returns to execute step 308, and repeatedly executes steps 308-311 until reaching the preset iteration times or the statistical difference value to be expected and reaching stability, the current CNN model can be considered to be ideal, and training of the CNN model is completed.

In the embodiment, the training adjustment of the CNN model does not need to acquire the picture data of the user, but trains the CNN model through error information fed back by the APP and the feature vector of the picture related to the error information in the using process of the CNN model, so that private data is effectively protected; the training basis of the embodiment is strong in objectivity, the defects of strong subjectivity and single standard of the existing training data are avoided, and the diversity and standard diversity of people are reflected, compared with the mode that a few persons used in the existing CNN network training manually or mechanically label pictures in advance for training; the method has the advantages that the CNN models distributed on each user side are changed through the genetic algorithm, which is equivalent to the improvement of the genetic algorithm calculation by using a distributed network, so that the actual scene data can be obtained more effectively, and the method is superior to a mode of only obtaining a centralized training data model of partial data information; the user-side network devices are distributed in a user family or office scene by private cloud devices, the data are more diversified, the real user scene data sample is reflected more objectively, and the model trained by the scheme is closer to the real use of the user.

Example 2:

the following description will be made with reference to training of a CNN model for image classification as an embodiment.

Deploying the initial model to user side network equipment, namely installing a 0 th generation CNN model with model parameters of C0= [ C01, C02, … C0N ] on the user side network equipment, wherein 0 represents the 0 th generation model, and C0 represents coefficient vectors among nodes of the initial model after training; in a specific manner, the configuration may be downloaded by a server, for example, by an application server, or may be directly configured when the user-side network device leaves the factory.

When the pictures are stored in the user side network equipment, the user side network equipment identifies the stored pictures through an artificial intelligence algorithm application deployed CNN model, and then stores the stored pictures in a classified manner according to the identification result, and the identification result is classified into the same class, such as people, landscape, animals and the like.

Obtaining a difference between a result identified by the CNN model and an expected result by capturing error correction operation of a user on the picture by using the APP, namely obtaining the difference between the current CNN model identification result and the expected identification result;

referring to fig. 5, fig. 5 shows an illustration of the operation of an APP on an image identified by a trained CNN model deployed by a user-side network device.

For image classification, there are two cases of errors, classification errors, and not included in classification. For classification errors, e.g., photo 4 shown in the illustration should be classified as class 3, then the photo can be moved to the correct class 3 by manipulation; for the case not included in the classification, deletion may be selected.

The intelligent terminal APP generates error information based on error correction operation, reports to the user side network equipment, and the user side network equipment reports the feature vector Oi of the error picture i to the network side server, specifically,

if it is a classification error: and reporting the feature vectors of all pictures in the first category to which the images belong, the feature vectors of all pictures in the second category to which the images belong after the images are moved, and the feature vectors of the moved images themselves to be used for carrying out statistics of ideal values.

If it is not included in the classification: reporting the feature vectors of all pictures in the first category to which the original belongs and the feature vectors of the deleted pictures; for making statistics of the ideal values.

Referring to fig. 6, fig. 6 is a flowchart of a network side server process according to embodiment 2.

In step 601, the network side server collects and stores the difference information between the output result and the expected result of the current data model of the user side network device, that is, stores all the reported information,

Step 602, for each report information, the network side server analyzes the difference Di between the feature vector Oi of each error picture i and the first threshold Oi' as an ideal value. The first threshold is calculated based on feature vector statistical information of the classified correct pictures.

In step 603, the network side server periodically counts all the differences Di as error information, calculates variances of all the error information, randomly calculates the coefficient vector C0 (coefficient vector of the 0 th generation) of the initial model by taking the variances as a reference, and obtains N adjusted coefficient vectors, and marks the N coefficient vectors as first generation offspring coefficient vectors as C11, C12,. C1n.

Step 604, the network side server sets an initial value of the iteration number, for example, records the iteration number p as 1; and distributing the first generation offspring coefficient vector to each user side network equipment according to a certain rule or strategy.

Step 605, the user side network device updates the current CNN model according to the first generation offspring coefficient vector, i.e. deploys the first generation CNN model on the user side network device;

step 606, the intelligent terminal obtains the difference between the current p-th generation CNN model identification result and the expected result by capturing the error correction operation of the picture by the user using the APP, generates error information by the intelligent terminal APP based on the error correction operation, reports the error information to the user side network equipment, and the user side network equipment reports the feature vector Oi of the error picture i to the information server, wherein the step 601 is the same;

Step 607, the network side server periodically selects m pairs of user side network devices with the smallest difference according to all the differences, and takes the coefficient vector of the current CNN model deployed in the m pairs of user side network devices as a genetic preference factor, namely, according to the principle that the error of the recognition result is the smallest, the m pairs of coefficient vectors are selected from the p-th generation offspring coefficient vectors;

step 608, the network side server hybridizes the m pairs of coefficient vectors according to the genetic algorithm to obtain N p+1st generation offspring coefficient vectors;

step 609, the p+1 generation offspring coefficient vector is distributed to each user side network device according to a certain rule or policy.

Step 610, determining whether the preset iteration number is reached or whether the error value is reached to the expected value, if yes, ending, otherwise, returning to step 606, repeatedly executing steps 606 to 610 until the preset iteration number is reached or the statistical difference reaches the expected value and reaches the stability, and recognizing that the current CNN model reaches the ideal value, and completing the training of the CNN model.

Example 3:

the following description will be made with reference to training of a CNN model for object recognition in video.

The video object identification is to extract a frame of data, typically an I-frame (key frame), from a video stream, consider the I-frame as an image, identify the I-frame through a trained CNN model, and, in a specific process similar to the identification of a picture, if the I-frame is identified as a person or other classification, generate a picture (such as JPG) according to the I-frame, and then store the generated picture in a corresponding video gallery, or store a video period in which the I-frame is located according to a period of time, for example, store the n-th to n+m-th frames.

When the APP performs a correction operation on the recognition result, the error processing is similar to that of embodiment 1 or 2, and if it is person recognition, the error information in embodiment 1 may be reported; if the classification is identified, the error message may be reported as in example 2.

The training and iterative process for the model parameters of the CNN model is the same as in example 1 or example 2.

Example 4:

hereinafter, training of a CNN model for character recognition will be described as an embodiment.

And the text is presented in a picture form, so that the characters in the picture are identified through the trained CNN model, and the identification of the characters, namely the character content in the picture is identified.

The initial model is deployed to the user side network device, and in a specific mode, the initial model can be downloaded through a server, for example, through an application server, or can be directly configured when the user side network device leaves the factory.

And converting the characters into pictures and storing the pictures in user side network equipment, and identifying the stored pictures by the user side network equipment through a CNN model applied and deployed by an artificial intelligence algorithm.

Referring to fig. 7, fig. 7 is a schematic diagram of a network side server in embodiment 4.

Step 701, obtaining a difference between a result identified by a CNN model and an expected result by capturing error correction operation of a picture by a user using an APP, namely obtaining a difference between a current CNN model identification result and an expected identification result; in particular to a special-shaped ceramic tile,

the user corrects the characters in the text generated according to the picture identification content in an editing mode through the APP, and records the coordinate area of each character in the text in the picture, namely, when the user finds that the text identification is wrong, the user manually edits the text, deletes the errors and writes the correct characters, at the moment, the APP reports the errors, the coordinate area where the wrongly identified characters are located is scratched and reported, and meanwhile, the wrongly identified characters and the corrected characters are reported to the user side network equipment.

The user side network equipment takes the reported matted drawing, the wrongly recognized characters and the corrected characters as training basis and reports the training basis to the network side server;

step 702, the network side server periodically performs statistics of reporting errors of the network devices at each user side, adjusts model parameters of the CNN model based on the reported training basis, and distributes the adjusted model parameters to the network devices at each user side according to a certain rule or policy.

In the embodiment, as the whole text is not used as training data due to the fact that the image is used for reporting, the single text cannot be infringed by privacy, and privacy protection is facilitated; meanwhile, the data with error samples are used as training basis for the CNN model, so that training of the CNN model is more fit with the actual situation, and training efficiency and accuracy are improved.

And in another mode, reporting the picture feature vector after the original picture is scratched and/or the picture of the original picture, the wrongly recognized text and the corrected text, wherein the picture after the original picture is scratched and/or the picture of the original picture is recognized by applying a CNN model, and the feature vector Oi (recognized vector) of the picture is obtained by outputting the CNN model. The training and iterative process for the model parameters of the CNN model is the same as in example 1 or example 2. The mode only reports the characteristic vector of the picture, and does not need to report the picture data, thereby being beneficial to protecting the privacy of the data, combining the iteration of a genetic algorithm to train the model and being beneficial to improving the training efficiency and the model training accuracy.

Example 5:

the following description will be made with reference to training of a CNN model for speech recognition as an embodiment.

The voice recognition is to input voice into a trained CNN model, and recognize the input voice content through the trained CNN model, namely, the voice is recognized into characters through the trained CNN model.

The user side network equipment identifies the stored voice through a CNN model which is applied and deployed by an artificial intelligence algorithm, and stores the identified voice content in text.

Referring to fig. 8, fig. 8 is a schematic diagram of a network side server in embodiment 5.

Step 801, obtaining a difference between a result identified by a CNN model and an expected result by capturing an error correction operation of a user on a text using an APP, i.e., obtaining a difference between a current CNN model identification result and an expected identification result; in particular to a special-shaped ceramic tile,

the user modifies the text generated according to the voice recognition through the APP in an editing way, and records the corresponding relation between the text and the voice time period, such as 'hello', wherein the recording time point is 0:0.89, namely from 0 seconds to 0.89 seconds. And when the recognition errors are found, the recognized error characters, the corrected characters and the spectrum feature vectors of the voice are reported to the network equipment at the user side.

The user side network equipment takes the identified error characters, the corrected characters and the spectrum feature vector of the voice section as training basis and reports the training basis to the network side server;

step 802, the network side server periodically performs statistics of reporting errors of the network devices at each user side, adjusts model parameters of the CNN model based on the reported training basis, and distributes the adjusted model parameters to the network devices at each user side according to a certain rule or strategy.

In this embodiment, the reported error sample data and the frequency spectrum feature vector are used as training basis, so that voice data are not reported, which is beneficial to protecting data privacy and improving training efficiency and accuracy of the CNN model.

In the above embodiment of the present application, the user side network device and the intelligent terminal may be integrated, for example, the functions of the user side network device are integrated in the intelligent terminal, or the APP of the intelligent terminal is installed in the supportable user side network device, which should be understood that the user side network device, the intelligent terminal, or the application serving the user in the cloud system is regarded as the user side device providing the service for the user.

Referring to fig. 9, fig. 9 is a training device according to an embodiment of the present application, the device comprising,

and the distribution module distributes the adjusted model parameters of the second data model to each user side network device, so that the first data model deployed by the user side network device is updated according to the distributed model parameters.

The statistics module also comprises a step of periodically counting sample characteristic values output by a first data model with p generation offspring model parameters from each user side network device;

the training module may further comprise a processor configured to,

the genetic algorithm module takes the counted sample characteristic values as training basis, and counts the error between each sample characteristic value and a preset first threshold value; according to the principle of minimum error, model parameters of m pairs of first data models are selected periodically; hybridizing the model parameters of the m pairs of first data models according to a genetic algorithm to obtain p+1st generation offspring model parameters; taking the p+1st generation offspring model parameters as the model parameters of the second data model; repeating the iteration until the iteration ending condition is met; wherein, p and m are natural numbers.

When the model parameters of the first data model are initial model parameters, the statistical sample feature values are used as training basis, the statistical error between each sample feature value and a preset first threshold value also comprises,

the p+1st generation offspring model parameters are N,

wherein N is a natural number.

The distribution module further comprises that the N p+1st generation offspring model parameters are distributed to each user side network device according to a certain rule or strategy.

According to an embodiment of the present invention, a network-side device, for example, a server, includes a memory and a processor, where,

the memory is used for storing a computer program;

The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

The embodiment of the invention also provides a computer readable storage medium, wherein the storage medium stores a computer program, and the computer program realizes the following steps when being executed by a processor:

a statistical module for calculating the difference information reported by the network equipment at each user side, wherein the difference information is the difference between the output result and the expected result of the first data model deployed by the network equipment at the user side,

the statistical difference information is used as a training basis to adjust the model parameters of the first data model,

and distributing the adjusted model parameters of the second data model to each user side network device, so that the first data model deployed by the user side network device is updated according to the distributed model parameters.

According to the storage medium provided by the embodiment of the invention, the training adjustment is performed on the data model by not acquiring the picture data of the user but the error information fed back by the APP and the sample feature vector related to the error information, so that the private data is effectively protected; the data training improvement is carried out by capturing specific operation of the user, compared with the mode of manual or machine advance labeling of a few persons used in the existing data model training, the training basis of the embodiment is strong in objectivity, the defects of strong subjectivity and single standard of the existing training data are avoided, and the diversity and standard diversity of people are reflected; the data models distributed on each user side are changed through the genetic algorithm, which is equivalent to the improvement of the genetic algorithm calculation by using the distributed network, so that the actual scene data can be obtained more effectively, and the model trained through the scheme is more closely used by the user.

For the apparatus/network side device/storage medium embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and the relevant points are referred to in the description of the method embodiment.

It should be noted that, the embodiment of the data model training method provided by the present invention is not limited to the above embodiment, and the data model may not be limited to the CNN model, and other data models that need to be trained may be adopted.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the invention.

Claims

1. A training method of a data model is characterized in that the method comprises, at a network side,

distributing the adjusted model parameters to each user side network device, so that a first data model deployed by the user side network device is updated according to the distributed model parameters;

wherein ,

the first data model includes: a model for recognizing text information in a picture, or a model for recognizing text information in voice, or a model for face recognition, or a model for classifying and recognizing pictures, or a model for recognizing objects in video;

the step of adjusting the model parameters of the first data model by taking the statistical difference information as a training basis comprises the following steps:

taking the counted sample characteristic values as training basis, counting the errors of the sample characteristic values and a preset first threshold value, outputting the sample characteristic values by a first data model with p generation offspring model parameters in each user side network device,

selecting model parameters of m pairs of first data models according to the principle of minimum error,

taking the p+1st generation offspring model parameters as the adjusted model parameters, and executing the step of distributing the adjusted model parameters to each user side network device;

p and m are natural numbers.

2. The method of claim 1, wherein the difference information is obtained by a user side device capturing error correction operations for the output result.

3. The method of claim 2, wherein the capturing, by the user side device, the error correction operation for the output result comprises: and capturing error correction operation of the user side equipment application program on the output result, and generating reported difference information based on the error correction operation.

4. The method of claim 3, wherein the statistics of the difference information reported from each user side network device comprises,

and according to the difference information, counting sample characteristic values output by a first data model with p generation offspring model parameters in each user side network device.

5. The method of claim 4, wherein when the model parameters of the first data model are initial model parameters, the counting the error between each sample feature value and the preset first threshold value by taking the counted sample feature value as a training basis further comprises,

the p+1st generation offspring model parameters are N,

wherein N is a natural number.

6. The method of claim 1, wherein the statistics of the difference information reported from each user-side network device are performed periodically.

7. The method of claim 4, wherein the selecting of the model parameters of the m pairs of first data models is performed on a periodic basis.

8. The method of claim 4, wherein the statistics are performed on a periodic basis from sample feature values output by a first data model having p-th generation offspring model parameters in each user side network device.

9. The method of claim 4, wherein the steps of using the p+1st generation offspring model parameters as the adjusted model parameters and performing the distribution of the adjusted model parameters to the respective user side network devices further comprise:

judging whether the current iteration times of the genetic algorithm reach a preset iteration threshold or not, or whether the statistical error value reaches the expected and stable value, if so, ending the iteration of the genetic algorithm and ending the updating of the model parameters; otherwise, returning to execute the step of taking the counted sample characteristic values as training basis and counting the errors between the sample characteristic values and a preset first threshold value.

10. The method of claim 5, wherein the N p+1st generation offspring model parameters are distributed to each user side network device.

11. A method according to claim 3, wherein the model for identifying text information in the picture is a convolutional neural network CNN model; the capturing error correction operation of the user side device application program on the output result, and the difference information generated based on the error correction operation comprises:

identifying erroneous text, corrected text, and

the method comprises the steps of carrying out matting based on a coordinate area where a character with errors is identified; or, the picture after the original picture is scratched and/or the picture of the original picture is/are scratched, and the CNN model for identifying the text information in the picture is applied to identify the output picture feature vector;

wherein ,

and correcting the characters in the text generated according to the picture identification result in an editing mode by an application program in the coordinate area where the wrongly identified characters are located, and recording the coordinate area of each character in the picture for determination.

12. The method of claim 3, wherein the text information model used in recognizing speech is a CNN model; the capturing error correction operation of the user side equipment application program on the output result, and the difference information generated based on the error correction operation comprises the following steps: identifying the wrong text, the corrected text and the voice frequency spectrum feature vector in the voice time period where the wrong text is identified; wherein,

The voice time period in which the error characters are identified is obtained by correcting the characters in the text generated according to the voice identification in an editing mode through an application program and searching according to the corresponding relation between the recorded characters and the voice time period.

13. A method according to claim 3, wherein the model for face recognition is a CNN model; the output result of the first data model deployed by the user side network equipment is that the face recognition is carried out on the pictures by applying the deployed face recognition CNN model through an artificial intelligent algorithm, and the pictures are grouped according to the face recognition, wherein the identification results are the same group;

the capturing user side device application program performs error correction operation on the output result, and the difference information generated based on the error correction operation comprises,

combining the feature vector Oi of each picture i after the combined grouping and the allergy identifier of the difference information;

deleting the characteristic vector Oi of each picture i in the membership group of the deleted picture, the characteristic vector of the deleted picture and the hyposensitive identifier of the difference information;

wherein ,

i is a natural number.

14. A method according to claim 3, wherein the model for picture classification recognition is a CNN model;

operating the feature vectors of all pictures in a first category of the moving picture before moving, the feature vectors of all pictures in a second category of the moving picture after moving, the feature vectors of the moving picture and the allergy mark of the difference information based on the classification errors;

the insensitive identity of the difference information is based on feature vectors of all pictures not included in the first classification to which the classification operation was deleted, and feature vectors of the deleted pictures.

15. A method according to claim 3, wherein the model for object recognition in video is a CNN model;

16. The method of claim 13 to 15, wherein said counting the error of each of said sample feature values from a predetermined first threshold using said counted sample feature values as a training basis comprises,

wherein ,

the first threshold is obtained by calculation based on feature vector statistical information of classifying correct pictures, and when error information reported based on error correction operation is allergy identification, the first threshold is reduced; when the error information reported based on the error correction operation is the low-sensitivity mark, the first threshold value is increased;

the allergy mark is used for marking error information reported during merging operation, and the low-sensitivity mark is used for marking error information reported during deleting operation.

17. A training device for a data model is characterized in that the device comprises,

The distribution module distributes the adjusted model parameters to each user side network device, so that a first data model deployed by the user side network device is updated according to the distributed model parameters;

the training module may further comprise a processor configured to,

the genetic algorithm module is used for taking the counted sample characteristic values as training basis, counting the errors of the sample characteristic values and a preset first threshold value, and outputting the sample characteristic values by a first data model with p generation offspring model parameters in each user side network device; according to the principle of minimum error, model parameters of m pairs of first data models are selected periodically; respectively hybridizing m pairs of model parameters of the first data model according to a genetic algorithm to obtain p+1st generation offspring model parameters; taking the p+1st generation offspring model parameters as the adjusted model parameters; repeating the iteration until the iteration ending condition is met; wherein, p and m are natural numbers;

the first data model includes: a model for recognizing text information in a picture, or a model for recognizing text information in speech, or a model for face recognition, or a model for classifying and recognizing pictures, or a model for recognizing objects in video.

18. The apparatus of claim 17, wherein the means for counting further comprises periodically counting sample feature values output from a first data model having p-th generation offspring model parameters in each user side network device.

19. The apparatus of claim 18, wherein the iteration end condition comprises whether a current number of iterations of the genetic algorithm reaches a preset iteration threshold, or whether a statistical error value reaches an expected and stable value.

20. The apparatus of claim 18, wherein when the model parameters of the first data model are initial model parameters, the counting the error between each of the sample feature values and the preset first threshold value by using the counted sample feature values as a training basis further comprises,

the p+1st generation offspring model parameters are N,

wherein N is a natural number.

21. The apparatus of claim 20, wherein the means for distributing further comprises the N p+1st generation offspring model parameters to each user side network device.

22. The apparatus of claim 18, wherein the first data model is a CNN model and the model parameters are coefficient vectors.

23. A network side device comprising a memory and a processor, wherein,

the memory is used for storing a computer program;

the processor is configured to execute a program stored in the memory, and implement the data model training method according to any one of claims 1 to 16.

24. A storage medium storing a computer program implementing the data model training method of any one of claims 1-16.