WO2022022233A1 - Ai模型更新的方法、装置、计算设备和存储介质 - Google Patents

Ai模型更新的方法、装置、计算设备和存储介质 Download PDF

Info

Publication number
WO2022022233A1
WO2022022233A1 PCT/CN2021/104537 CN2021104537W WO2022022233A1 WO 2022022233 A1 WO2022022233 A1 WO 2022022233A1 CN 2021104537 W CN2021104537 W CN 2021104537W WO 2022022233 A1 WO2022022233 A1 WO 2022022233A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
data set
data
inference
existing
Prior art date
Application number
PCT/CN2021/104537
Other languages
English (en)
French (fr)
Inventor
邬书哲
金鑫
李心成
涂丹丹
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21851426.3A priority Critical patent/EP4177792A4/en
Priority to JP2023505761A priority patent/JP2023535227A/ja
Publication of WO2022022233A1 publication Critical patent/WO2022022233A1/zh
Priority to US18/158,019 priority patent/US20230153622A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning

Definitions

  • the present application relates to the field of artificial intelligence (Artificial Intelligence, AI) technology, and in particular, to a method, apparatus, computing device and storage medium for updating an AI model.
  • AI Artificial Intelligence
  • AI models mainly use algorithms such as machine learning to learn the parameters of AI models based on a large amount of data. Since the AI models are obtained by learning a large amount of data, the constructed AI models have certain generalization capabilities, but when they are used When there is a big difference between the data distribution in the scene and its training data distribution, the performance of the AI model will be affected and the accuracy will be reduced.
  • the actual application environment of AI models changes dynamically. Therefore, in actual application scenarios, the data distribution may change constantly, which may cause the AI model to fail to maintain stable accuracy in changing application scenarios. In order to adapt the accuracy of the AI model to changes in the scene, the AI model needs to be adaptively updated.
  • an AI platform is developed, and users can update the model by themselves through the AI platform.
  • the AI platform provides the functions required from training AI models to deploying AI models. These functions usually include data annotation, data management, model training, and model inference.
  • the user can train the AI model through the AI platform.
  • the subsequent user determines that the current AI model is not suitable for the current scene, the user can provide the data set of the current scene to update the AI model so that the updated AI model can adapt to the current scene. scene, and then apply the updated AI model to the current scene.
  • the AI model update method of related technologies can only update the AI model after the accuracy of the AI model is reduced, the user perceives that the accuracy of the AI model has decreased in the current scene, and then the AI model will be updated, so the AI model will not be updated in time.
  • the present application provides a method, apparatus, computing device and storage medium for updating an AI model, so as to update the AI model in time.
  • the present application provides a method for updating an AI model.
  • the method includes: acquiring an inference data set, wherein the inference data in the inference data set is used to input the inference data into an existing AI model to perform inference; determining the data in the inference data set There is a difference between the distribution and the data distribution of the training data set.
  • the training data set is the data set used to train the existing AI model; the inference data set is used to update the existing AI model to obtain the updated AI model.
  • the method for updating the AI model can be executed by the AI platform, because the above method will update the existing AI model when there is a difference between the data distribution of the inference data set and the data distribution of the training data set, instead of The existing AI model will not be updated until the user perceives that the accuracy of the AI model has decreased, so the AI model can be updated in a more timely manner.
  • the existing AI model is deployed on the inference platform, and the method further includes: comparing the inference accuracy of the updated AI model and the existing AI model, and determining that the inference accuracy of the updated AI model is better than that of the existing AI model. Have the inference accuracy of the AI model; deploy the updated AI model to the inference platform to perform inference in place of the existing AI model.
  • the existing AI model can be deployed on the inference platform, and the inference platform can be a part of the AI platform, or can be independent of the AI platform.
  • the AI platform can obtain the inference accuracy of the updated AI model and the existing AI model, and then compare the inference accuracy of the updated AI model and the existing AI model.
  • the inference accuracy of the updated AI model is better than that of the existing AI model.
  • the updated AI model is deployed to the inference platform, and the updated AI model is used in the inference platform to perform inference instead of the existing AI model.
  • the existing AI model is updated only when the inference accuracy of the updated AI model is better than that of the existing AI model. Therefore, the updated AI model is used for inference, so that the inference accuracy is higher.
  • the method before deploying the updated AI model to the inference platform, the method further includes: displaying the inference accuracy of the existing AI model and the inference accuracy of the updated AI model through a display interface; Update commands for AI models.
  • the AI platform before deploying the updated AI model to the inference platform, the AI platform can display the inference accuracy of the existing AI model and the inference accuracy of the updated AI model through the display interface, and the user can choose whether to update the updated AI model.
  • AI models There are AI models.
  • the AI platform receives the user's update instruction for the existing AI model, the updated AI model can be deployed to the inference platform. In this way, the user can decide whether to deploy the updated AI model, so the user experience can be improved.
  • using the inference data set to update the existing AI model including: if the difference reaches the offline update condition, using the inference data set to update the existing AI model offline; if the difference does not meet the offline update condition; If the update conditions are met, the existing AI model is updated online using the inference data set.
  • the AI platform can determine whether the difference between the data distribution of the inference data set and the data distribution of the training data set satisfies the offline update condition. In the case that the difference satisfies the offline update condition, the AI platform can use the inference data set to update the existing AI model offline. In the case that the difference does not meet the offline update conditions, the AI platform can use the inference data set to update the existing AI model online. In this way, since different update methods can be selected based on the difference, the update time can be saved.
  • using the inference data set to update the existing AI model online includes: using the difference between the data distribution of the inference data set and the data distribution of the training data set to determine the target part of the existing AI model.
  • Parameter change Determine the parameters of the target part in the updated AI model based on the current parameters and parameter changes of the target part in the existing AI model.
  • using the inference data set to update the existing AI model includes: constructing a target data set according to the inference data set; and using the target data set to update the existing AI model.
  • constructing a target data set according to the inference data set includes: in the inference data set, obtaining target data that satisfies the sample conditions, displaying the target data through a display interface; obtaining a user's annotation result of the target data; According to the target data and the annotation results of the target data, the target dataset is constructed.
  • the AI platform can select the target data that meets the sample conditions in the inference data set, and display it to the user, so that the user can mark the target data.
  • the AI platform can construct a target data set based on the user's annotation results and target data. In this way, when the existing AI model is updated, the constructed target data set includes the target data marked by the user that meets the sample conditions, so the updated AI model can be more suitable for inference on the inference data set.
  • obtaining target data that satisfies the sample conditions including: according to the difference between the data distribution of the inference data set and the data distribution of the training data set, in the inference data set, obtaining the target data that satisfies the sample conditions.
  • the target data of the example conditions where the target data is suitable for updating the existing AI model. In this way, based on the difference between the data distribution of the inference data set and the data distribution of the training data set, the target data can be more suitable for updating the existing AI model.
  • the target data set further includes sampling and/or generating labeled data suitable for the data distribution of the inference data set from the current labeled data, where the current labeled data includes data in the training data set.
  • the labeled data suitable for the data distribution of the inference dataset can also be obtained from the existing labeled data, so the labeled data in the target dataset can be more, and the inference accuracy of the updated AI model can be improved.
  • the labeled data in the target data set can be more, so the inference accuracy of the updated AI model can be higher.
  • using the target data set to update the existing AI model includes: obtaining a strategy for updating the existing AI model according to the data characteristics of the data in the target data set; according to the strategy, updating the existing AI model renew.
  • the data characteristics of the data in the target dataset can be used to select a strategy for updating the existing AI model and update the existing AI model, it can not only improve the efficiency of updating the existing AI model, but also make the updated AI model more efficient.
  • the inference accuracy of the model is higher.
  • the method further includes: obtaining the update cycle of the AI model input by the user; determining that there is a difference between the data distribution of the inference data set and the data distribution of the training data set, including: according to the update cycle of the AI model, Determine that the data distribution of the inference dataset differs from the data distribution of the training dataset.
  • the user can decide the update cycle of the AI model, and when the update cycle is reached, the process of updating the AI model is executed.
  • the present application provides an apparatus for updating an artificial intelligence AI model, the apparatus comprising: an acquisition module for acquiring an inference data set, wherein the inference data in the inference data set is used to input into an existing AI model performing inference; a determining module for determining that there is a difference between the data distribution of the inference data set and the data distribution of the training data set, wherein the training data set is the data set used for training the existing AI model; an update module , which is used to update the existing AI model by using the inference data set to obtain the updated AI model.
  • the existing AI model will be updated, instead of waiting until the user perceives that the accuracy of the AI model has decreased, the existing AI model will be updated. , so the AI model can be updated in time.
  • the existing AI model is deployed on an inference platform, and the determination module is further configured to compare the inference accuracy of the updated AI model and the existing AI model, and determine the updated AI model.
  • the reasoning accuracy of the AI model is better than that of the existing AI model; the update module is also used to deploy the updated AI model to the reasoning platform, so that the updated AI model can replace The existing AI model performs inference. In this way, when the inference accuracy of the updated AI model is better than the existing AI model, the existing AI model is updated, so that the inference accuracy is higher.
  • the apparatus further includes: a display module, configured to display the inference accuracy and the inference accuracy of the existing AI model through a display interface before deploying the updated AI model to the inference platform.
  • the inference accuracy of the updated AI model the device further includes: a receiving module, configured to receive an update instruction of the existing AI model from the user. In this way, the user can decide whether to deploy the updated AI model, so the user experience can be improved.
  • the update module is configured to: if the difference reaches the offline update condition, use the inference data set to update the existing AI model offline; if the difference does not meet the offline update condition; If the offline update condition is satisfied, the existing AI model is updated online by using the inference data set. In this way, since different update methods can be selected based on the difference, the update time can be saved.
  • the update module is configured to: use the difference between the data distribution of the inference data set and the data distribution of the training data set to determine the parameter change of the target part of the existing AI model
  • the parameter of the target part in the updated AI model is determined based on the current parameter of the target part in the existing AI model and the parameter change amount.
  • the updating module is configured to construct a target data set according to the inference data set; and update the existing AI model by using the target data set.
  • the update module is configured to: in the inference data set, obtain target data that satisfies the sample conditions, and display the target data through a display interface; obtain user annotations on the target data Result: constructing a target data set according to the target data and the labeling result of the target data.
  • the constructed target data set includes the target data marked by the user that meets the sample conditions, so the updated AI model can be more suitable for inference on the inference data set.
  • the update module is configured to: obtain target data that satisfies the sample condition in the inference data set according to the difference between the data distribution of the inference data set and the data distribution of the training data set , wherein the target data is suitable for updating the existing AI model. In this way, the target data that satisfies the sample conditions can be filtered out more accurately.
  • the target dataset further includes sampling and/or generating annotation data suitable for the data distribution of the inference dataset from the current annotation data, where the current annotation data includes all the annotation data. data in the training dataset.
  • the target data set includes unlabeled data and labeled data suitable for the data distribution of the inference data set;
  • the updating module is configured to: use the unlabeled data in the target data set,
  • the feature extraction part in the existing AI model is optimized in an unsupervised manner; the existing AI model is updated according to the optimized feature extraction part and the labeled data in the target data set. In this way, the feature extraction part in the AI model can be optimized first, and then the existing AI model can be updated.
  • the target data set includes unlabeled data and labeled data suitable for the data distribution of the inference data set;
  • the updating module is configured to: use the existing AI model to update the data on the inference data set.
  • the unlabeled data in the target data set is labeled, and the labeling result of the unlabeled data is obtained; the existing AI model is updated according to the labeling result of the unlabeled data and the labeled data in the target data set.
  • the labeled data in the target data set can be more, so the inference accuracy of the updated AI model can be higher.
  • the update module is configured to: obtain a strategy for updating the existing AI model according to the data characteristics of the data in the target data set; and, according to the strategy, update the existing AI model to update.
  • the data characteristics of the data in the target dataset can be used to select a strategy for updating the existing AI model and update the existing AI model, it can not only improve the efficiency of updating the existing AI model, but also make the updated AI model more efficient.
  • the inference accuracy of the model is higher.
  • the obtaining module is further configured to obtain the update cycle of the AI model input by the user;
  • the determining module is configured to determine, according to the update cycle of the AI model, that there is a difference between the data distribution of the inference data set and the data distribution of the training data set.
  • a third aspect provides a computing device for updating an AI model
  • the computing device includes a processor and a memory, wherein: the memory stores computer instructions, and the processor executes the computer instructions to implement the first aspect and its possible implementations. method.
  • a computer-readable storage medium where computer instructions are stored in the computer-readable storage medium, and when the computer instructions in the computer-readable storage medium are executed by a computing device, the computing device is made to perform the first aspect and the possibility thereof.
  • a fifth aspect provides a computer program product comprising instructions that, when run on a computing device, cause the computing device to perform the method of the first aspect and possible implementations thereof, or cause the computing device to implement the second aspect described above and the functions of the device of its possible implementations.
  • FIG. 1 is a schematic structural diagram of an AI platform 100 provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of an application scenario of an AI platform 100 provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of deployment of an AI platform 100 according to an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a computing device 400 for deploying the AI platform 100 according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a provided operation mode provided by an embodiment of the present application.
  • Fig. 6 is a kind of AI model update logic diagram provided by the embodiment of this application.
  • FIG. 7 is a schematic flowchart of a method for updating an AI model provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of updating an AI model without user participation provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of updating an AI model with user participation provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of a method for updating an AI model provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a scenario for updating an AI model provided by an embodiment of the present application.
  • FIG. 12 is another schematic diagram of determining a difference in data distribution provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of local parameters of an updated AI model provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of determining target data according to an embodiment of the present application.
  • 15 is a schematic structural diagram of an apparatus for updating an AI model provided by an embodiment of the present application.
  • 16 is a schematic structural diagram of an apparatus for updating an AI model provided by an embodiment of the present application.
  • FIG. 17 is a schematic structural diagram of a computing device according to an embodiment of the present application.
  • Machine learning is a core method to realize AI.
  • Machine learning penetrates into various industries such as medicine, transportation, education, and finance. Not only professional technicians, but even non-AI technology professionals in various industries are looking forward to using AI and machine learning to complete specific tasks.
  • AI model is a type of mathematical algorithm model that uses machine learning ideas to solve practical problems.
  • the AI model includes a large number of parameters and calculation formulas (or calculation rules).
  • the parameters in the AI model can be used to train the AI model through the training data set. obtained value.
  • the parameters of the AI model are the calculation formulas or the weights of the calculation factors in the AI model.
  • the AI model also contains some hyper-parameters. Hyper-parameters can be used to guide the construction of the AI model or the training of the AI model. There are many types of hyper-parameters. For example, the number of iterations of AI model training, the learning rate, the batch size, the number of layers of the AI model, and the number of neurons in each layer. Hyperparameters can be parameters obtained by training the AI model through the training dataset, or can be preset parameters. The preset parameters mean that the AI model will not be updated by training the AI model through the training dataset.
  • the neural network model is a mathematical algorithm model that imitates the structure and function of a biological neural network (animal central nervous system).
  • a neural network model can include a variety of neural network layers with different functions, and each layer includes parameters and calculation formulas. According to different calculation formulas or different functions, different layers in the neural network model have different names. For example, layers that perform convolution calculations are called convolutional layers, and convolutional layers are often used for feature extraction on input signals such as images.
  • a neural network model can also be composed of a combination of multiple existing neural network models. Neural network models with different structures can be used in different scenarios (such as classification, recognition, etc.) or provide different effects when used in the same scenario.
  • the different structure of the neural network model specifically includes one or more of the following: the number of layers of the network layers in the neural network model is different, the order of each network layer is different, and the weights, parameters or calculation formulas in each network layer are different.
  • the number of layers of the network layers in the neural network model is different
  • the order of each network layer is different
  • the weights, parameters or calculation formulas in each network layer are different.
  • Training an AI model refers to using an existing sample (ie, a training data set) to make the AI model fit the rules in the existing sample by a certain method, and determine the parameters in the AI model.
  • training an AI model for image classification or detection and recognition requires preparing a training image set. According to whether the training images in the training image set are labeled (that is, whether the image has a specific type or name), the training of the AI model can be It is divided into supervised training and unsupervised training. When supervised training of an AI model, the training images in the training image set used for training have labels.
  • the training images in the training image set are used as the input of the AI model, the annotations corresponding to the training images are used as the reference for the output value of the AI model, and the loss function is used to calculate the output value of the AI model corresponding to the training image.
  • the marked loss value adjust the parameters in the AI model according to the loss value.
  • the AI model is iteratively trained with each training image in the training image set, and the parameters of the AI model are continuously adjusted until the AI model can output the same output value as the annotation corresponding to the training image with high accuracy according to the input training image.
  • the training images in the image set used for training are not labeled, and the training images in the training image set are input to the AI model in turn, and the AI model gradually recognizes the correlation and potential between the training images in the training image set.
  • the AI model used for clustering can learn the characteristics of each training image and the correlation and difference between the training images, and automatically classify the training images into multiple types.
  • Different task types can use different AI models.
  • Some AI models can only be trained with supervised learning, some AI models can only be trained with unsupervised learning, and some AI models can be trained with supervised learning. It can be trained with unsupervised learning.
  • a trained AI model can be used to complete a specific task.
  • AI models in machine learning need to be trained in a supervised learning method.
  • Training an AI model in a supervised learning method enables the AI model to learn more specifically from the labeled training image set.
  • the association between training images and corresponding annotations in the training image set enables the trained AI model to have a higher accuracy rate when used to predict other input inference images.
  • a neural network model for data classification tasks In order to train a neural network model for data classification tasks, first collect data according to the task, build a training data set, and constitute a training The dataset contains 3 types of data, namely: apples, pears, and bananas.
  • the collected training data are stored in 3 folders according to the type, and the folder name is the label of all the data in the folder.
  • select a neural network model such as convolutional neural network (CNN)
  • CNN convolutional neural network
  • the loss value is calculated using the loss function, and each layer in the CNN is updated according to the loss value and the CNN structure. parameter.
  • the aforementioned training process continues until the loss value output by the loss function converges or all the data in the training data set is used for training, then the training ends.
  • the loss function is a function used to measure the degree to which the AI model is trained (that is, used to calculate the difference between the results predicted by the AI model and the real target).
  • the loss function is a function used to measure the degree to which the AI model is trained (that is, used to calculate the difference between the results predicted by the AI model and the real target).
  • the loss function is used to judge the difference between the value predicted by the current AI model and the real target value, and the parameters of the AI model are updated until the AI model can predict the real target value or the real target value. If the value is very close, the AI model is considered to be trained.
  • the trained AI model can be used to infer the data to obtain inference results.
  • the specific reasoning process is: input the image into the AI model, the convolution check of each layer in the AI model performs feature extraction on the image, and outputs the category to which the image belongs based on the extracted features.
  • the scene of target detection also known as object detection
  • the image is input into the AI model, the convolution check of each layer in the AI model performs feature extraction on the image, and each target included in the image is output based on the extracted features.
  • the location and category of the bounding box is also known as object detection
  • the image is input into the AI model, the convolution check of each layer in the AI model performs feature extraction on the image, and outputs the category to which the image belongs based on the extracted features, as well as each image included in the image.
  • the location and class of the bounding box of the object It should be noted here that for AI models, some AI models have strong reasoning ability, while some AI models have weak reasoning ability.
  • the strong reasoning ability of the AI model means that when the AI model is used to reason about the image, the accuracy of the reasoning result is greater than or equal to a certain value.
  • the weak inference ability of the AI model means that when the AI model is used to infer images, the accuracy of the inference results is lower than a certain value.
  • Data labeling is the process of adding all labels in the corresponding scene to each unlabeled data.
  • the unlabeled data is an unlabeled image.
  • image classification scenario add the category to the unlabeled image
  • target detection scenario add location information and category to the target in the unlabeled image.
  • data labeling can also be a process of adding partial labels in the corresponding scene to one or more unlabeled data. For example, in the scene of object detection, only the category of the object in the unlabeled image is added, and the location information of the object in the unlabeled image is not added.
  • the AI platform is a platform that provides a convenient AI development environment and convenient development tools for AI developers and users. There are various AI models or AI sub-models built into the AI platform to solve different problems.
  • the AI platform can establish suitable AI models according to the needs input by users. That is, users only need to determine their own needs in the AI platform, and follow the prompts to prepare a training data set and upload it to the AI platform, and the AI platform can train an AI model for the user that can be used to achieve the user's needs. For example, if a user needs an image classification model, the AI platform can choose a classification model from the stored AI models, and then use the training data set to update the classification model to obtain the AI model required by the user.
  • users can follow the prompts to prepare their own algorithms and training data sets and upload them to the AI platform.
  • the AI platform can train an AI model that can be used to achieve user needs. Users can use the trained AI model to complete their specific tasks.
  • AI model mentioned above is a general term, and AI models include deep learning models, machine learning models, etc.
  • FIG. 1 is a schematic structural diagram of an AI platform 100 in an embodiment of the present application. It should be understood that FIG. 1 is only a schematic structural diagram of the AI platform 100 exemplarily, and the present application does not limit the modules in the AI platform 100 division.
  • the AI platform 100 includes a user input/output (I/O) module 101 , a model training module 102 , and an inference module 103 .
  • the AI platform may further include an AI model storage module 104 and a data storage module 105 .
  • User I/O module 101 used to receive the inference data set input by the user, or used for the user to establish a connection between the AI platform and the device that generates the inference data, and obtain the inference data set from the device that generates the inference data.
  • a device that generates inference data is a camera or the like.
  • the user I/O module 101 is also used for receiving a task goal input or selected by a user, receiving a training data set of the user, and the like.
  • the user I/O module 101 is also used to receive the user's annotation result of the target data in the inference data set (the target data is a sample adapted to the data distribution of the inference data set), obtain one or more data with annotations from the user, etc., of course , the user I/O module 101 is also used to provide AI models and the like to other users.
  • a graphical user interface (GUI) or a command line interface (CLI) can be used for implementation.
  • the AI platform 100 displayed on the GUI can provide users with various AI services (such as image classification services, target detection services, etc.). The user can select a task target on the GUI.
  • the user selects the image classification service, and the user can continue to upload multiple unlabeled images in the GUI of the AI platform.
  • the GUI receives the task target and multiple unlabeled images, it communicates with the model training module 102 .
  • the model training module 102 selects or searches for the user an initial AI model that can be used to accomplish the user's task goal according to the task goal determined by the user.
  • the user I/O module 101 can also be used to receive the effect expectation of the AI model for completing the task objective input by the user. For example, inputting or selecting the resulting AI model for face recognition is more than 99% accurate.
  • the user I/O module 101 may also be used to receive an AI model input by the user, and the like.
  • a user can enter an initial AI model in the GUI based on his or her mission goals.
  • the user I/O module 101 can also be used to provide various pre-built initial AI models for the user to choose from. For example, users can select an initial AI model on the GUI based on their mission goals.
  • the user I/O module 101 may also be used to receive surface features and deep features of unlabeled images in the inference data set input by the user.
  • surface features include one of image resolution, image aspect ratio, image red-green-blue (RGB) mean and variance, image brightness, image saturation, or image sharpness.
  • RGB red-green-blue
  • One or more, deep features refer to abstract features of images extracted using convolution kernels in feature extraction models (such as CNN, etc.).
  • the surface features include the surface features of the bounding box and the surface features of the image.
  • the surface features of the bounding box may include the aspect ratio of each bounding box in a single-frame image, the The ratio of area to image area, the degree of marginalization of each bounding box in a single-frame image, the stacked map of each bounding box in a single-frame image, the brightness of each bounding box in a single-frame image, or each boundary in a single-frame image
  • One or more of the blurriness of the box, the surface features of the image can include the resolution of the image, the aspect ratio of the image, the mean and variance of the RGB of the image, the brightness of the image, the saturation of the image, or the sharpness of the image , one or more of the number of boxes in a single-frame image or the variance of the area of the boxes in a single-frame image.
  • Deep features refer to abstract features of images extracted using convolution kernels in feature extraction models (such as CNN, etc.).
  • the user I/O module 101 can also be used to provide a GUI for the user to mark the training data in the training data set, and for the user to mark the target data in the inference data set.
  • the user I/O module 101 may also be configured to receive various configuration information of the user on the initial AI model, the training data in the training data set, and the like.
  • the user I/O module 101 can also be used to provide a GUI, for the model training module 102 to provide the inference accuracy of the AI model before updating and the inference accuracy of the updated AI model, etc., and for user input to update the AI model. instruction.
  • the user I/O module 101 can also be used to provide a GUI for the user to input the update cycle of the AI model.
  • Model training module 102 used to train the AI model, where "training” can be understood as training the initial AI model and optimizing and updating the trained AI model, and the initial AI model includes the AI model that has not been trained.
  • a trained AI model refers to an AI model obtained by training an initial AI model, or an AI model obtained by updating an existing trained AI model.
  • the model training module 102 can communicate with the user I/O module 101 , the inference module 103 and the AI model storage module 104 . Specifically, the model training module 102 may acquire the data marked by the user from the user I/O module 101 . The model training module 102 can obtain the existing AI model from the AI model storage module 104 as an initial AI model and the like. The model training module 102 can acquire the inference result of the inference data set and the inference data set from the inference module 103, and train the AI model based on the inference result and the inference data set.
  • the model training module 102 is further configured to perform a preprocessing operation on the training data in the training data set received by the user I/O module 101 .
  • preprocessing the training images in the training image set uploaded by the user can make the training images in the training image set consistent in size, and can also remove inappropriate training images in the training image set.
  • the preprocessed training data set can be suitable for training the initial AI model, and can also make the training effect better.
  • the preprocessed training image set may also be stored to the data storage module 105 .
  • the model training module 102 may also be used to determine the AI model selected by the user on the GUI as the initial AI model. Or determine the AI model uploaded by the user through the GUI as the initial AI model.
  • the model training module 102 can also be used to evaluate the trained AI model to obtain an evaluation result.
  • the evaluation of the AI model can also be a separate module.
  • the inference module 103 uses the AI model to infer the inference data set, and outputs the inference result and target data of the inference data set.
  • the inference module 103 can communicate with the user I/O module 101 and the AI model storage module 104 .
  • the inference module 103 obtains the inference data set from the user I/O module 101, performs inference processing on the inference data set, and obtains the inference result of the inference data set.
  • the reasoning module 103 feeds back the labeling result of the reasoning data set and the target data to the user I/O module 101 .
  • the user I/O module 101 obtains the target data marked by the user and the marking confirmation of the inference result by the user, and feeds the target data marked by the user and the inference data confirmed by the user marking to the model training module 102 .
  • the model training module 102 continues to train the optimized AI model based on the target data provided by the user I/O module 101 and the inference data marked and confirmed by the user to obtain a more optimized AI model.
  • the model training module 102 transmits the more optimized AI model to the AI model storage module 104 for storage, and transmits the more optimized AI model to the inference module 103 for inference processing.
  • the output may further include the difference between the data distribution of the inference data set and the data distribution of the training data set of the AI model.
  • the reasoning module 103 provides the difference to the model training module 102, and the model training module 102 can determine the way of updating the AI model based on the difference.
  • the difference between the data distribution of the inference data set and the data distribution of the AI model training data set may not be determined by the inference module 103, but by an independent module on the AI platform.
  • the difference between the data distribution of the inference data set and the data distribution of the AI model training data set may also be determined by the model training module 102 .
  • the inference module 103 is further configured to perform a preprocessing operation on the inference data in the inference data set received by the user I/O module 101 .
  • preprocessing the inference images in the inference image set uploaded by the user can make the inference images in the inference image set consistent in size, and can also remove inappropriate inference images in the inference image set.
  • the preprocessed inference data set can be suitable for inference on the initial AI model, and can also make the inference effect better.
  • the preprocessed inference image set may also be stored to the data storage module 105 .
  • the above data preprocessing operation can also be a separate module, which is connected to the inference module 103 and the model training module 102 respectively, provides the preprocessed inference data set for the inference module 103, and provides the model training module 102 with preprocessed training image set.
  • the inference module 103 may not provide the inference result and target data of the inference data set to the user I/O module 101 .
  • the initial AI model may further include an AI model after training the AI model in the AI model storage module 104 using the data in the training data set.
  • AI model storage module 104 used to store the initial AI model, the updated AI model, the AI sub-model structure, the preset model, and the like.
  • the preset model is an AI model that has been trained on the AI platform and can be used directly, or an AI model that has been trained on the AI platform but needs to be further trained and updated.
  • the AI model storage module 104 can communicate with both the user I/O module 101 and the model training module 102 .
  • the AI model storage module 104 receives and stores the trained initial AI model and the updated AI model transmitted by the model training module 102 .
  • the AI model storage module 104 provides the model training module 102 with AI sub-models or initial AI models.
  • the AI model storage module 104 stores the initial AI model uploaded by the user received by the user I/O module 101 . It should be understood that, in another embodiment, the AI model storage module 104 can also be used as a part of the model training module 102 .
  • Data storage module 105 (for example, it may be a data storage resource corresponding to an Object Storage Service (OBS) provided by a cloud service provider): used to store training data sets and inference data sets uploaded by users, and also used to store data
  • OBS Object Storage Service
  • the data processed by the preprocessing module 105 is also used to store the sampled or generated data suitable for the data distribution of the inference data set.
  • the user I/O module obtains the inference data set.
  • the data storage module 105 can also directly connect to the data source to obtain the inference data set.
  • a camera is connected to the data storage module 105, and the video images captured by the camera constitute an inference data set.
  • the data storage module 105 may also store a knowledge base, and the knowledge base includes knowledge that is helpful for faster updating of the AI model.
  • the AI platform in this application may be a system that can interact with users.
  • This system may be a software system, a hardware system, or a system combining software and hardware, which is not limited in this application.
  • model training module 102 is used not only to implement the initial training of the AI model, but also to implement the update of the AI model.
  • the modules for initial training and A module for updating AI models are used not only to implement the initial training of the AI model, but also to implement the update of the AI model.
  • the AI platform provided by the embodiments of the present application can determine that there is a difference between the data distribution of the inference data set and the data distribution of the training data set, and if there is a difference, update the AI model, so The AI model can be updated in time.
  • the reasoning module 103 may not be included, and the AI platform is only used to provide processing for updating the AI model.
  • the user provides the inference data set, the AI model, and the training data set for training the AI model (or the data distribution of the training data set for training the AI model) to the AI platform, and the AI platform updates the AI model.
  • the AI platform provides users with updated AI models.
  • the AI platform is connected to a third-party platform (that is, an inference platform that performs inference on inference data), and the AI platform obtains inference data sets, AI models, and training data sets for training the AI model from the third-party platform (or training the The data distribution of the training data set of the AI model), the AI platform updates the AI model.
  • the AI platform provides updated AI models to third-party platforms.
  • FIG. 2 is a schematic diagram of an application scenario of an AI platform 100 provided by an embodiment of the present application.
  • the AI platform 100 may be fully deployed in a cloud environment.
  • Cloud environment is an entity that utilizes basic resources to provide cloud services to users under the cloud computing model.
  • the cloud environment includes cloud data centers and cloud service platforms.
  • Cloud data centers include a large number of basic resources (including computing resources, storage resources, and network resources) owned by cloud service providers.
  • the computing resources included in cloud data centers can be a large number of computing devices ( e.g. server).
  • the AI platform 100 can be independently deployed on servers or virtual machines in the cloud data center, and the AI platform 100 can also be deployed on multiple servers in the cloud data center in a distributed manner, or distributed on multiple servers in the cloud data center. On multiple virtual machines, or distributed on servers and virtual machines in cloud data centers.
  • the AI platform 100 is abstracted by the cloud service provider into an AI cloud service on the cloud service platform and provided to the user. After the user purchases the cloud service on the cloud service platform (which can be pre-recharged and then performed according to the usage of the final resources) settlement), the cloud environment uses the AI platform 100 deployed in the cloud data center to provide the AI platform cloud service to the user.
  • the user can determine the tasks to be completed by the AI model through the application program interface (API) or GUI, upload the training image set and the inference data set to the cloud environment, the AI platform 100 in the cloud environment Receive the user's task information, training data set, and inference data set, perform data preprocessing, AI model training, and use the trained AI model to infer the inference data set.
  • the AI platform returns the inference results of the inference data set, the target data determined in the inference data set, the inference accuracy of the AI model before the update and the inference accuracy of the updated AI model to the user through the API or GUI.
  • the user further chooses whether to deploy the updated AI model.
  • the trained AI model can be downloaded by users or used online to complete specific tasks.
  • the AI platform 100 in the cloud environment when abstracted into AI cloud services and provided to users, it can be divided into two parts, namely: basic AI cloud services and cloud services for updating AI models based on data distribution , the basic AI cloud service may be a service for training AI models. Users can purchase only the basic AI cloud services on the cloud service platform, and then purchase them when they need to use the cloud services that update the AI model. After the purchase, the cloud service provider provides the cloud service API for updating the AI model. The number of times charges extra for cloud services that update AI models. Of course, it is also possible to just purchase a cloud service that updates the AI model.
  • the deployment of the AI platform 100 provided by the present application is relatively flexible. As shown in FIG. 3 , in another embodiment, the AI platform 100 provided by the present application may also be deployed in different environments in a distributed manner.
  • the AI platform 100 provided by this application can be logically divided into multiple parts, each part having different functions.
  • the AI platform 100 includes a user I/O module 101 , a model training module 102 , an AI model storage module 104 and a data storage module 105 .
  • Each part of the AI platform 100 may be deployed in any two or three environments of the terminal computing device, the edge environment and the cloud environment, respectively.
  • Terminal computing equipment includes: terminal server, smart phone, notebook computer, tablet computer, personal desktop computer, smart camera, etc.
  • the edge environment is an environment including a set of edge computing devices close to the terminal computing device, and the edge computing devices include: edge servers, edge small stations with computing capabilities, and the like.
  • Various parts of the AI platform 100 deployed in different environments or devices cooperate to implement functions such as determining and training the constructed AI model for users.
  • the user I/O module 101 and the data storage module 105 in the AI platform 100 are deployed in the terminal computing device, and the model training module 102 and the inference module in the AI platform 100 are deployed in the edge computing device in the edge environment. 103 and the AI model storage module 104.
  • the user sends the training data set and the inference data set to the user I/O module 101 in the terminal computing device, and the terminal computing device stores the training data set and the inference data set in the data storage module 105 .
  • the model training module 102 in the edge computing device updates the AI model based on the inference data set. It should be understood that this application does not restrictively divide which parts of the AI platform 100 are deployed in what environment. In actual application, it can be carried out according to the computing power of the terminal computing device, the resource occupation of the edge environment and the cloud environment, or specific application requirements. Adaptive deployment. The above description is based on the example that the user needs to input the training data set. Of course, the user may not input the training data set, but the user can directly input the distribution of the training data set. Alternatively, the model training module 102 analyzes the current existing AI model and determines the training data set. The distribution of the dataset.
  • FIG. 4 is a schematic diagram of a hardware structure of a computing device 400 in which the AI platform 100 is deployed.
  • the computing device 400 shown in FIG. 4 includes a memory 401 , a processor 402 , a communication interface 403 and a bus 404 .
  • the memory 401, the processor 402, and the communication interface 403 realize the communication connection among each other through the bus 404.
  • the memory 401 may be a read only memory (Read Only Memory, ROM), a random access memory (Random Access Memory, RAM), a hard disk, a flash memory or any combination thereof.
  • the memory 401 can store a program. When the program stored in the memory 401 is executed by the processor 402, the processor 402 and the communication interface 403 are used to execute the AI platform 100 to train the AI model for the user, determine the data distribution of the inference data set and the training data set. There are differences in the distribution of data, and methods for updating AI models based on inference datasets.
  • the memory can also store datasets.
  • a part of the storage resources in the memory 401 is divided into a data storage module 105 for storing the data required by the AI platform 100, and a part of the storage resources in the memory 401 is divided into an AI model storage module 104 for storing the AI Model library.
  • the processor 402 may adopt a central processing unit (CPU), an application specific integrated circuit (ASIC), a graphics processing unit (GPU) or any combination thereof.
  • Processor 402 may include one or more chips.
  • Processor 402 may include an AI accelerator, such as a neural processing unit (NPU).
  • NPU neural processing unit
  • Communication interface 403 uses a transceiver module, such as a transceiver, to enable communication between computing device 400 and other devices or a communication network. For example, data may be acquired through the communication interface 403 .
  • a transceiver module such as a transceiver
  • Bus 404 may include pathways for communicating information between various components of computing device 400 (eg, memory 401, processor 402, communication interface 403).
  • online single-node self-update means that each computing node where the AI model is deployed is independent of each other, and only uses the data accessed by itself to update the AI model online; online multi-node collaborative update refers to increasing data communication between different computing nodes.
  • an appropriate operation mode can be selected according to the actual situation (the detailed process will be described later). For example, when the data distribution of the inference data set does not change much from the data distribution of the training data set, the online single-node self-update is used; when the data distribution of the inference data set and the data distribution of the training data set change greatly, the offline multi- Node collaborative update, etc.
  • AI model update logic diagram as shown in Figure 6 is also provided:
  • Figure 6 includes data sources, offline updates, online updates, knowledge bases, etc.
  • the data sources include data of scenarios where the AI model update method is applied.
  • the knowledge base includes prior knowledge and/or domain knowledge for model training. Prior knowledge and/or domain knowledge can provide a basis for model update strategy selection, and can also provide AI model update process guidance for model update strategies.
  • the user needs an AI model for detecting cats.
  • the AI platform obtains from domain knowledge that cats and tigers belong to the feline family and are similar. Currently, the AI platform already has an AI model for detecting tigers. AI model, and then use the cat image to update the AI model for detecting tigers to obtain the AI model for detecting cats.
  • the AI platform can select an AI model architecture that meets the user's requirements or preset models based on the user's requirements and knowledge base. .
  • Online update refers to online adaptive updating of AI models.
  • Offline update refers to updating the AI model offline.
  • the data used can include real data (such as data in the inference data set), sampled or generated labeled data adapted to the inference data set (referred to as generated data in Figure 6), of course, the data used can also Including data characteristics, data characteristics refer to the statistics, distribution and other information of various types of data.
  • the "offline update" in the box in the lower left corner of Figure 6 refers to the offline update process, including obtaining the data for updating the AI model, and updating the existing AI model based on the data. When updating, the parameters of the existing AI model can be adjusted, or a new AI model can be retrained as the updated AI model.
  • the offline update process also includes data from the AI
  • the model storage module 104 acquires the initial model.
  • the data used may include data obtained from real data, sampled or generated annotation data adapted to the inference data set, and the like.
  • Step 701 the AI platform obtains an inference data set.
  • the reasoning data in the reasoning data set is used to input the existing AI model to perform reasoning, and the existing AI model can also be called the AI model before the update.
  • a user can input an inference data set to the AI platform, or the AI platform can obtain an inference data set from a connected inference data source, and the like.
  • the AI platform is connected to a camera, and the AI platform can continuously obtain data from the camera as the data in the inference data set, and the camera is the inference data source.
  • the data in this inference dataset is unlabeled data.
  • Step 702 the AI platform determines that there is a difference between the data distribution of the inference data set and the data distribution of the training data set.
  • the training data set is the data set used to train the existing AI model.
  • the AI platform may determine the data distribution of the inference data set and acquire the data distribution of the training data set each time the inference data set is obtained.
  • the AI platform can determine the data distribution of the inference data set and obtain the data distribution of the training data set each time the update cycle of the AI model is reached, and the update cycle of the AI model can be set by the user.
  • the AI platform determines whether there is a difference between the data distribution of the inference data set and the data distribution of the training data set. If there is a difference between the data distribution of the inference data set and the data distribution of the training data set, step 703 is executed. Otherwise, no subsequent processing is performed. The reason is that : There is a difference between the data distribution of the inference data set and the data distribution of the training data set. The high probability indicates that the existing AI model may not be suitable for the inference data set. Update the existing AI model to adapt to the inference data set. When there is no difference between the data distribution of the inference data set and the data distribution of the training data set, it means that the existing AI model is also suitable for the inference data set, and the existing AI model can not be updated to save processing resources.
  • the process of obtaining the data distribution of the training data set may be: the AI platform may obtain the training data set, determine the data distribution of the training data set based on the training data set, or may be the training data set obtained by the AI platform from the user It can also be the data distribution of the training data set obtained by the AI platform based on the inference results of the existing AI model on the inference data set.
  • Step 703 the AI platform uses the inference data set to update the existing AI model to obtain the updated AI model.
  • the AI platform when the AI platform determines that there is a difference between the data distribution of the inference data set and the data distribution of the training data set, it can update the existing AI model based on the inference data set to obtain the updated AI model.
  • the existing AI model is deployed on the inference platform (the inference platform can be a part of the AI platform, for example, the inference platform includes the inference module mentioned above, or it can be a platform independent of the AI platform ), after step 703, the AI platform can compare the inference accuracy of the updated AI model and the existing AI model, and determine that the inference accuracy of the updated AI model is better than the inference accuracy of the existing AI model; Deploy to the inference platform so that the updated AI model can perform inference in place of the existing AI model.
  • the AI platform can use a test data set adapted to the data distribution of the training data set and a test data set adapted to the data distribution of the inference data set to evaluate the updated AI model respectively, and obtain the first evaluation result and For the second evaluation result, use the test data set adapted to the data distribution of the inference data set and the test data set adapted to the data distribution of the training data set to evaluate the existing AI model respectively, and obtain the third evaluation result and the fourth evaluation result. If the first evaluation result is not significantly lower than the fourth evaluation result, and the second evaluation result is better than the third evaluation result, it is determined that the inference accuracy of the updated AI model is better than that of the existing AI model.
  • the AI platform can provide the updated AI model to the reasoning platform (specifically, it can provide the entire content of the updated AI model, or it can provide the difference between the updated AI model and the existing AI model), and the reasoning platform can use the updated AI model.
  • the latter AI model replaces the existing AI model to perform inference processing.
  • the precision-recall (PR) curve when evaluating the existing AI model and the updated AI model, the precision-recall (PR) curve, the average precision (AP) index, the false alarm rate, the missing Report rate, one or more of.
  • PR precision-recall
  • AP average precision
  • the false alarm rate the missing Report rate
  • different evaluation indicators can be used.
  • the PR curve is used here as an example. Of course, it can also be various charts such as box plots, confusion matrices, etc., not limited to the PR curve.
  • inference speed can also be used, that is, the AI platform uses the comprehensive results of inference accuracy and inference speed as the basis for evaluating AI models.
  • the user can decide whether to deploy the updated AI model.
  • the AI platform determines that the inference accuracy of the updated AI model is better than the inference accuracy of the existing AI model, it can display the existing AI model through the display interface.
  • the inference accuracy of the AI model and the inference accuracy of the updated AI model The user can determine whether to deploy the updated AI model based on the inference accuracy of the two.
  • the user can trigger the update of the AI model, and the AI platform will receive the user's update instruction for the existing AI model.
  • the AI platform can provide the updated AI model to the inference platform.
  • the AI platform can also display other information that is helpful for displaying the model characteristics of the updated AI model or user-specified information, such as the gradient change trend, the training loss function decreasing trend, One or more of validation set accuracy trends, intermediate output results, or intermediate feature visualization.
  • step 703 different update methods may be adopted according to differences:
  • the AI platform uses the inference data set to update the existing AI model offline; if the difference does not meet the offline update condition, the AI platform uses the inference data set to update the existing AI model online.
  • the offline update conditions are preset. For example, for an image in this embodiment of the present application, the offline update conditions are that the reconstruction error of the image and the reconstruction error of the training image set are greater than the first value, the prediction error between the predicted image corresponding to the feature of the image and the original image is greater than the second value, etc. . Also for example, the embodiment of the present application performs anomaly detection on switches, the data in the inference data set is the packet loss rate, and the offline update condition is that the difference between the maximum packet loss rate in the inference data set and the maximum packet loss rate in the training data set is greater than a third value. Wait.
  • the AI platform can determine whether the difference between the data distribution of the inference data set and the data distribution of the training data set reaches the offline difference condition. If the difference reaches the offline difference condition, it means that the data distribution of the inference data set differs greatly from the data distribution of the training data set.
  • the existing AI model is no longer suitable for inference in the inference data set, and the online update will be slow, which is not suitable for Online update, the AI platform uses the inference data set to update the existing AI model offline.
  • the difference does not meet the offline difference condition, it means that the difference between the data distribution of the inference data set and the data distribution of the training data set is not very large.
  • the existing AI model is no longer suitable for inference in the inference data set, it can be updated online.
  • the AI platform uses inference datasets to update existing AI models online.
  • the update method can be flexibly selected based on the difference between the data distribution of the inference data set and the data distribution of the training data set.
  • the parameters in the updated AI model may be determined based on the difference between the data distribution of the inference data set and the data distribution of the training data set, specifically:
  • the AI platform uses the difference between the data distribution of the inference data set and the data distribution of the training data set to determine the parameter change of the target part of the existing AI model; based on the current parameters and parameter change of the target part in the existing AI model, determine the update The parameters of the target part in the later AI model.
  • the target part is a sub-model of the existing AI model, or all the sub-models of the existing AI model.
  • the AI platform can obtain the relationship between the change of the data distribution and the parameter change, and the relationship can be obtained by pre-modeling of the AI platform, or obtained from other platforms.
  • the AI platform can obtain the relationship between the change of data distribution and the amount of parameter change, use the difference between the data distribution of the inference data set and the data distribution of the training data set, and the relationship between the change of data distribution and the amount of parameter change to determine the parameter change of the target part quantity.
  • the AI platform uses the current parameters and parameter changes of the target part in the existing AI model to determine the parameters of the target part in the updated AI model, and the AI platform replaces the parameters of the target part with the existing AI model, that is, the updated AI model is obtained.
  • AI model In this way, some or all parameters of the AI model can be updated online.
  • the existing AI model when updating the existing AI model, can be updated based on a data set (which can be referred to as the target data set later), and the processing is as follows:
  • the AI platform constructs a target data set based on the inference data set; uses the target data set to update the existing AI model.
  • an existing AI model can use the inference data set to construct a target data set, such as sampling some data in the inference data set as data in the target data set. Then the AI platform uses the target data set to update the existing AI model and obtain the updated AI model.
  • the user when using the target data set to update the existing AI model, the user may participate in the construction of the target data set, or may not participate in the target data set.
  • the user can confirm the inference result of the inference data, or annotate the target data provided by the AI platform.
  • the target data set includes the labeled data, and the AI platform can use supervised learning technology to analyze the existing data.
  • the AI platform can also use unsupervised learning technology to update the existing AI model when necessary, and the AI platform can also fine-tune and adapt the existing AI model through transfer learning and domain adaptation technology.
  • the AI platform can use unsupervised learning technology to update the existing AI model, or use the semi-supervised learning technology to update the existing AI model.
  • the specific processing is as follows: 1.
  • the target data set can be constructed based on the user's participation: the AI platform obtains the target data that meets the sample conditions in the inference data set, and displays the target data through the display interface; obtains the user's annotation results on the target data; The target data and the annotation results of the target data are used to construct the target data set.
  • the sample condition is used to indicate the typical data in the inference dataset.
  • the AI platform can determine the target data that meets the sample conditions in the inference data set. Then the AI platform displays the target data through the display interface.
  • the target data can be unlabeled data in the inference data set, or the inference data set can be labeled data after inference.
  • the user can label the target data on the display interface, and the AI platform will obtain the user's labeling result of the target data.
  • the target data is an inference data set with annotations after inference
  • the user can confirm the existing annotations of the target data on the display interface, and the AI platform can also obtain the user's annotation results for the target data.
  • the AI platform can use the target data and the labeling results of the target data as part of the labeling data of the target dataset.
  • the AI platform can display the inference results of the inference data set through the display interface.
  • the user can label and confirm the inference results, and the AI platform can obtain the inference results confirmed by the labeling as the label data in the target data set.
  • the annotation of the new category can be added to the annotation confirmation result, and the annotation of the new category is also used as the annotation data in the target dataset.
  • the target data set may also include the AI platform in the current labeled data to sample and/or generate data suitable for the data distribution of the inference data set.
  • the current labeled data may include data in the training dataset, and may also include existing labeled data in other scenarios.
  • sampling refers to finding data suitable for the data distribution of the inference data set in the current labeled data
  • generating here refers to generating a data set suitable for inference based on the current labeled data and data generation algorithm data distribution data.
  • the user when the user participates, the user can participate in the annotation data in real time, and the AI platform can use the annotation data marked by the user for training in real time.
  • the process of using the target dataset to update the existing AI model can be as follows:
  • the AI platform obtains a strategy for updating the existing AI model according to the data characteristics of the data in the target data set; and updates the existing AI model according to the strategy.
  • the AI platform can use the data characteristics of the data in the target data set, and use the prior knowledge and/or domain knowledge in the knowledge base to select a strategy for updating the existing AI model. Then the AI platform uses this strategy to update the existing AI model to obtain the updated AI model.
  • transfer learning and/or few-shot learning techniques are used when the labeled data is relatively small.
  • AI platforms can update existing AI models using transfer learning and/or few-shot learning techniques.
  • AI platforms can use strong supervision techniques to update AI models.
  • strong supervision technology here is only a general term, and it is only for illustration. In the specific implementation, it may be a specific strong supervision training method. For different tasks and scenarios, the corresponding training methods are different.
  • the labeled data of the target dataset is the current labeled data to sample and/or generate data suitable for the data distribution of the inference dataset.
  • the target data set includes unlabeled data and labeled data suitable for the data distribution of the inference data set.
  • the AI platform can use the unlabeled data in the target data set to optimize the feature extraction part of the existing AI model in an unsupervised manner; according to the optimization After the feature extraction part and the labeled data in the target dataset, the existing AI model is updated.
  • the AI platform can use the unlabeled data in the target data set and use an unsupervised method to analyze the existing AI model.
  • the feature extraction part is optimized, and the optimized feature extraction part is obtained.
  • the unsupervised method here may be a self-supervised method or the like.
  • the AI platform can use the optimized feature extraction part and the labeled data adapted to the data distribution of the inference dataset to further update the existing AI model to obtain the updated AI model. In this way, the existing AI model can also be updated without the user participating.
  • the AI platform obtains the inference result and target data when inferring the inference data set, and only provides the inference result to the user, not the target data. provided to the user (shown in (a) in FIG. 8 ).
  • the AI platform updates the existing AI model, and after the existing AI model is updated, the updated AI model can be provided to the user (as shown in (b) in FIG. 8 ).
  • the subsequent AI platform can use the updated AI model for inference, obtain the inference result and target data, and provide the inference result to the user (shown in (c) in Figure 8).
  • the target data may not be output.
  • the labeled data of the target dataset is the current labeled data to sample and/or generate data suitable for the data distribution of the inference dataset.
  • the target data set includes unlabeled data and labeled data suitable for the data distribution of the inference data set.
  • the AI platform can use the existing AI model to label the unlabeled data in the target data set to obtain the labeled data in the target data set; There is an existing AI model that labels the unlabeled data in the target data set, and obtains the labeling result of the unlabeled data.
  • the existing AI model is updated according to the labeling result of the unlabeled data and the labeling data in the target data set.
  • the AI platform uses the existing AI model to label the unlabeled data in the target data set, and obtains the labeling result of the unlabeled data (the labeling result here).
  • the labeling result There may be inaccurate annotations, which can be called "pseudo-annotations" and the corresponding confidence.
  • the AI model uses the annotation results with high confidence (such as annotation results with a confidence higher than a preset threshold) and the annotation data in the target data set to update the existing AI model and obtain the updated AI model.
  • the AI model uses the labeling results with relatively high confidence to train the existing AI model to obtain the trained AI model. Then the AI platform can use the labeled data in the target data set (this part of the labeled data is sampled and/or generated labeled data suitable for the data distribution of the inference data set) to further update the trained AI model, and obtain the updated AI model. In this way, the existing AI model can also be updated without the user participating.
  • the AI platform in the process of updating the AI model, with the participation of users, can use strong supervised learning technology to update the existing AI model based on the labeled data in the target dataset. Specifically, when the task has not changed, the strong supervised learning technology is directly used to update the existing AI model and obtain the updated AI model. In the case of changes in tasks (such as a new category that needs to be classified, etc.), if the number of labeled data for the new category in the target data set is relatively small, the AI platform can choose a small sample learning technology to update the existing AI model.
  • the AI platform obtains the inference result and target data when inferring the inference data set, and provides the inference result and target data to the user (Figure 9).
  • the user can provide the AI platform with the annotation confirmation of the inference result and/or the annotation result of the multi-target data (ie, the annotation data) (shown in (b) in FIG. 9 ).
  • the AI platform can update the existing AI model based on the annotation data provided by the user, obtain the updated AI model, and provide the updated AI model to the user (as shown in (c) in Figure 9).
  • the subsequent AI platform can use the updated AI model for inference, obtain inference results and target data, and provide it to users (shown in (d) in Figure 9).
  • the AI platform can also sample and/or generate data suitable for the data distribution of the inference dataset from the current labeled data to expand the target. Annotated data in the dataset.
  • the representation learning method can also be used to tune the feature extraction part of the existing AI model.
  • the process for the AI platform to determine the difference between the data distribution of the inference data set and the data distribution of the training data set is as follows:
  • the AI platform obtains the probability model of the modeled data distribution.
  • the probability model can be obtained by modeling the AI platform itself, or the probability model can be obtained from other platforms.
  • the AI platform can use this probabilistic model to extract features from the training dataset and fit a mixture of Gaussian distributions.
  • the AI platform can fit a Gaussian distribution on the features of the training data set to determine the likelihood of the inference data set.
  • the likelihood represents the difference between the data distribution of the training data set and the data distribution of the inference data set. , the smaller the difference is, the larger the difference is.
  • the distribution used can be a mixed Gaussian distribution, a parameterized distribution fitting method, etc., of course, it can also be other distributions, non-parametric distribution fitting algorithms, complex probability distributions, etc. Graphical models, etc.
  • Step 1001 the AI platform obtains an inference data set.
  • Step 1002 the AI platform determines that there is a difference between the data distribution of the inference data set and the data distribution of the training data set.
  • Step 1003 the AI platform uses the inference data set to update the current existing AI model to obtain the updated AI model.
  • Step 1004 Deploy the updated AI model, and the updated AI model is the current existing AI model.
  • Step 1005 the AI platform returns to step 1002.
  • Figure 10 only describes the process of a loop, as long as the update of the AI model is not stopped, it is actually in a loop. Specifically, that is, when the AI platform accesses the inference data source, it will always obtain the data in the inference data set from the inference data source, determine the data distribution of the inference data set, and then judge whether there is a difference between the data distribution of the training data set and the data distribution of the training data set. When there is a difference, update the existing AI model, deploy the updated AI model, and then return to step 1002 .
  • Steps 1001 and 1002 are asynchronous with steps 1003 and 1004, because the data in the inference data set is always updated, and it will always be judged whether there is a difference between the data distribution of the inference data set and the data distribution of the training data set. Steps 1003 and 1004 are executed only when there is a difference between the data distribution of the inference data set and the data distribution of the training data set.
  • the AI platform can input a stop command to the AI platform and control the AI platform to update the existing AI model. Or the AI platform determines that the difference between the data distribution of the inference data set and the data distribution of the training data set is relatively small, and the accuracy of the updated AI model compared with the existing AI model (that is, the AI model before the update) is relatively small, then the AI The platform can actively stop updating the AI model.
  • the AI platform After stopping updating the AI model, the AI platform again determines that the difference between the data distribution of the inference data set and the data distribution of the training data set is relatively large, and can provide the user with an update prompt message (for example, send the update prompt message to the user's terminal, display the update prompt message through the display interface, etc.), when the AI platform receives the confirmation update instruction input by the user, it can restart the process of updating the AI model.
  • the AI platform can restart the process of updating the AI model.
  • the AI when the AI again determines that the data distribution of the inference data set is quite different from the data distribution of the training data set, it can actively start the process of updating the AI model.
  • the method for updating the AI model may be applied to an application scenario of recognizing actions in images. For example, it is applied to the identification of irregular sorting actions in logistics scenarios.
  • the AI platform can reason and update the existing AI model based on the existing AI model.
  • the inference data set is the user's surveillance video data
  • the existing AI model is a model that can perform irregular sorting action recognition.
  • the training dataset is also video data.
  • the AI platform includes: a video reasoning module (that is, the reasoning module 103 in the preceding paragraph), a storage service module (that is, the data storage module 105 in the preceding paragraph), and a model training module (that is, the preceding The model training module 102 in the text), the user I/O module (ie the user I/O module 101 in the previous text), and the like.
  • the video inference module is used to infer the inference data set obtained from the camera
  • the storage service module is used to store the inference data set, etc.
  • the model training module is used to update the existing AI model
  • the user I/O module is used to exchange with users.
  • the process of updating the AI model may include the process of determining the difference between the data distribution of the inference data set and the data distribution of the training data set, the process of updating the existing AI model online, the process of inferring the inference data set, and the process of providing target data for users.
  • the AI platform can extract video frames in the inference data set, extract deep features of video frames through a stored deep neural network, and/or extract non-depth features through other algorithms.
  • non-depth features are frame difference, optical flow, etc.
  • Frame difference refers to the result of the subtraction of two adjacent frames of images
  • optical flow refers to the positional change relationship of pixels between two adjacent frames of images, which is the displacement field of pixel movement.
  • the AI platform can obtain short-term videos of appropriate duration as input by sliding windows or segmenting video segments.
  • the methods used for extracting video frames in the inference data set include but are not limited to extracting all video frames, uniform sampling, non-uniform sampling, multi-scale sampling, and the like.
  • the non-uniform sampling may be the selection of video frames based on key frames, inter-frame similarity, etc., and specifically may be to select key frames, video frames whose inter-frame similarity is less than a certain value, and the like.
  • the multi-scale sampling may be to obtain multiple short videos at different sampling intervals, extract depth features respectively, and integrate the extracted depth features to obtain the above depth features.
  • Deep neural networks include, but are not limited to, 2D/3D convolutional neural networks, recurrent neural networks, long short-term memory networks, two-stream convolutional neural networks, etc., and combinations and variants thereof.
  • a video prediction model is stored in the AI platform, and the video prediction model can be established by the AI platform itself or obtained from other platforms.
  • the AI platform can use the above-mentioned extracted features (features include depth features and/or non-depth features) to predict future video frames. Then the AI platform calculates the prediction error between the predicted video frame and the actual video frame in the inference data set, and uses the prediction error of one prediction or the average of the prediction errors of multiple predictions to represent the data distribution of the training data set and the inference data set. differences in data distribution. Specifically, when predicting future video frames, the AI platform can predict one frame or multiple frames.
  • the above video prediction models include, but are not limited to, 2D/3D convolutional neural networks, recurrent neural networks, generative adversarial networks, variational autoencoders, etc., and combinations and variants thereof.
  • Video interpolation refers to the prediction of intervening video frames from non-adjacent video frames.
  • Video reconstruction refers to reconstructing the reconstructed video frame of the current video frame based on the characteristics of the current video frame, comparing the reconstructed video frame with the current video frame, obtaining the reconstruction error, and using the reconstruction error of one video frame or more. The average of the reconstruction errors for each video frame represents the difference between the data distribution of the training dataset and the data distribution of the inference dataset.
  • Calculating the similarity between frames refers to calculating the similarity of two adjacent video frames, using the similarity of one video frame or the average of the similarity of multiple adjacent two video frames to represent the data distribution of the training data set. Differences in the data distribution of inference datasets.
  • each video frame can be used for modeling in the spatial dimension, and the local area of each video frame can also be used for modeling, or the two can be combined.
  • the temporal dimension a similar global (considering a whole video), local (considering part of a video in a whole video) or a combination of both can be adopted.
  • the AI platform can use any metric that meets the requirements of the task, including but not limited to L1 distance (the difference between the predicted two video frames), L2 distance (the square of the difference between the predicted two video frames) , Wasserstein distance (also known as earth mover distance), learnable metrics, etc., and combinations and variants thereof.
  • L1 distance the difference between the predicted two video frames
  • L2 distance the square of the difference between the predicted two video frames
  • Wasserstein distance also known as earth mover distance
  • learnable metrics etc., and combinations and variants thereof.
  • the difference between the data distribution of the inference data set and the data distribution of the training data set is directly represented by the prediction error.
  • the result of performing linear or nonlinear transformation on the prediction error can also be used.
  • the AI platform can update the existing AI model online.
  • the AI platform can input the difference between the data distribution of the inference data set and the data distribution of the training data set to the parameter generator (the parameter generator models the corresponding relationship between the parameter change and the data distribution difference) ), the output of the parameter generator is the parameter variation of the target part mentioned above.
  • the AI platform adds the current parameter value of the target part in the existing AI model to the parameter change, and obtains the parameter value of the target part in the updated AI model.
  • the difference between the data distribution of the inference data set and the data distribution of the training data set is used to determine the parameter change of the target part.
  • part or all of the data in the inference data set and the difference can be used to determine the parameter change of the target part. The amount is not limited in the embodiments of the present application.
  • the existing AI model (or the updated AI model) infers the video frames in the inference data set, outputs the inference result of the action recognition, and the AI platform displays the inference result through the display interface.
  • the AI platform can obtain target data that meets the sample conditions in the inference data set according to the difference between the data distribution of the inference data set and the data distribution of the training data set.
  • the AI platform can use the difference between the data distribution of the inference data set and the data distribution of the training data set.
  • the target data that satisfies the sample condition is screened out.
  • the target data under the sample condition is applicable To update the existing AI model.
  • the AI platform can use uncertainty to determine target data and process as:
  • the AI platform can use the existing AI model to infer the uncertainty of each inference data in the inference data, and the uncertainty can be represented by any one of action category probability, information entropy, mutual information, and variance.
  • the AI platform uses the difference between the data distribution of the inference data set and the data distribution of the training data set, and the corresponding uncertainty of each inference data, to obtain the target data that meets the sample conditions in the inference data set.
  • the uncertainty is represented by the action category probability
  • the difference between the data distribution of the inference dataset and the data distribution of the training dataset is represented by the prediction error
  • the L1 distance is used to measure the prediction error
  • the entropy of the action category probability is used to measure the uncertainty.
  • Equation (1) x represents the target data (also called a sample), the first item I(x) represents the actual video frame, Represents the predicted video frame.
  • p i (x) represents the probability of the i - th type of action corresponding to x .
  • the meaning of formula (1) is: satisfy When taking the maximum value, x is x * . In this way, the typical data in the inference data set can be obtained, which is more suitable for updating the existing AI model, so that the inference accuracy of the updated AI model is better.
  • the AI platform provides the target data to the user, the user labels x * , and obtains the labeling result y * .
  • the AI platform updates the existing AI model based on ⁇ x * , y * ⁇ .
  • the AI platform can use supervision technology to update the existing AI model. It can either optimize the existing AI model based on the target data set, or directly retrain an AI model based on the target data set as the updated AI model.
  • the optimization algorithms used to update parameters include, but are not limited to, stochastic gradient descent, conjugate gradient descent, and the like.
  • the adaptive ability of the AI model itself is used to adjust the local parameters, and on the other hand, the AI model can be obtained offline by interacting with the user to obtain new annotation data.
  • the overall update so as to continuously adapt to the new data distribution, to ensure the accuracy of AI model inference.
  • the AI platform can control the AI model to continuously and automatically update without requiring users to have algorithm-related expertise.
  • the AI model may be updated with or without user participation.
  • the AI model is updated, it is not limited by the number of labeled data, and is not limited by task changes or changes, and can be applied to any form of inference data set, such as batch data (100 consecutive pictures) , streaming data (step-by-step video data), etc.
  • FIG. 15 is a structural diagram of an apparatus for updating an AI model provided by an embodiment of the present application.
  • the apparatus can be implemented by software, hardware or a combination of the two to become a part or all of the apparatus.
  • the means for updating the AI model may be part or all of the aforementioned AI platform 100 .
  • the apparatus can implement the process described in FIG. 7 in the embodiment of the present application, and the apparatus includes: an acquisition module 1510, a determination module 1520, and an update module 1530, wherein:
  • the acquisition module 1510 is used to acquire an inference data set, wherein the inference data in the inference data set is used to input the inference data into an existing AI model to perform inference, and specifically can be used to realize the acquisition function of step 701 and execute the implicit information contained in step 701 step;
  • a determination module 1520 configured to determine that there is a difference between the data distribution of the inference data set and the data distribution of the training data set, wherein the training data set is the data set used for training the existing AI model, which can be specifically used for Implement the determination function of step 702 and execute the implicit steps included in step 702;
  • the updating module 1530 is used to update the existing AI model by using the reasoning data set to obtain the updated AI model, which can be specifically used to implement the update function of step 703 and execute the implicit steps included in step 703 .
  • the existing AI model is deployed on an inference platform, and the determining module 1520 is further configured to compare the updated AI model with the existing AI model after obtaining the updated AI model The inference accuracy of the AI model is available, and it is determined that the inference accuracy of the updated AI model is better than the inference accuracy of the existing AI model;
  • the updating module 1530 is further configured to deploy the updated AI model to the reasoning platform, so that the updated AI model replaces the existing AI model to perform reasoning.
  • the apparatus further includes: a display module 1540, configured to deploy the updated AI model to the inference platform, and execute it instead of the existing AI model Before inference, display the inference accuracy of the existing AI model and the inference accuracy of the updated AI model through the display interface;
  • a display module 1540 configured to deploy the updated AI model to the inference platform, and execute it instead of the existing AI model Before inference, display the inference accuracy of the existing AI model and the inference accuracy of the updated AI model through the display interface;
  • the receiving module 1550 is configured to receive an update instruction of the existing AI model from the user.
  • the update module 1530 is used for:
  • the existing AI model is updated offline by using the inference data set
  • the existing AI model is updated online by using the inference data set.
  • the update module 1530 is used for:
  • the parameters of the target part in the updated AI model are determined.
  • the update module 1530 is used for:
  • the existing AI model is updated using the target data set.
  • the update module 1530 is used for:
  • the inference data set obtain target data that satisfies the sample conditions, and display the target data through a display interface;
  • a target data set is constructed according to the target data and the labeling result of the target data.
  • the update module 1530 is used for:
  • the inference data set According to the difference between the data distribution of the inference data set and the data distribution of the training data set, in the inference data set, obtain target data that satisfies the sample conditions, wherein the target data is suitable for updating the existing AI model .
  • the target dataset further includes sampling and/or generating annotation data suitable for the data distribution of the inference dataset from the current annotation data, where the current annotation data includes all the annotation data. data in the training dataset.
  • the target data set includes unlabeled data and labeled data suitable for the data distribution of the inference data set;
  • the update module 1530 is used for:
  • the feature extraction part in the existing AI model is optimized in an unsupervised manner
  • the existing AI model is updated according to the optimized feature extraction part and the labeled data in the target data set.
  • the target data set includes unlabeled data and labeled data suitable for the data distribution of the inference data set;
  • the update module 1530 is used for:
  • the existing AI model is updated according to the labeling result of the unlabeled data and the labeling data in the target data set.
  • the update module 1530 is used for:
  • the existing AI model is updated.
  • the obtaining module 1510 is further configured to obtain the update cycle of the AI model input by the user;
  • the determining module 1520 is used for:
  • the update cycle of the AI model it is determined that there is a difference between the data distribution of the inference data set and the data distribution of the training data set.
  • the division of modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may also be other division methods.
  • the functional modules in the various embodiments of the present application may be integrated into one
  • the processor may also exist physically alone, or two or more modules may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules.
  • the present application also provides a computing device 400 as shown in FIG. 4 .
  • the processor 402 in the computing device 400 reads the program and the image set stored in the memory 401 to execute the aforementioned AI platform execution method.
  • each module in the AI platform 100 provided by the present application can be distributed on multiple computers in the same environment or in different environments, the present application also provides a computing device as shown in FIG. 17 , the computing device A plurality of computers 1700 are included, each computer 1700 including a memory 1701 , a processor 1702 , a communication interface 1703 and a bus 1704 .
  • the memory 1701 , the processor 1702 , and the communication interface 1703 are connected to each other through the bus 1704 for communication.
  • the memory 1701 may be read-only memory, static storage device, dynamic storage device, or random access memory.
  • the memory 1701 can store programs, and when the programs stored in the memory 1701 are executed by the processor 502, the processor 1702 and the communication interface 1703 are used to execute part of the method for the AI platform to update the AI model.
  • the memory can also store image sets. For example, a part of the storage resources in the memory 1701 is divided into an inference data set storage module for storing the inference data set, and a part of the storage resources in the memory 1701 is divided into an AI model storage module, which is used to store the inference data set. It is used to store the AI model library.
  • the processor 1702 can be a general-purpose central processing unit, a microprocessor, an application-specific integrated circuit, a graphics processor, or one or more integrated circuits.
  • the communication interface 1703 uses a transceiver module such as, but not limited to, a transceiver to enable communication between the computer 1700 and other devices or a communication network. For example, inference datasets can be obtained through communication interface 1703 .
  • Bus 504 may include a pathway for communicating information between various components of computer 1700 (eg, memory 1701, processor 1702, communication interface 1703).
  • a communication path is established between each of the above computers 1700 through a communication network.
  • Each computer 1700 runs any one or more of the user I/O module 101 , the model training module 102 , the inference module 103 , the AI model storage module 104 or the data storage module 105 .
  • Any computer 1700 may be a computer (such as a server) in a cloud data center, or a computer in an edge data center, or a terminal computing device.
  • the above-mentioned embodiments it may be implemented in whole or in part by software, hardware or a combination thereof.
  • software When implemented in software, it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product that provides the AI platform includes one or more computer instructions for entering the AI platform.
  • these computer program instructions When these computer program instructions are loaded and executed on the computer, the computer program instructions generated in whole or in part according to the embodiments of the present application as shown in FIG. 5 , FIG. 11 , FIG. 14 or The flow or function described in Figure 15.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, twisted pair, or wireless (eg, infrared, wireless, microwave, etc.) Computer program instructions of the AI platform.
  • the computer-readable storage medium can be any medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes one or more medium integration.
  • the medium can be a magnetic medium, (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, optical disks), or semiconductor media (eg, SSDs).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本申请提供了一种AI模型更新的方法、装置、计算设备和存储介质,属于人工智能技术领域。该方法包括:获取推理数据集,其中,推理数据集中的推理数据用于输入至已有AI模型执行推理;确定推理数据集的数据分布与训练数据集的数据分布存在差异,训练数据集为训练已有AI模型所使用的数据集;利用推理数据集,对已有AI模型进行更新,获得更新后的AI模型。采用本申请,在感知推理数据集的数据分布与训练数据集的数据分布存在差异时,可以及时对已有的AI模型进行更新,而不需要等到用户发现AI模型的推理精度下降时才对已有AI模型进行更新,所以可以及时的更新AI模型。

Description

AI模型更新的方法、装置、计算设备和存储介质
本申请要求于2020年07月27日提交中国知识产权局、申请号为202010732241.0、申请名称为“AI模型更新的方法、装置、计算设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能(Artificial Intelligence,AI)技术领域,特别涉及一种AI模型更新的方法、装置、计算设备和存储介质。
背景技术
随着人工智能技术的发展,AI模型应用越来越广泛。目前AI模型主要是采用机器学习等算法基于大量数据来学习AI模型的参数,由于AI模型是通过学习大量数据获得,所以所构建的AI模型尽管具有一定的泛化能力,但是当其所使用的场景下的数据分布和其训练数据分布存在较大差异时,AI模型的表现会受到影响,精度降低,且分布差异越大,精度降低越显著。然而AI模型的实际应用环境是动态变化的,因此在实际应用场景中,数据分布有可能不断变化,有可能导致AI模型在变化的应用场景下无法保持稳定的精度。为了使AI模型的精度可以适应场景的变化,需要对AI模型进行自适应的更新。
相关技术中,是开发了一个AI平台,用户通过AI平台自行完成对模型的更新。具体的,AI平台提供了从训练AI模型至部署AI模型所需的功能,这些功能通常包括数据标注、数据管理、模型训练和模型推理等。用户可以通过AI平台训练AI模型,后续用户在确定当前的AI模型不适应于当前的场景时,用户可以提供当前的场景的数据集,对AI模型进行更新,以使更新后的AI模型适应当前的场景,再将更新后的AI模型应用于当前的场景。
由于相关技术的AI模型更新的方法只能在AI模型的精度降低后,用户感知到AI模型在当前场景下精度下降了,才会去进行AI模型的更新,所以会导致AI模型更新不及时。
发明内容
本申请提供了一种AI模型更新的方法、装置、计算设备和存储介质,用以及时的对AI模型进行更新。
第一方面,本申请提供了一种AI模型更新的方法,该方法包括:获取推理数据集,其中,推理数据集中的推理数据用于输入至已有AI模型执行推理;确定推理数据集的数据分布与训练数据集的数据分布存在差异,其中,训练数据集为训练已有AI模型所使用的数据集;利用推理数据集,对已有AI模型进行更新,获得更新后的AI模型。
本申请所示的方案,AI模型更新的方法可以由AI平台执行,由于上述方法在推理数据集的数据分布与训练数据集的数据分布存在差异时,会对已有AI模型进行更新,而不是等到用户感知到AI模型的精度降低,才对已有AI模型进行更新,所以可以更及时地对AI模型进行更新。
在一种可能的实现方式中,已有AI模型部署在推理平台,该方法还包括:比较更新后的AI模型和已有AI模型的推理精度,确定更新后的AI模型的推理精度优于已有AI模型的推理精度;将更新后的AI模型部署至推理平台,代替已有AI模型执行推理。
本申请所示的方案,已有AI模型可以部署在推理平台,推理平台可以是AI平台的一部分,也可以是独立于AI平台。AI平台可以获得更新后的AI模型和已有AI模型的推理精度,然后比较更新后的AI模型和已有AI模型的推理精度,在更新后的AI模型的推理精度优于已有AI模型的推理精度时,将更新后的AI模型部署至推理平台,推理平台中使用更新后的AI模型,代替已有AI模型执行推理。这样,在更新后的AI模型的推理精度优于已有AI模型时,才对已有AI模型进行更新,所以使用更新后的AI模型进行推理,使得推理精度较高。
在一种可能的实现方式中,将更新后的AI模型部署至推理平台之前,还包括:通过显示界面显示已有AI模型的推理精度和更新后的AI模型的推理精度;接收用户对已有AI模型的更新指令。
本申请所示的方案,AI平台在将更新后的AI模型部署至推理平台之前,可以通过显示界面显示已有AI模型的推理精度和更新后的AI模型的推理精度,用户可以选择是否更新已有AI模型。在AI平台接收到用户对已有AI模型的更新指令时,可以将更新后的AI模型部署至推理平台。这样,用户可以决定是否部署更新后的AI模型,所以可以使得用户体验更好。
在一种可能的实现方式中,利用推理数据集,对已有AI模型进行更新,包括:若差异达到离线更新条件,则利用推理数据集对已有AI模型进行离线更新;若差异未达到离线更新条件,则利用推理数据集对已有AI模型进行在线更新。
本申请所示的方案,AI平台可以判断推理数据集的数据分布与训练数据集的数据分布的差异是否满足离线更新条件。在该差异满足离线更新条件的情况下,AI平台可以利用推理数据集对已有AI模型进行离线更新。在该差异不满足离线更新条件的情况下,AI平台可以利用推理数据集对已有AI模型进行在线更新。这样,由于基于差异的不同,可以选择不同的更新方式,所以可以节约更新时长。
在一种可能的实现方式中,利用推理数据集对已有AI模型进行在线更新,包括:利用推理数据集的数据分布与训练数据集的数据分布的差异,确定已有AI模型的目标部分的参数变化量;基于已有AI模型中目标部分当前的参数和参数变化量,确定更新后的AI模型中目标部分的参数。这样,在进行在线更新时,可以仅更新AI模型中某些部分的参数,即可达到对已有AI模型进行在线更新。
在一种可能的实现方式中,利用推理数据集,对已有AI模型进行更新,包括:根据推理数据集构造目标数据集;利用目标数据集对已有AI模型进行更新。
在一种可能的实现方式中,根据推理数据集构造目标数据集,包括:在推理数据集中,获取满足样例条件的目标数据,通过显示界面显示目标数据;获取用户对目标数据的标注结果;根据目标数据以及目标数据的标注结果,构建目标数据集。
本申请所示的方案,AI平台可以在推理数据集中,选取出满足样例条件的目标数据,展示给用户,使得用户对目标数据进行标注。AI平台可以基于用户对目标数据的标注结果和目标数据,构建出目标数据集。这样,由于在对已有AI模型进行更新时,构建出的目标数据集中包括用户标注的满足样例条件的目标数据,所以可以使更新后的AI模型更适用于对推理数据集的推理。
在一种可能的实现方式中,在推理数据集中,获取满足样例条件的目标数据,包括:根据推理数据集的数据分布与训练数据集的数据分布的差异,在推理数据集中,获取满足样例条件的目标数据,其中,目标数据适用于更新已有AI模型。这样,基于推理数据集的数据分布与训练数据集的数据分布的差异,可以使得目标数据更适用于更新已有AI模型。
在一种可能的实现方式中,目标数据集还包括在当前的标注数据中采样和/或生成适合推 理数据集的数据分布的标注数据,当前的标注数据包括训练数据集中的数据。这样,还可以在已有的标注数据中,获取到适合推理数据集的数据分布的标注数据,所以可以使得目标数据集中的标注数据比较多,进而可以使得更新后的AI模型的推理精度更高。
在一种可能的实现方式中,目标数据集包括适合推理数据集的数据分布的未标注数据和标注数据;利用目标数据集对已有AI模型进行更新,包括:利用目标数据集中的未标注数据,使用无监督方式对已有AI模型中的特征提取部分进行优化;根据优化后的特征提取部分和目标数据集中的标注数据,对已有AI模型进行更新。这样,可以先优化AI模型中的特征提取部分,然后再对已有AI模型进行更新。
在一种可能的实现方式中,目标数据集包括适合推理数据集的数据分布的未标注数据和标注数据;利用目标数据集对已有AI模型进行更新,包括:利用已有AI模型,对目标数据集中的未标注数据进行标注,获得未标注数据的标注结果;根据未标注数据的标注结果和目标数据集中的标注数据,对已有AI模型进行更新。这样,由于可以对目标数据集中的未标注数据进行标注,所以可以使目标数据集中的标注数据比较多,所以可以使得更新后的AI模型的推理精度更高。
在一种可能的实现方式中,利用目标数据集对已有AI模型进行更新,包括:根据目标数据集中数据的数据特性,获取更新已有AI模型的策略;根据策略,对已有AI模型进行更新。这样,由于可以使用目标数据集中数据的数据特性,选择出更新已有AI模型的策略,对已有AI模型进行更新,所以不仅可以提升更新已有AI模型的效率,而且可以使得更新后的AI模型的推理精度更高。
在一种可能的实现方式中,该方法还包括:获取用户输入的AI模型的更新周期;确定推理数据集的数据分布与训练数据集的数据分布存在差异,包括:根据AI模型的更新周期,确定推理数据集的数据分布与训练数据集的数据分布存在差异。这样,用户可以决定AI模型的更新周期,在达到该更新周期时,执行AI模型更新的流程。
第二方面,本申请提供了一种人工智能AI模型更新的装置,该装置包括:获取模块,用于获取推理数据集,其中,所述推理数据集中的推理数据用于输入至已有AI模型执行推理;确定模块,用于确定所述推理数据集的数据分布与训练数据集的数据分布存在差异,其中,所述训练数据集为训练所述已有AI模型所使用的数据集;更新模块,用于利用所述推理数据集,对所述已有AI模型进行更新,获得更新后的AI模型。这样,由于在推理数据集的数据分布与训练数据集的数据分布存在差异时,会对已有AI模型进行更新,而不是等到用户感知到AI模型的精度降低,才对已有AI模型进行更新,所以可以及时的对AI模型进行更新。
在一种可能的实现方式中,该已有AI模型部署在推理平台,该确定模块,还用于比较所述更新后的AI模型和所述已有AI模型的推理精度,确定所述更新后的AI模型的推理精度优于所述已有AI模型的推理精度;该更新模块,还用于将所述更新后的AI模型部署至所述推理平台,以使所述更新后的AI模型代替所述已有AI模型执行推理。这样,在更新后的AI模型的推理精度优于已有AI模型时,才对已有AI模型进行更新,使得推理精度较高。
在一种可能的实现方式中,所述装置还包括:显示模块,用于将所述更新后的AI模型部署至所述推理平台之前,通过显示界面显示所述已有AI模型的推理精度和所述更新后的AI模型的推理精度;所述装置还包括:接收模块,用于接收用户对所述已有AI模型的更新指令。这样,用户可以决定是否部署更新后的AI模型,所以可以使得用户体验更好。
在一种可能的实现方式中,该更新模块,用于:若所述差异达到离线更新条件,则利用 所述推理数据集对所述已有AI模型进行离线更新;若所述差异未达到所述离线更新条件,则利用所述推理数据集对所述已有AI模型进行在线更新。这样,由于基于差异的不同,可以选择不同的更新方式,所以可以节约更新时长。
在一种可能的实现方式中,该更新模块,用于:利用所述推理数据集的数据分布与所述训练数据集的数据分布的差异,确定所述已有AI模型的目标部分的参数变化量;基于所述已有AI模型中所述目标部分当前的参数和所述参数变化量,确定更新后的AI模型中所述目标部分的参数。这样,在进行在线更新时,可以仅更新AI模型中某些部分的参数,即可达到对已有AI模型进行在线更新。
在一种可能的实现方式中,该更新模块,用于根据所述推理数据集构造目标数据集;利用所述目标数据集对所述已有AI模型进行更新。
在一种可能的实现方式中,该更新模块,用于:在所述推理数据集中,获取满足样例条件的目标数据,通过显示界面显示所述目标数据;获取用户对所述目标数据的标注结果;根据所述目标数据以及所述目标数据的标注结果,构建目标数据集。这样,由于在对已有AI模型进行更新时,构建出的目标数据集中包括用户标注的满足样例条件的目标数据,所以可以使更新后的AI模型更适用于对推理数据集的推理。
在一种可能的实现方式中,该更新模块,用于:根据所述推理数据集的数据分布与训练数据集的数据分布的差异,在所述推理数据集中,获取满足样例条件的目标数据,其中,所述目标数据适用于更新所述已有AI模型。这样,可以更准确的筛选出满足样例条件的目标数据。
在一种可能的实现方式中,所述目标数据集还包括在所述当前的标注数据中采样和/或生成适合所述推理数据集的数据分布的标注数据,所述当前的标注数据包括所述训练数据集中的数据。
在一种可能的实现方式中,所述目标数据集包括适合所述推理数据集的数据分布的未标注数据和标注数据;该更新模块,用于:利用所述目标数据集中的未标注数据,使用无监督方式对所述已有AI模型中的特征提取部分进行优化;根据优化后的特征提取部分和所述目标数据集中的标注数据,对所述已有AI模型进行更新。这样,可以先优化AI模型中的特征提取部分,然后再对已有AI模型进行更新。
在一种可能的实现方式中,所述目标数据集包括适合所述推理数据集的数据分布的未标注数据和标注数据;该更新模块,用于:利用所述已有AI模型,对所述目标数据集中的未标注数据进行标注,获得所述未标注数据的标注结果;根据所述未标注数据的标注结果和所述目标数据集中的标注数据,对所述已有AI模型进行更新。这样,由于可以对目标数据集中的未标注数据进行标注,所以可以使目标数据集中的标注数据比较多,所以可以使得更新后的AI模型的推理精度更高。
在一种可能的实现方式中,该更新模块,用于:根据所述目标数据集中数据的数据特性,获取更新所述已有AI模型的策略;根据所述策略,对所述已有AI模型进行更新。这样,由于可以使用目标数据集中数据的数据特性,选择出更新已有AI模型的策略,对已有AI模型进行更新,所以不仅可以提升更新已有AI模型的效率,而且可以使得更新后的AI模型的推理精度更高。
在一种可能的实现方式中,该获取模块,还用于获取用户输入的AI模型的更新周期;
该确定模块,用于根据所述AI模型的更新周期,确定所述推理数据集的数据分布与训练数据集的数据分布存在差异。
第三方面,提供了一种AI模型更新的计算设备,计算设备包括处理器和存储器,其中:存储器中存储有计算机指令,处理器执行计算机指令,以实现第一方面及其可能的实现方式的方法。
第四方面,提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机指令,当计算机可读存储介质中的计算机指令被计算设备执行时,使得计算设备执行第一方面及其可能的实现方式的方法,或者使得计算设备实现上述第二方面及其可能的实现方式的装置的功能。
第五方面,提供了一种包含指令的计算机程序产品,当其在计算设备上运行时,使得计算设备执行上述第一方面及其可能的实现方式的方法,或者使得计算设备实现上述第二方面及其可能的实现方式的装置的功能。
附图说明
图1为本申请实施例提供的一种AI平台100的结构示意图;
图2为本申请实施例提供的一种AI平台100的应用场景示意图;
图3为本申请实施例提供的一种AI平台100的部署示意图;
图4为本申请实施例提供的一种部署AI平台100的计算设备400的结构示意图;
图5为本申请实施例提供的一种提供的运作模式示意图;
图6为本申请实施例提供的一种AI模型更新逻辑图;
图7为本申请实施例提供的一种AI模型更新的方法的流程示意图;
图8为本申请实施例提供的一种无用户参与的AI模型更新示意图;
图9为本申请实施例提供的一种有用户参与的AI模型更新示意图;
图10为本申请实施例提供的一种AI模型更新的方法的流程示意图;
图11为本申请实施例提供的一种AI模型更新的场景示意图;
图12为本申请实施例提供的另一种确定数据分布差异的示意图;
图13为本申请实施例提供的一种更新AI模型的局部参数示意图;
图14为本申请实施例提供的一种确定目标数据的示意图;
图15为本申请实施例提供的一种AI模型更新的装置的结构示意图;
图16为本申请实施例提供的一种AI模型更新的装置的结构示意图;
图17为本申请实施例提供的一种计算设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
目前,人工智能热潮不断,机器学习是一种实现AI的核心手段,机器学习渗透至医学、交通、教育、金融等各个行业。不仅仅是专业技术人员,就连各行业的非AI技术专业也期盼用AI、机器学习完成特定任务。
为了便于理解本申请提供的技术方案和实施例,下面对AI模型、AI模型的训练、AI平台等概念进行详细说明:
AI模型,是一类用机器学习思想解决实际问题的数学算法模型,AI模型中包括大量的参数和计算公式(或计算规则),AI模型中的参数是可以通过训练数据集对AI模型进行训练获得的数值。例如,AI模型的参数是AI模型中的计算公式或计算因子的权重。AI模型还包含一些超参数(hyper-parameter),超参数可用于指导AI模型的构建或者AI模型的训练,超 参数有多种。例如,AI模型训练的迭代(iteration)次数、学习率(leaning rate)、批尺寸(batch size)、AI模型的层数、每层神经元的个数。超参数可以是通过训练数据集对AI模型进行训练获得的参数,也可以是预先设定的参数,预先设定的参数指不会通过训练数据集对AI模型训练而更新。
AI模型多种多样,使用较为广泛的一类AI模型为神经网络模型,神经网络模型是一类模仿生物神经网络(动物的中枢神经系统)的结构和功能的数学算法模型。一个神经网络模型可以包括多种不同功能的神经网络层,每层包括参数和计算公式。根据计算公式的不同或功能的不同,神经网络模型中不同的层有不同的名称。例如,进行卷积计算的层称为卷积层,卷积层常用于对输入信号(如图像)进行特征提取。一个神经网络模型也可以由多个已有的神经网络模型组合构成。不同结构的神经网络模型可用于不同的场景(如分类、识别等)或在用于同一场景时提供不同的效果。神经网络模型结构不同具体包括以下一项或多项:神经网络模型中网络层的层数不同、各个网络层的顺序不同、每个网络层中的权重、参数或计算公式不同。业界已存在多种不同的用于识别或分类等应用场景的具有较高准确率的神经网络模型,其中,一些神经网络模型可以被特定的训练数据集进行训练后单独用于完成一项任务或与其他神经网络模型(或其他功能模块)组合完成一项任务。每个AI模型在被用于完成一项任务前通常都需要被训练。
训练AI模型,是指利用已有的样本(即训练数据集)通过一定方法使AI模型拟合已有的样本中的规律,确定AI模型中的参数。例如,训练一个用于图像分类或检测识别的AI模型需要准备一个训练图像集,根据训练图像集中的训练图像是否有标注(即:图像是否有特定的类型或名称),可以将AI模型的训练分为监督训练(supervised training)和无监督训练(unsupervised training)。对AI模型进行监督训练时,用于训练的训练图像集中的训练图像带有标注(label)。训练AI模型时,将训练图像集中的训练图像作为AI模型的输入,将训练图像对应的标注作为AI模型的输出值的参考,利用损失函数(loss function)计算AI模型输出值与训练图像对应的标注的损失(loss)值,根据损失值调整AI模型中的参数。用训练图像集中的每个训练图像迭代地对AI模型进行训练,AI模型的参数不断调整,直到AI模型可以根据输入的训练图像准确度较高地输出与训练图像对应的标注相同的输出值。对AI模型进行无监督训练,则用于训练的图像集中的训练图像没有标注,训练图像集中的训练图像依次输入至AI模型,由AI模型逐步识别训练图像集中的训练图像之间的关联和潜在规则,直到AI模型可以用于判断或识别输入的图像的类型或特征。例如,聚类,用于聚类的AI模型接收到大量的训练图像后,可学习到各个训练图像的特征以及训练图像之间的关联和区别,将训练图像自动地分为多个类型。不同的任务类型可采用不同的AI模型,一些AI模型仅可以用监督学习的方式训练,一些AI模型仅可以用无监督学习的方式训练,还有一些AI模型既可以用监督学习的方式训练又可以用无监督学习的方式训练。经过训练完成的AI模型可以用于完成一项特定的任务。通常而言,机器学习中的AI模型都需要采用有监督学习的方式进行训练,有监督学习的方式对AI模型进行训练可使AI模型在带有标注的训练图像集中更有针对性地学习到训练图像集中训练图像与对应标注的关联,使训练完成的AI模型用于预测其他输入推理图像时准确率较高。
下面举一个用监督学习的方式训练一个用于数据分类任务的神经网络模型的例子:为了训练一个用于完成数据分类任务的神经网络模型,首先根据任务搜集数据,构建训练数据集,构成的训练数据集中包含3类数据,分别为:苹果、梨、香蕉,采集的训练数据分别按照类型存放在3个文件夹中,文件夹命名即为该文件夹内所有数据的标注。训练数据集构建好之 后,选择一个可实现数据分类的神经网络模型(如卷积神经网络(convolutional neural network,CNN)),将训练数据集中的训练数据输入至CNN中,CNN中各层的卷积核对数据进行特征提取、特征分类,最后输出数据属于每个类型的置信度(confidence),根据置信度和数据对应的标注利用损失函数计算损失值,根据损失值和CNN结构更新CNN中每层的参数。前述训练过程持续进行,直到损失函数输出的损失值收敛或者训练数据集中所有的数据均被用于训练,则训练结束。
损失函数,是用于衡量AI模型被训练的程度(也就是用于计算AI模型预测的结果与真实目标之间的差异)的函数。在训练AI模型的过程中,因为希望AI模型的输出尽可能的接近真正想要预测的值,所以可以通过比较当前AI模型根据输入数据的预测值和真正想要的目标值(即输入数据的标注),再根据两者之间的差异情况来更新AI模型中的参数(当然,在第一次更新之前通常会有初始化的过程,即为AI模型中的参数预先配置初始值)。每次训练都通过损失函数判断一下当前的AI模型预测的值与真实目标值之间的差异,更新AI模型的参数,直到AI模型能够预测出真正想要的目标值或与真正想要的目标值非常接近的值,则认为AI模型被训练完成。
在训练完成AI模型之后,训练完成的AI模型可以用于对数据进行推理,得到推理结果。例如,在图像分类的场景中,具体的推理过程是:将图像输入至AI模型中,AI模型中的各层的卷积核对图像进行特征提取,基于提取的特征输出图像所属的类别。在目标检测(也可以称为是物体检测)的场景中,将图像输入至AI模型中,AI模型中的各层的卷积核对图像进行特征提取,基于提取的特征输出图像包括的每个目标的边界框的位置和类别。在涵盖图像分类和目标检测的场景时,将图像输入至AI模型中,AI模型中的各层的卷积核对图像进行特征提取,基于提取的特征输出图像所属的类别,以及图像包括的每个目标的边界框的位置和类别。此处需要说明的是,对于AI模型,有些AI模型的推理能力较强,而有些AI模型的推理能力较弱。AI模型的推理能力较强指使用AI模型对图像进行推理时,推理结果的准确率大于或等于一定数值。而AI模型的推理能力较弱指使用AI模型对图像进行推理时,推理结果的准确率低于该一定数值。
数据标注,是对每个未标注数据添加在相应场景中的全部标签的过程。例如,未标注数据为未标注图像,在图像分类的场景中,为未标注图像添加所属类别,在目标检测的场景中,为未标注图像中的目标添加位置信息以及类别。当然数据标注也可以是对一个或多个未标注数据添加在相应场景中的部分标签的过程。例如,在目标检测的场景中,仅为未标注图像中的目标添加类别,而不添加目标在未标注图像中的位置信息。
AI平台,是一种为AI开发者和用户提供便捷的AI开发环境以及便利的开发工具的平台。AI平台中内置有各种解决不同问题的AI模型或者AI子模型,AI平台可以根据用户输入的需求建立适用的AI模型。即用户只需在AI平台中确定自己的需求,且按照提示准备好训练数据集上传至AI平台,AI平台就能为用户训练出一个可用于实现用户需要的AI模型。例如,用户需要一个图像分类模型,AI平台可以选择在存储的AI模型中选择一个分类模型,然后使用训练数据集,更新该分类模型,获得用户需要的AI模型。或者,用户按照提示准备好自己的算法和训练数据集,上传至AI平台,AI平台基于用户自己的算法和训练数据集,可以训练出一个可用于实现用户需要的AI模型。用户可利用训练完成的AI模型完成自己的特定任务。
需要说明的是,上文中提到的AI模型是一种泛指,AI模型包括深度学习模型、机器学习模型等。
图1为本申请实施例中的AI平台100的结构示意图,应理解,图1仅是示例性地展示了AI平台100的一种结构化示意图,本申请并不限定对AI平台100中的模块的划分。如图1所示,AI平台100包括用户输入输出(input/output,I/O)模块101、模型训练模块102、推理模块103。可选的,AI平台还可以包括AI模型存储模块104和数据存储模块105。
下面简要地描述AI平台100中的各个模块的功能:
用户I/O模块101:用于接收用户输入的推理数据集,或者用于用户建立AI平台与产生推理数据的设备的连接,从产生推理数据的设备获取推理数据集。例如,产生推理数据的设备为摄像机等。用户I/O模块101还用于接收用户输入或选择的任务目标、接收用户的训练数据集等。用户I/O模块101还用于接收用户对推理数据集中目标数据(目标数据为适应推理数据集的数据分布的样例)的标注结果、从用户获取带标注的一个或多个数据等,当然,用户I/O模块101还用于向其他用户提供AI模型等。作为用户I/O模块101的举例,可采用图形用户界面(graphical user interface,GUI)或命令行界面(CLI)实现。例如,GUI上显示AI平台100可向用户提供多种AI业务(如图像分类业务、目标检测业务等)。用户可在GUI上选择一个任务目标,例如,用户选择图像分类业务,用户可以继续在AI平台的GUI中上传未标注的多个图像等。GUI接收到任务目标和未标注的多个图像后,与模型训练模块102进行通信。模型训练模块102根据用户确定的任务目标为用户选择或者搜索可用于完成用户任务目标的初始AI模型。
可选的,用户I/O模块101还可用于接收用户输入的对完成任务目标的AI模型的效果期望。例如,输入或选择最终获得的AI模型用于人脸识别的准确率要高于99%。
可选的,用户I/O模块101还可用于接收用户输入的AI模型等。例如,用户可基于自己的任务目标,在GUI输入初始AI模型。用户I/O模块101还可用于提供各种预先内置的初始AI模型供用户选择。例如,用户可根据自己的任务目标在GUI上选择一个初始AI模型。
可选的,在AI模型应用于图像分类的场景、目标检测的场景时,用户I/O模块101还可用于接收用户输入的推理数据集中未标注图像的表层特征和深层特征。对于图像分类的场景中,表层特征包括图像的分辨率、图像的长宽比、图像的红绿蓝(RGB)的均值和方差、图像的亮度、图像的饱和度或图像的清晰度中的一种或多种,深层特征指使用特征提取模型(如CNN等)中的卷积核提取到的图像的抽象特征。对于目标检测的场景中,表层特征包括边界框的表层特征和图像的表层特征,边界框的表层特征可以包括单帧图像中每个边界框的长宽比、单帧图像中每个边界框的面积占图像面积的比例、单帧图像中每个边界框的边缘化程度、单帧图像中每个边界框的堆叠图、单帧图像中每个边界框的亮度或单帧图像中每个边界框的模糊度中的一种或多种,图像的表层特征可以包括图像的分辨率、图像的长宽比、图像的RGB的均值和方差、图像的亮度、图像的饱和度或图像的清晰度、单帧图像中框的数目或单帧图像中框的面积的方差中的一种或多种。深层特征指使用特征提取模型(如CNN等)中的卷积核提取到的图像的抽象特征。
可选的,用户I/O模块101还可用于提供GUI,用于用户对训练数据集中训练数据的标注,并且用于用户对推理数据集中的目标数据进行标注。
可选的,用户I/O模块101还可用于接收用户对初始AI模型、训练数据集中训练数据的各种配置信息等。
可选的,用户I/O模块101还可用于提供GUI,用于模型训练模块102提供更新前的AI模型的推理精度和更新后的AI模型的推理精度等,并且用于用户输入更新AI模型的指令。
可选的,用户I/O模块101还可用于提供GUI,用于用户输入AI模型的更新周期。
模型训练模块102:用于对AI模型进行训练,此处“训练”可以理解为对初始AI模型进行训练以及对完成训练的AI模型进行优化更新,初始AI模型包括未进行训练的AI模型。完成训练的AI模型指的是对初始AI模型进行训练,获得的AI模型,或者,对已有的完成训练的AI模型进行更新获得的AI模型。
模型训练模块102与用户I/O模块101、推理模块103、AI模型存储模块104均可以通信。具体的,模型训练模块102可以从用户I/O模块101获取用户标注的数据。模型训练模块102可以从AI模型存储模块104获取现有的AI模型,作为初始AI模型等。模型训练模块102可以从推理模块103获取推理数据集的推理结果以及推理数据集,基于该推理结果和推理数据集训练AI模型。
可选的,模型训练模块102还用于对用户I/O模块101接收到的训练数据集中的训练数据进行预处理操作。例如,对用户上传的训练图像集中训练图像进行预处理可使得训练图像集中训练图像在尺寸上具有一致性,还可以去除训练图像集中不恰当的训练图像。预处理后的训练数据集可适用于对初始AI模型进行训练,还可使训练的效果更优。预处理后的训练图像集还可以被存储至数据存储模块105。
可选的,模型训练模块102还可用于将用户在GUI上选择的AI模型确定为初始AI模型。或者将用户通过GUI上传的AI模型确定为初始AI模型。
可选的,模型训练模块102,还可以用于对训练的AI模型,进行评估,获得评估结果。当然对AI模型进行评估,也可以是一个单独的模块。
推理模块103使用AI模型对推理数据集进行推理,输出推理数据集的推理结果和目标数据。推理模块103与用户I/O模块101、AI模型存储模块104均可以进行通信。推理模块103从用户I/O模块101获取推理数据集,对推理数据集进行推理处理,得到推理数据集的推理结果。推理模块103将推理数据集的标注结果和目标数据反馈给用户I/O模块101。用户I/O模块101获取用户标注的目标数据、用户对推理结果的标注确认,将用户标注的目标数据和用户标注确认的推理数据反馈给模型训练模块102。模型训练模块102基于用户I/O模块101提供的目标数据和用户标注确认的推理数据,继续对优化AI模型进行训练,得到更加优化的AI模型。模型训练模块102将更加优化的AI模型传输至AI模型存储模块104进行存储,将更加优化的AI模型传输至推理模块103用于进行推理处理。
可选的,推理模块103在对推理数据集进行推理时,输出还可以包括推理数据集的数据分布与AI模型的训练数据集的数据分布的差异。此时,推理模块103将该差异提供给模型训练模块102,模型训练模块102可以基于该差异确定对AI模型的更新方式。当然推理数据集的数据分布与AI模型的训练数据集的数据分布的差异也可以不是由推理模块103确定,而是由AI平台上的独立模块确定。另外,推理数据集的数据分布与AI模型的训练数据集的数据分布的差异也可以是由模型训练模块102确定。
可选的,推理模块103还用于对用户I/O模块101接收到的推理数据集中的推理数据进行预处理操作。例如,对用户上传的推理图像集中推理图像进行预处理可使得推理图像集中推理图像在尺寸上具有一致性,还可以去除推理图像集中不恰当的推理图像。预处理后的推理数据集可适合用于对初始AI模型进行推理,还可使推理的效果更优。预处理后的推理图像集还可以被存储至数据存储模块105。
上述数据预处理操作,也可以是一个单独的模块,分别连接推理模块103和模型训练模块102,为推理模块103提供预处理后的推理数据集,并为模型训练模块102提供预处理后 的训练图像集。
可选的,在AI模型更新过程中,若没有用户参与,则推理模块103可以不向用户I/O模块101提供推理数据集的推理结果和目标数据。
可选的,初始AI模型还可以包括使用训练数据集中的数据对AI模型存储模块104中的AI模型训练后的AI模型。
AI模型存储模块104:用于存储初始AI模型、更新后的AI模型、AI子模型结构和预置模型等。预置模型为AI平台上已经训练好可以直接使用的AI模型,或者为AI平台上已经进行训练但需要继续训练更新的AI模型。AI模型存储模块104与用户I/O模块101、模型训练模块102均可以进行通信。AI模型存储模块104接收并存储模型训练模块102传输的训练完成的初始AI模型和更新后的AI模型。AI模型存储模块104为模型训练模块102提供AI子模型或者初始AI模型。AI模型存储模块104对用户I/O模块101接收到的用户上传的初始AI模型,进行存储。应理解,在另一个实施例中,AI模型存储模块104也可作为模型训练模块102中的一部分。
数据存储模块105(如可以是云服务提供商提供的对象存储服务(Object Storage Service,OBS)对应的数据存储资源):用于存储用户上传的训练数据集和推理数据集,也用于存储数据预处理模块105处理后的数据,还用于存储采样或生成的适用于推理数据集的数据分布的数据。
可选的,上述是用户I/O模块获取推理数据集,当然数据存储模块105也可以直接连接数据源,获得推理数据集。例如,数据存储模块105连接有摄像机,摄像机拍摄的视频图像构成推理数据集。
可选的,数据存储模块105中还可以存储有知识库,知识库中包括有助于更快更新AI模型的知识。
需要说明的是,本申请中的AI平台可以是一个可以与用户交互的系统,这个系统可以是软件系统也可以是硬件系统,也可以是软硬结合的系统,本申请中不进行限定。
还需要说明的是,上述模型训练模块102既用于实现AI模型的初始训练,还用于实现AI模型的更新,当然在本申请实施例中,也可以分别部署用于初始训练的模块和用于AI模型更新的模块。
由于上述各模块的功能,本申请实施例提供的AI平台可以确定出推理数据集的数据分布与训练数据集的数据分布存在差异,且在存在差异的情况下,对AI模型进行更新处理,所以可以及时的对AI模型进行更新。
需要说明的是,在上述AI平台中,也可以不包括推理模块103,AI平台仅用于提供对AI模型进行更新的处理。具体的,用户将推理数据集、AI模型以及训练该AI模型的训练数据集(也可以是训练该AI模型的训练数据集的数据分布)提供给AI平台,AI平台对AI模型进行更新。AI平台向用户提供更新后的AI模型。或者,AI平台连接有第三方平台(即对推理数据进行推理的推理方平台),AI平台从第三方平台获取推理数据集、AI模型以及训练该AI模型的训练数据集(也可以是训练该AI模型的训练数据集的数据分布),AI平台对AI模型进行更新。AI平台向第三方平台提供更新后的AI模型。
图2为本申请实施例提供的一种AI平台100的应用场景示意图,如图2所示,在一种实施例中,AI平台100可全部部署在云环境中。云环境是云计算模式下利用基础资源向用户提供云服务的实体。云环境包括云数据中心和云服务平台,云数据中心包括云服务提供商拥有 的大量基础资源(包括计算资源、存储资源和网络资源),云数据中心包括的计算资源可以是大量的计算设备(例如服务器)。AI平台100可以独立地部署在云数据中心中的服务器或虚拟机上,AI平台100也可以分布式地部署在云数据中心中的多台服务器上、或者分布式地部署在云数据中心中的多台虚拟机上、再或者分布式地部署在云数据中心中的服务器和虚拟机上。如图2所示,AI平台100由云服务提供商在云服务平台抽象成一种AI云服务提供给用户,用户在云服务平台购买该云服务后(可预充值再根据最终资源的使用情况进行结算),云环境利用部署在云数据中心的AI平台100向用户提供AI平台云服务。在使用AI平台云服务时,用户可以通过应用程序接口(application program interface,API)或者GUI确定要AI模型完成的任务、上传训练图像集和推理数据集至云环境,云环境中的AI平台100接收用户的任务信息、训练数据集和推理数据集,执行数据预处理、AI模型训练、使用训练完成的AI模型对推理数据集进行推理等操作。AI平台通过API或者GUI向用户返回对推理数据集的推理结果、在推理数据集中确定出的目标数据、更新前的AI模型的推理精度和更新后的AI模型的推理精度等内容。用户进一步选择是否要部署更新后的AI模型。训练完成的AI模型可被用户下载或者在线使用,用于完成特定的任务。
在本申请的另一种实施例中,云环境下的AI平台100抽象成AI云服务向用户提供时,可分为两部分,即:基础AI云服务和基于数据分布更新AI模型的云服务,该基础AI云服务可以是训练AI模型的服务。用户在云服务平台可先仅购买基础AI云服务,在需要使用更新AI模型的云服务时再进行购买,购买后由云服务提供商提供更新AI模型的云服务API,最终按照调用该API的次数对更新AI模型的云服务进行额外计费。当然,也可以仅购买更新AI模型的云服务。
本申请提供的AI平台100的部署较为灵活,如图3所示,在另一种实施例中,本申请提供的AI平台100还可以分布式地部署在不同的环境中。本申请提供的AI平台100可以在逻辑上分成多个部分,每个部分具有不同的功能。例如,在一种实施例中AI平台100包括用户I/O模块101、模型训练模块102、AI模型存储模块104和数据存储模块105。AI平台100中的各部分可以分别部署在终端计算设备、边缘环境和云环境中的任意两个或三个环境中。终端计算设备包括:终端服务器、智能手机、笔记本电脑、平板电脑、个人台式电脑、智能摄相机等。边缘环境为包括距离终端计算设备较近的边缘计算设备集合的环境,边缘计算设备包括:边缘服务器、拥有计算能力的边缘小站等。部署在不同环境或设备的AI平台100的各个部分协同实现为用户提供构建的AI模型确定和训练等功能。例如,在一种场景中,终端计算设备中部署AI平台100中的用户I/O模块101、数据存储模块105,边缘环境的边缘计算设备中部署AI平台100中的模型训练模块102、推理模块103和AI模型存储模块104。用户将训练数据集和推理数据集发送至终端计算设备中的用户I/O模块101,终端计算设备将训练数据集和推理数据集存储至数据存储模块105。边缘计算设备中模型训练模块102基于推理数据集对AI模型进行更新。应理解,本申请不对AI平台100的哪些部分部署具体部署在什么环境进行限制性的划分,实际应用时可根据终端计算设备的计算能力、边缘环境和云环境的资源占有情况或具体应用需求进行适应性的部署。上述是以用户需要输入训练数据集为例说明,当然,用户也可以不输入训练数据集,用户可以直接输入训练数据集的分布,或者,模型训练模块102分析当前已有AI模型,确定出训练数据集的分布。
AI平台100也可以单独部署在任意环境中的一个计算设备上(如单独部署在边缘环境的一个边缘服务器上)。图4为部署有AI平台100的计算设备400的硬件结构示意图,图4所示的计算设备400包括存储器401、处理器402、通信接口403以及总线404。其中,存储器 401、处理器402、通信接口403通过总线404实现彼此之间的通信连接。
存储器401可以是只读存储器(Read Only Memory,ROM),随机存取存储器(Random Access Memory,RAM),硬盘,快闪存储器或其任意组合。存储器401可以存储程序,当存储器401中存储的程序被处理器402执行时,处理器402和通信接口403用于执行AI平台100为用户训练AI模型、确定推理数据集的数据分布与训练数据集的数据分布存在差异、基于推理数据集更新AI模型的方法。存储器还可以存储数据集。例如,存储器401中的一部分存储资源被划分成一个数据存储模块105,用于存储AI平台100所需的数据,存储器401中的一部分存储资源被划分成一个AI模型存储模块104,用于存储AI模型库。
处理器402可以采用中央处理器(CPU),应用专用集成电路(ASIC),图形处理器(GPU)或其任意组合。处理器402可以包括一个或多个芯片。处理器402可以包括AI加速器,例如神经网络处理器(neural processing unit,NPU)。
通信接口403使用例如收发器一类的收发模块,来实现计算设备400与其他设备或通信网络之间的通信。例如,可以通过通信接口403获取数据。
总线404可包括在计算设备400各个部件(例如,存储器401、处理器402、通信接口403)之间传送信息的通路。
在本申请实施例中,AI模型更新的方法中,在对已有AI模型进行更新时,可以有多种运作模式,如可以包括在线单节点自更新、在线多节点协同更新和离线多节点协同更新。在线单节点自更新是指部署了AI模型的每个计算节点相互独立,只使用自身接入的数据在线进行AI模型更新;在线多节点协同更新是指在不同的计算节点之间增加数据通信,允许计算节点之间交换接入的数据,那么一个计算节点在线进行模型更新时,不仅可以使用自身接入的数据,还可以使用其它计算节点共享的数据;离线多节点协同更新是指AI模型更新可以同时以离线的方式进行,在这种模式下,各个计算节点接入的数据可以汇总到一起,统一供离线更新AI模型时使用,离线更新好AI模型之后,再向各计算节点进行推送更新后的AI模型。这三种运作方式,分别对应图5所示的图5中的(a)、图5中的(b)和图5中的(c)。
具体的,在对已有AI模型进行更新时,可以根据实际情况,选择合适的运作模式(详细过程后文中有描述)。例如,在推理数据集的数据分布与训练数据集的数据分布变化不大时,采用在线单节点自更新;在推理数据集的数据分布与训练数据集的数据分布变化比较大时,采用离线多节点协同更新等。
在本申请实施例中,还提供了如图6所示的AI模型更新逻辑图:
图6中包括数据源、离线更新、在线更新和知识库等,数据源包括AI模型更新方法所应用的场景的数据。知识库包括模型训练的先验知识和/或领域知识,先验知识和/或领域知识可以提供模型更新策略选择的依据,并且还可以提供AI模型更新过程指导模型更新的策略。例如,用户需要一个检测猫的AI模型,AI平台从领域知识中获得猫和老虎均属于猫科动物,且具有相似性,当前AI平台已经有检测老虎的AI模型,AI平台可以获取检测老虎的AI模型,然后使用猫的图像对检测老虎的AI模型进行更新,获得检测猫的AI模型。再例如,用户对AI模型的推理速度、占用内存、硬件设备中的一种或多种有要求,AI平台可以基于用户的要求和知识库,选取符合用户要求的AI模型架构,或者预置模型。
在线更新指的是在线自适应更新AI模型。离线更新指的是离线更新AI模型。在离线更新时,使用的数据可以包括真实数据(如推理数据集中的数据)、采样或生成的适应推理数据集的标注数据(在图6中简称为生成的数据),当然使用的数据还可以包括数据特性,数据特 性指的是各种类型数据的统计量、分布等信息。图6中左下角方框中的“离线更新”指代的是离线更新的过程,包括获取更新AI模型的数据,基于该数据对已有AI模型进行更新,此处需要说明的是,在离线更新时,可以是对已有AI模型的参数进行调整,也可以是重新训练一个新的AI模型作为更新后的AI模型,在重新训练一个新的AI模型时,离线更新的过程还包括从AI模型存储模块104中获取初始的模型的处理。在线更新时,使用的数据(可以简称为在线数据)可以包括从真实数据中获取的数据、采样或生成的适应推理数据集的标注数据等。
下面结合图7描述AI模型更新的方法的具体流程,以该方法由AI平台执行为例进行说明:
步骤701,AI平台获取推理数据集。
其中,推理数据集中的推理数据用于输入至已有AI模型执行推理,已有AI模型也可以称为是更新前的AI模型。
在本实施例中,用户可以向AI平台输入推理数据集,或者AI平台从连接的推理数据源,获得推理数据集等。例如,AI平台连接有摄像头,AI平台可以从摄像头持续获取数据,作为推理数据集中的数据,摄像头为推理数据源。该推理数据集中的数据为未标注数据。
步骤702,AI平台确定推理数据集的数据分布与训练数据集的数据分布存在差异。
其中,训练数据集为训练已有AI模型所使用的数据集。
在本实施例中,AI平台可以在每次获得推理数据集时,确定推理数据集的数据分布,并且获取训练数据集的数据分布。或者,AI平台可以在每次在达到AI模型的更新周期时,确定推理数据集的数据分布,并且获取训练数据集的数据分布,该AI模型的更新周期可以由用户设置。
AI平台判断推理数据集的数据分布与训练数据集的数据分布是否存在差异,若推理数据集的数据分布与训练数据集的数据分布存在差异,则执行步骤703,反之不进行后续处理,原因为:推理数据集的数据分布与训练数据集的数据分布存在差异,大概率说明已有AI模型有可能不适用于推理数据集,对已有AI模型进行更新处理,以适应推理数据集。在推理数据集的数据分布与训练数据集的数据分布不存在差异时,说明已有AI模型还适用于推理数据集,可以不对已有AI模型进行更新处理,以节约处理资源。
此处,获取训练数据集的数据分布的过程可以是:AI平台可以获取到训练数据集,基于训练数据集,确定训练数据集的数据分布,也可以是AI平台从用户获取到的训练数据集的数据分布,也可以是AI平台基于已有AI模型对推理数据集的推理结果,分析获得训练数据集的数据分布。
步骤703,AI平台利用推理数据集,对已有AI模型进行更新,获得更新后的AI模型。
在本实施例中,AI平台在确定推理数据集的数据分布与训练数据集的数据分布存在差异时,可以基于推理数据集,对已有AI模型进行更新,获得更新后的AI模型。
这样,由于数据分布的变化可以感知,所以可以根据数据分布的变化,及时的对已有AI模型进行更新。
在一种可能的实现方式中,已有AI模型部署在推理平台(推理平台可以为AI平台一部分,如推理平台包括前文中提到的推理模块等,也可以是独立于AI平台之外的平台),在步骤703之后,AI平台可以比较更新后的AI模型和已有AI模型的推理精度,确定更新后的AI模型的推理精度优于已有AI模型的推理精度;将更新后的AI模型部署至推理平台,以使 更新后的AI模型代替已有AI模型执行推理。
在本实施例中,AI平台可以使用适应训练数据集的数据分布的测试数据集和适应推理数据集的数据分布的测试数据集,分别对更新后的AI模型进行评估,获得第一评估结果和第二评估结果,使用适应推理数据集的数据分布的测试数据集和适应训练数据集的数据分布的测试数据集,分别对已有AI模型进行评估,获得第三评估结果和第四评估结果。若第一评估结果未显著低于第四评估结果,且第二评估结果优于第三评估结果,则确定更新后的AI模型的推理精度优于已有AI模型的推理精度。AI平台可以向推理平台提供更新后的AI模型(具体可以是提供更新后的AI模型的全部内容,也可以是提供更新后的AI模型与已有AI模型的区别内容),推理平台可以使用更新后的AI模型代替已有AI模型执行推理处理。
需要说明的是,对已有AI模型和更新后的AI模型的评估时,可以采用准确率-召回率(Precision Recall,PR)曲线、平均精度(Average Precision,AP)指标、误报率、漏报率、中的一种或多种。当然对不同类型的AI模型进行评估时,可以使用不同的评估指标,如在目标检测场景中,除了使用PR曲线、AP指标等,还可以使用目标框的交并比分布、不同交并比下AP的平均值等。此处是以PR曲线为例,当然,也可以是盒图、混淆矩阵等各种图表,而不局限于PR曲线
上述是以推理精度评估AI模型,当然还可以使用推理速度,即AI平台将推理精度和推理速度的综合结果,作为评估AI模型的依据。
在一种可能的实现方式中,用户可以决定是否部署更新后的AI模型,AI平台在确定更新后的AI模型的推理精度优于已有AI模型的推理精度时,可以通过显示界面显示已有AI模型的推理精度和更新后的AI模型的推理精度。用户可以基于二者的推理精度判断是否要部署更新后的AI模型,在确定部署更新后的AI模型时,用户可以触发更新AI模型,AI平台则会接收用户对已有AI模型的更新指令。AI平台可以向推理平台提供更新后的AI模型。
此处在显示推理精度时,AI平台在显示界面中还可以显示其它有助于展现更新后的AI模型的模型特性的信息或者用户指定的信息,如梯度变化趋势、训练的损失函数下降趋势、验证集精度变化趋势、中间输出结果或中间特征可视化中的一种或多种。
在一种可能的实现方式中,在步骤703中,可以按照差异的不同,采用不相同的更新方式:
若差异达到离线更新条件,则AI平台利用推理数据集对已有AI模型进行离线更新;若差异未达到离线更新条件,则AI平台利用推理数据集对已有AI模型进行在线更新。
其中,离线更新条件是预先设定的。例如,本申请实施例中针对图像,离线更新条件为图像的重构误差与训练图像集的重构误差大于第一数值、图像的特征对应的预测图像与原图像的预测误差大于第二数值等。还例如,本申请实施例针对交换机进行异常检测,推理数据集中的数据为丢包率,离线更新条件为推理数据集中的最大丢包率与训练数据集中最大丢包率的差值大于第三数值等。
在本实施例中,AI平台可以判断推理数据集的数据分布与训练数据集的数据分布的差异是否达到离线差异条件。若该差异达到离线差异条件,说明推理数据集的数据分布与训练数据集的数据分布的差异比较大,已有AI模型不再适用于推理数据集的推理,在线更新会比较慢,不适用于在线更新,AI平台利用推理数据集对已有AI模型进行离线更新。
若该差异未达到离线差异条件,说明推理数据集的数据分布与训练数据集的数据分布的差异不是很大,已有AI模型虽然不再适用于推理数据集的推理,但是在线更新即可,AI平台利用推理数据集对已有AI模型进行在线更新。
这样,可以基于推理数据集的数据分布与训练数据集的数据分布的差异,灵活的选择更新方式。
在一种可能的实现方式中,在进行在线更新时,可以基于推理数据集的数据分布与训练数据集的数据分布的差异,确定更新后的AI模型中的参数,具体为:
AI平台利用推理数据集的数据分布与训练数据集的数据分布的差异,确定已有AI模型的目标部分的参数变化量;基于已有AI模型中目标部分当前的参数和参数变化量,确定更新后的AI模型中目标部分的参数。
其中,目标部分为已有AI模型中的一个子模型,或者为已有AI模型的全部子模型。
在本实施例中,AI平台可以获取数据分布的变化与参数变化量的关系,该关系可以是AI平台预先建模获得,或者,从其他平台获得。AI平台可以获取数据分布的变化与参数变化量的关系,使用推理数据集的数据分布与训练数据集的数据分布的差异,数据分布的变化与参数变化量的关系,确定出目标部分的参数变化量。然后AI平台使用已有AI模型中目标部分当前的参数和参数变化量,确定更新后的AI模型中目标部分的参数,AI平台将目标部分的参数替换到已有AI模型,即获得更新后的AI模型。这样,可以在线更新AI模型的部分或全部参数。
在一种可能的实现方式中,在对已有AI模型进行更新时,可以基于一个数据集(后续可以称为是目标数据集),对已有AI模型进行更新处理,处理如下:
AI平台根据推理数据集构造目标数据集;利用目标数据集对已有AI模型进行更新。
在本实施例中,已有AI模型可以使用推理数据集,构造出一个目标数据集,如在推理数据集中采样一些数据,作为目标数据集中的数据等。然后AI平台利用该目标数据集,对已有AI模型进行更新处理,获得更新后的AI模型。
在一种可能的实现方式中,在利用目标数据集对已有AI模型进行更新时,用户可以参与构建目标数据集,也可以不参与目标数据集。在用户参与构建目标数据集时,用户可以对推理数据的推理结果进行确认,或者对AI平台提供的目标数据进行标注,此时目标数据集中包括标注数据,AI平台可以使用监督学习技术等对已有AI模型进行更新。另外,AI平台也可以在必要时使用无监督学习技术更新已有AI模型,另外AI平台也可以通过迁移学习和领域适应技术对已有AI模型进行微调和适配。
在用户未参与构建目标数据集时,AI平台可以采用无监督学习技术更新已有AI模型,或者采用半监督学习技术更新已有AI模型。
具体处理为:1、可以基于用户的参与,构造目标数据集:AI平台在推理数据集中,获取满足样例条件的目标数据,通过显示界面显示目标数据;获取用户对目标数据的标注结果;根据目标数据以及目标数据的标注结果,构造目标数据集。
其中,样例条件用于指示推理数据集中典型的数据。
在本实施例中,AI平台可以在推理数据集中,确定满足样例条件的目标数据。然后AI平台通过显示界面显示目标数据,具体的,目标数据可以是推理数据集中未标注的数据,也可以是推理数据集经过推理后带有标注的数据。在目标数据为推理数据集中未标注的数据时,用户可以在显示界面,对目标数据进行标注,AI平台则会获取到用户对目标数据的标注结果。在目标数据为推理数据集经过推理后带有标注的数据,用户可以在显示界面,对目标数据已有的标注进行确认,AI平台也可以获得用户对目标数据的标注结果。AI平台可以将目标数据和目标数据的标注结果,作为目标数据集的部分标注数据。
2、在用户参与的情况下,AI平台可以通过显示界面显示推理数据集的推理结果,用户 对推理结果,进行标注确认,AI平台可以获取标注确认的推理结果,作为目标数据集中的标注数据。另外,用户对推理结果进行标注确认的同时,如果发现新的类别,可以在标注确认结果中添加新的类别的标注,该新的类别的标注也作为目标数据集中的标注数据。
3、在用户未参与的情况下,目标数据集中还可以包括AI平台在当前的标注数据中,采样和/或生成适合推理数据集的数据分布的数据。当前的标注数据可以包括训练数据集中的数据,还可以包括已有的其它场景中的标注数据。此处“采样”指的是在当前的标注数据中,查找到适合推理数据集的数据分布的数据,此处“生成”指的是基于当前的标注数据和数据生成算法,生成适合推理数据集的数据分布的数据。这样,在现有的适应推理数据集的数据分布的数据比较少时,也可以提供出比较多的适应推理数据集的数据分布的数据,便于更新AI模型。
需要说明的是,在用户参与时,用户可以实时参与标注数据,AI平台可以实时使用用户标注的标注数据进行训练。
在一种可能的实现方式中,利用目标数据集对已有AI模型进行更新的处理可以为:
AI平台根据所述目标数据集中数据的数据特性,获取更新所述已有AI模型的策略;根据所述策略,对所述已有AI模型进行更新。
在本实施例中,AI平台可以使用目标数据集中数据的数据特性,使用知识库的中的先验知识和/或领域知识,选择更新已有AI模型的策略。然后AI平台使用该策略,对已有AI模型进行更新,获得更新后的AI模型。
例如,如果目标数据集中的标注数据比较少,基于知识库中的先验知识,获得在标注数据比较少时,使用迁移学习和/或小样本学习技术。AI平台可以使用迁移学习和/或小样本学习技术更新已有AI模型。
如果目标数据集中标注数据比较多,基于知识库中的先验知识,获得在标注数据比较多时,使用强监督技术。AI平台可以使用强监督技术,更新AI模型。需要说明的是,这里的强监督技术只是一种统称,仅作为示意,具体实现中,其可能为某一种具体的强监督训练方法,针对不同的任务和场景,对应的训练方法不同。
在一种可能的实现方式中,在AI模型更新过程中,没有用户的参与,目标数据集的标注数据是在当前的标注数据中,采样和/或生成适合推理数据集的数据分布的数据。目标数据集包括适合推理数据集的数据分布的未标注数据和标注数据,AI平台可以利用目标数据集中的未标注数据,使用无监督方式对已有AI模型中的特征提取部分进行优化;根据优化后的特征提取部分和目标数据集中的标注数据,对已有AI模型进行更新。
在本实施例中,在目标数据集中包括未标注数据和适合推理数据集的数据分布的标注数据时,AI平台可以利用目标数据集中的未标注数据,使用无监督方式,对已有AI模型中的特征提取部分进行优化,获得优化后的特征提取部分。此处的无监督方式可以是自监督方式等。然后AI平台可以使用优化后的特征提取部分和适应推理数据集的数据分布的标注数据,进一步对已有AI模型进行更新,获得更新后的AI模型。这样,在用户未参与的情况下,也能对已有AI模型进行更新。例如,如图8所示,在没有用户的参与的情况下,AI平台在对推理数据集进行推理时,获取推理结果和目标数据,仅会将推理结果提供给用户,而不会将目标数据提供给用户(图8中的(a)所示)。AI平台对已有AI模型进行更新,在对已有AI模型进行更新后,可以将更新后的AI模型提供给用户(图8中的(b)所示)。后续AI平台可以使用更新后的AI模型进行推理,获得推理结果和目标数据,将推理结果提供给用户(图8中的(c)所示)。当然,在图8中,也可以不输出目标数据。
需要说明的是,此处是基于优化后的特征提取部分和目标数据集中的标注数据,对已有AI模型进行更新,当然也可以是基于优化后的特征提取部分、目标数据集中的未标注数据和标注数据,对已有AI模型进行更新。
在一种可能的实现方式中,在AI模型更新过程中,没有用户的参与,目标数据集的标注数据是在当前的标注数据中,采样和/或生成适合推理数据集的数据分布的数据。目标数据集包括适合推理数据集的数据分布的未标注数据和标注数据,AI平台可以利用已有AI模型,对目标数据集中的未标注数据进行标注,获得目标数据集中的标注数据;利用所述已有AI模型,对所述目标数据集中的未标注数据进行标注,获得所述未标注数据的标注结果。根据所述未标注数据的标注结果和所述目标数据集中的标注数据,对所述已有AI模型进行更新。
在本实施例中,在目标数据集中的数据为未标注数据时,AI平台利用已有AI模型,对目标数据集中的未标注数据进行标注,获得未标注数据的标注结果(此处的标注结果有可能是不准确的标注,可以称为是“伪标注”)和对应的置信度。然后AI模型使用置信度比较高的标注结果(如置信度高于预设阈值的标注结果)和目标数据集中的标注数据,对已有AI模型进行更新,获得更新后的AI模型。
或者,AI模型使用置信度比较高的标注结果,对已有AI模型进行训练,获得训练后的AI模型。然后AI平台可以使用目标数据集中的标注数据(这部分标注数据是经过采样和/或生成适合所述推理数据集的数据分布的标注数据),进一步对训练后的AI模型进行更新,获得更新后的AI模型。这样,在用户未参与的情况下,也能对已有AI模型进行更新。
在一种可能的实现方式中,在AI模型更新过程中,有用户的参与,AI平台可以基于目标数据集中的标注数据,使用强监督学习技术,更新已有AI模型。具体的,在任务没有发生变化的情况下,直接使用强监督学习技术,更新已有AI模型,获得更新后的AI模型。在任务发生变化的情况下(如出现了新的类别需要分类等),若目标数据集中新的类别的标注数据的数目比较少,AI平台可以选择小样本学习技术更新已有AI模型。
例如,如图9所示,在有用户的参与的情况下,AI平台在对推理数据集进行推理时,获得推理结果和目标数据,会将推理结果和目标数据提供给用户(图9中的(a)所示)。用户可以向AI平台提供对推理结果的标注确认和/或多目标数据的标注结果(即标注数据)(图9中的(b)所示)。AI平台可以基于用户提供的标注数据,更新已有AI模型,获得更新后的AI模型,将更新后的AI模型提供给用户(图9中的(c)所示)。后续AI平台可以使用更新后的AI模型进行推理,获得推理结果和目标数据,提供给用户(图9中的(d)所示)。
另外,在任务没有发生变化的情况,若目标数据集中标注数据的数目比较少,则AI平台还可以在当前的标注数据中,采样和/或生成适合推理数据集的数据分布的数据,扩充目标数据集中的标注数据。
另外,在选择小样本学习技术更新已有AI模型时,还可以使用表示学习方式,调优已有AI模型的特征提取部分。
在一种可能的实现方式中,AI平台确定推理数据集的数据分布与训练数据集的数据分布的差异的过程为:
AI平台获取建模数据分布的概率模型,此处可以是AI平台自身建模获得该概率模型,也可以是从其它平台获得该概率模型。AI平台可以使用该概率模型提取训练数据集的特征,拟合一个混合高斯分布。AI平台可以在训练数据集的特征上拟合高斯分布,确定推理数据集的似然,该似然即表示训练数据集的数据分布与推理数据集的数据分布的差异,在似然越大时,表征该差异越小,反之该差异越大。
此处需要说明的是,进行概率建模时,使用的分布可以是混合高斯分布、参数化的分布拟合方法等,当然也可以是其它分布、非参数化的分布拟合算法、复杂的概率图模型等。
需要说明的是,上述仅描述了一次对已有AI模型进行更新的过程,在使用过程中可以持续循环对已有AI模型进行更新。如图10所示,提供了AI模型更新的循环示意图:步骤1001,AI平台获取推理数据集。步骤1002,AI平台判断推理数据集的数据分布与训练数据集的数据分布存在差异。步骤1003,AI平台利用推理数据集,对当前的已有AI模型进行更新,获得更新后的AI模型。步骤1004,部署更新后的AI模型,更新后的AI模型即为当前已有AI模型。步骤1005,AI平台返回执行步骤1002。
图10中仅描述了一个循环的过程,只要不停止对AI模型的更新,实际是一直在循环中。具体的,即AI平台接入推理数据源,会一直从推理数据源获取到推理数据集中的数据,确定出推理数据集的数据分布,继而判断与训练数据集的数据分布是否存在差异,在存在差异时,即更新已有AI模型,部署更新后的AI模型,然后返回步骤1002。步骤1001和步骤1002,与步骤1003和步骤1004是异步关系,这是由于:推理数据集中的数据是一直更新的,会一直判断推理数据集的数据分布与训练数据集的数据分布是否存在差异。而步骤1003和步骤1004是在推理数据集的数据分布与训练数据集的数据分布存在差异时,才会执行。
用户可以向AI平台输入停止指令,控制AI平台更新已有AI模型。或者AI平台确定推理数据集的数据分布与训练数据集的数据分布的差异比较小,且更新后的AI模型相比已有AI模型(即更新前的AI模型)的精度变化比较小,则AI平台可以主动停止更新AI模型。
在停止更新AI模型后,AI平台再次确定推理数据集的数据分布与训练数据集的数据分布的差异比较大,则可以向用户提供更新提示消息(如通过短消息将更新提示消息发送至用户的终端、通过显示界面显示更新提示消息等),AI平台在接收到用户输入的确认更新指令时,可以重新启动更新AI模型的流程。当然AI在再次确定推理数据集的数据分布与训练数据集的数据分布的差异比较大时,可以主动启动更新AI模型的流程。
本申请实施例中,AI模型更新的方法可以应用于识别图像中动作的应用场景中。例如,应用于物流场景中不规范分拣动作的识别。在该应用场景中,AI平台可以基于已有AI模型进行推理和更新已有AI模型。
在该应用场景中,推理数据集为用户的监控视频数据,并且已有AI模型是可以进行不规范分拣动作识别的模型。在该应用场景中,训练数据集也是视频数据。该场景中的示意图为:如图11所示,AI平台包括:视频推理模块(即前文中的推理模块103)、存储服务模块(即前文中的数据存储模块105)、模型训练模块(即前文中的模型训练模块102)、用户I/O模块(即前文中的用户I/O模块101)等。视频推理模块用于推理从摄像头获取的推理数据集,存储服务模块用于存储推理数据集等,模型训练模块用于更新已有AI模型,用户I/O模块用于与用户进行交换。
AI模型更新的流程可以包括确定推理数据集的数据分布与训练数据集的数据分布的差异的过程、在线更新已有AI模型的过程、对推理数据集进行推理的过程、为用户提供目标数据的处理、离线更新已有AI模型的过程等。
1、确定推理数据集的数据分布与训练数据集的数据分布的差异的过程:
对于给定的输入推理数据集,AI平台可以抽取推理数据集中的视频帧,通过存储的深度神经网络提取视频帧的深度特征,和/或通过其它算法提取非深度特征。例如,非深度特征为帧差、光流等。帧差指的是相邻两帧图像相减的结果,光流指的是相邻两帧图像之间像素的 位置变化关系,是像素移动的位移场。
此处需要说明的是,如果推理数据集中是视频流或者一段时间的连续视频,AI平台可以通过滑动窗口或者视频片段分割的方式,来获取合适时长的短时视频作为输入。抽取推理数据集中的视频帧所采用的方式包括但不限于抽取全部视频帧、均匀采样、非均匀采样、多尺度采样等。非均匀采样可以为基于关键帧、帧间相似度等对视频帧进行选择,具体可以是选择关键帧、帧间相似度小于一定数值的视频帧等。多尺度采样可以是采用不同的采样间隔得到多段短视频,分别提取深度特征,将提取的深度特征进行整合,获得上述深度特征。深度神经网络包括但不限于二维/三维卷积神经网络、循环神经网络、长短期记忆网络、双流卷积神经网络等及其组合和变体。
AI平台中存储有视频预测模型,该视频预测模型可以是AI平台自身建立,也可以是从其它平台获取。如图12所示,AI平台可以使用上述提取到的特征(特征包括深度特征和/或非深度特征)对未来的视频帧进行预测。然后AI平台计算预测的视频帧和推理数据集中的实际视频帧之间的预测误差,使用一次预测的预测误差或者多次预测的预测误差的平均值,表示训练数据集的数据分布与推理数据集的数据分布的差异。具体的,在对未来的视频帧进行预测时,AI平台可以预测一个帧或多个帧。上述视频预测模型包括但不限于二维/三维卷积神经网络、循环神经网络、生成对抗网络、变分自编码器等及其组合和变体。
上述建模数据分布时,也可以采用其它方式,如还可以采用视频插帧、视频重构、计算帧间相似性等方式。视频插帧指的是通过非相邻视频帧来预测中间间隔的视频帧。视频重构指的是基于当前视频帧的特征,重构出当前视频帧的重构视频帧,比较重构视频帧与当前视频帧,获得重构误差,使用一个视频帧的重构误差或者多个视频帧的重构误差的平均值,表示训练数据集的数据分布与推理数据集的数据分布的差异。计算帧间相似性指的是计算相邻两个视频帧的相似度,使用一个视频帧的相似度或者,多个相邻两个视频帧的相似度的平均值表示训练数据集的数据分布与推理数据集的数据分布的差异。
另外,在建模数据分布时,在空间维度可以采用各视频帧的整体图像进行建模,也可以采用各视频帧的局部区域进行建模,或者将二者进行结合。在时间维度上也可以采用类似的整体(考虑一整段视频)、局部(考虑一整段视频中的部分视频)或者两者相结合的建模方式。
在计算预测误差时,AI平台可以采用任何符合任务需求的度量,包括但是不限于L1距离(预测的两个视频帧的差值)、L2距离(预测的两个视频帧的差值的平方)、Wasserstein距离(也可以称为推土距离(earth mover distance))、可学习度量等及其组合和变体。
上述描述中,推理数据集的数据分布与训练数据集的数据分布的差异直接使用预测误差表示,当然,也可以采用对预测误差进行线性或非线性变换的结果表示。
上述确定数据分布的变化的流程在图12中有说明。
2、在线更新已有AI模型的过程:
在本实施例中,在推理数据集的数据分布与训练数据集的数据分布的差异小于差异值时,AI平台可以在线更新已有AI模型。具体的:如图13所示,AI平台可以将推理数据集的数据分布与训练数据集的数据分布的差异,输入至参数生成器(参数生成器建模参数变化量与数据分布差异的对应关系),该参数生成器的输出即为前文中提到的目标部分的参数变化量。AI平台将已有AI模型中目标部分当前的参数的取值与参数变化量相加,获得更新后的AI模型中的目标部分的参数的取值。
此处使用推理数据集的数据分布与训练数据集的数据分布的差异,确定出目标部分的参数变化量,当然也可以使用推理数据集中的部分或全部数据、该差异,确定目标部分的参数 变化量,本申请实施例不做限定。
3、对推理数据集进行推理的过程:
在本实施例中,已有AI模型(或者更新后的AI模型)对推理数据集中的视频帧进行推理,输出动作识别的推理结果,AI平台通过显示界面显示该推理结果。
4、为用户提供目标数据的处理:
AI平台可以根据推理数据集的数据分布与训练数据集的数据分布的差异,在推理数据集中,获取满足样例条件的目标数据。
在本实施例中,AI平台可以使用推理数据集的数据分布与训练数据集的数据分布的差异,在推理数据中,筛选出满足样例条件的目标数据,该样例条件下的目标数据适用于更新已有AI模型。
可选的,AI平台可以使用不确定性,确定目标数据,处理为:
AI平台可以使用已有AI模型对推理数据中各推理数据推理的不确定性,该不确定性可以使用动作类别概率、信息熵、互信息、方差中的任一种表示。
然后AI平台使用推理数据集的数据分布与训练数据集的数据分布的差异、各推理数据对应的不确定性,在推理数据集中,获取满足样例条件的目标数据。
具体的,在不确定性使用动作类别概率表示,推理数据集的数据分布与训练数据集的数据分布的差异使用预测误差表示,使用L1距离来度量预测误差,用动作类别概率的熵来度量不确定性,则目标数据满足的样例条件为:
Figure PCTCN2021104537-appb-000001
在式(1)中,x表示目标数据(也可以称为是样例),第一项I(x)表示实际视频帧,
Figure PCTCN2021104537-appb-000002
表示预测视频帧,第二项中p i(x)表示x对应的第i类动作的概率,λ 1和λ 2作为超参数分别为两项的权重用于权衡二者的作用,x *表示选择出来有待标注的目标数据。式(1)的含义是:满足
Figure PCTCN2021104537-appb-000003
取最大值时的x为x *,这样,可以获取到推理数据集中典型的数据,更适用于更新已有AI模型,使更新后的AI模型的推理精度更好。
如图14所示,AI平台将目标数据提供给用户,用户对x *进行标注,获得标注结果y *。AI平台基于{x *,y *}更新已有AI模型。
5、离线更新已有AI模型的过程:
AI平台可以采用监督技术更新已有AI模型,可以是基于目标数据集对已有AI模型进行调优,也可以是基于目标数据集直接重新训练一个AI模型,作为更新后的AI模型。在对已有AI模型进行更新时,更新参数采用的优化算法包括但是不限于随机梯度下降算法、共轭梯度下降算法等。
评估更新后的AI模型的过程和部署更新后的AI模型的过程与前文中的描述相同,此处不再赘述。
这样,通过本申请实施例的技术方案,可以感知数据分布的变化,一方面利用AI模型本身的自适应能力进行局部参数的调整,另一方面和用户交互获取新的标注数据离线进行AI模型的整体更新,从而不断适应新的数据分布,保证AI模型推理精度。另外,AI平台可以控制AI模型持续自动进行更新,无需用户具有算法相关的专业知识。
另外,本申请实施例中,可以在有用户参与、无用户参与的情况下,对AI模型进行更新。并且在对AI模型机型更新时,不受标注数据的数目限制,并且不受任务变化或不变化的限制,并且可以适用于任何形式的推理数据集,如成批数据(连续100张图片)、流式数据(逐步产生的视频数据)等。
图15是本申请实施例提供的AI模型更新的装置的结构图。该装置可以通过软件、硬件或者两者的结合实现成为装置中的部分或者全部。在一些实施例中,AI模型更新的装置可以是前述AI平台100的部分或者全部。
本申请实施例提供的装置可以实现本申请实施例图7所述的流程,该装置包括:获取模块1510、确定模块1520和更新模块1530,其中:
获取模块1510,用于获取推理数据集,其中,所述推理数据集中的推理数据用于输入至已有AI模型执行推理,具体可以用于实现步骤701的获取功能以及执行步骤701包含的隐含步骤;
确定模块1520,用于确定所述推理数据集的数据分布与训练数据集的数据分布存在差异,其中,所述训练数据集为训练所述已有AI模型所使用的数据集,具体可以用于实现步骤702的确定功能以及执行步骤702包含的隐含步骤;
更新模块1530,用于利用所述推理数据集,对所述已有AI模型进行更新,获得更新后的AI模型,具体可以用于实现步骤703的更新功能以及执行步骤703包含的隐含步骤。
在一种可能的实现方式中,所述已有AI模型部署在推理平台,所述确定模块1520,还用于在获得更新后的AI模型之后,比较所述更新后的AI模型和所述已有AI模型的推理精度,确定所述更新后的AI模型的推理精度优于所述已有AI模型的推理精度;
所述更新模块1530,还用于将所述更新后的AI模型部署至所述推理平台,以使所述更新后的AI模型代替所述已有AI模型执行推理。
在一种可能的实现方式中,如图16所示,所述装置还包括:显示模块1540,用于将所述更新后的AI模型部署至所述推理平台,代替所述已有AI模型执行推理之前,通过显示界面显示所述已有AI模型的推理精度和所述更新后的AI模型的推理精度;
接收模块1550,用于接收用户对所述已有AI模型的更新指令。
在一种可能的实现方式中,所述更新模块1530,用于:
若所述差异达到离线更新条件,则利用所述推理数据集对所述已有AI模型进行离线更新;
若所述差异未达到所述离线更新条件,则利用所述推理数据集对所述已有AI模型进行在线更新。
在一种可能的实现方式中,所述更新模块1530,用于:
利用所述推理数据集的数据分布与所述训练数据集的数据分布的差异,确定所述已有AI模型的目标部分的参数变化量;
基于所述已有AI模型中所述目标部分当前的参数和所述参数变化量,确定更新后的AI模型中所述目标部分的参数。
在一种可能的实现方式中,所述更新模块1530,用于:
根据所述推理数据集构造目标数据集;
利用所述目标数据集对所述已有AI模型进行更新。
在一种可能的实现方式中,所述更新模块1530,用于:
在所述推理数据集中,获取满足样例条件的目标数据,通过显示界面显示所述目标数据;
获取用户对所述目标数据的标注结果;
根据所述目标数据以及所述目标数据的标注结果,构建目标数据集。
在一种可能的实现方式中,所述更新模块1530,用于:
根据所述推理数据集的数据分布与训练数据集的数据分布的差异,在所述推理数据集中, 获取满足样例条件的目标数据,其中,所述目标数据适用于更新所述已有AI模型。
在一种可能的实现方式中,所述目标数据集还包括在所述当前的标注数据中采样和/或生成适合所述推理数据集的数据分布的标注数据,所述当前的标注数据包括所述训练数据集中的数据。
在一种可能的实现方式中,所述目标数据集包括适合所述推理数据集的数据分布的未标注数据和标注数据;
所述更新模块1530,用于:
利用所述目标数据集中的未标注数据,使用无监督方式对所述已有AI模型中的特征提取部分进行优化;
根据优化后的特征提取部分和所述目标数据集中的标注数据,对所述已有AI模型进行更新。
在一种可能的实现方式中,所述目标数据集包括适合所述推理数据集的数据分布的未标注数据和标注数据;
所述更新模块1530,用于:
利用所述已有AI模型,对所述目标数据集中的未标注数据进行标注,获得所述未标注数据的标注结果;
根据所述未标注数据的标注结果和所述目标数据集中的标注数据,对所述已有AI模型进行更新。
在一种可能的实现方式中,所述更新模块1530,用于:
根据所述目标数据集中数据的数据特性,获取更新所述已有AI模型的策略;
根据所述策略,对所述已有AI模型进行更新。
在一种可能的实现方式中,所述获取模块1510,还用于获取用户输入的AI模型的更新周期;
所述确定模块1520,用于:
根据所述AI模型的更新周期,确定所述推理数据集的数据分布与训练数据集的数据分布存在差异。
本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时也可以有另外的划分方式,另外,在本申请各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上模块集成为一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
本申请还提供一种如图4所示的计算设备400,计算设备400中的处理器402读取存储器401存储的程序和图像集合以执行前述AI平台执行的方法。
由于本申请提供的AI平台100中的各个模块可以分布式地部署在同一环境或不同环境中的多个计算机上,因此,本申请还提供一种如图17所示的计算设备,该计算设备包括多个计算机1700,每个计算机1700包括存储器1701、处理器1702、通信接口1703以及总线1704。其中,存储器1701、处理器1702、通信接口1703通过总线1704实现彼此之间的通信连接。
存储器1701可以是只读存储器,静态存储设备,动态存储设备或者随机存取存储器。存储器1701可以存储程序,当存储器1701中存储的程序被处理器502执行时,处理器1702和通信接口1703用于执行AI平台为AI模型更新的部分方法。存储器还可以存储图像集合,例如:存储器1701中的一部分存储资源被划分成一个推理数据集存储模块,用于存储推理数据 集,存储器1701中的一部分存储资源被划分成一个AI模型存储模块,用于存储AI模型库。
处理器1702可以采用通用的中央处理器,微处理器,应用专用集成电路,图形处理器或者一个或多个集成电路。
通信接口1703使用例如但不限于收发器一类的收发模块,来实现计算机1700与其他设备或通信网络之间的通信。例如,可以通过通信接口1703获取推理数据集。
总线504可包括在计算机1700各个部件(例如,存储器1701、处理器1702、通信接口1703)之间传送信息的通路。
上述每个计算机1700间通过通信网络建立通信通路。每个计算机1700上运行用户I/O模块101、模型训练模块102、推理模块103、AI模型存储模块104或数据存储模块105中的任意一个或多个。任一计算机1700可以为云数据中心中的计算机(如服务器),或边缘数据中心中的计算机,或终端计算设备。
上述各个附图对应的流程的描述各有侧重,某个流程中没有详述的部分,可以参见其他流程的相关描述。
在上述实施例中,可以全部或部分地通过软件、硬件或者其组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。提供AI平台的计算机程序产品包括一个或多个进AI平台的计算机指令,在计算机上加载和执行这些计算机程序指令时,全部或部分地产生按照本申请实施例图5、图11、图14或图15所述的流程或功能。
所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、双绞线或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质存储有提供AI平台的计算机程序指令。所述计算机可读存储介质可以是计算机能够存取的任何介质或者是包含一个或多个介质集成的服务器、数据中心等数据存储设备。所述介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,光盘)、或者半导体介质(例如SSD)。

Claims (24)

  1. 一种人工智能AI模型更新的方法,其特征在于,所述方法包括:
    获取推理数据集,其中,所述推理数据集中的推理数据用于输入至已有AI模型执行推理;
    确定所述推理数据集的数据分布与训练数据集的数据分布存在差异,其中,所述训练数据集为训练所述已有AI模型所使用的数据集;
    利用所述推理数据集,对所述已有AI模型进行更新,获得更新后的AI模型。
  2. 根据权利要求1所述的方法,其特征在于,所述已有AI模型部署在推理平台,所述方法还包括:
    比较所述更新后的AI模型和所述已有AI模型的推理精度,确定所述更新后的AI模型的推理精度优于所述已有AI模型的推理精度;
    将所述更新后的AI模型部署至所述推理平台,以使所述更新后的AI模型代替所述已有AI模型执行推理。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述更新后的AI模型部署至所述推理平台之前,还包括:
    通过显示界面显示所述已有AI模型的推理精度和所述更新后的AI模型的推理精度;
    接收用户对所述已有AI模型的更新指令。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述利用所述推理数据集,对所述已有AI模型进行更新,包括:
    若所述差异达到离线更新条件,则利用所述推理数据集对所述已有AI模型进行离线更新;
    若所述差异未达到所述离线更新条件,则利用所述推理数据集对所述已有AI模型进行在线更新。
  5. 根据权利要求4所述的方法,其特征在于,所述利用所述推理数据集对所述已有AI模型进行在线更新,包括:
    利用所述推理数据集的数据分布与所述训练数据集的数据分布的差异,确定所述已有AI模型的目标部分的参数变化量;
    基于所述已有AI模型中所述目标部分当前的参数和所述参数变化量,确定更新后的AI模型中所述目标部分的参数。
  6. 根据权利要求1至4任一项所述的方法,其特征在于,所述利用所述推理数据集,对所述已有AI模型进行更新,包括:
    根据所述推理数据集构造目标数据集;
    利用所述目标数据集对所述已有AI模型进行更新。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述推理数据集构造目标数据集,包括:
    在所述推理数据集中,获取满足样例条件的目标数据,通过显示界面显示所述目标数据;
    获取用户对所述目标数据的标注结果;
    根据所述目标数据以及所述目标数据的标注结果,构建目标数据集。
  8. 根据权利要求6所述的方法,其特征在于,所述目标数据集包括适合所述推理数据集的数据分布的未标注数据和标注数据;
    所述利用所述目标数据集对所述已有AI模型进行更新,包括:
    利用所述目标数据集中的未标注数据,使用无监督方式对所述已有AI模型中的特征提取部分进行优化;
    根据优化后的特征提取部分和所述目标数据集中的标注数据,对所述已有AI模型进行更新。
  9. 根据权利要求6所述的方法,其特征在于,所述目标数据集包括适合所述推理数据集的数据分布的未标注数据和标注数据;
    所述利用所述目标数据集对所述已有AI模型进行更新,包括:
    利用所述已有AI模型,对所述目标数据集中的未标注数据进行标注,获得所述未标注数据的标注结果;
    根据所述未标注数据的标注结果和所述目标数据集中的标注数据,对所述已有AI模型进行更新。
  10. 根据权利要求6或7所述的方法,其特征在于,所述利用所述目标数据集对所述已有AI模型进行更新,包括:
    根据所述目标数据集中数据的数据特性,获取更新所述已有AI模型的策略;
    根据所述策略,对所述已有AI模型进行更新。
  11. 根据权利要求1至10任一项所述的方法,其特征在于,所述方法还包括:
    获取用户输入的AI模型的更新周期;
    所述确定所述推理数据集的数据分布与训练数据集的数据分布存在差异,包括:
    根据所述AI模型的更新周期,确定所述推理数据集的数据分布与训练数据集的数据分布存在差异。
  12. 一种人工智能AI模型更新的装置,其特征在于,所述装置包括:
    获取模块,用于获取推理数据集,其中,所述推理数据集中的推理数据用于输入至已有AI模型执行推理;
    确定模块,用于确定所述推理数据集的数据分布与训练数据集的数据分布存在差异,其中,所述训练数据集为训练所述已有AI模型所使用的数据集;
    更新模块,用于利用所述推理数据集,对所述已有AI模型进行更新,获得更新后的AI模型。
  13. 根据权利要求12所述的装置,其特征在于,所述已有AI模型部署在推理平台,所述确定模块,还用于比较所述更新后的AI模型和所述已有AI模型的推理精度,确定所述更新后的AI模型的推理精度优于所述已有AI模型的推理精度;
    所述更新模块,还用于将所述更新后的AI模型部署至所述推理平台,以使所述更新后的 AI模型代替所述已有AI模型执行推理。
  14. 根据权利要求13所述的装置,其特征在于,所述装置还包括:显示模块,用于将所述更新后的AI模型部署至所述推理平台之前,通过显示界面显示所述已有AI模型的推理精度和所述更新后的AI模型的推理精度;
    接收模块,用于接收用户对所述已有AI模型的更新指令。
  15. 根据权利要求12至14任一项所述的装置,其特征在于,所述更新模块,用于:
    若所述差异达到离线更新条件,则利用所述推理数据集对所述已有AI模型进行离线更新;
    若所述差异未达到所述离线更新条件,则利用所述推理数据集对所述已有AI模型进行在线更新。
  16. 根据权利要求15所述的装置,其特征在于,所述更新模块,用于:
    利用所述推理数据集的数据分布与所述训练数据集的数据分布的差异,确定所述已有AI模型的目标部分的参数变化量;
    基于所述已有AI模型中所述目标部分当前的参数和所述参数变化量,确定更新后的AI模型中所述目标部分的参数。
  17. 根据权利要求12至16任一项所述的装置,其特征在于,所述更新模块,用于:
    根据所述推理数据集构造目标数据集;
    利用所述目标数据集对所述已有AI模型进行更新。
  18. 根据权利要求17所述的装置,其特征在于,所述更新模块,用于:
    在所述推理数据集中,获取满足样例条件的目标数据,通过显示界面显示所述目标数据;
    获取用户对所述目标数据的标注结果;
    根据所述目标数据以及所述目标数据的标注结果,构建目标数据集。
  19. 根据权利要求17所述的装置,其特征在于,所述目标数据集包括适合所述推理数据集的数据分布的未标注数据和标注数据;
    所述更新模块,用于:
    利用所述目标数据集中的未标注数据,使用无监督方式对所述已有AI模型中的特征提取部分进行优化;
    根据优化后的特征提取部分和所述目标数据集中的标注数据,对所述已有AI模型进行更新。
  20. 根据权利要求17所述的装置,其特征在于,所述目标数据集包括适合所述推理数据集的数据分布的未标注数据和标注数据;
    所述更新模块,用于:
    利用所述已有AI模型,对所述目标数据集中的未标注数据进行标注,获得所述未标注数据的标注结果;
    根据所述未标注数据的标注结果和所述目标数据集中的标注数据,对所述已有AI模型进 行更新。
  21. 根据权利要求17或18所述的装置,其特征在于,所述更新模块,用于:
    根据所述目标数据集中数据的数据特性,获取更新所述已有AI模型的策略;
    根据所述策略,对所述已有AI模型进行更新。
  22. 根据权利要求12至21任一项所述的装置,其特征在于,所述获取模块,还用于获取用户输入的AI模型的更新周期;
    所述确定模块,用于根据所述AI模型的更新周期,确定所述推理数据集的数据分布与训练数据集的数据分布存在差异。
  23. 一种人工智能AI模型更新的计算设备,其特征在于,所述计算设备包括处理器和存储器,其中:
    所述存储器中存储有计算机指令;
    所述处理器执行所述计算机指令,以使所述计算设备执行所述权利要求1-11中任一项权利要求所述的方法。
  24. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,当所述计算机可读存储介质中的计算机指令被计算设备执行时,使得所述计算设备执行所述权利要求1-11中任一项权利要求所述的方法,或者使得所述计算设备实现所述权利要求12-22中任一项所述装置的功能。
PCT/CN2021/104537 2020-07-27 2021-07-05 Ai模型更新的方法、装置、计算设备和存储介质 WO2022022233A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP21851426.3A EP4177792A4 (en) 2020-07-27 2021-07-05 AI MODEL UPDATE METHOD AND APPARATUS, COMPUTER DEVICE AND STORAGE MEDIUM
JP2023505761A JP2023535227A (ja) 2020-07-27 2021-07-05 Aiモデルを更新する方法、装置、および計算デバイス、ならびに記憶媒体
US18/158,019 US20230153622A1 (en) 2020-07-27 2023-01-23 Method, Apparatus, and Computing Device for Updating AI Model, and Storage Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010732241.0 2020-07-27
CN202010732241.0A CN114004328A (zh) 2020-07-27 2020-07-27 Ai模型更新的方法、装置、计算设备和存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/158,019 Continuation US20230153622A1 (en) 2020-07-27 2023-01-23 Method, Apparatus, and Computing Device for Updating AI Model, and Storage Medium

Publications (1)

Publication Number Publication Date
WO2022022233A1 true WO2022022233A1 (zh) 2022-02-03

Family

ID=79920228

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/104537 WO2022022233A1 (zh) 2020-07-27 2021-07-05 Ai模型更新的方法、装置、计算设备和存储介质

Country Status (5)

Country Link
US (1) US20230153622A1 (zh)
EP (1) EP4177792A4 (zh)
JP (1) JP2023535227A (zh)
CN (1) CN114004328A (zh)
WO (1) WO2022022233A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294505A (zh) * 2022-10-09 2022-11-04 平安银行股份有限公司 风险物体检测及其模型的训练方法、装置及电子设备

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210256421A1 (en) * 2020-02-18 2021-08-19 swarmin.ai System and method for maintaining network integrity for incrementally training machine learning models at edge devices of a peer to peer network
CN116963100A (zh) * 2022-04-15 2023-10-27 维沃移动通信有限公司 模型微调的方法、装置及设备
CN114780110B (zh) * 2022-06-21 2022-09-09 山东极视角科技有限公司 一种算法链路的优化方法及优化系统
WO2024000387A1 (zh) * 2022-06-30 2024-01-04 京东方科技集团股份有限公司 Ai模型构建评估系统、视频流模拟模块与方法、控制器
WO2024031986A1 (zh) * 2022-08-12 2024-02-15 华为云计算技术有限公司 一种模型管理方法及相关设备
WO2024092852A1 (zh) * 2022-11-06 2024-05-10 北京小米移动软件有限公司 一种通信方法、装置及存储介质
US20240152797A1 (en) * 2022-11-07 2024-05-09 Genpact Luxembourg S.à r.l. II Systems and methods for model training and model inference
CN116502131B (zh) * 2023-06-26 2023-11-24 清华大学 基于迁移学习的轴承故障诊断模型训练方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699685A (zh) * 2013-12-04 2015-06-10 富士通株式会社 模型更新装置及方法、数据处理装置及方法、程序
CN109102126A (zh) * 2018-08-30 2018-12-28 燕山大学 一种基于深度迁移学习的理论线损率预测模型
CN109716346A (zh) * 2016-07-18 2019-05-03 河谷生物组学有限责任公司 分布式机器学习系统、装置和方法
CN110414631A (zh) * 2019-01-29 2019-11-05 腾讯科技(深圳)有限公司 基于医学图像的病灶检测方法、模型训练的方法及装置
CN110728294A (zh) * 2019-08-30 2020-01-24 北京影谱科技股份有限公司 一种基于迁移学习的跨领域图像分类模型构建方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699685A (zh) * 2013-12-04 2015-06-10 富士通株式会社 模型更新装置及方法、数据处理装置及方法、程序
CN109716346A (zh) * 2016-07-18 2019-05-03 河谷生物组学有限责任公司 分布式机器学习系统、装置和方法
CN109102126A (zh) * 2018-08-30 2018-12-28 燕山大学 一种基于深度迁移学习的理论线损率预测模型
CN110414631A (zh) * 2019-01-29 2019-11-05 腾讯科技(深圳)有限公司 基于医学图像的病灶检测方法、模型训练的方法及装置
CN110728294A (zh) * 2019-08-30 2020-01-24 北京影谱科技股份有限公司 一种基于迁移学习的跨领域图像分类模型构建方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4177792A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294505A (zh) * 2022-10-09 2022-11-04 平安银行股份有限公司 风险物体检测及其模型的训练方法、装置及电子设备
CN115294505B (zh) * 2022-10-09 2023-06-20 平安银行股份有限公司 风险物体检测及其模型的训练方法、装置及电子设备

Also Published As

Publication number Publication date
CN114004328A (zh) 2022-02-01
US20230153622A1 (en) 2023-05-18
EP4177792A4 (en) 2024-01-10
JP2023535227A (ja) 2023-08-16
EP4177792A1 (en) 2023-05-10

Similar Documents

Publication Publication Date Title
WO2022022233A1 (zh) Ai模型更新的方法、装置、计算设备和存储介质
US11741361B2 (en) Machine learning-based network model building method and apparatus
US11829880B2 (en) Generating trained neural networks with increased robustness against adversarial attacks
US20190279075A1 (en) Multi-modal image translation using neural networks
CN110276406B (zh) 表情分类方法、装置、计算机设备及存储介质
CN111651671B (zh) 用户对象推荐方法、装置、计算机设备和存储介质
WO2020159890A1 (en) Method for few-shot unsupervised image-to-image translation
KR20230104738A (ko) 비디오 행동 인식을 위한 시간적 병목 어텐션 아키텍처
US11360927B1 (en) Architecture for predicting network access probability of data files accessible over a computer network
WO2022001501A1 (zh) 数据标注的方法、装置、计算设备和存储介质
US20230259739A1 (en) Image detection method and apparatus, computer-readable storage medium, and computer device
CN114298122B (zh) 数据分类方法、装置、设备、存储介质及计算机程序产品
US20220237917A1 (en) Video comparison method and apparatus, computer device, and storage medium
WO2021035412A1 (zh) 一种自动机器学习AutoML系统、方法及设备
Mashtalir et al. Spatio-temporal video segmentation
CN112817563B (zh) 目标属性配置信息确定方法、计算机设备和存储介质
CN115511892A (zh) 一种语义分割模型的训练方法、语义分割方法及装置
CN114072809A (zh) 经由神经架构搜索的小且快速的视频处理网络
CN113822144A (zh) 一种目标检测方法、装置、计算机设备和存储介质
CN113705293A (zh) 图像场景的识别方法、装置、设备及可读存储介质
US11816185B1 (en) Multi-view image analysis using neural networks
CN117095460A (zh) 基于长短时关系预测编码的自监督群体行为识别方法及其识别系统
CN116361643A (zh) 实现对象推荐的模型训练方法及对象推荐方法及相关装置
CN115631008B (zh) 商品推荐方法、装置、设备及介质
CN111177493B (zh) 数据处理方法、装置、服务器和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21851426

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023505761

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021851426

Country of ref document: EP

Effective date: 20230201

NENP Non-entry into the national phase

Ref country code: DE