CN113139563B

CN113139563B - Optimization method and device for image classification model

Info

Publication number: CN113139563B
Application number: CN202010062888.7A
Authority: CN
Inventors: 邢玲; 杨天宝
Original assignee: Shenzhen Intellifusion Technologies Co Ltd
Current assignee: Shenzhen Intellifusion Technologies Co Ltd
Priority date: 2020-01-19
Filing date: 2020-01-19
Publication date: 2024-05-03
Anticipated expiration: 2040-01-19
Also published as: CN113139563A

Abstract

The application is suitable for the technical field of image processing, and provides an optimization method and device of an image classification model. The method comprises the following steps: acquiring a new image set, wherein the new image in the new image is a monitoring image in a target monitoring area, and the monitoring image is marked with an image category; performing N-turn training on the image classification model by utilizing the newly added image set and the historical image set of the image classification model; the ith training process in the N rounds comprises the following steps: m historical images are selected from first residual images in the historical image set, wherein the number P of the M and the number P of the newly added images in the newly added image set meet a preset proportion, and the first residual images are historical images which are not selected before the current training; performing iterative training on the image classification model by using the M historical images and the newly added image set; wherein P, N, M and i are integers greater than 1.

Description

Optimization method and device for image classification model

Technical Field

The application belongs to the technical field of image processing, and particularly relates to an optimization method and device of an image classification model.

Background

The effect of the image classification model may be different when applied to different scenes for an image classification model that has completed training. In some scenarios, the accuracy of the image classification model is low. Therefore, in order to improve the accuracy of the image classification model in different scenes, the image classification model is often optimized by using a new image of a new scene. One common optimization approach does not implement optimization of the image classification model based on an incremental learning method.

However, in the existing incremental learning method, a part of the history image is often selected from the history images of the image classification model in proportion to the newly added image. The image classification model is then trained based on the selected historical images and the newly added images. This results in that history images that are not selected may be forgotten, and a problem of model bias (performance of the image classification model on the set of history images is reduced) arises.

Disclosure of Invention

The embodiment of the application provides an optimization method and device for an image classification model, which can solve the problem of model deviation caused in the process of optimizing the image classification model by using an incremental learning method in the prior art.

In a first aspect, the present application provides a method for optimizing an image classification model, including:

Acquiring a new image set, wherein the new image in the new image is a monitoring image in a target monitoring area, and the monitoring image is marked with an image category;

Performing N-turn training on the image classification model by utilizing the newly added image set and the historical image set of the image classification model; the ith training process in the N rounds comprises the following steps:

M historical images are selected from first residual images in the historical image set, wherein the number P of the M and the number P of the newly added images in the newly added image set meet a preset proportion, and the first residual images are historical images which are not selected before the current training;

Performing iterative training on the image classification model by using the M historical images and the newly added image set; wherein P, N, M and i are integers greater than 1.

By adopting the optimization method of the image classification model, when the image classification model is trained, one image acquisition is carried out from the historical image set in each round of training process. And the images acquired each time are different. Therefore, compared with the existing incremental learning method, the method can enable the image classification model to learn more historical images, and therefore the possibility of model deviation of the image classification model in the optimization process is reduced.

In a second aspect, the present application provides an optimization apparatus for an image classification model, including:

the acquisition unit is used for acquiring a new image set, wherein the new image in the new image is a monitoring image in a target monitoring area, and the monitoring image is marked with an image category;

The training unit is used for performing N-round training on the image classification model by utilizing the newly added image set and the historical image set of the image classification model; the ith training process in the N rounds comprises the following steps:

M historical images are selected from first residual images in the historical image set, wherein the number of M and the number of the newly-added images in the newly-added image set meet a preset proportion, and the first residual images are historical images which are not selected before the current training;

performing iterative training on the image classification model by using the M historical images and the newly added image set; wherein N, M and i are integers greater than 1.

In a third aspect, an embodiment of the present invention provides a terminal device, including a processor, a memory, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement a method for optimizing an image classification model according to the first aspect or any of the alternative modes of the first aspect.

In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium, where a computer program is stored, where the computer program, when executed by a processor, implements a method for optimizing an image classification model according to the first aspect or any of the alternatives of the first aspect.

In a fifth aspect, an embodiment of the present invention provides a computer program product, which when run on a terminal device, causes the terminal device to perform the method for optimizing an image classification model according to the first aspect or any of the alternatives of the first aspect.

It will be appreciated that the advantages of the second to fifth aspects may be found in the relevant description of the first aspect, and are not described here again.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of one embodiment of a method for optimizing an image classification model according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a data acquisition system according to the present application;

FIG. 3 is a flow chart of one embodiment of a method for optimizing an image classification model according to another embodiment of the present invention;

FIG. 4 is a schematic structural diagram of an optimizing device for an image classification model according to the present application;

fig. 5 is a schematic structural diagram of a terminal device provided by the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The term "and/or" refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations. The terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

The embodiment of the application provides an optimization method of a target network, which is based on the new incremental learning method provided by the application, and can realize the optimization of the target network under the condition that the image classification model is ensured not to deviate.

An exemplary description is given below of an optimization method of an image classification model according to the present application, with reference to specific embodiments.

Referring to fig. 1, a flowchart of an embodiment of a method for optimizing an image classification model is provided in an embodiment of the present application. The process of optimizing the image classification model based on the new incremental learning method provided by the application is mainly described. The execution subject of the optimization method of the image classification model in this embodiment is a terminal device. Referring to fig. 1, the optimization method of the image classification model provided by the application includes:

S101, acquiring a new image set, wherein the new image in the new image is a monitoring image in a target monitoring area, and the monitoring image is marked with an image type.

The new image in the new image set may be a monitoring image collected by the terminal device in the target monitoring area.

For example, the target monitoring area may be a mall, park, office building, etc., the newly added image may be a monitoring image collected within the monitoring area, and each monitoring image is marked (i.e., the category of the monitoring image is marked).

S102, training the image classification model by using the newly added image set and the historical image set of the image classification model to be optimized for N rounds (namely epoch=N). The ith training process in the N rounds comprises the following steps: m historical images are selected from first residual images in the historical image set, wherein the number P of the M and the number P of the newly added images in the newly added image set meet a preset proportion, and the first residual images are historical images which are not selected before the current training; performing iterative training on the image classification model by using the M historical images and the newly added image set; wherein P, N, M and i are integers greater than 1.

The image classification model classifies and identifies the images, and the historical images in the historical image set are historical training data of the image classification model.

By taking a target monitoring area as an example of a park, the image classification model is arranged in a monitoring terminal in the park and is used for classifying and identifying monitoring images of the park, and abnormal images are determined so as to monitor abnormal events occurring in the park. In order to improve the accuracy of the image classification model in classifying and identifying the monitoring images in the park, the monitoring images with classification marks in the park are adopted as a new image set, and based on the steps S101-S102, the image classification model is trained so that the image classification model modifies network parameters of the image classification model based on the data characteristics of the new image set, thereby improving the accuracy of the image classification model in classifying and identifying the monitoring images in the park.

The preset proportion may be a ratio of the number of newly added images P to the number of historical images M. For example, the preset ratio may be 0.5:0.5, also 0.9:0.1, and may also be 0.6:0.4. the application is not limited in this regard.

The terminal device may determine the number M of history images based on the number of newly added images and a preset ratio.

It can be understood that the image classification model is iteratively trained by using the newly added image and the historical image of the target monitoring area, and after the training is converged, the optimized image classification model can be obtained, so that the accuracy of the image classification model in the target monitoring area is improved.

It should be noted that S102 is a new incremental learning method provided by the present application, that is, the image acquisition is performed once from the historical image set during each training process. And the images acquired each time are different. That is, when N is greater than 1, compared with the existing incremental learning method, the incremental learning method provided by the application can enable the image classification model to learn more historical images, thereby reducing the possibility of model deviation of the image classification model in the process of optimization.

In one possible scenario, assuming that the set of historical images includes X historical images, then when N is greater than or equal to X/M, the image classification model can learn all the historical images, thereby avoiding the problem of model offset of the image classification model in the optimization process.

In one example, before the terminal device performs the N-round training on the image classification model using the new image set and the historical image set of the image classification model in S102, the image arrangement sequence in the new image set and the historical image set may be randomly adjusted. The probability of selecting new images and historical images with similar data characteristics in one round of training is reduced, and the probability of model deviation in the process of optimizing an image classification model is further reduced.

For example, when the terminal device selects M history images from the first remaining images in the history image set, the M history images may be selected in an image arrangement order of the first remaining images in the history image set. Namely, from the first historical image in the first residual image, M historical images are sequentially selected, so that complexity of image acquisition is reduced, and efficiency is improved.

Or M history images may be randomly selected from the first remaining images.

For example, in each training round, the collection manner of the M historical images and the newly added image set may be as shown in fig. 2, where in each training round, the collected M historical images are different.

Optionally, in each training round, when the terminal device performs iterative training on the image classification model by using the M historical images and the newly added image set, dividing the M historical images and the P newly added images into K times (batch), and performing K iterative training on the image classification model by using the M historical images and the newly added image set.

Illustratively, the jth training procedure of the K times may include: selecting M/K historical images from second residual images in the M historical images, wherein the second residual images are unselected historical images before the training, and K and j are integers larger than 1; selecting P/K newly added images from the newly added image set in sequence; and training the image classification model by using the M/K historical images and the P/K newly added images.

Wherein the ratio between P/K and M/K also satisfies the preset ratio.

In the embodiment of the application, the terminal equipment selects different M/K historical images in each training process of each round. Therefore, in each round of training process, the selected M historical images can participate in training of the image classification model. Compared with the prior art, in each training process of each round, the method has the advantages that the historical images participating in training are randomly selected from the historical images, the problem that part of the historical images are selected for multiple times and part of the historical images are missed is avoided, and the possibility of model deviation of the image classification model in the optimization process is further guaranteed through the selection.

For example, when the terminal device selects M/K history images from the second remaining images among the M history images, the terminal device may select M/K history images from the second remaining images according to the arrangement order of the M history images. I.e. from the 1 st history image in the second remaining image, M/K history images are selected.

Referring to fig. 3, a flowchart of one embodiment of another method for optimizing an image classification model is provided for an embodiment of the present application. The process of optimizing the image classification model by combining active learning and incremental learning is mainly described, wherein the incremental learning is a novel incremental learning method provided by the application.

As shown in fig. 3, the method includes:

s301, acquiring an original image set of the target monitoring area.

Wherein the original image set is an unlabeled monitoring image collected in the target monitoring area.

S302, selecting partial images with highest uncertainty from the original image set based on an active learning (ACTIVE LEARNING) method to form the newly added image set.

Because the number of the monitoring images in the original image set is very large, if all the monitoring images are used for optimizing and training the image classification model, a large amount of manpower and material resources are required to be consumed for manual marking, and the category of each monitoring image is obtained.

Therefore, in the embodiment of the application, based on the active learning method, a part of images are selected from the original image set, and the part of images are manually marked to obtain the monitoring images marked with the image types, so as to form the newly added image set.

The active learning method is used for picking out a monitoring image with high uncertainty based on the uncertainty or diversity of data.

For example, the terminal device may measure the uncertainty of a monitoring image using entropy (Entropy) values, the greater the entropy, the greater the uncertainty, and the better the optimization of the monitoring image to the image classification model. By way of example, one possible way of calculating the entropy value may be as follows:

Wherein H (l _i) represents the entropy value of the i-th monitoring image x _i in the original image set; j represents a category index number, j=1, 2, … …, c. C _j denotes the j-th category, and P (C _j|x_i) denotes the probability that the monitoring image x _i belongs to the category C _j.

It can be understood that the number of marks of the monitoring images can be greatly reduced by the active learning method, the data calculation amount of the subsequent image classification model in the optimization process is reduced, the monitoring images with higher optimization efficiency on the image classification model are selected, and then a new image set with higher optimization efficiency on the image classification model is obtained.

S303, training the image classification model by utilizing the newly added image set and the historical image set of the image classification model in N rounds. The ith training process in the N rounds comprises the following steps: m historical images are selected from first residual images in the historical image set, wherein the number P of the M and the number P of the newly added images in the newly added image set meet a preset proportion, and the first residual images are historical images which are not selected before the current training; and carrying out iterative training on the image classification model by using the M historical images and the newly added image set.

After a new image set with smaller data volume and high optimization efficiency is obtained by using an active learning method, the iteration efficiency of training the target network model is improved by using an incremental learning method, so that the optimization efficiency of the image classification model is improved. In the example, the incremental learning method provided by the application can ensure that the image classification model learns more historical images in the optimization process, and reduce the problem that the image classification model generates model deviation in the optimization process.

Corresponding to the method for optimizing the image classification model described in the above embodiments, fig. 4 shows a block diagram of a device for optimizing the image classification model according to an embodiment of the present invention, and for convenience of explanation, only a portion related to the embodiment of the present invention is shown.

Referring to fig. 4, the optimizing apparatus of an image classification model includes:

an obtaining unit 401, configured to obtain a new image set, where a new image in the new images is a monitoring image in a target monitoring area, and the monitoring image is marked with an image category.

A training unit 402, configured to perform N-round training on the image classification model by using the newly added image set and a historical image set of the image classification model; the ith training process in the N rounds comprises the following steps:

Optionally, before the training unit 402 performs N rounds of training on the image classification model by using the newly added image set and the historical image set of the image classification model, the method further includes:

And randomly adjusting the image arrangement sequence of the newly added image set and the historical image set.

Optionally, the training unit 402 selects M history images from the first remaining images in the history image set, including: and selecting M historical images according to the image arrangement sequence of the first residual image in the historical image set.

Optionally, the training unit 402 performs iterative training on the image classification model using the M historical images and the newly added image set, including: performing K times of iterative training on the image classification model by utilizing the M historical images and the newly added image set; wherein the jth training process of the K times comprises: selecting M/K historical images from second residual images in the M historical images, wherein the second residual images are unselected historical images before the training, and K and j are integers larger than 1; selecting P/K newly added images from the newly added image set in sequence; and training the image classification model by using the M/K historical images and the P/K newly added images.

Optionally, the training unit 402 selects M/K history images from the second remaining images in the M history images, including: and selecting M/K historical images from the second residual images according to the arrangement sequence of the M historical images.

Optionally, the acquiring unit 401 acquires a new image set of the target monitoring area, including: acquiring an original image set of the target monitoring area; and selecting partial data with highest uncertainty from the original image set based on an active learning method to form the newly added image set.

In the embodiment of the invention, the optimizing device of the image classification model can be terminal equipment, a chip in the terminal equipment or a functional module integrated in the terminal equipment. The chip or the functional module can be located in the terminal equipment, and the terminal equipment is controlled to realize the optimization method using the image classification model provided by the embodiment of the invention.

Referring to fig. 5, a terminal device provided in an embodiment of the present invention includes: at least one processor 50 (only one shown in fig. 5), a memory 51 and a computer program 52 stored in the memory 51 and executable on the at least one processor 50, the processor 50 implementing the steps in the above-described embodiments of the method for optimizing an image classification model when executing the computer program 52.

The Processor 50 may be a central processing unit (Central Processing Unit, CPU), the Processor 50 may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 51 may in some embodiments be an internal storage unit of the terminal device, such as a hard disk or a memory of the terminal device. The memory 51 may in other embodiments also be an external storage device of the terminal device, such as a plug-in hard disk provided on the terminal device, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), etc. Further, the memory 51 may also include both an internal storage unit of the terminal device and an external storage device. The memory 51 is used to store an operating system, application programs, boot loader (BootLoader), data, and other programs, etc., such as program code of the computer program 52. The memory 51 may also be used to temporarily store data that has been output or is to be output.

It will be appreciated by those skilled in the art that fig. 5 is merely an example of a terminal device and is not limiting of the terminal device, and may include more or fewer components than shown, or may combine certain components, or different components, such as may also include input and output devices, network access devices, scanners, etc.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiments of the present invention. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above.

Accordingly, an embodiment of the present invention provides a computer readable storage medium storing a computer program, which when executed by a processor, implements a method for optimizing an image classification model as provided by the embodiment of the present invention.

The embodiment of the invention also provides a computer program product, which enables the terminal equipment to execute the optimization method of the image classification model provided by the embodiment of the invention when the computer program product runs on the terminal equipment.

Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. A method for optimizing an image classification model, comprising:

performing N-wheel training on the image classification model by utilizing the newly added image set and a historical image set of the image classification model to be optimized; the ith training process in the N rounds comprises the following steps:

m historical images are selected from first residual images in the historical image set, wherein the number P of the M and the number P of the newly added images in the newly added image set meet a preset proportion, and the first residual images are historical images which are not selected before the current training; wherein P, N, M and i are integers greater than 1;

performing K times of iterative training on the image classification model by utilizing the M historical images and the newly added image set; wherein the jth training process of the K times comprises:

Selecting M/K historical images from second residual images in the M historical images, wherein the second residual images are unselected historical images before the training, and K and j are integers larger than 1;

Selecting P/K newly added images from the newly added image set in sequence;

and training the image classification model by using the M/K historical images and the P/K newly added images.

2. The method of claim 1, wherein prior to N-training the image classification model using the newly added image set and the historical image set of image classification models, the method further comprises:

3. The method of claim 2, wherein selecting M historical images from the first remaining images in the set of historical images comprises:

And selecting M historical images according to the image arrangement sequence of the first residual image in the historical image set.

4. The method of claim 1, wherein selecting M/K historical images from a second remaining image of the M historical images comprises:

and selecting M/K historical images from the second residual images according to the arrangement sequence of the M historical images.

5. The method of claim 1, wherein the acquiring the new image set comprises:

Acquiring an original image set of the target monitoring area;

and selecting partial images with highest uncertainty from the original image set based on an active learning method to form the newly added image set.

6. An optimization apparatus for an image classification model, comprising:

M historical images are selected from first residual images in the historical image set, wherein the number of M and the number of the newly-added images in the newly-added image set meet a preset proportion, and the first residual images are historical images which are not selected before the current training; wherein N, M and i are integers greater than 1;

Selecting P/K newly added images from the newly added image set in sequence;

7. The apparatus according to claim 6, wherein the acquisition unit acquires a newly added image set, comprising:

Acquiring an original image set of the target monitoring area;

and selecting partial data with highest uncertainty from the original image set based on an active learning method to form the newly added image set.

8. Terminal device, characterized by a processor, a memory and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 5 when executing the computer program.

9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the method according to any one of claims 1 to 5.