CN114120364A

CN114120364A - Image processing method, image classification method, device, medium, and electronic device

Info

Publication number: CN114120364A
Application number: CN202111395579.2A
Authority: CN
Inventors: 陈维识
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2021-11-23
Filing date: 2021-11-23
Publication date: 2022-03-01

Abstract

The present disclosure relates to an image processing method, an image classification method, an apparatus, a medium, and an electronic device, the image processing method including: receiving an image to be processed; determining a noise image corresponding to the image to be processed according to the image to be processed and an image processing model, wherein the image processing model comprises an image feature extraction submodel and a noise information submodel, the image feature extraction submodel is used for obtaining a feature vector of the image to be processed, the noise information submodel is used for determining the noise image of the image to be processed according to the feature vector of the image to be processed, the image processing model also comprises a classification submodel in the training process, and the loss of the image processing model is determined based on the noise vector added by the noise information submodel and the classification probability distribution output by the classification submodel; and generating a target image corresponding to the image to be processed according to the image to be processed and the noise image, wherein the classification of the image to be processed corresponding to the target image is the same.

Description

Image processing method, image classification method, device, medium, and electronic device

Technical Field

The present disclosure relates to the field of image processing, and in particular, to an image processing method, an image classification method, an apparatus, a medium, and an electronic device.

Background

Network data security and risk management are a very important issue for current content platforms, which need to be able to defend against malicious attacks while quickly identifying potential risks. With the development of machine learning and deep learning technologies and the improvement of computing power, the existing attack mode is more difficult to identify.

In the prior art, when a content platform automatically examines an image, an image classification model is usually trained based on machine learning, so that the image in the content platform is classified based on the image classification model, illegal content or sensitivity in the image is identified, the illegal content is subjected to near-real-time operations such as pressing, deleting or flow limitation, and the user is prevented from contacting the content.

At present, a plurality of hacker/attack technologies may attack the content auditing mechanism, attack the image classification model under the condition of basically not changing the overall form of the image, and enable the image classification model to not identify the modified image under the condition of no difference in manual identification of the image before and after modification, thereby bypassing the auditing mechanism of the content platform and bringing greater threat to the security of the content platform.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In a first aspect, the present disclosure provides a method of image processing, the method comprising:

receiving an image to be processed;

determining a noise image corresponding to the image to be processed according to the image to be processed and an image processing model, wherein the image processing model comprises an image feature extraction submodel and a noise information submodel, the image feature extraction submodel is used for obtaining a feature vector of the image to be processed, the noise information submodel is used for determining the noise image of the image to be processed according to the feature vector of the image to be processed, the image processing model further comprises a classification submodel in a training process, and the loss of the image processing model is determined based on the noise vector added by the noise information submodel and the classification probability distribution output by the classification submodel;

and generating a target image corresponding to the image to be processed according to the image to be processed and the noise image, wherein the classification of the image to be processed and the classification of the target image are the same.

In a second aspect, the present disclosure provides an image classification method, the method comprising:

receiving an image to be classified;

and determining a target classification corresponding to the image to be classified according to the image to be classified and the image classification model, wherein training data of the image classification model comprises a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method of the first aspect.

In a third aspect, the present disclosure provides an image processing apparatus, the apparatus comprising:

the first receiving module is used for receiving an image to be processed;

the image processing system comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining a noise image corresponding to the image to be processed according to the image to be processed and an image processing model, the image processing model comprises an image feature extraction submodel and a noise information submodel, the image feature extraction submodel is used for obtaining a feature vector of the image to be processed, the noise information submodel is used for determining the noise image of the image to be processed according to the feature vector of the image to be processed, the image processing model further comprises a classification submodel in a training process, and the loss of the image processing model is determined based on the noise vector added by the noise information submodel and the classification probability distribution output by the classification submodel;

and the generation module is used for generating a target image corresponding to the image to be processed according to the image to be processed and the noise image, wherein the classification of the image to be processed and the classification of the target image are the same.

In a fourth aspect, the present disclosure provides an image classification apparatus, the apparatus comprising:

the second receiving module is used for receiving the images to be classified;

a second determining module, configured to determine, according to the image to be classified and the image classification model, a target classification corresponding to the image to be classified, where training data of the image classification model includes a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method of the first aspect.

In a fifth aspect, the present disclosure provides a computer readable medium having stored thereon a computer program which, when executed by a processing apparatus, performs the steps of the method of the first or second aspect.

In a sixth aspect, the present disclosure provides an electronic device comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to implement the steps of the method of the first or second aspect.

In the technical scheme, an image to be processed is received, so that a noise image corresponding to the image to be processed is determined according to the image to be processed and an image processing model, and a target image corresponding to the image to be processed is generated according to the image to be processed and the noise image, wherein the classification of the image to be processed and the classification of the target image are the same. Therefore, by the technical scheme, partial noise can be added in the image to be processed to obtain the target image, namely the target image can be generated in an attack simulating mode, the accuracy and the robustness of the image classification model obtained by training based on the target image can be improved to a certain extent, the attack prevention capability of the image classification model is improved, an auditing mechanism that malicious attack images bypass a content platform is avoided, reliable support is provided for auditing the images and video contents in the content platform in real time and accurately, and the use experience of a user is improved.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale. In the drawings:

FIG. 1 is a flow diagram of an image processing method provided in accordance with one embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a sliced sub-image provided by the present disclosure;

FIG. 3 is a schematic diagram of a noise vector of an output provided by the present disclosure;

FIG. 4 is a schematic illustration of a stitched noisy image provided by the present disclosure;

FIG. 5 is a block diagram of an image processing apparatus provided in accordance with one embodiment of the present disclosure;

FIG. 6 shows a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Fig. 1 is a flowchart of an image processing method according to an embodiment of the present disclosure, and as shown in fig. 1, the method may include:

in step 11, an image to be processed is received. The image to be processed may be an image used for training an image classification model in the related art, and the part of the image to be processed may be manually or automatically labeled in advance, so as to obtain a labeled classification corresponding to the image to be processed.

In step 12, a noise image corresponding to the image to be processed is determined according to the image to be processed and an image processing model, wherein the image processing model includes an image feature extraction submodel and a noise information submodel, the image feature extraction submodel is used for obtaining a feature vector of the image to be processed, the noise information submodel is used for determining the noise image of the image to be processed according to the feature vector of the image to be processed, the image processing model further includes a classification submodel in a training process, and loss of the image processing model is determined based on a noise vector added by the noise information submodel and a classification probability distribution output by the classification submodel.

In this step, the image to be processed may be processed in a manner of simulating an attack on the image to be processed, so as to obtain a noise image corresponding to the image to be processed. For example, in the embodiment, the image to be processed may be processed by simulating an attack manner of adding noise, and in the training process of the image processing model, multi-target learning is performed by combining a noise vector and a classification probability distribution, so as to improve the accuracy of the image processing model and the adaptability to an application scenario.

In step 13, a target image corresponding to the image to be processed is generated according to the image to be processed and the noise image, wherein the classification of the image to be processed and the classification of the target image are the same, and the target image can be used for training an image classification model.

In this embodiment, by processing the image to be processed, noise may be added to the image to be processed to obtain the target image, and the classification of the target image is the same as that of the image to be processed, so that the image classification model is trained using the target image including noise, and the anti-attack capability of the image classification model may be effectively improved.

In one possible embodiment, the image processing model is determined by:

training samples are obtained, wherein each training sample comprises a sample image and an annotation class corresponding to the sample image.

Under the condition of obtaining user authorization, obtaining partial images from corresponding content platforms as sample images, and labeling the sample images, so as to obtain labeling classifications corresponding to the sample images. For example, the labeling can be performed by a method commonly used in the art, such as manual labeling or model automatic labeling, which is not limited by the present disclosure.

And determining a feature vector sequence corresponding to the sample image through an image feature sub-model, wherein the feature vector sequence corresponding to the sample image comprises feature vectors of a plurality of sub-images of the sample image.

And inputting the characteristic vector sequence into the noise information submodel to obtain a noise vector corresponding to each sub-image.

For example, the noise information sub-model may be an inverse convolution model, so that the features of each sub-image in the input feature vector sequence may be converted into noise vectors that are consistent with the size of the sub-image. The processing mode of the inverse convolution model may be void convolution or upsampling after ordinary convolution, and the specific implementation of the convolution processing is a known technology in the art and is not described herein again.

And classifying based on the fusion vector obtained by fusing the feature vector sequence and the noise vector to obtain the classification probability distribution of the fusion vector.

The vector of each sub-image in the feature vector sequence and the feature value of the corresponding position in the noise vector corresponding to the vector can be superimposed to obtain the fusion vector, and the fusion vector contains the features of the sample image and the noise features, so that the simulated attack on the sample image is realized. Then, softmax processing may be performed on the fused vector to obtain a classification probability distribution of each preset output classification corresponding to the fused vector.

And training the image processing model according to the noise vector, the classification probability distribution and a preset target classification.

Therefore, by the technical scheme, the characteristic noise vector corresponding to the sample image can be obtained by processing the sample image, the image containing the noise characteristic can be generated by the image processing model in a mode of simulating image attack, the attack effect of the target image generated by the determined image processing model can be improved by improving the accuracy of the image processing model, the accuracy and the effectiveness of the image classification model obtained by training based on the target image are improved to a certain extent, and the safety and the robustness of content platform audit are improved.

In one possible embodiment, an exemplary implementation of determining a noise image corresponding to the image to be processed according to the image to be processed and the image processing model is as follows, and the step may include:

determining a feature vector sequence corresponding to the image to be processed through an image feature sub-model, wherein the feature vector sequence corresponding to the image to be processed comprises feature vectors of a plurality of sub-images of the image to be processed;

in one possible embodiment, the sequence of feature vectors may be determined in particular by:

the method comprises the steps of segmenting an image to be input, inputting each segmented sub-image into an image feature extraction sub-model, and obtaining vectors of the sub-images, wherein the size of each sub-image is the same, the image to be input is the image to be processed or the sample image, namely the image to be input can be the sample image in the training process of the image processing model, and the image to be input can be the image to be processed in the application process of the image processing model.

As an example, the image to be input may be preprocessed to a preset resolution by resize, so as to uniformly process images at different resolutions. For the image to be processed, after obtaining the corresponding noise image, resize back may be further performed on the noise image to the original resolution of the image to be processed.

For example, the image processing model may be implemented based on a vit (vision transducer) model, in order to enable the transducer model to receive an image, the image to be input may be processed in the present disclosure and segmented by a preset grid, so as to obtain a plurality of sub-images, as shown in fig. 2, the image to be input may be segmented into 16 sub-images, i.e., sub-images Seg1-Seg16, by a preset grid of 4 x 4. Thereafter, an image feature extraction sub-model may be implemented based on the convolutional neural network model, so that each of the sub-images is converted into a vector by the convolutional neural network. For example, the image feature extraction sub-model in this step may be a two-layer convolutional neural network to perform image coding on the sub-images so as to obtain a vector corresponding to each sub-image.

And then, splicing the vectors of the sub-images according to the position of each sub-image to obtain a characteristic vector sequence corresponding to the image to be input.

For example, the position stitching order of the images may be preset, as shown in fig. 2, the images may be stitched in the order of the numbers of the sub-images from small to large, so as to obtain the feature vector sequence, that is, the feature vector sequence is a sequence with a length of 16.

And inputting the characteristic vector sequence corresponding to the image to be processed into the noise information submodel to obtain a noise vector corresponding to each sub-image of the image to be processed, wherein the noise information submodel is an inverse convolution model.

The specific implementation manner of the above steps is the same as the processing procedure of the sample image, and is not described herein again.

And then, splicing the noise vectors corresponding to each sub-image of the image to be processed according to the positions of the sub-images to obtain a noise image corresponding to the image to be processed.

As shown in fig. 3, the vectors 1-16 are each feature vector in the extracted feature vector sequence, and generate noise vectors 1-16 in a one-to-one correspondence, so that the noise vectors can be stitched according to the correspondence between the sequence order and the positions of the sub-images to obtain a noise image corresponding to the image to be processed, and for example, the stitched noise image is shown in fig. 4.

Therefore, by the technical scheme, the noise characteristics of each sub-image in the image to be processed can be predicted, so that the corresponding noise vector of each sub-image can be determined, the matching between the noise vector and the image to be processed is ensured, the unstable influence of random noise added in the prior art on the noise of a prediction result is avoided, the attack efficiency of the noise image for simulating the attack is improved, and reliable data support is provided for effective subsequent attack defense.

In a possible embodiment, the exemplary implementation of training the image processing model according to the noise vector, the classification probability distribution and the preset target classification is as follows, and this step may include:

and determining the sum of squares of the noise values of each pixel point in the noise vector corresponding to each sample image in the current training batch as a first loss.

Illustratively, the first loss L1 may be calculated by the following formula:

wherein n is the number of sample images in the current training batch;

y_jthe noise value is used for representing the j-th pixel point in the noise vector;

j is used to represent a pixel point, pixel, in the sample image_allRepresenting the traversal of each pixel point in the sample image.

And determining a second loss according to the target classification and the probability corresponding to the target classification in the classification probability distribution of each sample image in the current training batch.

As an example, if the noise information sub-model is an orientation type, the target classification is a preset classification corresponding to the orientation type, and the second loss is a sum of a probability corresponding to the target classification in the classification probability distribution of each sample image in the current training batch and a cross entropy between the target classification.

The orientation type is that noise is added into the image, so that the image classification model can output a specified classification category while the image classification model is wrongly classified. For example, if an image classification model is attacked and a category "dog" is designated, and the real category corresponding to the image is "panda", it is necessary to classify the image obtained by adding the noise image into the image classification model and output the image as "dog".

In this case, the second loss L2 can be determined by the following equation:

wherein, y_targetA flag value for indicating the target classification, i.e., a preset classification specified by the orientation type;

for representing the probability corresponding to the target classification in the classification probability distribution;

n is the number of sample images in the current training batch.

And if the noise information submodel is of a non-directional type, the target classification is the label classification corresponding to the sample image, and the second loss is the sum of probability distributions of other classifications except the target classification in the classification probability distribution of each sample image in the current training batch and the sum of cross entropy between the target classifications.

The non-directional type is that noise is added into the image, so that the image classification model is wrongly classified, and the output category is other categories except the real category corresponding to the image. For example, if the image classification model is attacked and the true category corresponding to the image is "panda", the image classification model may be required to classify the image to which the noise image is added into any one of categories other than "panda".

In this case, the second loss L2 can be determined by the following equation:

wherein, y_trueA marking value used for representing the corresponding marking classification of the sample image;

the label classification is used for representing the probability corresponding to the label classification in the classification probability distribution;

n is the number of sample images in the current training batch.

And then determining the sum of the first loss and the second loss as a target loss so as to train the image processing model according to the target loss.

When the image processing model is trained according to the target loss, if the target loss is greater than a preset loss threshold value, the accuracy of the image processing model is insufficient. Or if the iteration times are smaller than the preset time threshold, the image processing model is considered to have fewer iteration times and insufficient accuracy.

Accordingly, in the case where the above condition is satisfied, the parameters of the image processing model may be updated according to the target loss. The method for updating the parameter based on the determined target loss may adopt an updating method commonly used in the art, such as a gradient descent method, so that the target loss may gradually converge, and details are not repeated here. If the above condition is not satisfied, the accuracy of the image processing model may be considered to meet the training requirement, and at this time, the training process may be stopped to obtain the trained image processing model.

Therefore, by the scheme, in the training process of the image processing model, the loss of the model can be determined through the noise features added to the image and the accuracy of the simulation attack, so that the image processing model can be trained from the two aspects, the noise features contained in the target image obtained based on the image training model are smaller and are less easily perceived by human eyes, meanwhile, the simulation attack on the image classification is realized, the image classification model obtained based on the target image training can automatically realize the image classification when the specific parameters in the attack mode are not known, the image attack defense is carried out, the safety and the effectiveness of the audit of a content platform based on the image classification model are ensured, and the workload of manual identification is reduced.

In one possible embodiment, an exemplary implementation of generating a target image corresponding to a to-be-processed image from the to-be-processed image and a noise image is as follows, which may include:

and superposing the data of the pixel points corresponding to the same position in the image to be processed and the noise image to obtain the target image.

In this embodiment, the feature values of the corresponding pixel points in the image to be processed and the generated noise image may be superimposed, respectively, so as to obtain the target image. Therefore, each pixel point in the image to be processed can be processed respectively, so that the smoothness of noise features contained in the target image can be guaranteed, the perception of human eyes to the noise features in the target image is reduced, the fineness of the determined noise features in the target image can be improved, the simulated attack efficiency of the target image is improved, and accurate data support is improved for subsequent determination of defense measures.

The present disclosure also provides an image classification method, which may include:

receiving an image to be classified, which may be, for example, an image collected from an image to be audited or a video in the content platform after obtaining the authorization, so as to audit the uploaded content in the content platform in real time.

And then, determining a target classification corresponding to the image to be classified according to the image to be classified and the image classification model, wherein training data of the image classification model comprises a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method.

As can be seen from the above, the target training image used for training the image classification model includes corresponding noise features, and when training is performed based on the training data, the image classification model may learn attack characteristics of the noise features, such that when the resulting image classification model classifies images containing noise attacks, can carry out accurate classification on the image, improves the accuracy of the output result of the image classification model, the image classification efficiency can be improved, reliable data support is provided for improving the auditing instantaneity of the content platform and processing uploaded content in the content platform, the robustness and safety of the image classification model are improved, the influence of image noise attack on the image classification model is avoided, the safety and robustness of the content platform applying the image classification model are improved, and the user experience is improved.

The present disclosure also provides an image processing apparatus, as shown in fig. 5, the apparatus 10 including:

a first receiving module 100, configured to receive an image to be processed;

a first determining module 200, configured to determine a noise image corresponding to the image to be processed according to the image to be processed and an image processing model, where the image processing model includes an image feature extraction submodel and a noise information submodel, the image feature extraction submodel is used to obtain a feature vector of the image to be processed, the noise information submodel is used to determine a noise image of the image to be processed according to the feature vector of the image to be processed, the image processing model further includes a classification submodel in a training process, and a loss of the image processing model is determined based on a noise vector added by the noise information submodel and a classification probability distribution output by the classification submodel;

a generating module 300, configured to generate a target image corresponding to the image to be processed according to the image to be processed and the noise image, where the classification of the image to be processed and the classification of the target image are the same.

Optionally, the image processing model is generated by a construction module comprising:

the first obtaining sub-module is used for obtaining training samples, wherein each training sample comprises a sample image and an annotation classification corresponding to the sample image;

the first determining submodule is used for determining a feature vector sequence corresponding to the sample image through an image feature sub-model, wherein the feature vector sequence corresponding to the sample image comprises feature vectors of a plurality of sub-images of the sample image;

the second determining submodule is used for inputting the characteristic vector sequence into the noise information submodel to obtain a noise vector corresponding to each sub-image;

the classification submodules are used for classifying based on the fusion vectors obtained by fusing the characteristic vector sequences and the noise vectors to obtain the classification probability distribution of the fusion vectors;

and the first training submodule is used for training the image processing model according to the noise vector, the classification probability distribution and a preset target classification.

Optionally, the first determining module includes:

the third determining submodule is used for determining a feature vector sequence corresponding to the image to be processed through an image feature sub-model, wherein the feature vector sequence corresponding to the image to be processed comprises feature vectors of a plurality of sub-images of the image to be processed;

a fourth determining submodule, configured to input the feature vector sequence corresponding to the image to be processed into the noise information submodel, to obtain a noise vector corresponding to each sub-image of the image to be processed, where the noise information submodel is an inverse convolution model;

and the splicing submodule is used for splicing the noise vector corresponding to each sub-image of the image to be processed according to the position of the sub-image to obtain the noise image corresponding to the image to be processed.

Optionally, the sequence of feature vectors is determined by:

segmenting an image to be input, and inputting each segmented sub-image into the image feature extraction sub-model to obtain a vector of the sub-image, wherein the image to be input is the image to be processed or the sample image, and the size of each sub-image is the same;

and splicing the vectors of the sub-images according to the position of each sub-image to obtain a characteristic vector sequence corresponding to the image to be input.

Optionally, the first training submodule comprises:

a fifth determining submodule, configured to determine a sum of squares of noise values of each pixel point in the noise vector corresponding to each sample image in the current training batch as a first loss;

a sixth determining submodule, configured to determine a second loss according to the target classification and a probability corresponding to the target classification in the classification probability distribution of each sample image in the current training batch;

and the second training submodule is used for determining the sum of the first loss and the second loss as a target loss so as to train the image processing model according to the target loss.

Optionally, if the noise information sub-model is an orientation type, the target classification is a preset classification corresponding to the orientation type;

and if the noise information sub-model is a non-directional type, the target classification is a labeling classification corresponding to the sample image.

Optionally, the generating module includes:

The present disclosure also provides an image classification apparatus, including:

the second receiving module is used for receiving the images to be classified;

and the second determining module is used for determining the target classification corresponding to the image to be classified according to the image to be classified and the image classification model, wherein the training data of the image classification model comprises a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method.

Referring now to FIG. 6, a block diagram of an electronic device 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving an image to be processed; determining a noise image corresponding to the image to be processed according to the image to be processed and an image processing model, wherein the image processing model comprises an image feature extraction submodel and a noise information submodel, the image feature extraction submodel is used for obtaining a feature vector of the image to be processed, the noise information submodel is used for determining the noise image of the image to be processed according to the feature vector of the image to be processed, the image processing model further comprises a classification submodel in a training process, and the loss of the image processing model is determined based on the noise vector added by the noise information submodel and the classification probability distribution output by the classification submodel; and generating a target image corresponding to the image to be processed according to the image to be processed and the noise image, wherein the classification of the image to be processed and the classification of the target image are the same.

Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving an image to be classified; determining a target classification corresponding to the image to be classified according to the image to be classified and the image classification model, wherein training data of the image classification model comprises a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method of any one of claims 1 to 7.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented by software or hardware. The name of the module does not in some cases constitute a limitation of the module itself, and for example, the first receiving module may also be described as a "module that receives an image to be processed".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Example 1 provides an image processing method according to one or more embodiments of the present disclosure, wherein the method includes:

receiving an image to be processed;

Example 2 provides the method of example 1, wherein the image processing model is determined by:

the image processing model is determined by:

acquiring training samples, wherein each training sample comprises a sample image and an annotation classification corresponding to the sample image;

determining a feature vector sequence corresponding to the sample image through an image feature sub-model, wherein the feature vector sequence corresponding to the sample image comprises feature vectors of a plurality of sub-images of the sample image;

inputting the characteristic vector sequence into the noise information submodel to obtain a noise vector corresponding to each sub-image;

classifying based on the fusion vector obtained by fusing the feature vector sequence and the noise vector to obtain the classification probability distribution of the fusion vector;

Example 3 provides the method of example 2, wherein the determining a noise image corresponding to the image to be processed according to the image to be processed and an image processing model, includes:

inputting the feature vector sequence corresponding to the image to be processed into the noise information submodel to obtain a noise vector corresponding to each subimage of the image to be processed, wherein the noise information submodel is an inverse convolution model;

and splicing the noise vectors corresponding to each sub-image of the image to be processed according to the positions of the sub-images to obtain a noise image corresponding to the image to be processed.

Example 4 provides the method of example 2 or example 3, wherein the sequence of feature vectors is determined by:

Example 5 provides the method of example 2, wherein the training the image processing model according to the noise vector, the classification probability distribution, and a preset target classification includes:

determining the sum of squares of noise values of each pixel point in noise vectors corresponding to each sample image in the current training batch as a first loss;

determining a second loss according to the probability corresponding to the target classification in the classification probability distribution of each sample image in the current training batch and the target classification;

and determining the sum of the first loss and the second loss as a target loss, and training the image processing model according to the target loss.

Example 6 provides the method of example 5, wherein, if the noise information sub-model is an orientation type, the target classification is a preset classification corresponding to the orientation type;

Example 7 provides the method of example 1, wherein the generating a target image corresponding to the image to be processed from the image to be processed and the noise image, according to one or more embodiments of the present disclosure, includes:

Example 8 provides, in accordance with one or more embodiments of the present disclosure, a method of image classification, the method comprising:

receiving an image to be classified;

determining a target classification corresponding to the image to be classified according to the image to be classified and the image classification model, wherein training data of the image classification model comprises a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method in any one of examples 1 to 7.

Example 9 provides, in accordance with one or more embodiments of the present disclosure, an image processing apparatus, including:

the first receiving module is used for receiving an image to be processed;

Example 10 provides, in accordance with one or more embodiments of the present disclosure, an image classification apparatus, the apparatus comprising:

the second receiving module is used for receiving the images to be classified;

a second determining module, configured to determine, according to the image to be classified and the image classification model, a target classification corresponding to the image to be classified, where training data of the image classification model includes a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method in any one of examples 1 to 7.

Example 11 provides a computer-readable medium having stored thereon a computer program that, when executed by a processing apparatus, performs the steps of the method of any of examples 1-8, in accordance with one or more embodiments of the present disclosure.

Example 12 provides, in accordance with one or more embodiments of the present disclosure, an electronic device, comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to carry out the steps of the method of any of examples 1-8.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Claims

1. An image processing method, characterized in that the method comprises:

receiving an image to be processed;

2. The method of claim 1, wherein the image processing model is determined by:

3. The method according to claim 2, wherein the determining a noise image corresponding to the image to be processed according to the image to be processed and an image processing model comprises:

4. The method according to claim 2 or 3, characterized in that the sequence of feature vectors is determined by:

5. The method of claim 2, wherein training the image processing model based on the noise vector, the classification probability distribution, and a preset target classification comprises:

6. The method of claim 5,

if the noise information submodel is an orientation type, the target classification is a preset classification corresponding to the orientation type;

7. The method of claim 1, wherein generating a target image corresponding to the image to be processed from the image to be processed and the noise image comprises:

8. A method of image classification, the method comprising:

receiving an image to be classified;

determining a target classification corresponding to the image to be classified according to the image to be classified and the image classification model, wherein training data of the image classification model comprises a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method of any one of claims 1 to 7.

9. An image processing apparatus, characterized in that the apparatus comprises:

the first receiving module is used for receiving an image to be processed;

10. An image classification apparatus, characterized in that the apparatus comprises:

the second receiving module is used for receiving the images to be classified;

a second determining module, configured to determine a target classification corresponding to the image to be classified according to the image to be classified and the image classification model, where training data of the image classification model includes a target training image and an annotation classification corresponding to the target training image, and the target training image is an image obtained based on the image processing method according to any one of claims 1 to 7.

11. A computer-readable medium, on which a computer program is stored, characterized in that the program, when being executed by processing means, carries out the steps of the method of any one of claims 1 to 8.

12. An electronic device, comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to carry out the steps of the method according to any one of claims 1 to 8.