CN111814776B - Image processing method, device, server and storage medium - Google Patents

Image processing method, device, server and storage medium Download PDF

Info

Publication number
CN111814776B
CN111814776B CN202010949537.8A CN202010949537A CN111814776B CN 111814776 B CN111814776 B CN 111814776B CN 202010949537 A CN202010949537 A CN 202010949537A CN 111814776 B CN111814776 B CN 111814776B
Authority
CN
China
Prior art keywords
target
feature
image
target object
target image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010949537.8A
Other languages
Chinese (zh)
Other versions
CN111814776A (en
Inventor
刘彦宏
王洪斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202010949537.8A priority Critical patent/CN111814776B/en
Publication of CN111814776A publication Critical patent/CN111814776A/en
Application granted granted Critical
Publication of CN111814776B publication Critical patent/CN111814776B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses an image processing method, image processing equipment, a server and a storage medium, wherein the method comprises the following steps: acquiring a target image to be processed; inputting the target image into a target detection model for detection so as to identify a target detection frame and a target category corresponding to the target object from the target image; inputting a target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of a target object in the target image; determining a second feature mask set corresponding to the target category according to the corresponding relation between the preset feature masks and the categories, and calculating a similarity coefficient between the first feature mask set and the second feature mask set; and correcting the target class of the target object in the target image according to the similarity coefficient. By the method for classifying and correcting the target object in the attacked target image based on the robust features, the difficulty of resisting attack cracking is increased, and the efficiency and the accuracy of image processing are effectively improved.

Description

Image processing method, device, server and storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, a server, and a storage medium.
Background
The intelligent monitoring system can be applied to a plurality of applications using intelligent monitoring cameras in the construction of smart cities, and can be used for detecting and identifying different types of target objects in monitoring images obtained by shooting in scenes. For example, in applications such as community security, food supervision, environmental supervision, traffic monitoring, and the like, specific people and objects are detected, and in applications such as security and supervision, the robustness requirement on detection is high. At present, people realize detection and identification of target objects in monitored images through a deep convolutional neural network technology, the target detection technology obtains a model through training on a predefined image data set, and then predicts images acquired online in real time through the model in an actual scene.
However, when the deep neural network model processes an image subjected to counterattack, the accuracy rate is reduced sharply, the counterdefense method based on the norm model can provide robustness only when the disturbance value of the pixel is smaller than a certain threshold value, and an effective solution is not provided for the attack with the disturbance range larger than the threshold value. Therefore, how to improve robustness in the image processing process is very important.
Disclosure of Invention
The embodiment of the invention provides an image processing method, equipment, a server and a storage medium, wherein the difficulty of resisting attack cracking is increased by classifying and correcting target objects in an attacked target image based on robust features, an attacker not only needs to change the prediction type of a model but also needs to change each robust feature, meanwhile, the original deep neural network model can be continuously used for predicting the target image which is not attacked, the high accuracy of prediction is kept by using the non-robust features, and the efficiency and the accuracy of image processing are effectively improved.
In a first aspect, an embodiment of the present invention provides an image processing method, including:
acquiring a target image to be processed, wherein the target image comprises a target object;
inputting the target image into a target detection model for detection so as to identify a target detection frame and a target category corresponding to the target object from the target image;
inputting the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image;
determining a second feature mask set corresponding to the target category according to a preset corresponding relation between feature masks and categories, and calculating a similarity coefficient between the first feature mask set and the second feature mask set;
and correcting the target class of the target object in the target image according to the similarity coefficient.
Further, before inputting the target image into a target detection model for detection, the method further includes:
acquiring a sample image set, and determining a target object in each sample image in the sample image set;
adding a first class label and a detection frame to the target object in each sample image;
and inputting the sample images added with the first class labels and the detection frames into a deep neural network model for training to obtain the target detection model.
Further, before inputting the target image into the robust feature extraction model, the method further includes:
determining components of the target object in each sample image;
adding a second class label and a feature mask to each component of the target object in each sample image;
and inputting the sample images added with the second class labels and the feature masks into the deep neural network model for training to obtain the robust feature extraction model.
Further, the inputting the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image includes:
inputting the target image into a robust feature extraction model to determine pixel coverage areas of all components of the target object in the target image;
and extracting the first feature mask set corresponding to the pixel coverage area of each component of the target object.
Further, the performing, according to the similarity coefficient, a correction process on the target class of the target object in the target image includes:
detecting whether the similarity coefficient is larger than a preset threshold value or not;
if the detection result is that the similarity coefficient is larger than a preset threshold value, determining that the target object in the target image is not attacked by counterattack, and not correcting the target type of the target object;
and if the detection result is that the similarity coefficient is smaller than or equal to a preset threshold value, determining that the target object in the target image is attacked by counterattack, and correcting the target class of the target object in the target image.
Further, the performing rectification processing on the target class of the target object in the target image includes:
determining a feature mask corresponding to each category according to the corresponding relation between the preset feature masks and the categories;
calculating similarity coefficients of the first feature mask set and the feature masks corresponding to each category;
and determining the category corresponding to the maximum similarity coefficient as the target category of the target object.
Further, the calculating a similarity coefficient between the first set of feature masks and the second set of feature masks comprises:
acquiring an intersection feature mask of the first feature mask set and the second feature mask set;
acquiring a union feature mask of the first feature mask set and the second feature mask set;
and determining a similarity coefficient between the first feature mask set and the second feature mask set according to the absolute value of the ratio of the intersection feature mask to the union feature mask.
In a second aspect, an embodiment of the present invention provides an image processing apparatus, including:
acquiring a target image to be processed, wherein the target image comprises a target object;
inputting the target image into a target detection model for detection so as to identify a target detection frame and a target category corresponding to the target object from the target image;
inputting the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image;
determining a second feature mask set corresponding to the target category according to a preset corresponding relation between feature masks and categories, and calculating a similarity coefficient between the first feature mask set and the second feature mask set;
and correcting the target class of the target object in the target image according to the similarity coefficient.
In a third aspect, an embodiment of the present invention provides a server, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program that supports an image processing device to execute the above method, and the computer program includes a program, and the processor is configured to call the program to execute the method of the first aspect.
In a fourth aspect, the present invention provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to implement the method of the first aspect.
In the embodiment of the present invention, a server may obtain a target image to be processed, input the target image to be processed into a target detection model for detection, input the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, determine a second feature mask set corresponding to the target class according to a preset correspondence between feature masks and classes, and calculate a similarity coefficient between the first feature mask set and the second feature mask set, so as to correct the target class of the target object in the target image according to the similarity coefficient. By the method for carrying out classification correction on the target object in the attacked target image based on the robust features, difficulty in resisting attack cracking is increased, an attacker not only needs to change the prediction type of the model but also needs to change each robust feature, meanwhile, the original deep neural network model can be continuously used for predicting the target image which is not attacked, high prediction accuracy is kept by using the non-robust features, and efficiency and accuracy of image processing are effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram of an image processing method provided by an embodiment of the invention;
fig. 2 is a schematic block diagram of an image processing apparatus provided by an embodiment of the present invention;
fig. 3 is a schematic block diagram of a server according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The image processing method provided by the embodiment of the invention can be applied to an image processing device, and the image processing device can be arranged in a server.
An image processing method provided by the embodiment of the invention is schematically described below with reference to fig. 1.
Referring to fig. 1, fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present invention, and as shown in fig. 1, the method may be executed by an image processing apparatus, where the image processing apparatus is disposed in a server. Specifically, the method of the embodiment of the present invention includes the following steps.
S101: and acquiring a target image to be processed, wherein the target image comprises a target object.
In the embodiment of the invention, the image processing device can acquire a target image to be processed, wherein the target image comprises a target object. In some embodiments, one or more target objects are included in the target image, and the target objects may be any objects such as people and things.
In some embodiments, the target image may be captured by a camera; in some embodiments, the camera may include, but is not limited to, a camera, a sensor, etc., which may be used to monitor a scene. In some embodiments, the image processing apparatus may establish a communication connection with a camera, and the image processing apparatus may acquire a target image captured by the camera.
S102: and inputting the target image into a target detection model for detection so as to identify a target detection frame and a target category corresponding to the target object from the target image.
In this embodiment of the present invention, the image processing device may input the target image into a target detection model for detection, so as to identify a target detection frame and a target category corresponding to the target object from the target image.
In an embodiment, before inputting the target image into a target detection model for detection, the image processing device may obtain a sample image set, determine a target object in each sample image in the sample image set, add a first class label and a detection frame to the target object in each sample image, and input each sample image added with the first class label and the detection frame into a deep neural network model for training to obtain the target detection model. In some embodiments, the first class label is used to indicate a class of each target object in each sample image. In some embodiments, the detection frame may be a closed frame composed of lines, wherein the closed frame composed of lines may have any shape, and in one example, the closed frame composed of lines may be a circular frame, a square frame, a polygonal frame, an irregular frame, or the like, which is not limited herein.
S103: and inputting the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image.
In this embodiment of the present invention, the image processing device may input the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image. In some embodiments, the first set of feature masks includes one or more feature masks.
In one embodiment, before inputting the target image into the robust feature extraction model, the image processing device may determine components of the target object in each sample image, add a second class label and a feature mask to each component of the target object in each sample image, and input each sample image added with the second class label and the feature mask into the deep neural network model for training, so as to obtain the robust feature extraction model. In some embodiments, the feature mask is composed of numbers to indicate robust features of components of the target object.
In one example, assuming that the target object in the sample image is an automobile, the components of the automobile include tires, windows, a frame, wipers, and the like.
In one embodiment, before inputting the target image into the robust feature extraction model, the image processing device may extract a portion of a subsample image including the target object from the sample image set, determine a component of the target object in each subsample image, add a second class label and a first feature mask to each component of the target object in each subsample image, and input each subsample image added with the second class label and the first feature mask into the deep neural network model for training, so as to obtain the robust feature extraction model.
In one embodiment, when the target image is input into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, the image processing apparatus may input the target image into the robust feature extraction model to determine a pixel coverage area of each component of the target object in the target image and extract the first feature mask set corresponding to the pixel coverage area of each component of the target object.
In one example, assuming that the target object is an automobile, and the constituent parts of the automobile include windows, the windows may represent pixel coverage areas of the windows with first feature masks corresponding to different colors.
S104: and determining a second feature mask set corresponding to the target category according to the preset corresponding relation between the feature masks and the categories, and calculating a similarity coefficient between the first feature mask set and the second feature mask set.
In this embodiment of the present invention, the image processing device may determine, according to a preset correspondence between feature masks and categories, a second feature mask set corresponding to the target category, and calculate a similarity coefficient between the first feature mask set and the second feature mask set. In some embodiments, the second set of feature masks includes one or more feature masks.
In one embodiment, when calculating the similarity coefficient between the first feature mask set and the second feature mask set, the image processing apparatus may obtain an intersection feature mask of the first feature mask set and the second feature mask set, obtain a union feature mask of the first feature mask set and the second feature mask set, and determine the similarity coefficient between the first feature mask set and the second feature mask set according to an absolute value of a ratio of the intersection feature mask to the union feature mask.
In some embodiments, the correspondence between the preset feature mask and the category may be represented in the form of a matrix, and the matrix is established according to the preset feature mask and the category.
In one example, assuming that the first set of feature masks is fri and the second set of feature masks is fei, a similarity coefficient J (fri, fei) between the first set of feature masks fri and the second set of feature masks fei may be calculated according to the following equation (1).
Figure 574501DEST_PATH_IMAGE001
S105: and correcting the target class of the target object in the target image according to the similarity coefficient.
In this embodiment of the present invention, the image processing device may perform a correction process on the target class of the target object in the target image according to the similarity coefficient.
In one embodiment, when the image processing apparatus performs the correction processing on the target class of the target object in the target image according to the similarity coefficient, it may detect whether the similarity coefficient is greater than a preset threshold, if the detection result is that the similarity coefficient is greater than the preset threshold, it may be determined that the target object in the target image is not attacked, the correction processing is not performed on the target type of the target object, and if the detection result is that the similarity coefficient is less than or equal to the preset threshold, it may be determined that the target object in the target image is attacked, and the correction processing is performed on the target class of the target object in the target image.
For example, if the image processing device detects that the similarity coefficient J is greater than the preset threshold t, it may be determined that the target object bi in the target image is not attacked by counterattack, the target type of the target object is not corrected, and the first tag of the currently identified target object is determined to be the final target category of the target object; if the similarity coefficient J is detected to be less than or equal to the preset threshold t, it may be determined that the target object in the target image is under counterattack and the target class of the target object in the target image needs to be corrected.
In an embodiment, when performing the rectification processing on the target category of the target object in the target image, the image processing apparatus may determine, according to the preset correspondence between the feature masks and the categories, a feature mask corresponding to each category, calculate a similarity coefficient between the first feature mask set and the feature mask corresponding to each category, and determine the category corresponding to the maximum similarity coefficient as the target category of the target object.
In an embodiment, when determining that the category corresponding to the maximum similarity coefficient is the target category of the target object, the image processing device may obtain a first category tag corresponding to the maximum similarity coefficient, and add the first category tag to the target object to determine that the category corresponding to the maximum similarity coefficient is the target category of the target object.
For example, it is assumed that a feature mask corresponding to each category is determined according to a preset correspondence between feature masks and categories, a similarity coefficient between the first feature mask set and the feature mask corresponding to each category is calculated, and a category corresponding to the first category label cj corresponding to the maximum similarity coefficient is determined as a target category of the target object.
In the embodiment of the present invention, the image processing apparatus may obtain a target image to be processed, input the target image to be processed into a target detection model for detection, input the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, determine a second feature mask set corresponding to the target class according to a preset correspondence between feature masks and classes, and calculate a similarity coefficient between the first feature mask set and the second feature mask set, so as to correct the target class of the target object in the target image according to the similarity coefficient. By the method for carrying out classification correction on the target object in the attacked target image based on the robust features, difficulty in resisting attack cracking is increased, an attacker not only needs to change the prediction type of the model but also needs to change each robust feature, meanwhile, the original deep neural network model can be continuously used for predicting the target image which is not attacked, high prediction accuracy is kept by using the non-robust features, and efficiency and accuracy of image processing are effectively improved.
The embodiment of the invention also provides an image processing device, which is used for executing the unit of the method in any one of the preceding claims. Specifically, referring to fig. 2, fig. 2 is a schematic block diagram of an image processing apparatus according to an embodiment of the present invention. The image processing apparatus of the present embodiment includes: an acquisition unit 201, a detection unit 202, an extraction unit 203, a determination unit 204, and a correction unit 205.
An obtaining unit 201, configured to obtain a target image to be processed, where the target image includes a target object;
a detection unit 202, configured to input the target image into a target detection model for detection, so as to identify a target detection frame and a target category corresponding to the target object from the target image;
an extracting unit 203, configured to input the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image;
a determining unit 204, configured to determine, according to a preset correspondence between feature masks and categories, a second feature mask set corresponding to the target category, and calculate a similarity coefficient between the first feature mask set and the second feature mask set;
a correcting unit 205, configured to perform correction processing on the target class of the target object in the target image according to the similarity coefficient.
Further, before the detection unit 202 inputs the target image into a target detection model for detection, the detection unit is further configured to:
acquiring a sample image set, and determining a target object in each sample image in the sample image set;
adding a first class label and a detection frame to the target object in each sample image;
and inputting the sample images added with the first class labels and the detection frames into a deep neural network model for training to obtain the target detection model.
Further, before the extracting unit 203 inputs the target image into the robust feature extraction model, it is further configured to:
determining components of the target object in each sample image;
adding a second class label and a feature mask to each component of the target object in each sample image;
and inputting the sample images added with the second class labels and the feature masks into the deep neural network model for training to obtain the robust feature extraction model.
Further, when the extracting unit 203 inputs the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, specifically, the extracting unit is configured to:
inputting the target image into a robust feature extraction model to determine pixel coverage areas of all components of the target object in the target image;
and extracting the first feature mask set corresponding to the pixel coverage area of each component of the target object.
Further, when the correcting unit 205 performs the correction processing on the target class of the target object in the target image according to the similarity coefficient, specifically, the correcting unit is configured to:
detecting whether the similarity coefficient is larger than a preset threshold value or not;
if the detection result is that the similarity coefficient is larger than a preset threshold value, determining that the target object in the target image is not attacked by counterattack, and not correcting the target type of the target object;
and if the detection result is that the similarity coefficient is smaller than or equal to a preset threshold value, determining that the target object in the target image is attacked by counterattack, and correcting the target class of the target object in the target image.
Further, when the rectification unit 205 performs the rectification processing on the target class of the target object in the target image, it is specifically configured to:
determining a feature mask corresponding to each category according to the corresponding relation between the preset feature masks and the categories;
calculating similarity coefficients of the first feature mask set and the feature masks corresponding to each category;
and determining the category corresponding to the maximum similarity coefficient as the target category of the target object.
Further, when the determining unit 204 calculates the similarity coefficient between the first feature mask set and the second feature mask set, it is specifically configured to:
acquiring an intersection feature mask of the first feature mask set and the second feature mask set;
acquiring a union feature mask of the first feature mask set and the second feature mask set;
and determining a similarity coefficient between the first feature mask set and the second feature mask set according to the absolute value of the ratio of the intersection feature mask to the union feature mask.
In the embodiment of the present invention, the image processing apparatus may obtain a target image to be processed, input the target image to be processed into a target detection model for detection, input the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, determine a second feature mask set corresponding to the target class according to a preset correspondence between feature masks and classes, and calculate a similarity coefficient between the first feature mask set and the second feature mask set, so as to correct the target class of the target object in the target image according to the similarity coefficient. By the method for carrying out classification correction on the target object in the attacked target image based on the robust features, difficulty in resisting attack cracking is increased, an attacker not only needs to change the prediction type of the model but also needs to change each robust feature, meanwhile, the original deep neural network model can be continuously used for predicting the target image which is not attacked, high prediction accuracy is kept by using the non-robust features, and efficiency and accuracy of image processing are effectively improved.
Referring to fig. 3, fig. 3 is a schematic block diagram of a server according to an embodiment of the present invention. The server in this embodiment as shown in the figure may include: one or more processors 301; one or more input devices 302, one or more output devices 303, and memory 304. The processor 301, the input device 302, the output device 303, and the memory 304 are connected by a bus 305. The memory 304 is used for storing computer programs, including programs, and the processor 301 is used for executing the programs stored in the memory 304. Wherein the processor 301 is configured to invoke the program to perform:
acquiring a target image to be processed, wherein the target image comprises a target object;
inputting the target image into a target detection model for detection so as to identify a target detection frame and a target category corresponding to the target object from the target image;
inputting the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image;
determining a second feature mask set corresponding to the target category according to a preset corresponding relation between feature masks and categories, and calculating a similarity coefficient between the first feature mask set and the second feature mask set;
and correcting the target class of the target object in the target image according to the similarity coefficient.
Further, before the processor 301 inputs the target image into a target detection model for detection, the processor is further configured to:
acquiring a sample image set, and determining a target object in each sample image in the sample image set;
adding a first class label and a detection frame to the target object in each sample image;
and inputting the sample images added with the first class labels and the detection frames into a deep neural network model for training to obtain the target detection model.
Further, before the processor 301 inputs the target image into the robust feature extraction model, it is further configured to:
determining components of the target object in each sample image;
adding a second class label and a feature mask to each component of the target object in each sample image;
and inputting the sample images added with the second class labels and the feature masks into the deep neural network model for training to obtain the robust feature extraction model.
Further, when the processor 301 inputs the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, specifically, the processor is configured to:
inputting the target image into a robust feature extraction model to determine pixel coverage areas of all components of the target object in the target image;
and extracting the first feature mask set corresponding to the pixel coverage area of each component of the target object.
Further, when the processor 301 performs the rectification processing on the target class of the target object in the target image according to the similarity coefficient, specifically, the processor is configured to:
detecting whether the similarity coefficient is larger than a preset threshold value or not;
if the detection result is that the similarity coefficient is larger than a preset threshold value, determining that the target object in the target image is not attacked by counterattack, and not correcting the target type of the target object;
and if the detection result is that the similarity coefficient is smaller than or equal to a preset threshold value, determining that the target object in the target image is attacked by counterattack, and correcting the target class of the target object in the target image.
Further, when the processor 301 performs the rectification processing on the target class of the target object in the target image, specifically, the processor is configured to:
determining a feature mask corresponding to each category according to the corresponding relation between the preset feature masks and the categories;
calculating similarity coefficients of the first feature mask set and the feature masks corresponding to each category;
and determining the category corresponding to the maximum similarity coefficient as the target category of the target object.
Further, when the processor 301 calculates the similarity coefficient between the first feature mask set and the second feature mask set, it is specifically configured to:
acquiring an intersection feature mask of the first feature mask set and the second feature mask set;
acquiring a union feature mask of the first feature mask set and the second feature mask set;
and determining a similarity coefficient between the first feature mask set and the second feature mask set according to the absolute value of the ratio of the intersection feature mask to the union feature mask.
In the embodiment of the present invention, a server may obtain a target image to be processed, input the target image to be processed into a target detection model for detection, input the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, determine a second feature mask set corresponding to the target class according to a preset correspondence between feature masks and classes, and calculate a similarity coefficient between the first feature mask set and the second feature mask set, so as to correct the target class of the target object in the target image according to the similarity coefficient. By the method for carrying out classification correction on the target object in the attacked target image based on the robust features, difficulty in resisting attack cracking is increased, an attacker not only needs to change the prediction type of the model but also needs to change each robust feature, meanwhile, the original deep neural network model can be continuously used for predicting the target image which is not attacked, high prediction accuracy is kept by using the non-robust features, and efficiency and accuracy of image processing are effectively improved.
It should be understood that, in the embodiment of the present invention, the Processor 301 may be a Central Processing Unit (CPU), and the Processor may also be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 302 may include a touch pad, a microphone, etc., and the output device 303 may include a display (LCD, etc.), a speaker, etc.
The memory 304 may include a read-only memory and a random access memory, and provides instructions and data to the processor 301. A portion of the memory 304 may also include non-volatile random access memory. For example, the memory 304 may also store device type information.
In a specific implementation, the processor 301, the input device 302, and the output device 303 described in this embodiment of the present invention may execute the implementation described in the method embodiment shown in fig. 2 provided in this embodiment of the present invention, and may also execute the implementation of the image processing device described in fig. 3 in this embodiment of the present invention, which is not described again here.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the image processing method described in the embodiment corresponding to fig. 2 is implemented, and the image processing apparatus according to the embodiment corresponding to fig. 3 of the present invention may also be implemented, which is not described herein again.
The computer readable storage medium may be an internal storage unit of the image processing apparatus according to any of the foregoing embodiments, for example, a hard disk or a memory of the image processing apparatus. The computer-readable storage medium may also be an external storage device of the image processing apparatus, such as a plug-in hard disk provided on the image processing apparatus, a Smart Memory Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the image processing apparatus. The computer-readable storage medium is used for storing the computer program and other programs and data required by the image processing apparatus. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a computer-readable storage medium, which includes several instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned computer-readable storage media comprise: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a part of the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims (9)

1. An image processing method, comprising:
acquiring a target image to be processed, wherein the target image comprises a target object;
inputting the target image into a target detection model for detection so as to identify a target detection frame and a target category corresponding to the target object from the target image;
inputting the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, wherein the first feature mask set comprises one or more feature masks, and each feature mask is composed of numbers and is used for indicating robust features of each component of the target object;
the inputting the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image includes:
inputting the target image into a robust feature extraction model to determine pixel coverage areas of all components of the target object in the target image;
extracting the first feature mask set corresponding to the pixel coverage area of each component of the target object;
determining a second feature mask set corresponding to the target category according to a preset corresponding relation between feature masks and categories, and calculating a similarity coefficient between the first feature mask set and the second feature mask set, wherein the second feature mask set comprises one or more feature masks;
and correcting the target class of the target object in the target image according to the similarity coefficient.
2. The method of claim 1, wherein before inputting the target image into a target detection model for detection, the method comprises:
acquiring a sample image set, and determining a target object in each sample image in the sample image set;
adding a first class label and a detection frame to the target object in each sample image;
and inputting the sample images added with the first class labels and the detection frames into a deep neural network model for training to obtain the target detection model.
3. The method of claim 2, wherein before inputting the target image into a robust feature extraction model, further comprising:
determining components of the target object in each sample image;
adding a second class label and a feature mask to each component of the target object in each sample image;
and inputting the sample images added with the second class labels and the feature masks into the deep neural network model for training to obtain the robust feature extraction model.
4. The method according to claim 1, wherein the performing the rectification processing on the target class of the target object in the target image according to the similarity coefficient comprises:
detecting whether the similarity coefficient is larger than a preset threshold value or not;
if the detection result is that the similarity coefficient is larger than a preset threshold value, determining that the target object in the target image is not attacked by counterattack, and not correcting the target type of the target object;
and if the detection result is that the similarity coefficient is smaller than or equal to a preset threshold value, determining that the target object in the target image is attacked by counterattack, and correcting the target class of the target object in the target image.
5. The method according to claim 4, wherein the performing of the rectification processing on the target class of the target object in the target image comprises:
determining a feature mask corresponding to each category according to the corresponding relation between the preset feature masks and the categories;
calculating similarity coefficients of the first feature mask set and the feature masks corresponding to each category;
and determining the category corresponding to the maximum similarity coefficient as the target category of the target object.
6. The method of claim 1, wherein the calculating a similarity coefficient between the first set of feature masks and the second set of feature masks comprises:
acquiring an intersection feature mask of the first feature mask set and the second feature mask set;
acquiring a union feature mask of the first feature mask set and the second feature mask set;
and determining a similarity coefficient between the first feature mask set and the second feature mask set according to the absolute value of the ratio of the intersection feature mask to the union feature mask.
7. An image processing apparatus characterized by comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a target image to be processed, and the target image comprises a target object;
the detection unit is used for inputting the target image into a target detection model for detection so as to identify a target detection frame and a target category corresponding to the target object from the target image;
an extracting unit, configured to input the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, where the first feature mask set includes one or more feature masks, and each feature mask is composed of numbers and is used to indicate a robust feature of each component of the target object;
the extraction unit is configured to, when inputting the target image into a robust feature extraction model to extract a first feature mask set corresponding to each component of the target object in the target image, specifically, input the target image into the robust feature extraction model to determine a pixel coverage area of each component of the target object in the target image; extracting the first feature mask set corresponding to the pixel coverage area of each component of the target object;
a determining unit, configured to determine a second feature mask set corresponding to the target category according to a preset correspondence between feature masks and categories, and calculate a similarity coefficient between the first feature mask set and the second feature mask set, where the second feature mask set includes one or more feature masks;
and the correcting unit is used for correcting the target class of the target object in the target image according to the similarity coefficient.
8. A server, comprising a processor, an input device, an output device, and a memory, the processor, the input device, the output device, and the memory being interconnected, wherein the memory is configured to store a computer program, the computer program comprising a program, the processor being configured to invoke the program to perform the method of any of claims 1-6.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method of any one of claims 1-6.
CN202010949537.8A 2020-09-10 2020-09-10 Image processing method, device, server and storage medium Active CN111814776B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010949537.8A CN111814776B (en) 2020-09-10 2020-09-10 Image processing method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010949537.8A CN111814776B (en) 2020-09-10 2020-09-10 Image processing method, device, server and storage medium

Publications (2)

Publication Number Publication Date
CN111814776A CN111814776A (en) 2020-10-23
CN111814776B true CN111814776B (en) 2020-12-15

Family

ID=72860174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010949537.8A Active CN111814776B (en) 2020-09-10 2020-09-10 Image processing method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN111814776B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465691A (en) * 2020-11-25 2021-03-09 北京旷视科技有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
CN112364843A (en) * 2021-01-11 2021-02-12 中国科学院自动化研究所 Plug-in aerial image target positioning detection method, system and equipment
CN113033334A (en) * 2021-03-05 2021-06-25 北京字跳网络技术有限公司 Image processing method, apparatus, electronic device, medium, and computer program product

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728188A (en) * 2019-09-11 2020-01-24 北京迈格威科技有限公司 Image processing method, device, system and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015171450A1 (en) * 2014-05-09 2015-11-12 Graphiclead LLC System and method for embedding of a two dimensional code with an image
CN109670429B (en) * 2018-12-10 2021-03-19 广东技术师范大学 Method and system for detecting multiple targets of human faces of surveillance videos based on instance segmentation
CN109815874A (en) * 2019-01-17 2019-05-28 苏州科达科技股份有限公司 A kind of personnel identity recognition methods, device, equipment and readable storage medium storing program for executing

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728188A (en) * 2019-09-11 2020-01-24 北京迈格威科技有限公司 Image processing method, device, system and storage medium

Also Published As

Publication number Publication date
CN111814776A (en) 2020-10-23

Similar Documents

Publication Publication Date Title
CN111814776B (en) Image processing method, device, server and storage medium
CN109325964B (en) Face tracking method and device and terminal
JP5701005B2 (en) Object detection apparatus, object detection method, surveillance camera system, and program
CN108564579B (en) Concrete crack detection method and detection device based on time-space correlation
CN111210399B (en) Imaging quality evaluation method, device and equipment
CN112016413A (en) Method and device for detecting abnormal behaviors between objects
CN111639653B (en) False detection image determining method, device, equipment and medium
CN108647587B (en) People counting method, device, terminal and storage medium
CN112200081A (en) Abnormal behavior identification method and device, electronic equipment and storage medium
CN112906647B (en) Method and device for monitoring load of small-span bridge, computer equipment and storage medium
CN110866428B (en) Target tracking method, device, electronic equipment and storage medium
CN114241370A (en) Intrusion identification method and device based on digital twin transformer substation and computer equipment
CN110766007A (en) Certificate shielding detection method, device and equipment and readable storage medium
CN115457466A (en) Inspection video-based hidden danger detection method and system and electronic equipment
CN113869137A (en) Event detection method and device, terminal equipment and storage medium
CN109996063B (en) Video image screen splash detection method and device, computer equipment and storage medium
US11393091B2 (en) Video image processing and motion detection
CN113052019A (en) Target tracking method and device, intelligent equipment and computer storage medium
CN111462188A (en) Camera movement detection method and system
CN112308061B (en) License plate character recognition method and device
CN111274899B (en) Face matching method, device, electronic equipment and storage medium
CN113298102B (en) Training method and device for target classification model
CN112861711A (en) Regional intrusion detection method and device, electronic equipment and storage medium
CN112529928A (en) Part assembly detection method, computer device and storage medium
CN115861321B (en) Production environment detection method and system applied to industrial Internet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant