CN113743231B - Video target detection avoidance system and method - Google Patents
Video target detection avoidance system and method Download PDFInfo
- Publication number
- CN113743231B CN113743231B CN202110909116.7A CN202110909116A CN113743231B CN 113743231 B CN113743231 B CN 113743231B CN 202110909116 A CN202110909116 A CN 202110909116A CN 113743231 B CN113743231 B CN 113743231B
- Authority
- CN
- China
- Prior art keywords
- patch
- module
- distance
- adaptive
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000012549 training Methods 0.000 claims abstract description 48
- 230000003044 adaptive effect Effects 0.000 claims abstract description 29
- 238000004364 calculation method Methods 0.000 claims abstract description 20
- 238000004088 simulation Methods 0.000 claims abstract description 14
- 230000009466 transformation Effects 0.000 claims abstract description 14
- 239000003086 colorant Substances 0.000 claims description 19
- 230000008569 process Effects 0.000 claims description 18
- 238000006467 substitution reaction Methods 0.000 claims description 17
- 230000008859 change Effects 0.000 claims description 14
- 230000037303 wrinkles Effects 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 230000006978 adaptation Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 230000007246 mechanism Effects 0.000 claims description 4
- 230000007613 environmental effect Effects 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims description 2
- 230000014759 maintenance of location Effects 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 238000012795 verification Methods 0.000 claims description 2
- 239000000523 sample Substances 0.000 description 12
- 238000010586 diagram Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 101100377706 Escherichia phage T5 A2.2 gene Proteins 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 240000007651 Rubus glaucus Species 0.000 description 2
- 235000011034 Rubus glaucus Nutrition 0.000 description 2
- 235000009122 Rubus idaeus Nutrition 0.000 description 2
- 230000008485 antagonism Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 125000000205 L-threonino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])[C@](C([H])([H])[H])([H])O[H] 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000004873 anchoring Methods 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000003475 lamination Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a video target detection avoidance system and a video target detection avoidance method. The adaptive training module is used for detecting the object attached with the avoidance patch, extracting the number and the confidence coefficient of human body detection, and getting rid of the limitation on the types and the parameters of the actual detection model; the patch distance self-adaptive module is used for carrying out distance adaptive updating on the patch based on a threshold value specified by a user or preset by a system, so as to ensure the protection performance of the patch under different distances; the multiple loss function calculation module and the digital world patch fitting module are used for realizing clothes fold simulation of patches in the digital world, physical world color transformation, picture training loss constraint and the like, and guaranteeing robustness of transferring the patches to the physical world. The method is not aimed at a specific model, can be effective against different models, has good physical world robustness, and meets the privacy protection requirement of a user side.
Description
Technical Field
The invention belongs to the technical field of privacy disclosure of anti-sample protection target detection in computer vision, relates to a video target detection avoidance system and method, and particularly relates to a universal human body target detection privacy protection system and method.
Background
The method has the advantages that the method is rapid in development in various fields such as behavior tracking, intelligent monitoring and the like, target detection and recognition are used as core technologies, convenience is provided for people, and meanwhile, huge security challenges and risks are brought to personal privacy. Users often use camouflage means such as glasses, hats and masks to avoid personal privacy disclosure, but these means cause inconvenience to the users when traveling, and cannot thoroughly avoid video target detection technically.
The main reason why the target detection and identification technology brings great risks to personal privacy safety is that the construction cost of a platform is extremely low. According to the investigation, the YOLOv3 human body target detection and identification model can run to 40FPS on a raspberry group 4b+ development board. This means that individuals or enterprises can build such human body target detection models only with extremely low cost (a camera, a raspberry group development board and model codes of opened sources), so as to acquire massive pedestrian data. As shown in fig. 1-2, this is a human target detection and feature extraction system, and the data obtained by the system not only includes the captured pedestrian image, but also includes personal privacy information covering detailed features such as pedestrian behavior, whereabouts, faces, clothes, etc., which are obtained after analysis and processing of a human target detection model.
In recent years, research on interference of target detection based on an antagonistic sample has become a hotspot in the field of academic research, but many problems still exist in existing research. The video target detection interference technology based on the antagonism sample can be used for disturbing the target detection network by generating a specific antagonism sample so as to protect the trace privacy of the user. However, most of the existing countermeasure samples are generated based on a single model of a white box, the real target detection model has complex structure, the countermeasure samples are difficult to meet the actual privacy protection requirement of users, and the disadvantages of difficult portability, easy detection, long-distance attack failure and the like exist; how to improve the interference capability of a sample to multiple target detection models, ensure the validity of the sample within the full distance range, improve the naturalness and portability of the sample, and is a challenge to be solved based on the current technology for detecting interference of the target on the video of the countersample.
Disclosure of Invention
In view of the drawbacks of the conventional privacy protection scheme and the safety and performance requirements of protecting the privacy characteristic information of the human body in the real physical world, the invention provides a human body target detection avoidance system and method with high universality, high distance adaptability and good semanteme in the physical world based on the multi-model generation substitution model.
The system of the invention adopts the technical proposal that: a video target detection avoidance system comprises a model adaptive gray box training module based on multiple models, a patch distance self-adaptive module based on a threshold value, a calculation module based on multiple loss functions and a digital world patch fitting module;
the model adaptive gray box training module based on multiple models (YOLO, SSD, fast RCNN, etc.) is used for detecting target pictures attached with avoidance patches, extracting the number of human body detections and confidence level
The patch distance self-adaptive module based on the threshold value is used for setting a user or a system threshold value, deciding a threshold distance and adaptively updating the patch distance;
the calculation module based on the multiple loss functions is used for training loss constraint, and comprises smoothness loss, pixel change loss, non-printable color loss and the like, wherein the smoothness loss and the pixel change loss are used for smoothing an image to keep picture semantic information;
the digital world patch attaching module is used for simulating clothes wrinkles and physical world color transformation, simulating the environmental change of each physical world, carrying out relevant transformation on the avoidance patch, and improving the robustness of the avoidance patch in the physical world.
The technical scheme adopted by the method is as follows: a video target detection evasion method comprises the following steps:
step 1: detecting target pictures attached with avoidance patches based on a model adaptive gray box training module of a plurality of models, and extracting the number and the confidence of human body detection; distributing the patch distance to a patch distance self-adaptive module;
step 2: threshold setting and threshold distance decision are carried out, and according to result parameters distributed by the system, the distance adaptability characteristic of the relevant area of the target patch to be updated is updated; the patch pictures after iteration updating are sent into a digital world patch attaching module to generate a new target detection data set attached with the patch pictures;
step 3: simulating clothes folds, carrying out physical world color transformation and training loss constraint, calculating to obtain various loss indexes of the patch picture in the training process, and storing and distributing the loss indexes to a patch distance self-adaptive module;
the clothing fold simulation system generates fold simulation distortion for the patch based on the built-in two-position fold data set, and stores the distorted patch for later use;
the physical world color transformation is carried out by using a multi-layer perceptron method to establish a mapping relation between the digital world colors and the physical world printable colors, and then fitting the patch picture colors into the physical world printable colors;
the training loss constraint is based on the output result of the substitution model and various loss functions (smoothness loss, pixel change loss, non-printable color loss and the like) generated in the model adaptive gray box training module, various loss indexes of the patch picture in the training process are calculated and obtained, and the various loss indexes are stored and distributed to the patch distance self-adaptive module.
Compared with the prior art, the invention has the advantages and positive effects mainly represented in the following aspects:
(1) The model adaptive gray box training method is provided, so that the universality of the generated avoidance patches in the physical world is improved, and the disturbance capacity of the patches on multiple models is improved; based on this, an avoidance patch that is effective against many detection models can be generated.
(2) The invention designs a distance self-adaptive patch generation algorithm, improves the self-adaptability of the countermeasure sample to the attack distance, ensures the effectiveness of the countermeasure patch within the whole distance range of 2-10m, and improves the capability of the countermeasure patch to the disturbance of the whole distance of the model; is suitable for the real physical world.
(3) The invention provides a semantic preservation mechanism based on smooth images, which reserves the semantics of the original pictures to a certain extent and improves the semantics and naturalness of samples. The color of the clothing pattern is not easy to be perceived by passers-by when wearing in the physical world.
Drawings
FIG. 1 is a system frame diagram of an embodiment of the present invention.
FIG. 2 is a schematic diagram of a model adaptive gray box training module based on multiple models in an embodiment of the present invention.
Fig. 3 is a schematic diagram of a threshold-based patch distance adaptation module in an embodiment of the present invention.
Fig. 4 is a schematic diagram of a calculation module based on a plurality of loss functions and a digital world patch fitting module according to an embodiment of the present invention.
FIG. 5 is a block diagram of an alternative Model1 for modeling real world detection in accordance with an embodiment of the present invention.
Fig. 6 is an application scenario diagram of an embodiment of the present invention.
Detailed Description
For the purpose of facilitating understanding and practicing the invention by those of ordinary skill in the art, reference will now be made in detail to the present invention, examples of which are illustrated in the accompanying drawings and examples, it being understood that the examples described herein are for the purpose of illustration and explanation only and are not intended to be limiting.
Referring to fig. 1, the video target detection avoidance system provided by the invention comprises a model adaptive gray box training module based on multiple models, a patch distance self-adaptive module based on threshold values, a calculation module based on multiple loss functions and a digital world patch fitting module.
The model adaptability ash box training module based on multiple models (comprising YOLO, SSD, fasterRCNN and the like) is used for detecting target pictures attached with avoidance patches (also called stealth clothes patches), and extracting the number and the confidence of human body detection; in the process, model types and parameters in actual detection are not needed, and only a preset substitute model in the system is needed;
the patch distance self-adaptive module based on the threshold value is used for setting a user or a system threshold value, deciding a threshold distance and adaptively updating the patch distance; in the process, a threshold value set by a user or preset by a system is used as a decision main body to decide to generate a patch updating area range, and the system performs image updating of the patch to ensure the protection performance of the patch under different distances;
the calculation module based on the multiple loss functions of the embodiment is used for training loss constraint, including smoothness loss, pixel change loss, non-printable color loss and the like, wherein the smoothness loss and the pixel change loss are used for smoothing an image to keep picture semantic information;
the digital world patch attaching module is used for simulating clothes wrinkles and physical world color transformation, simulating the environmental change of each physical world, and carrying out relevant transformation on the avoidance patch, so that the robustness of the avoidance patch in the physical world is improved.
The video target detection avoidance method provided by the embodiment comprises the following steps:
step 1: detecting a target picture attached with a stealth clothing patch based on a model adaptive gray box training module of a plurality of models, and extracting the number and the confidence of human body detection; distributing the patch distance to a patch distance self-adaptive module;
referring to fig. 2, the specific process of step 1 in this embodiment includes the following steps:
step A1: detecting a target stage of sticking a 'stealth coat' patch, determining a model detection sequence and times by a user or a system, generating a substitute model by using a model built in the system, and performing serial detection on a target picture stuck with the patch;
a1.1: before the system works, determining the detection sequence and the round epoch of the Model according to the specified or preset parameters of a user, generating a substitute Model1 for simulating real world detection, and storing the Model for training iteration in patch adaptability training;
referring to FIG. 5, the alternative Model1 of the present embodiment includes a Yolov2 layer, a Faster RCNN layer, a Yolov2/SDD layer, a Faster RCNN layer and a Faster RCNN layer connected in sequence;
in this embodiment, the system iterating process is to attach the evading patch to the human target picture through the digital world patch attaching module, obtain the confidence coefficient and the prediction frame through the model adaptive gray box training module based on multiple models, obtain multiple loss values through the loss function calculating module, and finally update the patch through the patch distance self-adapting module. The user can specify the iteration times or iterate by using the built-in iteration times of the system, and finally, the trained avoidance patches are output.
A1.2: the system distributes the target picture (batch) with the 'stealth clothing' patch generated in the patch attaching module to the patch distance self-adapting module, and uses the alternative Model1 generated by A1.1 to perform serial identification detection.
Step A2: the human body detection number and confidence level stage is extracted, after detection, the system acquires the human body target detection number and confidence level of a target picture output by the substitution model, and distributes the extracted result to an optimizer arranged in the module;
a2.1: the system extracts the prediction confidence and the number of prediction frames of the target picture which is output by the substitution model and is pasted with the stealth clothing patch through the detection of the generated substitution model,
a2.2: and after the result obtained through the substitution model is collected and formatted, the system distributes the result to the patch distance self-adaptive module, and the patch picture is actually updated according to the result parameters.
Assuming that each batch (co-batch of pictures), the result after the formatting process is: the confidence weighting sum probe and the number of prediction frames pred_num respectively represent the confidence weighting sum of all the boundary frames of each picture, the confidence of which is larger than a set threshold value conftresh and the prediction category is the weighting of the human, and the number of all the boundary frames of each picture, the prediction category is the human and the prediction is correct.
Step 2: setting a threshold value and deciding a threshold distance, and updating the distance adaptability characteristic of the relevant area of the target patch to be updated according to the result parameters distributed by the system in A2.2; the patch pictures after iteration updating are sent into a digital world patch attaching module to generate a new target detection data set attached with the patch pictures;
please refer to fig. 3, the specific process of step 2 in this embodiment includes the following steps:
step B1: threshold setting, wherein the user main body decides to autonomously set distance threshold parameters or adopts a built-in preset threshold S thtes By the system submitting to the distance adaptation module, when receiving the result generated by the surrogate model distributed by the system, the distance adaptation module is based on the previously submitted threshold S thres Running a module function;
step B1.1: before the system operates, the user main body decides to autonomously set a distance threshold or adopts a built-in preset threshold for deciding a target patch image updating range;
step B1.2: the system performs legal inspection on the distance threshold parameters submitted by the user, distributes the distance threshold parameters to the patch distance self-adaptive module after passing the verification, and pre-stores the distance threshold;
step B1.3: after receiving the generation result of the substitution model distributed by the system, the patch distance self-adaptive module extracts a pre-stored distance threshold value and a target patch distributed by the system and executes the corresponding module function;
step B2: and (3) threshold distance decision, carrying out length and width normalization on the patch picture to be updated, which is input, by the patch distance self-adaptive module, and determining the patch picture updating range based on a previously set threshold value. It should be noted that since the patch picture is square, the update scope is also a scaled square area;
step B2.1: after the patch distance self-adaptive module receives the distributed target patch to be updated, the module normalizes the length and width of the patch picture, so that the patch updating range can be conveniently and subsequently determined;
if the size ratio of the picture identification frame attached with the patch to the patch is smaller than the threshold value, the system decides that the patch is in a long-distance scene at the moment, and decides that the patch updating range is the full-image;
if the size ratio of the picture identification frame attached with the patch to the patch exceeds a threshold value, the system decides that the patch is in a short-distance scene at the moment, decides an updating area as a picture center, and decides the updating range ratio of the patch through the system;
step B2.2: after the system makes a decision based on a threshold value, determining an anchor point and updating a relevant area by an anchor patch;
step B2.3: the system transmits the information of the patch updating area after anchoring to the patch distance self-adaptive module, and the patch is updated with the distance adaptive characteristics;
step B3: patch distance adaptive updating, when a patch distance adaptive module obtains the range of a patch updating area to be updated based on a distance threshold value, calculating a corresponding mask M, and updating the distance adaptive characteristics of the relevant area of the target patch to be updated according to the result parameters distributed by the system in A2.2;
step B3.1: after the patch distance self-adaptive module obtains the patch update area range to be updated, the result parameter in A2.2 and the index in the loss function calculation module are requested to the system for patch update;
step B3.2: when the patch distance self-adaptive module obtains result parameters and calculation index pushing, updating the characteristics of the corresponding area of the patch, wherein the characteristics comprise patterns, textures, colors and the like;
step B3.3: the patch pictures after iterative updating are sent into a digital world patch attaching module to generate a new target detection data set attached with the patch pictures;
the L2 regularized weight attenuation method is optimized in updating optimization, and the problem of model overfitting is reduced to a certain extent.
Step 3: simulating clothes folds, carrying out physical world color transformation and training loss constraint, calculating to obtain various loss indexes of the patch picture in the training process, and storing and distributing the loss indexes to a patch distance self-adaptive module;
the method comprises the steps of simulating clothes wrinkles, generating wrinkles to simulate distortion for patches based on a built-in two-position wrinkles data set, and storing the distorted patches for later use;
the physical world color transformation, using a multi-layer perceptron method, establishing a mapping relation between the digital world colors and the physical world printable colors, and then fitting the patch picture colors to the physical world printable colors;
and training loss constraint, calculating various loss indexes of the patch picture in the training process based on the generated output result of the substitution model and various loss functions (including smoothness loss, pixel change loss, non-printable color loss and the like), and saving and distributing the loss indexes to a patch self-adaptive updating module.
Please refer to fig. 4, the specific process of step 3 in this embodiment includes the following steps:
step C1: the method comprises the steps of simulating clothes wrinkles, generating wrinkles to simulate distortion for patches based on a built-in two-position wrinkles data set, and storing the distorted patches for later use;
step C1.1: data set data of two-dimensional anchor point distortion under different human states and built in garment fold simulation function loading system tps And patch picture patch after color conversion cnv To call the distortion function f tps Realizing patch picture distortion and providing data support;
step C1.2: garment fold simulation function loading and loading designed distortion function f tps Performing two-dimensional image distortion on the target patch picture, and simulating the appearance characteristics of the patch worn on a human body;
step C1.3: the clothes fold simulation function stores and distributes the distorted patches to a laminator in the digital world patch lamination module, and the patch pictures are laminated to the digital world human body to form a data set data of the model adaptive gray box training module;
step C2: the physical world color transformation, using a multi-layer perceptron method, establishing a mapping relation f between the digital world colors and the physical world printable colors, and then fitting the patch picture colors to the physical world printable colors;
step C2.1: before the system works, the color conversion function in the digital world patch attaching module is used for converting the digital world color digital And built-in physical world color physical Loading three layers of full-connection BP networks to generate color fitting of the physical world and the digital world;
step C2.2:color conversion reading patch picture patch to be color converted origin Transforming patch picture colors to generate patch by the generated color fitting cnv Pushing the clothes fold simulation functional part to the digital world patch attaching module;
step C3: training loss constraint, calculating various loss indexes loss of the patch picture in the training process based on the generated output result of the substitution model and various loss functions f (x), and storing and distributing the loss indexes loss to a patch self-adaptive updating module;
step C3.1: the system transmits the result parameters of the model adaptive gray box training module to the loss function calculation module;
step C3.2: the loss function calculation module calculates and obtains various loss indexes loss in patch adaptability training by loading the input result parameters and calling a plurality of loss functions f (x);
step C3.3: the loss function calculation module processes each loss index obtained through calculation and then transmits the processed loss indexes to the patch distance self-adaptive module for updating patch characteristics.
Please refer to fig. 6, which is a view of a stealth garment application scenario according to an embodiment of the present invention. When facing an illegal person detection camera, the individual user who does not wear the "stealth clothing" printed with the avoidance patch is detected by the illegal person detection camera (as shown in the figure, and the individual user who wears the "stealth clothing" is not detected (as shown in the figure, not shown in the figure), and in the simulated illegal person detection camera picture, the user who wears the "stealth clothing" is on the left, and the user who does not have the "stealth clothing" is on the right.
The invention has the advantages that:
1. the model adaptive gray box training method is adopted in the scheme, so that the universality of the countermeasure patches in the physical world is improved, the physical world interference success rate on a plurality of target detection models can reach more than 50%, the transferability of the countermeasure patches is improved, and the privacy protection performance of the countermeasure samples when the main stream recognition models are faced to the practical application is remarkably improved. Applicable to real physical world applications, please see fig. 5;
2. threshold-based patch distance adaptive update mechanism: setting a patch updating threshold value by a user as a decision maker, and carrying out targeted feature updating on a central area or the whole patch by using a distance self-adaptive module as an updating operation main body according to the threshold value by the system;
3. based on a semantic retention mechanism of the smooth image, the semanteme and naturalness of the countermeasure sample are improved. By designing the semantic loss function based on the initial image change, the semantics of the initial image in the challenge sample can be effectively reserved in the training process, and the naturalness of the challenge sample in the physical world is improved.
The invention can effectively protect the user from automatic identification of the target identification detection model, and prevent the target detection extraction technology from illegally acquiring, storing and using the sensitive information of the user. In military, along with the continuous development of modern unmanned warfare, the intelligent stealth clothing can effectively avoid the detection and locking of unmanned weapons to human targets, and preempt the unmanned warfare. In the future, the intelligent stealth clothing based on the countermeasure patches has wide application prospect, and can bring great benefit for civil, commercial and military scenes and be available in the future.
The invention can provide a reliable and convenient sensitive information protection method for users in more fields of civil use, military use and the like.
It should be understood that parts of the specification not specifically set forth herein are all prior art.
It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.
Claims (10)
1. The method for detecting and avoiding the video target is characterized by comprising the following steps of:
step 1: the method comprises the steps of detecting target pictures attached with avoidance patches based on a model adaptability gray box training module of a plurality of models, and extracting the number and the confidence of human body detection; distributing the patch distance to a patch distance self-adaptive module;
step 2: threshold setting and threshold distance decision are carried out, and according to result parameters distributed by the system, the distance adaptability characteristic of the relevant area of the target patch to be updated is updated; the patch pictures after iteration updating are sent into a digital world patch attaching module to generate a new target detection data set attached with the patch pictures;
step 3: simulating clothes folds, carrying out physical world color transformation and training loss constraint, calculating to obtain various loss indexes of the patch picture in the training process, and storing and distributing the loss indexes to a patch distance self-adaptive module;
the clothing fold simulation system generates fold simulation distortion for the patch based on the built-in two-dimensional fold data set, and stores the distorted patch for later use;
the physical world color transformation uses three layers of full-connection BP network to establish a mapping relation between digital world colors and physical world printable colors, and then the fitting of the physical world printable colors is carried out on patch picture colors;
the training loss constraint is based on a substitute model output result generated by the model adaptive gray box training module and various loss functions, including smoothness loss, pixel change loss and non-printable color loss, various loss indexes of the patch picture in the training process are calculated and obtained, and are stored and distributed to the patch adaptive updating module, wherein the smoothness loss and the pixel change loss are semantic retention mechanisms of the smooth image;
the system iteration process is that an avoidance patch is attached to a human body target picture through a digital world patch attaching module, confidence level and a prediction frame are obtained through a model adaptive gray box training module based on multiple models, multiple loss values are obtained through a loss function calculating module, and patch updating is performed through a patch distance self-adaptive module; the user can specify the iteration times or iterate by using the built-in iteration times of the system, and finally, the trained avoidance patches are output.
2. The video object detection avoidance method of claim 1, wherein the specific implementation of step 1 comprises the sub-steps of:
step 1.1: before the system works, determining the detection sequence and the turn of the model according to the user specified or preset parameters, and generating a substitute model for training based on multiple modelsAnd saving the model for iteration in patch adaptation training;
the substitution modelComprises a YOLOv2 layer, a Faster RCNN layer, a YOLOv2/SDD layer, a Faster RCNN layer and a Faster RCNN layer which are connected in sequence;
step 1.2: the system distributes the target pictures with the avoidance patches generated in the patch attaching module to the adaptive training module, and uses the substitution model generated in the step 1.1 to perform serial identification detection, so as to obtain the human body target detection number and the prediction confidence of the target pictures output by the substitution model;
if m target pictures to which avoidance patches are attached in each batch are provided, the prediction confidence is the weighted sum of the confidence of the boundary frames of which the confidence of each picture exceeds a set threshold and the prediction category is a person; the number of human body target detection is the number of boundary frames with the correct prediction and the human body target detection is the number of prediction categories in each picture;
step 1.3: the system collects and combines the results obtained by the substitution model, distributes the results to the patch distance self-adaptive module, and carries out actual updating on the patch picture according to the result parameters.
3. The video object detection avoidance method of claim 1, wherein the specific implementation of step 2 comprises the sub-steps of:
step 2.1: setting a threshold value;
the user main body decides to autonomously set a distance threshold parameter or adopts a built-in preset threshold, the parameter is submitted to a patch distance self-adaptive module through the system, and when a result generated by a substitution model distributed by the system is received, the patch distance self-adaptive module operates a module function based on the previously submitted threshold parameter;
step 2.2: threshold distance decision;
the patch distance self-adaptive module performs length and width normalization on the input patch picture to be updated, and determines the patch picture updating range based on a previously set threshold value;
step 2.3: patch distance adaptive updating;
and after the patch distance self-adaptive module obtains the range of the patch updating area to be updated based on the distance threshold, the patch distance self-adaptive module updates the distance adaptability characteristic of the relevant area of the target patch to be updated according to the result parameters distributed by the system.
4. The video object detection avoidance method of claim 3, wherein: in step 2.1, before the system operates, the user main body decides to autonomously set a distance threshold or adopts a built-in preset threshold for deciding a target patch image updating range; the system performs legal inspection on the distance threshold parameters submitted by the user, distributes the distance threshold parameters to the patch distance self-adaptive module after passing the verification, and pre-stores the distance threshold; and after receiving the generation result of the substitution model distributed by the system, the patch distance self-adaptive module extracts a pre-stored distance threshold value and a target patch distributed by the system and executes the corresponding module function.
5. The video object detection avoidance method of claim 3, wherein: in step 2.2, after the distance self-adaptive module receives the distributed target patch to be updated, the module normalizes the length and width of the patch picture, so that the patch updating range can be conveniently and subsequently determined; if the size ratio of the picture identification frame attached with the patch to the patch is smaller than the threshold value, the system decides that the patch is in a long-distance scene at the moment, and decides that the patch updating range is the full-image; if the size ratio of the picture identification frame attached with the patch to the patch exceeds a threshold value, the system decides that the patch is in a short-distance scene at the moment, decides an updating area as a picture center, and decides the updating range ratio of the patch through the system; after the system makes a decision based on a threshold value, the anchor patch updates the relevant area; the system transmits the anchored patch update area information to a patch distance self-adapting module, and the patch distance self-adapting module updates the distance adaptive characteristics of the patch.
6. The video object detection avoidance method of claim 3, wherein: in step 2.3, after the patch distance adaptive module obtains the patch update area range to be updated, the result parameters and the indexes in the loss function calculation module are requested to the system for patch update; when the patch distance self-adaptive module obtains result parameters and calculation index pushing, updating the characteristics of the corresponding area of the patch, wherein the characteristics comprise patterns, textures and colors; the patch pictures after the iteration update are sent into a digital world patch attaching module to generate a new target detection data set attached with the patch pictures.
7. The video object detection avoidance method of claim 1, wherein the garment fold simulation function in step 3 is specifically implemented by the following sub-steps:
step 3.1.1: the garment fold simulation function loads a two-dimensional anchor point distorted data set and a patch picture subjected to color transformation under different human states built in the system, and provides data support for invoking a TPS distortion function to realize patch picture distortion;
step 3.1.2: the garment fold simulation function loads a designed TPS distortion function, performs two-dimensional image distortion on a target patch picture, and simulates the appearance characteristics of the patch when the patch is worn on a human body;
step 3.1.3: and the clothes fold simulation function stores and distributes the distorted patches to a bonder in the digital world patch bonding module, bonds patch pictures to a digital world human body, and forms a data set of the model adaptive gray box training module.
8. The video object detection circumvention method according to claim 1, wherein said physical world color transform function in step 3 is specifically implemented by the following sub-steps:
step 3.2.1: before the system works, a color conversion function in the digital world patch attaching module loads the digital world color and the built-in physical world color into a three-layer full-connection BP network to generate color fitting of the physical world and the digital world;
step 3.2.2: the color conversion function reads the patch picture to be color-converted, converts the color of the patch picture through the generated color fitting, and pushes the patch picture to the clothes fold simulation function part in the digital world patch fitting module.
9. The method for avoiding video object detection according to any one of claims 1 to 8, wherein the training loss constraint in step 3 is implemented by the following sub-steps:
step 3.3.1: the system transmits the result parameters of the model adaptive gray box training module to the loss function calculation module;
step 3.3.2: the loss function calculation module calculates and obtains various loss indexes in patch adaptability training by loading the input result parameters and calling a plurality of loss functions;
step 3.3.3: the loss function calculation module processes each loss index obtained through calculation and then transmits the processed loss indexes to the patch distance self-adaptive module for updating patch characteristics.
10. A video object detection avoidance system employing the method of any of claims 1 to 9; the method is characterized in that: the system comprises a model adaptive gray box training module based on multiple models, a patch distance self-adaptive module based on threshold values, a calculation module based on multiple loss functions and a digital world patch fitting module;
the model adaptability gray box training module based on the multiple models is used for generating a training substitution model for detecting human body target pictures attached with avoidance patches, and extracting the number and the confidence of human body detection; the multiple modes include YOLO, SSD, and FasterRCNN;
the patch distance self-adaptive module based on the threshold value is used for setting a user or a system threshold value, deciding a threshold distance and adaptively updating the patch distance;
the multi-term loss function-based calculation module is used for calculating losses, including smoothness losses, pixel change losses and non-printable color losses, wherein the smoothness losses and the pixel change losses are used for smoothing images to preserve picture semantic information;
the digital world patch fitting module is used for simulating clothes wrinkles and physical world color fitting, simulating the environmental change of each physical world, carrying out relevant transformation on the avoidance patches, and improving the robustness of the avoidance patches in the physical world.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110909116.7A CN113743231B (en) | 2021-08-09 | 2021-08-09 | Video target detection avoidance system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110909116.7A CN113743231B (en) | 2021-08-09 | 2021-08-09 | Video target detection avoidance system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113743231A CN113743231A (en) | 2021-12-03 |
CN113743231B true CN113743231B (en) | 2024-02-20 |
Family
ID=78730401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110909116.7A Active CN113743231B (en) | 2021-08-09 | 2021-08-09 | Video target detection avoidance system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113743231B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116883520B (en) * | 2023-09-05 | 2023-11-28 | 武汉大学 | Color quantization-based multi-detector physical domain anti-patch generation method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111340008A (en) * | 2020-05-15 | 2020-06-26 | 支付宝(杭州)信息技术有限公司 | Method and system for generation of counterpatch, training of detection model and defense of counterpatch |
CN111739016A (en) * | 2020-07-20 | 2020-10-02 | 平安国际智慧城市科技股份有限公司 | Target detection model training method and device, electronic equipment and storage medium |
CN112241790A (en) * | 2020-12-16 | 2021-01-19 | 北京智源人工智能研究院 | Small countermeasure patch generation method and device |
CN112597993A (en) * | 2020-11-24 | 2021-04-02 | 中国空间技术研究院 | Confrontation defense model training method based on patch detection |
CN113111731A (en) * | 2021-03-24 | 2021-07-13 | 浙江工业大学 | Deep neural network black box countermeasure sample generation method and system based on channel measurement information |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11205096B2 (en) * | 2018-11-19 | 2021-12-21 | Google Llc | Training image-to-image translation neural networks |
US11948292B2 (en) * | 2019-07-02 | 2024-04-02 | MakinaRocks Co., Ltd. | Systems and methods for detecting flaws on panels using images of the panels |
-
2021
- 2021-08-09 CN CN202110909116.7A patent/CN113743231B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111340008A (en) * | 2020-05-15 | 2020-06-26 | 支付宝(杭州)信息技术有限公司 | Method and system for generation of counterpatch, training of detection model and defense of counterpatch |
CN111739016A (en) * | 2020-07-20 | 2020-10-02 | 平安国际智慧城市科技股份有限公司 | Target detection model training method and device, electronic equipment and storage medium |
CN112597993A (en) * | 2020-11-24 | 2021-04-02 | 中国空间技术研究院 | Confrontation defense model training method based on patch detection |
CN112241790A (en) * | 2020-12-16 | 2021-01-19 | 北京智源人工智能研究院 | Small countermeasure patch generation method and device |
CN113111731A (en) * | 2021-03-24 | 2021-07-13 | 浙江工业大学 | Deep neural network black box countermeasure sample generation method and system based on channel measurement information |
Non-Patent Citations (1)
Title |
---|
基于视觉传感器的道路前方车辆模型研究;陈勇;陈瑶;传感器与微系统;20141231;第33卷(第9期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113743231A (en) | 2021-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Fca: Learning a 3d full-coverage vehicle camouflage for multi-view physical adversarial attack | |
CN103020992B (en) | A kind of video image conspicuousness detection method based on motion color-associations | |
CN110555390A (en) | pedestrian re-identification method, device and medium based on semi-supervised training mode | |
CN107273822A (en) | A kind of method for secret protection based on monitor video multiple target tracking and recognition of face | |
CN113361604A (en) | Target detection-oriented physical attack counterattack patch generation method and system | |
CN112597993B (en) | Patch detection-based countermeasure model training method | |
CN113420731B (en) | Model training method, electronic device and computer-readable storage medium | |
CN116343330A (en) | Abnormal behavior identification method for infrared-visible light image fusion | |
CN113743231B (en) | Video target detection avoidance system and method | |
CN112464822A (en) | Helmet wearing detection method and device based on feature enhancement | |
CN116363748A (en) | Power grid field operation integrated management and control method based on infrared-visible light image fusion | |
CN111639589A (en) | Video false face detection method based on counterstudy and similar color space | |
CN109637123A (en) | A kind of complexity traffic environment downlink people living things feature recognition and traffic control system | |
CN115984439A (en) | Three-dimensional countertexture generation method and device for disguised target | |
CN115640609A (en) | Feature privacy protection method and device | |
CN112435257A (en) | Smoke detection method and system based on multispectral imaging | |
CN115761310A (en) | Method and system for generating customizable countermeasure patch | |
CN115481716A (en) | Physical world counter attack method based on deep network foreground activation feature transfer | |
Li et al. | A self-attention feature fusion model for rice pest detection | |
Miao et al. | Abnormal behavior learning based on edge computing toward a crowd monitoring system | |
CN111460416B (en) | Face feature and dynamic attribute authentication method based on WeChat applet platform | |
JP2021093144A (en) | Sensor-specific image recognition device and method | |
CN114299327A (en) | Anti-patch camouflage generation method based on content features | |
CN113989886A (en) | Crew identity verification method based on face recognition | |
CN114882582A (en) | Gait recognition model training method and system based on federal learning mode |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |