CN113919998B - Picture anonymizing method based on semantic and gesture graph guidance - Google Patents

Picture anonymizing method based on semantic and gesture graph guidance Download PDF

Info

Publication number
CN113919998B
CN113919998B CN202111196429.9A CN202111196429A CN113919998B CN 113919998 B CN113919998 B CN 113919998B CN 202111196429 A CN202111196429 A CN 202111196429A CN 113919998 B CN113919998 B CN 113919998B
Authority
CN
China
Prior art keywords
semantic
graph
picture
gesture
anonymization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111196429.9A
Other languages
Chinese (zh)
Other versions
CN113919998A (en
Inventor
张继东
吕超
曹靖城
吴宇松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Digital Life Technology Co Ltd
Original Assignee
Tianyi Digital Life Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Digital Life Technology Co Ltd filed Critical Tianyi Digital Life Technology Co Ltd
Priority to CN202111196429.9A priority Critical patent/CN113919998B/en
Publication of CN113919998A publication Critical patent/CN113919998A/en
Priority to PCT/CN2022/097530 priority patent/WO2023060918A1/en
Application granted granted Critical
Publication of CN113919998B publication Critical patent/CN113919998B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention relates to a picture anonymization method based on semantic and attitude graph guidance. The invention also relates to a picture anonymization system based on the guidance of the semantic graph and the gesture graph. In the system, the picture semantic anonymization module is configured to firstly perform semantic segmentation on the picture to obtain a semantic graph, and then generate a scene graph with the same semantic but different contents under the guidance of the semantic graph by using the countermeasure generation network. The character posture anonymizing module is configured to further guide and generate characters in the picture on the basis of the picture semantic anonymizing module, firstly, the characters are subjected to posture estimation to obtain a posture graph, and then a new figure with the same posture but different characters is generated under the guidance of the posture graph by using the countermeasure generation network. The overlapping module is configured to overlap the scene graph generated by the picture semantic anonymization module and the new portrait graph generated by the character gesture anonymization module according to the semantic graph to obtain a final anonymized picture.

Description

Picture anonymizing method based on semantic and gesture graph guidance
Technical Field
The invention relates to the field of image processing, in particular to the field of picture anonymization.
Background
The development of video monitoring cameras is from an initial closed-circuit television monitoring system, namely a first generation analog television monitoring system, to a video monitoring system based on a PC card in the latter half digital age, and finally to a digital age which mainly uses a network and communication technology as a platform and a network video monitoring system with intelligent image analysis as a characteristic based on the existing embedded technology.
As machine learning and artificial intelligence techniques develop and continue to advance, intelligent video surveillance techniques are becoming increasingly popular. The current intelligent video analysis technology mainly aims at analyzing real-time video images so as to achieve the effect of early warning. The development of network propagation makes users more and more important for personal privacy, and pictures as a rich information carrier are more sensitive to users. Early picture anonymization operations used only masking, blurring, or pixelation methods on sensitive information. While these methods have high ease of use, they are essentially ineffective in facing the currently popular deep learning identification methods. In recent years, more complex and effective methods have been proposed by researchers: face anonymization is performed, for example, using a k-same algorithm, and picture anonymization is achieved using a Generated Antagonism Network (GAN) framework.
The patent 'face anonymity privacy protection method based on generation of an countermeasure network' (CN 111242837A) discloses a face anonymity privacy protection method based on generation of the countermeasure network. Firstly, preprocessing face image data; then constructing and generating an countermeasure network structure; establishing an anonymous objective function of the face area; then establishing an objective function reserved in a scene content area; combining the anonymity of the face and the objective function of scene reservation; and finally, training and testing by adopting the public data set, and outputting a final result. The method replaces the face region in the image with the synthesized face to achieve the effect of anonymity of the face, and compared with the prior mosaic shielding method, the method is more efficient and more visually friendly. However, the method only replaces the human face, the body parts except the human face and other scenes on the picture are not processed, and the method still has the risk of user privacy for the picture anonymization of the indoor scene of the family. Meanwhile, the method depends on the accuracy of face detection, and anonymization failure is possible.
The invention discloses a service robot visual picture privacy protection method based on a generation type countermeasure network (CN 110363183A), which comprises the steps of firstly preprocessing data collected by a visual data acquisition end, then judging whether the input preprocessed data has privacy by a privacy recognition module, if so, performing picture conversion, converting into picture data which does not relate to privacy, and storing; training data growth and feature learning are used for updating a training data set, and a feature model is obtained through a modified Cycle-GAN algorithm based on the training data set and used for the picture conversion. The invention can lead the picture data to not relate to the privacy content from the source, but the invention directly uses the Cycle-GAN to transfer the original picture, and the lack of a fixed guiding mechanism can lead to larger style difference among different processing results and is not suitable for being used as training test data.
Therefore, there is a need for an improved technique for anonymizing pictures while maintaining the original semantic information of the pictures and the pose information of the figures.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
The invention aims at a video monitoring scene, uses a guided type countermeasure generation network to realize the global anonymization of pictures and furthest protects the privacy of users. In addition, the invention can maintain the availability of the picture data as much as possible, and can simultaneously meet the practical requirements of privacy protection and development of users.
According to one embodiment of the invention, a picture anonymizing method based on semantic graph and gesture graph guidance is disclosed, comprising: carrying out semantic segmentation on the original picture to obtain a semantic graph; generating a scene graph with the same semantics but different contents with the original picture under the guidance of the semantic graph by using a picture semantic anonymization countermeasure generation network; taking the portrait part in the semantic graph as a mask to intercept a portrait graph from the original picture; extracting and estimating the gesture of the person in the portrait graph to generate a gesture graph; generating a new portrait graph of the same gesture but different characters with the portrait graph under the guidance of the gesture graph by using a portrait gesture anonymization countermeasure generation network; and superposing the scene graph and the new portrait graph according to the semantic graph to obtain a final anonymized picture.
According to one embodiment of the invention, a picture anonymization system based on semantic graph and gesture graph guidance is disclosed, comprising: the system comprises a picture semantic anonymization module, a character gesture anonymization module and a superposition module. The picture semantic anonymization module is configured to: carrying out semantic segmentation on the original picture to obtain a semantic graph; a picture semantic anonymization countermeasure generation network is used for generating a scene graph with the same semantic as the original picture but different content under the guidance of the semantic graph. The character pose anonymization module is configured to: taking the portrait part in the semantic graph as a mask to intercept a portrait graph from the original picture; extracting and estimating the gesture of the person in the portrait graph to generate a gesture graph; a new figure of a person having the same pose but different from the figure is generated under the guidance of the figure of the pose using a figure pose anonymization challenge generating network. The superposition module is configured to: and superposing the scene graph and the new portrait graph according to the semantic graph to obtain a final anonymized picture.
According to another embodiment of the invention, a computing device for semantic graph and gesture graph guided picture anonymization is disclosed, comprising: a processor; a memory storing instructions that when executed by the processor are capable of performing the method as described above.
These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.
Drawings
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.
FIG. 1 illustrates a block diagram of a picture anonymization system 100 for semantic and gesture graph based guidance according to one embodiment of the present invention;
FIG. 2 shows a diagram 200 further describing the functionality of the picture semantic anonymization module 101 according to one embodiment of the present invention;
FIG. 3 shows a diagram of a multi-channel attention selection model 300 in accordance with one embodiment of the invention;
FIG. 4 shows a diagram 400 further describing the functionality of the persona pose anonymization module 102, according to one embodiment of the present invention;
FIG. 5 illustrates a data flow diagram 500 for a semantic graph and gesture graph guided picture anonymization process according to one embodiment of the present invention;
FIG. 6 illustrates a flow diagram of a method 600 for picture anonymization based on semantic and gesture graph guidance, according to one embodiment of the present invention; and
FIG. 7 illustrates a block diagram 700 of an exemplary computing device, according to one embodiment of the invention.
Detailed Description
The features of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings.
The user demands in the field of home cameras are becoming more and more abundant, and the accuracy of many AI functions depends on the richness of the video training data of the relevant pictures. Although a large amount of very valuable real data is accumulated during the use of the user, such data cannot be used in actual development for privacy protection and the like. The contradiction between privacy protection and model training data shortage has plagued developers.
The invention uses the semantic graph guiding and the gesture graph guiding methods to carry out global anonymization on the original picture of the user, thereby not only ensuring that the privacy of the user is not revealed, but also keeping the original semantic information of the picture and the gesture information of the figure. The invention can provide available training data for developing and optimizing AI algorithm models with low requirements on human faces, such as human shape detection, motion detection and the like, and can also provide a privacy protection mechanism for active anonymization encryption for users.
FIG. 1 illustrates a block diagram of a picture anonymization system 100 for guidance based on semantic and gesture graphs according to one embodiment of the present invention. As shown in fig. 1, the system 100 is divided by modules, with communication and data exchange between the modules taking place in a manner known in the art. In the present invention, each module may be implemented by software or hardware or a combination thereof. The system 100 includes a picture semantic anonymization module 101, a character pose anonymization module 102, and an overlay module 103.
According to one embodiment of the invention, the picture semantic anonymization module 101 is configured to first semantically segment the picture to obtain a semantic graph, and then generate a scene graph with the same semantic but different content under the guidance of the semantic graph using the countermeasure generation network.
According to one embodiment of the present invention, the character pose anonymizing module 102 is configured to further guide and generate characters in the picture based on the picture semantic anonymizing module 101, firstly perform pose estimation on the characters to obtain a human body key point pose map, and then generate a new figure with the same pose but different characters under the guidance of the pose map by using the countermeasure generation network.
According to one embodiment of the present invention, the superimposing module 103 is configured to superimpose the scene graph generated by the picture semantic anonymizing module 101 and the new portrait graph generated by the portrait pose anonymizing module 102 according to the semantic graph, so as to obtain a final anonymized picture. The position information of the person on the picture can be obtained through semantic segmentation, and the final superposition of the anonymized picture is realized through the information.
As known to those skilled in the art, the camera used in the intelligent video monitoring technology related to the present invention generally refers to a home camera related to the field of smart home, a monitoring probe related to the field of smart city, and a camera device generally installed in public places to perform a monitoring function. The monitoring device can photograph and pick up a scene, and can store the acquired image data in a local machine for subsequent processing or send the data to a remote device (such as an intelligent home control platform, a central control platform, other computing devices and the like) for processing. The manner of connection and communication between the monitoring device and the remote device is not limited herein, but is believed to be performed in a variety of ways known in the art. According to one embodiment of the invention, the system 100 may be implemented in a monitoring device or on a remote device. According to another embodiment of the invention, one or more modules in system 100 may be implemented separately in a monitoring device and a remote device.
FIG. 2 shows a diagram 200 further describing the functionality of the picture semantic anonymization module 101 according to one embodiment of the present invention. The picture semantic anonymization module 101 is configured to implement three phases of semantic segmentation, semantic guided reconstruction, and picture optimization.
As shown in fig. 2, in the semantic segmentation stage, a self-codec built by ShuffleNet as a backbone network is used as a semantic generator to infer an input original picture Ig to obtain a scene semantic graph Sg.
In the context of the present invention, both the semantic guided reconstruction stage and the picture optimization stage may logically/functionally constitute a picture semantic anonymization challenge-generating network based on a multi-channel attention selection mechanism under cascaded semantic guidance. In the picture semantic anonymization countermeasure generation network, a semantic guidance reconstruction stage is used for generating a coarse-granularity picture semantic anonymization result by adopting cascade semantic guidance, and a picture optimization stage is used for generating a finer result through a multichannel attention selection mechanism.
In the semantic guidance reconstruction stage, a target texture picture Ir is randomly selected from a scene texture picture library to serve as a conditional image, the randomly selected target texture picture Ir and a scene semantic picture Sg obtained in the semantic segmentation stage are cascaded, and a result after the cascade is input into a generator Gi to be inferred to obtain a generated image I ' g, wherein the generator Gi is a U-Net model constructed based on REFINENET, and the generator Gi is optimized by optimizing a semantic picture S ' g of the generated image I ' g and a loss function of an original scene semantic picture Sg in training. Where L1-L4 are the four components in calculating the loss function.
The picture optimization stage uses a multichannel attention selection model to optimize the generated picture I 'g of the previous stage to obtain a final scene graph I' g. The purpose of using the multi-channel attention selection model is to produce finer granularity results from a larger generation space and to generate an uncertainty map to guide the optimized pixel loss. FIG. 3 shows a diagram of a multi-channel attention selection model 300 in accordance with one embodiment of the invention.
The multi-channel attention selection model 300 includes a multi-scale space pooling portion and a multi-channel attention selection portion. The multi-scale spatial pooling portion performs global average pooling on the same input features using a set of different sizes and strides, resulting in multi-scale features with different acceptance domains to perceive different spatial contexts. The multichannel attention selection component utilizes the generation of a series of different intermediate pictures and the combination into a final output.
Referring to fig. 2 and 3, the multi-channel attention selection model 300 selects the feature map cascade of the conditional image Ir, the generated image I' g, the generator Gi, and the last convolution layer output in the semantic segmentation stage as features input into a multi-scale spatial pooling section that performs different scale average pooling to obtain multi-scale spatial context features. The pooled features of different scales are multiplied by the input features in order to preserve useful information, and the result is convolved to produce new multi-scale features and used as input for the multi-channel attention selection section. The multi-channel attention selection section expands the channel representation of the image through a convolutional network and combines the attention maps to produce more reasonable results.
Specifically, with further reference to fig. 2 and 3, the multi-channel attention selection model 300 takes the feature maps F i and F s of the conditional image I r, the generated image I' g, the generator Gi, and the last convolutional layer output in the semantic segmentation stage, concatenates the feature inputs into the multi-scale spatial pooling section, and the generated multi-scale features serve as inputs to the multi-channel attention selection section. The multichannel attention selection section enlarges a channel representation of the image by a convolutional network, wherein an intermediate pictureAnd corresponding attention seeking tabletThe calculation method of (2) is shown in the formula (1):
finally, each intermediate picture is selected by using the learned attention picture, and the calculation method is shown as a formula (2):
meanwhile, by learning the uncertainty map (uncertainty maps), the pixel level Loss (Loss function) optimization computation can be made more robust.
According to one embodiment of the invention, the picture semantic anonymization module 101 trains the generator Gi and the multi-attention selection model 300 using the inside 09 indoor scene dataset.
FIG. 4 shows a diagram 400 further describing the functionality of the persona pose anonymization module 102, according to one embodiment of the present invention. As shown in fig. 4, the function of the character pose anonymization module 102 is implemented similarly to the picture semantic anonymization module 101, except that the semantic graph is replaced by the pose graph extracted by openPose model, and a picture is randomly selected as a conditional image in the disclosed portrait picture dataset. Character pose anonymization module 102 trains using CUHK a 03 human data set.
Specifically, the character pose anonymization module 102 is configured to implement three phases of pose estimation, pose guide reconstruction, and picture optimization.
As shown in fig. 4, in the gesture estimation stage, the portrait portion in the semantic graph obtained by the picture semantic anonymization module 101 is used as a mask, an original portrait graph Ig is intercepted from an original input picture, and the gesture of the person in the original portrait graph Ig is extracted and estimated by using a openPose model, so as to generate a gesture graph Sg.
In the context of the present invention, both the gesture-guided reconstruction stage and the picture optimization stage may logically/functionally constitute a multi-channel attention selection mechanism based character gesture anonymization countermeasure generation network under cascaded gesture guidance. In the character pose anonymization countermeasure generation network, a pose guidance reconstruction stage is used for generating a character pose anonymization result with coarse granularity, and a picture optimization stage is used for generating a finer result through a multichannel attention selection mechanism.
In the posture guiding reconstruction stage, a human image picture Ir is randomly selected from a human image picture data set to serve as a conditional image, the randomly selected human image picture Ir and a posture picture Sg obtained in the posture estimation stage are cascaded, and a result after the cascade is input into a generator Gi to be inferred to obtain a generated image I ' g, wherein the generator Gi is a U-Net model constructed based on REFINENET, and the generator Gi is optimized by optimizing a loss function of a posture picture S ' g of the generated image I ' g and an original posture picture Sg in training. Where L1-L4 are the four components in calculating the loss function.
And in the picture optimization stage, the generated picture I 'g in the previous stage is optimized by using a multichannel attention selection model so as to obtain a final portrait picture I' g. For a detailed description of the multi-channel attention selection model, see the description above for fig. 3.
FIG. 5 illustrates a data flow diagram 500 for a semantic graph and gesture graph guided picture anonymization process according to one embodiment of the present invention. The dataflow graph 500 can be divided into a picture semantic anonymization stage 501, a character pose anonymization stage 502, and an overlay stage 503.
Referring to fig. 5, in a picture semantic anonymization stage 501, an input picture is semantically segmented to form a semantic graph that generates a scene graph through a picture semantic anonymization challenge generation network as described above. Meanwhile, after the semantic graph is formed, a character pose anonymization stage 502 may be initiated in which the original portrait portion in the semantic graph is first truncated from the input picture using the portrait portion as a mask. The original portrait graph is subjected to gesture extraction and estimation to generate a gesture graph. The gesture map generates a new portrait map through the character gesture anonymization countermeasure generation network as described above. After the picture semantic anonymization stage 501 and the character pose anonymization stage 502 are completed, an overlay stage 503 may be initiated in which the scene graph generated by the picture semantic anonymization stage 501 and the new portrait graph generated by the character pose anonymization stage 502 are overlaid according to the semantic graph to form an anonymized picture for output.
Fig. 6 shows a flow diagram of a method 600 for picture anonymization based on semantic and gesture graph guidance, according to one embodiment of the present invention.
In step 601, the original picture is semantically segmented to obtain a semantic graph. According to one embodiment of the present invention, the original picture may be a picture taken by a monitoring camera, or a certain frame in a video taken by the monitoring camera, or a picture selected by a user. According to one embodiment of the invention, a self-codec built by ShuffleNet as a backbone network is used as a semantic generator to infer the original pictures to obtain a semantic graph. According to one embodiment of the invention, the semantic graph may indicate the location information of the person on the original picture.
At step 602, a scene graph having the same semantics as the original picture but different content is generated under the guidance of the semantic graph using a picture semantic anonymization countermeasure generation network. According to one embodiment of the invention, the picture semantic anonymization countermeasure generation network comprises a semantic guidance reconstruction stage and a picture optimization stage, wherein the semantic guidance reconstruction stage is used for generating a picture semantic anonymization result of a coarse granularity level by adopting cascade semantic guidance based on a semantic graph, and the picture optimization stage is used for optimizing the picture semantic anonymization result generated by the semantic guidance reconstruction stage through a multichannel attention selection mechanism so as to obtain a final scene graph with a finer granularity level.
In step 603, the image portion in the semantic graph obtained in step 601 is used as a mask to intercept the original image graph from the original image.
At step 604, the pose of the person in the original portrait graph is extracted and estimated to generate a pose graph. According to one embodiment of the present invention, the pose of the person in the original image obtained in step 603 is extracted and estimated using openPose models to generate a pose map.
At step 605, a new figure having the same pose as the original figure but a different person is generated under the guidance of the pose map using the human pose anonymization challenge-generating network. According to one embodiment of the invention, the character pose anonymization countermeasure generation network comprises a pose guide reconstruction stage and a picture optimization stage, wherein the pose guide reconstruction stage is used for generating a character pose anonymization result with a coarse granularity level by adopting cascade pose guide based on a pose map, and the picture optimization stage is used for optimizing the character pose anonymization result generated in the pose guide reconstruction stage through a multichannel attention selection mechanism so as to obtain a final figure with a finer granularity level.
In step 606, the scene graph generated in step 602 and the new portrait graph generated in step 605 are superimposed according to the semantic graph obtained in step 601, so as to obtain a final anonymized picture. According to one embodiment of the invention, the position information of the person on the original picture can be obtained through semantic segmentation, and the superposition of the scene graph and the new image graph is realized through the information.
In summary, compared with the prior art, the invention has the main advantages that: (1) The overall anonymization of the picture is carried out, only the abstract semantic graph and the figure gesture graph of the original picture are reserved in the generated picture, and the face, the person and the background are completely replaced, so that the privacy leakage risk can be reduced to the greatest extent; (2) On the basis of complete anonymization, original semantic information, character posture information and object motion information of the picture can be kept, and a large amount of available training data can be provided for developing and optimizing AI algorithm models of non-identity authentication such as human form detection, motion detection and the like; (3) The initial result of the antagonism generation network output is further optimized using the multichannel attention model, so that the quality of the output picture is higher.
In addition, in practical application, the invention has the advantages that, for example, the clothes can be adjusted by using the online try-on application of similar technology, the facial information and the background information can be replaced, and the privacy of the user is protected to the greatest extent.
FIG. 7 illustrates a block diagram 700 of an exemplary computing device, which is one example of a hardware device that may be used in connection with aspects of the invention, according to one embodiment of the invention. For example, the monitoring devices, remote devices, computing devices associated with users mentioned above may all be implemented as computing devices in fig. 7. Computing device 700 may be any machine that may be configured to implement processes and/or calculations and may be, but is not limited to, a workstation, a server, a desktop computer, a laptop computer, a tablet computer, a personal digital assistant, a smart phone, a vehicle mount computer, or any combination thereof. Computing device 700 may include components that may be connected or in communication with a bus 702 via one or more interfaces. For example, computing device 700 may include a bus 702, one or more processors 704, one or more input devices 706, and one or more output devices 708. The one or more processors 704 may be any type of processor and may include, but are not limited to, one or more general purpose processors and/or one or more special purpose processors (e.g., special processing chips). Input device 706 may be any type of device capable of inputting information to a computing device and may include, but is not limited to, a mouse, keyboard, touch screen, microphone, and/or remote controller. Output device 708 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. Computing device 700 may also include or be connected to a non-transitory storage device 710, which may be non-transitory and capable of data storage, and which may include, but is not limited to, a disk drive, an optical storage device, a solid state memory, a floppy disk, a flexible disk, a hard disk, a magnetic tape, or any other magnetic medium, an optical disk or any other optical medium, a ROM (read only memory), a RAM (random access memory), a cache memory, and/or any memory chip or cartridge, and/or any other medium from which a computer may read data, instructions, and/or code. The non-transitory storage device 710 may be separated from the interface. The non-transitory storage device 710 may have data/instructions/code for implementing the methods and steps described above. Computing device 700 may also include communication device 712. Communication device 712 may be any type of device or system capable of communicating with internal equipment and/or with a network and may include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication device, and/or a chipset, such as a Bluetooth device, an IEEE 1302.11 device, a WiFi device, a WiMax device, a cellular communication device, and/or the like.
Bus 702 can include, but is not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.
Computing device 700 may also include a working memory 714, which working memory 714 may be any type of working memory capable of storing instructions and/or data that facilitate the operation of processor 704 and may include, but is not limited to, random access memory and/or read-only memory devices.
Software components may reside in the working memory 714 and include, but are not limited to, an operating system 716, one or more application programs 718, drivers, and/or other data and code. Instructions for implementing the above-described methods and steps of the present invention may be included in the one or more application programs 718 and the above-described method 600 of the present invention may be implemented by the processor 704 reading and executing the instructions of the one or more application programs 718.
It should also be appreciated that variations may be made according to particular needs. For example, custom hardware may also be used, and/or particular components may be implemented in hardware, software, firmware, middleware, microcode, hardware description voices, or any combination thereof. In addition, connections to other computing devices, such as network input/output devices, etc., may be employed. For example, some or all of the disclosed methods and apparatus may be implemented with programming hardware (e.g., programmable logic circuits including Field Programmable Gate Arrays (FPGAs) and/or Programmable Logic Arrays (PLAs)) having an assembly language or hardware programming language (e.g., VERILOG, VHDL, C ++).
Although aspects of the present invention have been described so far with reference to the accompanying drawings, the above-described methods and apparatuses are merely examples, and the scope of the present invention is not limited to these aspects but is limited only by the appended claims and equivalents thereof. Various components may be omitted or replaced with equivalent components. In addition, the steps may also be implemented in a different order than described in the present invention. Furthermore, the various components may be combined in various ways. It is also important that as technology advances, many of the described components can be replaced by equivalent components that appear later.

Claims (6)

1. A picture anonymization method based on semantic graph and gesture graph guidance comprises the following steps:
carrying out semantic segmentation on the original picture to obtain a semantic graph;
Generating a scene graph with the same semantics as the original picture but different contents under the guidance of the semantic graph by using a picture semantic anonymization countermeasure generation network, wherein the picture semantic anonymization countermeasure generation network comprises a semantic guidance reconstruction stage and a first picture optimization stage, wherein the semantic guidance reconstruction stage is used for generating a picture semantic anonymization result with a coarse granularity level by adopting cascade semantic guidance based on the semantic graph, and the first picture optimization stage is used for optimizing the picture semantic anonymization result generated by the semantic guidance reconstruction stage through a multichannel attention selection mechanism so as to obtain the scene graph with a finer granularity level;
taking the portrait part in the semantic graph as a mask to intercept a portrait graph from the original picture;
extracting and estimating the gesture of the person in the portrait graph to generate a gesture graph;
generating a new figure with the same gesture as the figure but different from the figure under the guidance of the figure by using a figure gesture anonymization countermeasure generation network, wherein the figure gesture anonymization countermeasure generation network comprises a gesture guidance reconstruction stage and a second picture optimization stage, wherein the gesture guidance reconstruction stage is used for generating a figure gesture anonymization result with coarse granularity level by adopting cascade gesture guidance based on the figure gesture, and the second picture optimization stage is used for optimizing the figure gesture anonymization result generated by the gesture guidance reconstruction stage through a multichannel attention selection mechanism so as to obtain the figure with finer granularity level;
And superposing the scene graph and the new portrait graph according to the semantic graph to obtain a final anonymized picture.
2. The method of claim 1, wherein semantically segmenting the original picture to obtain a semantic graph further comprises: and using ShuffleNet as a self-codec built by a backbone network as a semantic generator, and reasoning the original picture to obtain the semantic graph.
3. The method of claim 1, wherein extracting and estimating the pose of the person in the portrait graph to generate a pose graph further comprises: the pose of the person in the portrait graph is extracted and estimated using openPose models to generate the pose graph.
4. A picture anonymization system based on semantic graph and gesture graph guidance, comprising:
A picture semantic anonymization module configured to:
carrying out semantic segmentation on the original picture to obtain a semantic graph;
Generating a scene graph with the same semantics as the original picture but different contents under the guidance of the semantic graph by using a picture semantic anonymization countermeasure generation network, wherein the picture semantic anonymization countermeasure generation network comprises a semantic guidance reconstruction stage and a first picture optimization stage, wherein the semantic guidance reconstruction stage is used for generating a picture semantic anonymization result with a coarse granularity level by adopting cascade semantic guidance based on the semantic graph, and the first picture optimization stage is used for optimizing the picture semantic anonymization result generated by the semantic guidance reconstruction stage through a multichannel attention selection mechanism so as to obtain the scene graph with a finer granularity level;
a persona pose anonymization module configured to:
taking the portrait part in the semantic graph as a mask to intercept a portrait graph from the original picture;
extracting and estimating the gesture of the person in the portrait graph to generate a gesture graph;
generating a new figure with the same gesture as the figure but different from the figure under the guidance of the figure by using a figure gesture anonymization countermeasure generation network, wherein the figure gesture anonymization countermeasure generation network comprises a gesture guidance reconstruction stage and a second picture optimization stage, wherein the gesture guidance reconstruction stage is used for generating a figure gesture anonymization result with coarse granularity level by adopting cascade gesture guidance based on the figure gesture, and the second picture optimization stage is used for optimizing the figure gesture anonymization result generated by the gesture guidance reconstruction stage through a multichannel attention selection mechanism so as to obtain the figure with finer granularity level;
A superposition module configured to:
And superposing the scene graph and the new portrait graph according to the semantic graph to obtain a final anonymized picture.
5. The system of claim 4, wherein semantically segmenting the original picture to obtain a semantic graph further comprises: using ShuffleNet as a self-encoding and decoding machine built by a backbone network as a semantic generator, and reasoning the original picture to obtain the semantic graph;
Extracting and estimating the character pose in the portrait graph to generate a pose graph further comprises: the pose of the person in the portrait graph is extracted and estimated using openPose models to generate the pose graph.
6. A computing device for semantic graph and gesture graph guided picture anonymization, comprising:
A processor;
a memory storing instructions that when executed by the processor are capable of performing the method of any of claims 1-3.
CN202111196429.9A 2021-10-14 2021-10-14 Picture anonymizing method based on semantic and gesture graph guidance Active CN113919998B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111196429.9A CN113919998B (en) 2021-10-14 2021-10-14 Picture anonymizing method based on semantic and gesture graph guidance
PCT/CN2022/097530 WO2023060918A1 (en) 2021-10-14 2022-06-08 Image anonymization method based on guidance of semantic and pose graphs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111196429.9A CN113919998B (en) 2021-10-14 2021-10-14 Picture anonymizing method based on semantic and gesture graph guidance

Publications (2)

Publication Number Publication Date
CN113919998A CN113919998A (en) 2022-01-11
CN113919998B true CN113919998B (en) 2024-05-14

Family

ID=79240288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111196429.9A Active CN113919998B (en) 2021-10-14 2021-10-14 Picture anonymizing method based on semantic and gesture graph guidance

Country Status (2)

Country Link
CN (1) CN113919998B (en)
WO (1) WO2023060918A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116778564B (en) * 2023-08-24 2023-11-17 武汉大学 Identity-maintained face anonymization method, system and equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110021051A (en) * 2019-04-01 2019-07-16 浙江大学 One kind passing through text Conrad object image generation method based on confrontation network is generated
KR20190119261A (en) * 2018-04-12 2019-10-22 가천대학교 산학협력단 Apparatus and method for segmenting of semantic image using fully convolutional neural network based on multi scale image and multi scale dilated convolution
CN110473266A (en) * 2019-07-08 2019-11-19 南京邮电大学盐城大数据研究院有限公司 A kind of reservation source scene figure action video generation method based on posture guidance
CN111242837A (en) * 2020-01-03 2020-06-05 杭州电子科技大学 Face anonymous privacy protection method based on generation of countermeasure network
CN112241708A (en) * 2020-10-19 2021-01-19 戴姆勒股份公司 Method and apparatus for generating new person image from original person image
CN112651423A (en) * 2020-11-30 2021-04-13 深圳先进技术研究院 Intelligent vision system
CN113160035A (en) * 2021-04-16 2021-07-23 浙江工业大学 Human body image generation method based on posture guidance, style and shape feature constraints
CN113255813A (en) * 2021-06-02 2021-08-13 北京理工大学 Multi-style image generation method based on feature fusion
WO2021164283A1 (en) * 2020-02-18 2021-08-26 苏州科达科技股份有限公司 Clothing color recognition method, device and system based on semantic segmentation
CN113343878A (en) * 2021-06-18 2021-09-03 北京邮电大学 High-fidelity face privacy protection method and system based on generation countermeasure network

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11397462B2 (en) * 2012-09-28 2022-07-26 Sri International Real-time human-machine collaboration using big data driven augmented reality technologies
US10755479B2 (en) * 2017-06-27 2020-08-25 Mad Street Den, Inc. Systems and methods for synthesizing images of apparel ensembles on models
US10546387B2 (en) * 2017-09-08 2020-01-28 Qualcomm Incorporated Pose determination with semantic segmentation
EP3471060B1 (en) * 2017-10-16 2020-07-08 Nokia Technologies Oy Apparatus and methods for determining and providing anonymized content within images
US11179064B2 (en) * 2018-12-30 2021-11-23 Altum View Systems Inc. Method and system for privacy-preserving fall detection
US11244504B2 (en) * 2019-05-03 2022-02-08 Facebook Technologies, Llc Semantic fusion
CN110363183B (en) * 2019-07-30 2020-05-08 贵州大学 Service robot visual image privacy protection method based on generating type countermeasure network
US11475608B2 (en) * 2019-09-26 2022-10-18 Apple Inc. Face image generation with pose and expression control
DE102020203473A1 (en) * 2020-03-18 2021-09-23 Robert Bosch Gesellschaft mit beschränkter Haftung Anonymization device, monitoring device, method, computer program and storage medium
CN111539262B (en) * 2020-04-02 2023-04-18 中山大学 Motion transfer method and system based on single picture

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190119261A (en) * 2018-04-12 2019-10-22 가천대학교 산학협력단 Apparatus and method for segmenting of semantic image using fully convolutional neural network based on multi scale image and multi scale dilated convolution
CN110021051A (en) * 2019-04-01 2019-07-16 浙江大学 One kind passing through text Conrad object image generation method based on confrontation network is generated
CN110473266A (en) * 2019-07-08 2019-11-19 南京邮电大学盐城大数据研究院有限公司 A kind of reservation source scene figure action video generation method based on posture guidance
CN111242837A (en) * 2020-01-03 2020-06-05 杭州电子科技大学 Face anonymous privacy protection method based on generation of countermeasure network
WO2021164283A1 (en) * 2020-02-18 2021-08-26 苏州科达科技股份有限公司 Clothing color recognition method, device and system based on semantic segmentation
CN112241708A (en) * 2020-10-19 2021-01-19 戴姆勒股份公司 Method and apparatus for generating new person image from original person image
CN112651423A (en) * 2020-11-30 2021-04-13 深圳先进技术研究院 Intelligent vision system
CN113160035A (en) * 2021-04-16 2021-07-23 浙江工业大学 Human body image generation method based on posture guidance, style and shape feature constraints
CN113255813A (en) * 2021-06-02 2021-08-13 北京理工大学 Multi-style image generation method based on feature fusion
CN113343878A (en) * 2021-06-18 2021-09-03 北京邮电大学 High-fidelity face privacy protection method and system based on generation countermeasure network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Semantic-based Scene segmentation using convolutional neural networks;Aya M. Shaaban 等;《AEU - International Journal of Electronics and Communications》;第125卷;全文 *
基于图像语义的服务机器人视觉隐私行为识别与保护系统;李中益 等;《计算机辅助设计与图形学学报》(第10期);全文 *
基于生成对抗网络的文本引导人物图像编辑方法;黄韬 等;《广东技术师范大学学报》(第03期);全文 *

Also Published As

Publication number Publication date
WO2023060918A1 (en) 2023-04-20
CN113919998A (en) 2022-01-11

Similar Documents

Publication Publication Date Title
CN111210443B (en) Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
CN109977832B (en) Image processing method, device and storage medium
CN112419170A (en) Method for training occlusion detection model and method for beautifying face image
CN114820584B (en) Lung focus positioner
CN112132741B (en) Face photo image and sketch image conversion method and system
CN112861575A (en) Pedestrian structuring method, device, equipment and storage medium
CN112752158B (en) Video display method and device, electronic equipment and storage medium
CN116048244B (en) Gaze point estimation method and related equipment
CN113066017A (en) Image enhancement method, model training method and equipment
JP2020127194A (en) Computer system and program
CN113780326A (en) Image processing method and device, storage medium and electronic equipment
CN113919998B (en) Picture anonymizing method based on semantic and gesture graph guidance
CN114549557A (en) Portrait segmentation network training method, device, equipment and medium
US20230036366A1 (en) Image attribute classification method, apparatus, electronic device, medium and program product
CN116012255A (en) Low-light image enhancement method for generating countermeasure network based on cyclic consistency
CN113642359B (en) Face image generation method and device, electronic equipment and storage medium
WO2024041235A1 (en) Image processing method and apparatus, device, storage medium and program product
CN117336526A (en) Video generation method and device, storage medium and electronic equipment
JP7385046B2 (en) Color spot prediction method, device, equipment and storage medium
CN116862920A (en) Portrait segmentation method, device, equipment and medium
KR20210091033A (en) Electronic device for estimating object information and generating virtual object and method for operating the same
CN111583168A (en) Image synthesis method, image synthesis device, computer equipment and storage medium
CN112085025A (en) Object segmentation method, device and equipment
CN115623317B (en) Focusing method, device and storage medium
KR102633279B1 (en) Apparatus and method for selective deidentification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant