CN111783803A

CN111783803A - Image processing method and device for realizing privacy protection

Info

Publication number: CN111783803A
Application number: CN202010820688.3A
Authority: CN
Inventors: 赵凯; 杨成平
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2020-08-14
Filing date: 2020-08-14
Publication date: 2020-10-16
Anticipated expiration: 2040-08-14
Also published as: CN111783803B

Abstract

The embodiment of the specification provides an image processing method for realizing privacy protection, which adopts the technical idea that desensitization and compression processing are carried out on a privacy image at a terminal, image identification is carried out at a server according to data subjected to desensitization and compression processing, frequency domain characteristics of the privacy image are extracted through the terminal, preprocessing and screening are carried out on a frequency domain characteristic diagram, the characteristic diagram more beneficial to image identification is obtained in a self-adaptive manner, and compared with simple image compression, the compression ratio in the image transmission process is improved, the data privacy is effectively protected, the data processing amount is reduced, and therefore the effectiveness of the image processing is improved.

Description

Image processing method and device for realizing privacy protection

Technical Field

One or more embodiments of the present description relate to the field of computer technologies, and in particular, to an image processing method and apparatus for implementing privacy protection.

Background

With the development of computer technology, image recognition is more and more deep into various fields of people's lives. For example, unmanned, smart unlocking based on face or fingerprint recognition, terminal application login, and the like. In some cases, the image recognition may be local image recognition, for example, intelligent unlocking, and the unlocking control may be performed through a plurality of face images/fingerprint images locally stored in the intelligent terminal or the intelligent lock. In other cases, the terminal cannot store the related images in advance, or the computing power of the terminal is insufficient to complete image recognition, such as logging in a web application requiring face recognition through a public computer, intelligent cash register for face recognition in a supermarket, fingerprint payment and the like. This requires the image recognition module to be deployed at the server. In particular, if these scenes involve personal privacy (face image privacy, fingerprint privacy), the terminal needs to transmit the image to the server. The data transmission process may have a risk of being leaked due to receiving goods by other parties, and therefore, how to protect the private data from being leaked in the transmission process from the terminal to the server side is a problem worthy of research.

Disclosure of Invention

One or more embodiments of the present specification describe an image processing method and apparatus for implementing privacy protection to solve one or more of the problems mentioned in the background.

According to a first aspect, there is provided an image processing method of implementing privacy protection, the method comprising: acquiring pixel coding values on a plurality of preset color coding channels aiming at a privacy image to be processed to obtain coding feature maps respectively corresponding to the color coding channels, wherein a single coding feature map comprises feature points and coding values of the feature points which are in one-to-one correspondence with the pixels of the privacy image; for each coding feature map, extracting frequency domain features on each preset color coding channel based on the position of the feature point of the coding feature map and the corresponding coding value to obtain each corresponding frequency domain feature map; obtaining a plurality of preprocessing characteristic graphs based on the preset processing of each frequency domain characteristic graph; screening a predetermined number of recognition feature maps from the plurality of preprocessing feature maps according to the importance degree; and providing the identification feature maps to a server, so that the server processes the predetermined number of identification feature maps according to a pre-trained image identification model, thereby determining an identification result for the privacy image.

In an optional implementation manner, in a case that the color coding manner is an RGB manner, the plurality of predetermined color coding channels include three color coding channels corresponding to three color components of red R, green G, and blue B, respectively, and on a single color coding channel, a coding value of a single feature point is a value of a corresponding pixel in a corresponding color component; and under the condition that the color coding mode is a YUV mode, the plurality of preset color coding channels comprise three color coding channels respectively corresponding to three coding components of brightness Y, chroma U and concentration V, and on a single color coding channel, the coding value of a single characteristic point is the value of a corresponding pixel on the corresponding coding component.

According to one embodiment, the frequency domain characteristic values corresponding to a single characteristic point on a single frequency domain characteristic diagram are related to the positions of the characteristic points on the corresponding coding characteristic diagram and the corresponding coding values; for each coding feature map, extracting the frequency domain features on each preset color coding channel based on the position of the feature point and the corresponding coding value of each coding feature map respectively to obtain each corresponding frequency domain feature map, wherein the step of extracting the frequency domain features comprises the following steps: aiming at a single coding characteristic diagram, obtaining each frequency domain characteristic value respectively corresponding to each characteristic point on the single frequency domain characteristic diagram through one of Fourier transform and discrete cosine transform, and forming the single frequency domain characteristic diagram by each frequency domain characteristic value; or alternatively. And aiming at the single coding characteristic diagram, obtaining the single frequency domain characteristic diagram in the form of the matrix of the frequency domain characteristic values of each characteristic point through the matrix in one transformation mode of Fourier transformation and discrete cosine transformation.

In one possible design, the predetermined process includes: aiming at a single frequency domain characteristic diagram, at least one of rounding, normalization, weight addition and block confusion is carried out;

and the single pre-processing characteristic map is the corresponding frequency domain characteristic map after the pre-processing.

In another possible design, the predetermined process includes: carrying out fusion processing on each frequency domain characteristic graph by using a pre-trained convolutional neural network; meanwhile, the number of preprocessed feature maps is determined based on the size of the convolution kernel.

According to one embodiment, the filtering out a predetermined number of recognition feature maps from the plurality of preprocessed feature maps according to importance degree includes: processing the plurality of preprocessed feature maps by using a pre-trained screening model, wherein the output result of the screening model indicates a predetermined number of recognition feature maps with the highest degree of importance; and determining a corresponding preset number of recognition characteristic graphs according to the output result of the screening model.

In one embodiment, the model parameters of the screening model are determined by training the screening model with the image recognition model as a whole in advance and adjusting the model parameters based on the loss commonly related to the model parameters of the image recognition model.

According to one possible design, the filtering out a predetermined number of recognition feature maps from the plurality of preprocessed feature maps according to importance degree comprises: calculating the importance value of each preprocessing characteristic map through a predetermined rule, wherein the predetermined rule comprises one of the following items:

taking the average value of each characteristic value corresponding to each characteristic point in a single preprocessing characteristic diagram as the importance value of the characteristic value, and taking the weighted sum of each characteristic value corresponding to each characteristic point in the single preprocessing characteristic diagram as the importance value of the characteristic value;

and selecting a preset number of preprocessed feature maps as the recognition feature maps according to the descending order of the importance values.

According to a second aspect, there is provided an image processing apparatus that implements privacy protection, the apparatus comprising:

the color feature acquisition unit is configured to acquire pixel encoding values on a plurality of preset color encoding channels aiming at a privacy image to be processed to obtain each encoding feature map corresponding to each color encoding channel, and each single encoding feature map comprises each feature point corresponding to each pixel of the privacy image one to one and the encoding value of the feature point;

the frequency domain characteristic extraction unit is configured to extract frequency domain characteristics on each preset color coding channel to obtain each corresponding frequency domain characteristic graph respectively based on the position of the characteristic point of each coding characteristic graph and the corresponding coding value;

the preprocessing unit is configured to obtain a plurality of preprocessing characteristic graphs based on preset processing of each frequency domain characteristic graph;

the screening unit is configured to screen a predetermined number of identification feature maps from the plurality of preprocessing feature maps according to the importance degree;

and the providing unit is configured to provide the identification feature maps to a server, so that the server processes the predetermined number of identification feature maps according to a pre-trained image identification model, and thus an identification result for the privacy image is determined.

According to a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.

According to a fourth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has stored therein executable code, and wherein the processor, when executing the executable code, implements the method of the first aspect.

According to the method and the device provided by the embodiment of the specification, the technical concept that desensitization and compression processing are carried out on the private image at the terminal, image identification is carried out at the server side according to data subjected to desensitization and compression processing is adopted, the frequency domain characteristics of the private image are extracted through the terminal, preprocessing and screening are carried out on the frequency domain characteristic diagram, the characteristic diagram more beneficial to image identification is obtained in a self-adaptive mode, and compared with simple image compression, the compression ratio in the image transmission process is improved, the data privacy is effectively protected, the data processing amount is reduced, and therefore the effectiveness of image processing is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram illustrating a specific implementation scenario of image processing for implementing privacy protection according to the present description;

FIG. 2 illustrates a flow diagram of an image processing method to implement privacy protection according to one embodiment;

FIG. 3 illustrates a schematic diagram for image conversion from a spatial domain to a frequency domain, according to a specific example;

fig. 4 shows a schematic block diagram of an image processing apparatus implementing privacy protection according to one embodiment.

Detailed Description

The scheme provided by the specification is described below with reference to the accompanying drawings.

First, a description will be given of an embodiment of the present invention with reference to fig. 1. The implementation scenario illustrated in fig. 1 is an image recognition scenario during a face-brushing payment process. Fig. 1 illustrates a face image recognition process for face-brushing payment in an implementation scenario. The face image recognition process in fig. 1 is divided into two parts, an image acquisition part of the terminal platform and an image recognition part of the server platform. The two parts need to be connected through data network transmission of the terminal and the server.

At a terminal platform, the intelligent terminal can acquire a face image through image acquisition equipment (such as a camera). Thereafter, color features are collected for the face image. Different color coding modes correspond to different color features, and therefore, the color features may be referred to as coding features in this specification. For example, RGB, YUV, etc. are different encoding methods for images. In the RGB color coding scheme, the color coding features of the image may correspond to three color coding channels, red R, green G, and blue B. On a single color coding channel, there are corresponding feature points corresponding to respective pixels one to one, and the corresponding color component value may be taken as a coding value on the corresponding channel, for example, one pixel is red, the R component is 255, the G component is 0, and the B component is 0, and on the color coding channel corresponding to red R, the pixel coding value corresponding to the pixel is red R component 255. On a single color coding channel, the coding value of each pixel forms a two-dimensional coding characteristic diagram on the channel. The encoding feature map may be regarded as a map formed of feature points corresponding to pixels, and the encoding value of each feature point is the encoding value of the corresponding pixel. In the YUV encoding scheme, Y, U, V three encoding channels can be used, "Y" represents brightness (Luma), "U" represents Chroma (Chroma), and "V" represents Chroma (Chroma). Similarly, the pixel code value corresponding to each pixel on a certain code channel may form a two-dimensional code characteristic map on the code channel.

Then, on each color coding channel, according to the position information and the coding value of each feature point in the coding feature map, color-frequency domain transformation of a two-dimensional plane can be performed, that is, the frequency domain features on each color coding channel are extracted, and the coding feature map is converted into a frequency domain feature map. The transform method may be, for example, discrete cosine transform, fourier transform, or the like. Based on the preset processing of the frequency domain characteristic diagrams, a plurality of preprocessing characteristic diagrams can be obtained, and then a preset number of characteristic diagrams are screened from the preprocessing characteristic diagrams to be used as recognition characteristic diagrams for image recognition. The identification feature map after screening may have some variations when converted back to the original data due to the color-frequency domain transformation. And the screening processing after the predetermined processing can avoid back-pushing the original data, thereby protecting the data privacy in the network transmission process. In addition, due to the fact that the feature graphs are screened, data transmission quantity and processing quantity can be reduced, and image recognition efficiency is improved.

The terminal can transmit the screened identification characteristic graphs to the server side through wired or wireless network transmission. And the server platform receives the recognition characteristic graph and processes the recognition characteristic graph through a pre-trained image recognition network to obtain a recognition result. After the server side obtains the recognition result, the server side can further execute subsequent operations, such as matching with a face image stored in the server side and feeding back the matching result to the terminal, or giving an action instruction to the video image so that a terminal user can make corresponding actions according to the instruction to perform living body detection and feed back the detection result to the terminal, or completing payment operation and feeding back the detection result to the terminal when the image recognition result meets the face payment condition, and the like.

It should be noted that, in order to ensure the accuracy of the image recognition of the server, the data processing module deployed by the terminal platform may be trained together with the image recognition model of the server platform and separately deployed. Therefore, the data privacy can be protected, the accuracy of image recognition is not affected, and the effectiveness of the image recognition is improved.

An image processing method for realizing privacy protection under the technical idea of the present specification is described in detail below.

FIG. 2 illustrates an image processing flow diagram for implementing privacy protection according to one embodiment of the present description. The executing body of the flow may be a device or a terminal with certain computing power, such as the intelligent terminal shown in fig. 1. As shown in fig. 2, the image processing flow for implementing privacy protection includes: step 201, aiming at a privacy image to be processed, collecting pixel coding values on a plurality of preset color coding channels to obtain coding feature maps respectively corresponding to the color coding channels, wherein a single coding feature map comprises feature points and coding values of the feature points, which are in one-to-one correspondence with the pixels of the privacy image; step 202, for each coding feature map, extracting frequency domain features on each preset color coding channel based on the position of the feature point and the corresponding coding value of each coding feature map to obtain each corresponding frequency domain feature map; step 203, obtaining a plurality of preprocessing characteristic graphs based on the preset processing of each frequency domain characteristic graph; step 204, screening a predetermined number of recognition feature maps from the plurality of preprocessing feature maps according to the importance degree; step 205, providing the recognition feature maps to the server, so that the server processes a predetermined number of recognition feature maps according to the pre-trained image recognition model, thereby determining a recognition result for the privacy image.

First, in step 201, for a to-be-processed privacy image, pixel code values on a plurality of predetermined color coding channels are collected to obtain each coding feature map respectively corresponding to each color coding channel. It is to be understood that the privacy image herein may be a variety of personalized images. That is, an unknown image, or an image that is not desired to be disclosed, such as a fingerprint image, a face image, a photo game work, an image of symptoms of an on-line doctor visit, and the like. The private image can be collected in real time through a collecting device (such as a camera, a fingerprint collector and the like) of the terminal, and can also be selected from images stored locally in the terminal.

The image usually has a certain color coding. According to the color coding mode, the display image picture can be restored. The color coding scheme may have one or more color coding channels, and the corresponding colors are combined by the coding values on the respective channels. The color coding method may include, but is not limited to, RGB coding method, YUV coding method, gray scale coding method, and the like. For a gray image, the corresponding gray coding mode may only have one color coding channel, such as black, when the coding value of a pixel is 0, it indicates that the color at the pixel is white, the coding value is greater than 0 and less than 255, which indicates different degrees of gray, and the coding value is 255, which indicates black.

The RGB coding mode and the YUV coding mode respectively represent each pixel through three different color coding channels. The RGB encoding method is common and will not be described herein. In the YUV coding scheme, "Y" represents brightness, i.e. a gray level, and is a baseband signal, and "U" and "V" represent chrominance, which are used to describe image color and saturation, and are used to provide the color of a pixel. U and V are not baseband signals but are quadrature modulated information. "luminance" is created by the RGB input signals by superimposing specific parts of the RGB signals together. "chroma" defines two aspects of color-hue and saturation. Where the hue U reflects the difference between the red part of the RGB input signal and the luminance value of the RGB signal. And the saturation V reflects the difference between the blue part of the RGB input signal and the same luminance value of the RGB signal. By operation, the three components of YUV can be converted with the three components of RGB R (red), G (green), B (blue). For example:

Y”＝0.299*R”+0.587*G”+0.114*B”；

U”＝-0.147*R”-0.289*G”+0.436*B”＝0.492*(B”-Y”)；

V”＝0.615*R”-0.515*G”-0.100*B”＝0.877*(R”-Y”)。

different color coding modes correspond to different color coding channels. In the case of a defined color coding mode, the color coding channel of the image to be processed is also generally defined. Thus, the predetermined color-coded channels may be individual color-coded channels in a predetermined color-coding manner. For example, in the RGB color coding scheme, the predetermined color coding channels may correspond to red R, green G, and blue B, respectively. For a pixel, it may have a value between 0-255 on each color-coding channel, e.g., R for 255, G for 0, B for 0, indicating that the color of the pixel is red. For a single color-coded channel. Each pixel corresponds to a pixel code value on the color coding channel, and then a code characteristic map corresponding to the color coding channel can be formed. The encoding profile may be in accordance with the size of the privacy image to be processed. For example, the privacy image size is 960 × 1280 pixels, a single encoded feature map includes 960 × 1280 feature points and corresponding 960 × 1280 encoded values, and each encoded value uniquely corresponds to each pixel.

Thus, under the predetermined color coding mode, one coding feature map can be obtained for each predetermined color coding channel. In the RGB color coding scheme, three coding feature maps corresponding to R, G, B can be obtained, and in the YUV coding scheme, three coding feature maps corresponding to Y, U, V can be obtained.

Next, in step 202, for each encoding feature map, based on the position of the feature point and the corresponding encoding value, the frequency domain feature on each predetermined color encoding channel is extracted, so as to obtain each corresponding frequency domain feature map. It is understood that the encoding feature map is the most intuitive description of the privacy image, and the color features of the respective pixels of the privacy image may be originally described. To protect data privacy, further processing of such an encoding profile is required.

As can be seen from the foregoing description, a single code feature map can be regarded as a two-dimensional plane map, and can also be regarded as a matrix. Under the technical concept of the present specification, the frequency domain features can be extracted by transforming the color coding features of the pixels on a two-dimensional plane. The signature map formed by the frequency domain features may be referred to as a frequency domain signature map. The frequency domain feature map is used to describe the frequency distribution of the privacy image, and may also be referred to as a spectrogram or a power map. Bright spots with different brightness on the frequency domain characteristic map can correspond to the difference strength between a certain point and a neighboring point on the image, namely the gradient size, namely the frequency size of the point. In general, the frequency domain feature value of a single feature point on the frequency domain feature map is associated with the code value of each pixel on the image, i.e., the position of each feature point on the code feature map and the corresponding code value.

Fig. 3 is a schematic diagram illustrating a specific example of the transformation from the image space domain to the frequency domain. By the conversion, the frequency domain features of the image can be extracted. In fig. 3, the spatial domain is represented by a coordinate system (x, y), and the frequency domain is represented by a coordinate system (u, v). The feature points after the spatial domain and the frequency domain conversion may be the same, for example, N × M, and the feature points may be the number of pixels of the privacy image.

In fig. 3, the feature value of the feature point in the encoding feature map is denoted as F (x, y), and the feature value of the feature point in the frequency domain feature map is denoted as F (u, v). The color frequency domain transformation process of the feature map is described below. Wherein F (x, y) represents the coded value of the pixel at the x-th row and y-th column, and F (u, v) represents the frequency domain feature value (such as the frequency value) at the u-th row and v-th column in the corresponding frequency domain feature map.

In one embodiment, a fourier transform may be performed on a single encoded signature to obtain a frequency domain signature. For example:

wherein:

the expression of C (v) corresponds to that of C (u). Order to

Where j represents the imaginary part of the real number, there is:

it can be seen that the feature value of each feature point in the frequency domain feature map is related to the feature value of each feature point in the encoding feature map. That is, although the feature points in the frequency domain feature map and the encoding feature map are the same, they do not correspond to each other. But the color coding features of each pixel in the privacy image are subjected to frequency filtering and feature extraction through the frequency domain feature map.

Regarding the eigenvalue of each eigenvalue as an N × M matrix, the frequency domain eigenvalue is represented as: and F is AfB, wherein A is a transformation coefficient matrix traversed longitudinally, B is a transformation coefficient matrix traversed transversely, and F is a coding value matrix corresponding to the coding feature map. A and B represent traversals in the vertical direction (row data) and horizontal direction (column data), respectively. Thus, the element of the matrix a at the feature point with coordinates (x, y) in the encoding feature map can be determined by:

the element determination manner of B may be symmetrical to that of a (e.g., x and y are interchanged in the expression), and will not be described herein again. If N is M, F is AfA^T. Wherein A is^TAlternatively, when N is not equal to M, assuming N is greater than M, for ease of processing, in one embodiment, the matrix of N × M may be padded with 0 to be the matrix of N × N.

In another embodiment, a Discrete Cosine Transform (DCT) may be performed on a single encoding feature map to obtain a frequency domain feature map. DCT is similar to fourier transform but uses only the real part. Such as:

in the matrix form, the element determination mode of the coefficient matrix a may be:

the discrete cosine transform has strong energy concentration characteristics, concentrates energy in a low-frequency part, and has better frequency domain expression in practical use.

In other embodiments, other reasonable transformation methods may be used to extract the frequency domain features from the color-coded feature map. Therefore, the frequency domain characteristics of the privacy image to be processed can be extracted, and the coding characteristic graph corresponding to the color coding channel is converted into the frequency domain characteristic graph. In general, the size of the frequency domain feature map coincides with the size of the encoding feature map and with the number of pixels of the privacy image.

Then, in step 203, several pre-processing feature maps are obtained based on the predetermined processing of the respective frequency domain feature maps. The preprocessed feature map may be a feature map obtained by subjecting the frequency domain feature map to a predetermined process. The predetermined processing may be performed separately for a single frequency domain feature map, or may be performed for a plurality of frequency domain feature maps together. The number of the preprocessed feature maps may or may not be the same as the number of the frequency domain feature maps.

According to one possible design, the predetermined processing may be processing of the single frequency domain feature map, such as at least one of rounding, normalizing, weighting, block aliasing, and the like. Where rounding is the process of converting a decimal into an integer, rounding may be performed by rounding up, rounding down, etc., such as 3.141592587786979, rounding down or rounding down to an integer of 3, which may reduce the amount of computation and increase the interference information. The normalization may be performed on all feature points of a single frequency-domain feature map, and the normalization result of a single feature point is a ratio of the frequency-domain feature value of the feature point to the sum of the frequency-domain feature values of all feature points on the frequency-domain feature map. The adding of the weight may be adding the respective corresponding weight to all feature points of a single frequency domain feature map, wherein the adding method of the weight may be any reasonable method, for example, the weight of the current feature point is a ratio of a feature value to a sum of feature values of its neighboring feature points. The block confusion may be to divide the feature points into blocks, for example, into a plurality of 4 × 4 blocks, and to confuse the association relationship between the blocks and/or the arrangement order of the feature values of the feature points in a single block, so as to effectively protect privacy. In this manner, such predetermined processing can be performed for each frequency domain feature map. For convenience of description, the frequency domain feature map after the predetermined processing may be referred to as a pre-processing feature map.

According to another possible design, the predetermined process is a process of fusing the frequency domain feature maps. Since a single frequency domain feature map corresponds to a single color-coded channel, each frequency domain feature map may be processed by a Convolutional Neural Network (CNN). The frequency domain feature map processed by the convolutional neural network can be expanded or compressed to a preset number of channels. For example, if the size of the frequency domain feature map is 960 × 1280 and the number of channels is 3, the frequency domain feature map may be expanded into feature maps corresponding to 10 channels by a convolution operation of 3 × 10. For consistency of description, feature maps on a plurality of channels obtained by fusing the frequency domain feature maps are called as preprocessing feature maps.

In other possible designs, the frequency domain feature maps may be subjected to predetermined processing in other manners to obtain a plurality of preprocessed feature maps, which is not described herein again.

And step 204, screening a predetermined number of recognition feature maps from the plurality of preprocessing feature maps according to the importance degree. Here, the recognition feature map is a feature map used for the server to perform image recognition. In order to further reduce the amount of transmitted information and filter out a part of the information having less influence on the result to protect data privacy, a predetermined number of identification profiles may be determined by further filtering the plurality of preprocessed profiles.

In an alternative embodiment, the importance of each preprocessed feature map may be measured by a corresponding respective importance value. The importance value corresponding to the preprocessed feature map can be determined by calculation of a predetermined rule. For example, in one embodiment, the predetermined rule may be that an average value of feature values of respective feature points in a single preprocessed feature map is used as the importance value of the single preprocessed feature map. In another embodiment, each weight corresponds to each feature point in the preprocessed feature map, and the predetermined rule may be to sum the feature values of the feature points by weighting, and the result is used as the importance value of the single preprocessed feature map. In other embodiments, the importance values corresponding to the preprocessing feature maps may also be calculated in other reasonable manners, which are not listed here.

In another alternative embodiment, each pre-processed feature map may be screened through a pre-trained screening model. The screening model can process each preprocessing characteristic graph and respectively evaluate the importance degree of each preprocessing characteristic graph. In the case that the screening model can process each of the preprocessed feature maps simultaneously, the output result of the screening model may indicate the preprocessed feature map with a higher degree of importance, and the output result may be, for example, an importance value corresponding to each of the preprocessed feature maps, or a predetermined number of recognition feature maps or their corresponding identifiers. The screening model may include, for example, at least one of a fully connected neural network, an activation function, and the like.

In this way, a predetermined number of preprocessed feature maps with a high degree of importance can be selected as recognition feature maps.

Further, in step 205, the above-mentioned identification feature map is provided to the server. It is understood that the server may process a predetermined number of feature maps according to the pre-trained image recognition model, so as to determine the recognition result for the privacy image. Referring to fig. 1, the identification feature map may be sent to a server, and the server further processes the identification feature map, so as to perform at least one of image recognition of target identification, living body identification, and the like, and obtain a corresponding recognition result, which may also be fed back to the terminal.

In general, in order to ensure that the server performs effective image recognition according to the screened recognition feature maps, in a model training phase, each processing module, which is deployed at a terminal and performs predetermined processing on each frequency domain feature map, screens out a predetermined number of recognition feature maps according to importance degree, is trained together with an image recognition model, and adjusts processing parameters affecting results therein, so that the recognition feature maps screened out according to importance degree are useful frequency band information determined through machine learning. That is to say, the spectrum screening and the feature map preprocessing can obtain information of different frequency bands by different screening means according to different service end tasks, and on the basis of ensuring the desensitization effect, image identification tasks such as target identification and living body identification can obtain better accuracy. The processing parameters affecting the result in each processing module of the terminal at least include the number of the identification feature maps (the predetermined number is the final result of the parameter adjustment), and in addition, according to the setting condition of the actual processing architecture, for example, the method may further include: parameters in a convolutional neural network that performs predetermined processing on the respective frequency domain feature maps, parameters in a filtering model, and the like.

In the foregoing process, the image processing method for implementing privacy protection provided in the embodiment of this specification extracts the frequency domain features of the privacy image through the terminal, and performs preprocessing and screening on the frequency domain feature map, so as to adaptively obtain the feature map more favorable for image recognition.

According to an embodiment of another aspect, there is also provided an image processing apparatus that realizes privacy protection. The device can be arranged on an intelligent terminal with certain computing capability. Fig. 4 shows a schematic block diagram of an image processing apparatus implementing privacy protection according to one embodiment. As shown in fig. 4, the apparatus 400 includes:

a color feature acquisition unit 41, configured to acquire pixel encoding values on a plurality of predetermined color encoding channels for a to-be-processed privacy image to obtain each encoding feature map corresponding to each color encoding channel, where a single encoding feature map includes each feature point and the encoding value of the feature point corresponding to each pixel of the privacy image one-to-one;

a frequency domain feature extraction unit 42, configured to extract, for each encoding feature map, frequency domain features on each predetermined color encoding channel based on the position of the feature point and the corresponding encoding value, respectively, to obtain each corresponding frequency domain feature map;

a preprocessing unit 43 configured to obtain a plurality of preprocessed feature maps based on predetermined processing on each frequency domain feature map;

a filtering unit 44 configured to filter a predetermined number of recognition feature maps from the plurality of preprocessed feature maps according to the degree of importance;

and the providing unit 45 is configured to provide the recognition feature maps to the server, so that the server processes a predetermined number of recognition feature maps according to the pre-trained image recognition model, and thus determines a recognition result for the privacy image.

The color coding method may be a color coding method such as RGB and YUV. In the case of determining the color coding scheme, the color coding channel may also be determined. For example, in the case that the color coding mode is the RGB mode, the plurality of predetermined color coding channels include three color coding channels corresponding to three color components of red R, green G, and blue B, respectively, and on a single color coding channel, the coding value of a single feature point is the value of the corresponding pixel in the corresponding color component. For another example, in the case that the color coding mode is the YUV mode, the plurality of predetermined color coding channels include three color coding channels corresponding to three coding components of brightness Y, chroma U, and density V, respectively, and on a single color coding channel, the coding value of a single feature point is the value of the corresponding pixel on the corresponding coding component.

According to one embodiment, the frequency domain feature values corresponding to a single feature point on a single frequency domain feature map are associated with the position of each feature point on the corresponding encoding feature map and the corresponding encoding value. As such, the frequency domain feature extraction unit 42 may be further configured to:

aiming at a single coding characteristic diagram, obtaining each frequency domain characteristic value respectively corresponding to each characteristic point on the single frequency domain characteristic diagram through one of Fourier transform and discrete cosine transform, and forming the single frequency domain characteristic diagram by each frequency domain characteristic value; or

And aiming at the single coding characteristic diagram, obtaining the single frequency domain characteristic diagram in the form of the matrix of the frequency domain characteristic values of each characteristic point through the matrix in one transformation mode of Fourier transformation and discrete cosine transformation.

In one embodiment, the predetermined processing by the preprocessing unit 43 includes: aiming at a single frequency domain characteristic diagram, at least one of rounding, normalization, weight addition and block confusion is carried out; and the single pre-processing characteristic map is a corresponding frequency domain characteristic map after the pre-processing.

In another embodiment, the predetermined processing of the preprocessing unit 43 includes: carrying out fusion processing on each frequency domain characteristic graph by using a pre-trained convolutional neural network; meanwhile, the number of preprocessed feature maps is determined based on the size of the convolution kernel.

According to one possible design, the screening unit 44 may further be configured to:

processing a plurality of preprocessing characteristic graphs by utilizing a pre-trained screening model, wherein the output result of the screening model indicates a predetermined number of recognition characteristic graphs with the highest importance degree;

and determining a corresponding preset number of recognition characteristic graphs according to the output result of the screening model.

In an alternative implementation, the model parameters of the screening model are determined by training the screening model with the image recognition model as a whole in advance and adjusting the model parameters based on the loss commonly related to the model parameters of the image recognition model.

According to an alternative embodiment, the screening unit 44 is further configured to:

calculating the importance value of each preprocessing characteristic map through a predetermined rule, wherein the predetermined rule comprises one of the following items: taking the average value of each characteristic value corresponding to each characteristic point in a single preprocessing characteristic diagram as the importance value of the characteristic value, and taking the weighted sum of each characteristic value corresponding to each characteristic point in the single preprocessing characteristic diagram as the importance value of the characteristic value;

It should be noted that the apparatus 400 shown in fig. 4 is an apparatus embodiment corresponding to the method embodiment shown in fig. 2, and the corresponding description in the method embodiment shown in fig. 2 is also applicable to the apparatus 400, and is not repeated herein.

According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.

According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor, when executing the executable code, implementing the method in conjunction with fig. 2.

Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in the embodiments of this specification may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.

The above embodiments are only intended to be specific embodiments of the technical concept of the present disclosure, and should not be used to limit the scope of the technical concept of the present disclosure, and any modification, equivalent replacement, improvement, etc. made on the basis of the technical concept of the embodiments of the present disclosure should be included in the scope of the technical concept of the present disclosure.

Claims

1. An image processing method for implementing privacy protection, the method comprising:

acquiring pixel coding values on a plurality of preset color coding channels aiming at a privacy image to be processed to obtain coding feature maps respectively corresponding to the color coding channels, wherein a single coding feature map comprises feature points and coding values of the feature points which are in one-to-one correspondence with the pixels of the privacy image;

for each coding feature map, extracting frequency domain features on each preset color coding channel based on the position of the feature point of the coding feature map and the corresponding coding value to obtain each corresponding frequency domain feature map;

obtaining a plurality of preprocessing characteristic graphs based on the preset processing of each frequency domain characteristic graph;

screening a predetermined number of recognition feature maps from the plurality of preprocessing feature maps according to the importance degree;

and providing the identification feature maps to a server, so that the server processes the predetermined number of identification feature maps according to a pre-trained image identification model, thereby determining an identification result for the privacy image.

2. The method of claim 1, wherein:

under the condition that the color coding mode is an RGB mode, the preset color coding channels comprise three color coding channels respectively corresponding to three color components of red R, green G and blue B, and on a single color coding channel, the coding value of a single characteristic point is the value of a corresponding pixel in the corresponding color component;

and under the condition that the color coding mode is a YUV mode, the plurality of preset color coding channels comprise three color coding channels respectively corresponding to three coding components of brightness Y, chroma U and concentration V, and on a single color coding channel, the coding value of a single characteristic point is the value of a corresponding pixel on the corresponding coding component.

3. The method according to claim 1, wherein the frequency domain feature values corresponding to a single feature point on a single frequency domain feature map are associated with the position of each feature point on the corresponding coding feature map and the corresponding coding value;

for each coding feature map, extracting the frequency domain features on each preset color coding channel based on the position of the feature point and the corresponding coding value of each coding feature map respectively to obtain each corresponding frequency domain feature map, wherein the step of extracting the frequency domain features comprises the following steps:

4. The method of claim 1, wherein the predetermined processing comprises: aiming at a single frequency domain characteristic diagram, at least one of rounding, normalization, weight addition and block confusion is carried out;

5. The method of claim 1, wherein the predetermined processing comprises: carrying out fusion processing on each frequency domain characteristic graph by using a pre-trained convolutional neural network; meanwhile, the number of preprocessed feature maps is determined based on the size of the convolution kernel.

6. The method of claim 1, wherein said filtering out a predetermined number of recognition feature maps from said plurality of preprocessed feature maps by importance comprises:

processing the plurality of preprocessed feature maps by using a pre-trained screening model, wherein the output result of the screening model indicates a predetermined number of recognition feature maps with the highest degree of importance;

7. The method of claim 6, wherein the model parameters of the screening model are determined by training the screening model with the image recognition model as a whole in advance and adjusting based on losses commonly associated with the model parameters of the image recognition model.

8. The method of claim 1, wherein said filtering out a predetermined number of recognition feature maps from said plurality of preprocessed feature maps by importance comprises:

9. An image processing apparatus that implements privacy protection, the apparatus comprising:

10. The apparatus of claim 1, wherein:

11. The apparatus according to claim 9, wherein the frequency domain feature values corresponding to a single feature point on a single frequency domain feature map are associated with the position of each feature point on the corresponding coding feature map and the corresponding coding value;

the frequency domain feature extraction unit is further configured to:

12. The apparatus of claim 9, wherein the predetermined process comprises: aiming at a single frequency domain characteristic diagram, at least one of rounding, normalization, weight addition and block confusion is carried out;

13. The apparatus of claim 9, wherein the predetermined process comprises: carrying out fusion processing on each frequency domain characteristic graph by using a pre-trained convolutional neural network; meanwhile, the number of preprocessed feature maps is determined based on the size of the convolution kernel.

14. The apparatus of claim 9, wherein the screening unit is further configured to:

15. The apparatus of claim 14, wherein the model parameters of the screening model are determined by training the screening model with the image recognition model as a whole in advance and adjusting based on losses commonly associated with the model parameters of the image recognition model.

16. The apparatus of claim 9, wherein the screening unit is further configured to:

17. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-8.

18. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, performs the method of any of claims 1-8.