CN113822129A - Image recognition method and device, computer equipment and storage medium - Google Patents

Image recognition method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113822129A
CN113822129A CN202110753108.8A CN202110753108A CN113822129A CN 113822129 A CN113822129 A CN 113822129A CN 202110753108 A CN202110753108 A CN 202110753108A CN 113822129 A CN113822129 A CN 113822129A
Authority
CN
China
Prior art keywords
channel
image
character
hidden layer
value range
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110753108.8A
Other languages
Chinese (zh)
Inventor
王欢
黄余格
沈鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110753108.8A priority Critical patent/CN113822129A/en
Publication of CN113822129A publication Critical patent/CN113822129A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4014Identity check for transactions
    • G06Q20/40145Biometric identity checks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C9/00Individual registration on entry or exit
    • G07C9/30Individual registration on entry or exit not involving the use of a pass
    • G07C9/32Individual registration on entry or exit not involving the use of a pass in combination with an identity check
    • G07C9/37Individual registration on entry or exit not involving the use of a pass in combination with an identity check using biometric data, e.g. fingerprints, iris scans or voice recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to an image recognition method, an image recognition device, a computer device and a storage medium. The method comprises the following steps: acquiring an image to be identified, and coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel; compressing the quantization characteristics of the corresponding channels through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain hidden layer characteristics corresponding to each channel; uploading hidden layer characteristics corresponding to each channel to a server, wherein the uploaded hidden layer characteristics are used for indicating the server to decompress the hidden layer characteristics based on a pre-stored target value range and a corresponding discrete accumulation probability interval of each channel, and performing identity recognition based on a decompression result; and receiving an identity recognition result which is fed back by the server and corresponds to the image to be recognized. By adopting the method, the safety of the user privacy information can be effectively improved.

Description

Image recognition method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an image recognition method and apparatus, a computer device, and a storage medium.
Background
With the development of computer technology, image recognition algorithms are more and more mature, authenticity verification can be performed on user images through the image recognition algorithms, and corresponding services are provided for users who pass identity verification. For example, providing services such as face payment and business handling for the user who passes the authentication.
In a traditional image identification mode, images acquired by a terminal are often uploaded to a server side for identification, so that terminal resources are saved. However, there is a risk that the image is leaked, stolen, or embezzled during the transmission process, and how to improve the security of the user privacy information becomes a problem of great concern at present.
Disclosure of Invention
In view of the above, it is necessary to provide an image recognition method, an apparatus, a computer device, and a storage medium that can effectively improve the security of user privacy information in view of the above technical problems.
An image recognition method, the method comprising:
acquiring an image to be identified, and coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel;
compressing the quantization characteristics of the corresponding channels through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain hidden layer characteristics corresponding to each channel;
uploading hidden layer characteristics corresponding to each channel to a server, wherein the uploaded hidden layer characteristics are used for indicating the server to decompress the hidden layer characteristics based on a pre-stored target value range and a corresponding discrete accumulation probability interval of each channel, and performing identity recognition based on a decompression result;
and receiving an identity recognition result which is fed back by the server and corresponds to the image to be recognized.
An image recognition apparatus, the apparatus comprising:
the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an image to be identified, and coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel;
the compression module is used for compressing the quantization characteristics of the corresponding channels through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain hidden layer characteristics corresponding to each channel;
the uploading module is used for uploading the hidden layer characteristics corresponding to each channel to a server, and the uploaded hidden layer characteristics are used for indicating the server to decompress the hidden layer characteristics based on the pre-stored target value range and the corresponding discrete accumulation probability interval of each channel and identify based on the decompression result;
and the result receiving module is used for receiving the identity recognition result which is fed back by the server and corresponds to the image to be recognized.
In an embodiment, the compression module is further configured to determine, for the quantized feature corresponding to each channel, a probability interval corresponding to each character in the quantized feature corresponding to the corresponding channel through the target value range and the discrete cumulative probability interval corresponding to the corresponding channel; determining a coding interval corresponding to each character respectively based on a probability interval corresponding to each character; and for each channel, determining the hidden layer characteristics corresponding to the corresponding channel according to the coding interval corresponding to each character corresponding to the corresponding channel.
In one embodiment, the compression module is further configured to determine, for a current character in the quantization features of the corresponding channel, a matching character that matches the current character in a target value range corresponding to the corresponding channel; taking the corresponding interval of the matched character in the corresponding discrete cumulative probability interval as the probability interval corresponding to the current character; and determining the coding interval corresponding to the current character according to the probability interval corresponding to the current character and the coding interval corresponding to the adjacent previous character, and continuously processing the next character in the quantization characteristics of the corresponding channel until the coding interval corresponding to each character in the quantization characteristics is obtained.
In one embodiment, the compression module is further configured to, for each channel, randomly select a value from the coding interval corresponding to the last character of the quantization feature corresponding to the corresponding channel; and taking the randomly selected numerical value as the hidden layer characteristics corresponding to the corresponding channel until the hidden layer characteristics corresponding to each channel are obtained.
In one embodiment, the image recognition means is implemented by a target recognition model comprising a first sub-model and a second sub-model; the first sub-model is deployed at a terminal, and the second sub-model is deployed at a server; the first sub-model comprises an encoder, a quantizer, a target value range of each channel and a corresponding discrete accumulation probability interval; the second submodel comprises a target value range and a corresponding discrete accumulation probability interval of each channel and a feature identification network.
In one embodiment, the apparatus further comprises:
the model determining module is used for determining a recognition model to be trained, and the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network;
the processing module is used for acquiring a sample image and a corresponding identity label, and sequentially coding and quantizing the sample image through a coder and a quantizer in the identification model to be trained to obtain sample quantization characteristics;
a probability distribution determining module, configured to determine, through the entropy network and based on the sample quantization feature, sample probability distributions corresponding to the pixels in the sample image, and determine, according to the sample probability distributions, sample image entropies corresponding to the sample images;
the calculation module is used for determining the value ranges respectively corresponding to all the channels in the identification model to be trained through the entropy network and based on the sample quantization characteristics, and calculating the channel probability distribution respectively corresponding to all the value ranges;
the probability loss determining module is used for determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range;
the identification module is used for identifying and processing the quantized characteristics of the sample through the characteristic identification network to obtain a sample identification result;
the construction module is used for determining image identification loss based on the sample identification result and the identity label and constructing a target loss function according to the image identification loss, the channel probability loss and the sample image entropy;
and the training module is used for training the recognition model to be trained through the target loss function until the training stopping condition is reached, so as to obtain the trained target recognition model.
The trained target recognition model comprises target value ranges respectively corresponding to the channels, and the discrete cumulative probability intervals respectively corresponding to the target value ranges of the channels are determined by the entropy network in the trained target recognition model based on the target value ranges of the corresponding channels.
In one embodiment, the image to be recognized is a human face image; the acquisition module is also used for responding to the resource transfer triggering operation of the resource amount and acquiring a face image; coding and quantizing the face image to obtain quantization characteristics corresponding to at least one channel;
the device further comprises: a resource transfer module; the resource transfer module is used for executing resource transfer operation when the identity recognition result is successful; and the resource transfer operation is used for transferring the resource amount from the resource account of the operation initiator to the resource account of the receiver.
In one embodiment, the image to be recognized is a human face image; the acquisition module is also used for responding to the triggering operation of the access control and acquiring a face image;
the device further comprises: an access control module; and the access control module is used for controlling the access control terminal to execute the access control opening action when the identity recognition result corresponding to the face image is successful.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring an image to be identified, and coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel;
compressing the quantization characteristics of the corresponding channels through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain hidden layer characteristics corresponding to each channel;
uploading hidden layer characteristics corresponding to each channel to a server, wherein the uploaded hidden layer characteristics are used for indicating the server to decompress the hidden layer characteristics based on a pre-stored target value range and a corresponding discrete accumulation probability interval of each channel, and performing identity recognition based on a decompression result;
and receiving an identity recognition result which is fed back by the server and corresponds to the image to be recognized.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring an image to be identified, and coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel;
compressing the quantization characteristics of the corresponding channels through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain hidden layer characteristics corresponding to each channel;
uploading hidden layer characteristics corresponding to each channel to a server, wherein the uploaded hidden layer characteristics are used for indicating the server to decompress the hidden layer characteristics based on a pre-stored target value range and a corresponding discrete accumulation probability interval of each channel, and performing identity recognition based on a decompression result;
and receiving an identity recognition result which is fed back by the server and corresponds to the image to be recognized.
According to the image identification method, the image identification device, the computer equipment and the storage medium, the image to be identified is obtained, the image to be identified is coded and quantized to obtain the quantization characteristics corresponding to at least one channel, the quantization characteristics of the corresponding channel are compressed through the target value range and the corresponding discrete cumulative probability interval of each channel, and the hidden layer characteristics corresponding to each channel obtained after compression are uploaded to the server, so that the privacy of a user can be effectively protected, and meanwhile, the data volume and the bandwidth of transmission can be reduced. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the problem that the image to be recognized is directly leaked out due to data leakage in the process of directly uploading the image to be recognized to the server is solved. The server correspondingly decompresses the hidden layer characteristics based on the pre-stored target value ranges of all the channels and the corresponding discrete cumulative probability intervals, accurately obtains the decompressed characteristics to perform identity recognition and returns an identity recognition result to the terminal, and therefore the accuracy of user identity recognition and the safety of a recognition process are effectively protected.
An image recognition method, the method comprising:
receiving hidden layer characteristics corresponding to at least one channel, wherein the hidden layer characteristics corresponding to the at least one channel are obtained by coding and quantizing an image to be identified to obtain quantized characteristics corresponding to each channel, and compressing the quantized characteristics of the corresponding channel through a target value range and a corresponding discrete cumulative probability interval of each channel;
decompressing the hidden layer characteristics of the corresponding channel through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain the decompressed characteristics of the corresponding channel;
and performing feature recognition based on the decompression features respectively corresponding to the channels to obtain an identity recognition result corresponding to the image to be recognized, and feeding back the identity recognition result to the terminal.
An image recognition apparatus, the apparatus comprising:
the characteristic receiving module is used for receiving hidden layer characteristics corresponding to at least one channel, wherein the hidden layer characteristics corresponding to the at least one channel are obtained by coding and quantizing an image to be identified to obtain quantized characteristics corresponding to each channel, and compressing the quantized characteristics of the corresponding channel through a target value range and a corresponding discrete cumulative probability interval of each channel;
the decompression module is used for decompressing the hidden layer characteristics of the corresponding channels through the target value ranges of the channels and the corresponding discrete cumulative probability intervals to obtain the decompression characteristics of the corresponding channels;
and the feedback module is used for carrying out characteristic identification on the basis of the decompression characteristics respectively corresponding to the channels to obtain an identity identification result corresponding to the image to be identified and feeding the identity identification result back to the terminal.
In one embodiment, the decompression module is further configured to determine, for each hidden layer feature corresponding to each channel, a character probability interval corresponding to the hidden layer feature in a discrete cumulative probability interval of the corresponding channel, and determine, based on the character probability interval, a first character in the hidden layer feature from a corresponding target value range; taking the first character as a current character, and updating a discrete cumulative probability interval based on a character probability interval and an initial probability interval of the current character; determining a next character of the current character from the target value range based on a character probability interval corresponding to the hidden layer feature in the updated discrete cumulative probability interval; taking the next character as a current character, returning the probability interval and the initial probability interval based on the current character, updating the discrete cumulative probability interval and continuing to execute until the last character in the hidden layer characteristic is obtained; and taking each character in the hidden layer characteristics as the decompression characteristics of the corresponding channel.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
receiving hidden layer characteristics corresponding to at least one channel, wherein the hidden layer characteristics corresponding to the at least one channel are obtained by coding and quantizing an image to be identified to obtain quantized characteristics corresponding to each channel, and compressing the quantized characteristics of the corresponding channel through a target value range and a corresponding discrete cumulative probability interval of each channel;
decompressing the hidden layer characteristics of the corresponding channel through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain the decompressed characteristics of the corresponding channel;
and performing feature recognition based on the decompression features respectively corresponding to the channels to obtain an identity recognition result corresponding to the image to be recognized, and feeding back the identity recognition result to the terminal.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
receiving hidden layer characteristics corresponding to at least one channel, wherein the hidden layer characteristics corresponding to the at least one channel are obtained by coding and quantizing an image to be identified to obtain quantized characteristics corresponding to each channel, and compressing the quantized characteristics of the corresponding channel through a target value range and a corresponding discrete cumulative probability interval of each channel;
decompressing the hidden layer characteristics of the corresponding channel through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain the decompressed characteristics of the corresponding channel;
and performing feature recognition based on the decompression features respectively corresponding to the channels to obtain an identity recognition result corresponding to the image to be recognized, and feeding back the identity recognition result to the terminal.
According to the image identification method, the image identification device, the computer equipment and the storage medium, the terminal carries out coding and quantization processing on the image to be identified to obtain the quantization characteristic corresponding to at least one channel, the quantization characteristic of the corresponding channel is compressed through the target value range and the corresponding discrete accumulation probability interval of each channel, and the hidden layer characteristic corresponding to each channel obtained after the compression processing is uploaded to the server, so that the privacy of a user can be effectively protected, and meanwhile, the data volume and the bandwidth of transmission can be reduced. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the problem that the image to be recognized is directly leaked out due to data leakage in the process of directly uploading the image to be recognized to the server is solved. The server correspondingly decompresses the hidden layer characteristics based on the pre-stored target value ranges of all the channels and the corresponding discrete accumulation probability intervals, and accurately obtains the decompressed characteristics to perform identity recognition, so that the accuracy of user identity recognition and the safety of the recognition process are effectively improved.
A recognition model training method, the method comprising:
determining a recognition model to be trained, wherein the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network;
acquiring a sample image and a corresponding identity label, and sequentially encoding and quantizing the sample image through an encoder and a quantizer in the identification model to be trained to obtain a sample quantization characteristic;
determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through the entropy network, and determining sample image entropy corresponding to the sample image according to the sample probability distribution;
determining the value ranges respectively corresponding to all channels in the recognition model to be trained based on the sample quantization characteristics through the entropy network, and calculating the channel probability distribution respectively corresponding to all the value ranges;
determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range;
identifying and processing the quantitative characteristics of the sample through the characteristic identification network to obtain a sample identification result;
determining image recognition loss based on the sample recognition result and the identity label, and constructing a target loss function according to the image recognition loss, the channel probability loss and the sample image entropy;
training the recognition model to be trained through the target loss function until the training stopping condition is reached, and obtaining a trained target recognition model; the target recognition model is used for carrying out identity recognition on the image to be recognized.
A recognition model training apparatus, the apparatus comprising:
the model determining module is used for determining a recognition model to be trained, and the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network;
the processing module is used for acquiring a sample image and a corresponding identity label, and sequentially coding and quantizing the sample image through a coder and a quantizer in the identification model to be trained to obtain sample quantization characteristics;
a probability distribution determining module, configured to determine, through the entropy network and based on the sample quantization feature, sample probability distributions corresponding to the pixels in the sample image, and determine, according to the sample probability distributions, sample image entropies corresponding to the sample images;
the calculation module is used for determining the value ranges respectively corresponding to all the channels in the identification model to be trained through the entropy network and based on the sample quantization characteristics, and calculating the channel probability distribution respectively corresponding to all the value ranges;
the probability loss determining module is used for determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range;
the identification module is used for identifying and processing the quantized characteristics of the sample through the characteristic identification network to obtain a sample identification result;
the construction module is used for determining image identification loss based on the sample identification result and the identity label and constructing a target loss function according to the image identification loss, the channel probability loss and the sample image entropy;
the training module is used for training the recognition model to be trained through the target loss function until the training stopping condition is reached, and obtaining a trained target recognition model; the target recognition model is used for carrying out identity recognition on the image to be recognized.
In one embodiment, the calculating module is further configured to calculate channel probability distributions corresponding to the upper limit value and the lower limit value in each value range;
the probability loss determining module is further configured to determine the channel probability loss according to the channel probability distribution corresponding to the upper limit value and the lower limit value of each value range.
In one embodiment, the trained target recognition model includes a target value range corresponding to each of the channels; the probability distribution determining module is further configured to determine, through the entropy network in the target recognition model, a channel probability distribution corresponding to each value in each target value range based on a target value range corresponding to each channel; for the target value range corresponding to each channel, calculating a discrete cumulative probability interval corresponding to the corresponding target value range according to the channel probability distribution corresponding to each value in the corresponding target value range; the target value range and the corresponding discrete cumulative probability interval are used for compressing the quantization features corresponding to the image to be recognized into hidden layer features, and decompressing the hidden layer features corresponding to the image to be recognized so as to obtain the identity recognition result corresponding to the image to be recognized based on the decompression result.
In one embodiment, the target recognition model comprises a first sub-model and a second sub-model, the first sub-model is deployed at a terminal, and the second sub-model is deployed at a server;
the first sub-model comprises an encoder, a quantizer, and a target value range and a corresponding discrete accumulation probability interval of each channel, and the target value range and the corresponding discrete accumulation probability interval of each channel in the first sub-model are used for compressing quantization features corresponding to an image to be identified into hidden layer features; the second submodel comprises a target value range and a corresponding discrete accumulation probability interval of each channel and a feature identification network; and the target value range and the corresponding discrete cumulative probability interval of each channel in the second submodel are used for decompressing the hidden layer characteristics corresponding to the image to be recognized.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
determining a recognition model to be trained, wherein the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network;
acquiring a sample image and a corresponding identity label, and sequentially encoding and quantizing the sample image through an encoder and a quantizer in the identification model to be trained to obtain a sample quantization characteristic;
determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through the entropy network, and determining sample image entropy corresponding to the sample image according to the sample probability distribution;
determining the value ranges respectively corresponding to all channels in the recognition model to be trained based on the sample quantization characteristics through the entropy network, and calculating the channel probability distribution respectively corresponding to all the value ranges;
determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range;
identifying and processing the quantitative characteristics of the sample through the characteristic identification network to obtain a sample identification result;
determining image recognition loss based on the sample recognition result and the identity label, and constructing a target loss function according to the image recognition loss, the channel probability loss and the sample image entropy;
training the recognition model to be trained through the target loss function until the training stopping condition is reached, and obtaining a trained target recognition model; the target recognition model is used for carrying out identity recognition on the image to be recognized.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
determining a recognition model to be trained, wherein the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network;
acquiring a sample image and a corresponding identity label, and sequentially encoding and quantizing the sample image through an encoder and a quantizer in the identification model to be trained to obtain a sample quantization characteristic;
determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through the entropy network, and determining sample image entropy corresponding to the sample image according to the sample probability distribution;
determining the value ranges respectively corresponding to all channels in the recognition model to be trained based on the sample quantization characteristics through the entropy network, and calculating the channel probability distribution respectively corresponding to all the value ranges;
determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range;
identifying and processing the quantitative characteristics of the sample through the characteristic identification network to obtain a sample identification result;
determining image recognition loss based on the sample recognition result and the identity label, and constructing a target loss function according to the image recognition loss, the channel probability loss and the sample image entropy;
training the recognition model to be trained through the target loss function until the training stopping condition is reached, and obtaining a trained target recognition model; the target recognition model is used for carrying out identity recognition on the image to be recognized.
In the embodiment of the identification model training method, the identification model training device, the computer device and the storage medium, the sample quantization characteristics of the sample image are processed through the entropy network to obtain the sample probability distribution corresponding to each pixel in the sample image, so as to determine the sample image entropy corresponding to the sample image, and the sample image entropy is used as a part of the target loss function to determine the loss degree of the key information of the sample image. Meanwhile, the channel probability loss is calculated according to the channel probability distribution corresponding to the value range of each channel of the entropy network, and can be used as a part of a target loss function, so that the value range of each channel is optimized in the training process, and the constraint on the output characteristics of the entropy network is realized. And the loss of the sample quantization characteristics between the prediction result and the real result in the characteristic identification network is used as a part of a target loss function, so that the training of the characteristic identification network is realized. The overall training of the recognition model is realized through a series of constraints, so that the trained target recognition model has higher prediction precision and accuracy.
Drawings
FIG. 1 is a diagram of an exemplary embodiment of an application of an image recognition method;
FIG. 2 is a flow diagram illustrating an image recognition method in one embodiment;
FIG. 3 is a schematic flow chart diagram illustrating the training steps of the object recognition model in one embodiment;
FIG. 4 is a schematic flow chart of face recognition in another embodiment;
FIG. 5 is a flow chart illustrating an image recognition method according to another embodiment;
fig. 6 is a schematic flow chart illustrating a step of decompressing hidden layer features of respective channels according to a target value range and a corresponding discrete cumulative probability interval of each channel in one embodiment;
FIG. 7 is a flowchart illustrating the testing steps for identifying a model in one embodiment;
FIG. 8 is a flowchart illustrating the training steps of the object recognition model in another embodiment;
FIG. 9 is a schematic diagram illustrating the structure of a portion of a feature recognition network in one embodiment;
FIG. 10 is a block diagram of a recognition model to be trained in one embodiment;
FIG. 11 is a block diagram showing the structure of an image recognition apparatus according to an embodiment;
FIG. 12 is a block diagram showing the construction of an image recognizing apparatus according to another embodiment;
FIG. 13 is a block diagram showing the structure of a recognition model training apparatus according to an embodiment;
FIG. 14 is a diagram showing an internal structure of a computer device in one embodiment;
fig. 15 is an internal structural view of a computer device in another embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The present application relates to the field of Artificial Intelligence (AI) technology, which is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The scheme provided by the embodiment of the application relates to an artificial intelligence image identification method, and is specifically explained by the following embodiments.
The image recognition method provided by the application can be applied to an image recognition system shown in fig. 1. As shown in fig. 1, the image recognition system includes a terminal 110 and a server 120. In one embodiment, the terminal 110 and the server 120 may each separately perform the image recognition method provided in the embodiments of the present application. The terminal 110 and the server 120 may also be cooperatively used to perform the image recognition method provided in the embodiments of the present application. When the terminal 110 and the server 120 are cooperatively used to execute the image recognition method provided in the embodiment of the present application, the terminal 110 obtains an image to be recognized, and performs encoding and quantization processing on the image to be recognized to obtain a quantization feature corresponding to at least one channel. The terminal 110 compresses the quantization characteristics of the corresponding channel through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain hidden layer characteristics corresponding to each channel. The terminal 110 uploads the hidden layer characteristics corresponding to each channel to the server 120, the server 120 receives the hidden layer characteristics corresponding to at least one channel, the hidden layer characteristics corresponding to at least one channel are obtained by coding and quantizing an image to be identified, and the quantized characteristics corresponding to each channel are obtained by compressing the quantized characteristics of the corresponding channel through the target value range and the corresponding discrete cumulative probability interval of each channel. The server 120 decompresses the hidden layer characteristics of the corresponding channel through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain the decompressed characteristics of the corresponding channel. The server 120 performs feature recognition based on the decompression features respectively corresponding to the channels to obtain an identity recognition result corresponding to the image to be recognized, and feeds back the identity recognition result to the terminal. The terminal 110 receives the identity recognition result corresponding to the image to be recognized fed back by the server.
The server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services or a cloud server cluster formed by a plurality of cloud servers. The terminal 110 may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, a vehicle-mounted terminal, a smart television, and the like. The terminal 110 and the server 120 may be directly or indirectly connected through wired or wireless communication, and the application is not limited thereto.
In one embodiment, multiple servers may be grouped into a blockchain, with servers being nodes on the blockchain.
In one embodiment, data related to the image recognition method may be stored in a blockchain, for example, data such as a target value range and a corresponding discrete cumulative probability interval of each channel, an image to be recognized, a quantization feature, a hidden layer feature, a decompression result, an identity recognition result, and the like may be stored in the blockchain. Similarly, data related to the recognition model training method may also be saved on the blockchain.
In one embodiment, as shown in fig. 2, an image recognition method is provided, which is described by taking the example that the method is applied to the terminal in fig. 1, and includes the following steps:
step S202, acquiring an image to be identified, and coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel.
The image to be recognized refers to an image related to the privacy of the user, such as a human face image, a half-body image, a whole body image, a certificate image, a signature image, and the like, but is not limited thereto. The image to be identified can be an image acquired in real time, a pre-stored image or an image acquired from other equipment. The quantization process is to convert the image features into integers, for example, to convert the floating-point feature values into integers.
Specifically, the terminal may obtain an image to be identified, and perform encoding processing on the image to be identified to obtain an encoding characteristic corresponding to at least one channel. And quantizing the coding features corresponding to the at least one channel to convert each coding feature into an integer to obtain the quantization features corresponding to the at least one channel.
In one embodiment, the terminal can acquire images to be identified in real time through the camera. The terminal can also select the image to be identified from the images shot and stored in advance, and can also acquire the image to be identified from other equipment.
In one embodiment, the coding features obtained by the encoder may be quantized by the following formula, round represents a rounding operation:
Figure BDA0003145869530000141
wherein the content of the first and second substances,
Figure BDA0003145869530000142
quantization for coding feature y mappingFeature, round (y), represents a rounding operation on the coding feature y, e.g. coding features of [0.27, 0.41, 0.50, 0.75, 0.84, 0.38]And then the [0, 0,1, 1, 1, 0] is obtained after quantization]。
And step S204, compressing the quantization characteristics of the corresponding channels through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain hidden layer characteristics corresponding to each channel.
The target value range refers to a constraint term of an output value of a channel, and the discrete cumulative probability interval refers to a set formed by probability intervals corresponding to each value in the target value range. Each channel corresponds to a target value range and a discrete cumulative probability interval.
In one embodiment, the target range of values is integer, i.e., each value in the target range of values is an integer. Each value in the discrete accumulation probability interval is greater than the previous value, and the lower limit value of the discrete accumulation probability interval is 0 and the upper limit value is 1, for example [0, 1 ].
Compression processing refers to a processing mechanism that reduces the size of computer data by a particular algorithm. The processing mechanism can reduce the total number of bytes of data, so that the data transmission is faster, and the space for storing the data can be reduced. Compression can be divided into lossy and lossless compression, where data is processed lossy or lossless to preserve most of the data information and make the data smaller. Lossy compression allows some information to be lost during compression, with the lost part having little impact on understanding the original data. Lossless compression, which is compression using statistical redundancy of data, can restore the original data completely without causing any distortion.
Specifically, the terminal stores in advance a target value range and a corresponding discrete cumulative probability interval of each channel. And for the quantization characteristics of each channel, the terminal can compress the quantization characteristics of the corresponding channel through the target value range and the corresponding discrete cumulative probability interval of the corresponding channel, so as to obtain the hidden layer characteristics corresponding to each channel. For example, if there are 192 channels, the range corresponds to 192 target values and the discrete cumulative probability interval. And compressing the quantization characteristics of the 1 st channel through the target value range and the discrete cumulative probability interval of the 1 st channel to obtain the hidden layer characteristics corresponding to the 1 st channel. According to the same process, 192 hidden layer features corresponding to 192 channels can be obtained.
And S206, uploading the hidden layer characteristics corresponding to each channel to a server, wherein the uploaded hidden layer characteristics are used for indicating the server to decompress the hidden layer characteristics based on the pre-stored target value range and the corresponding discrete accumulation probability interval of each channel, and performing identity recognition based on the decompression result.
The decompression processing refers to decompressing the compressed data by a processing mechanism corresponding to the compression processing to restore the compressed data to the data before compression.
The server may be a physical server or a cloud server.
Specifically, the terminal uploads the hidden layer characteristics corresponding to each channel to the server. The server stores the target value range and the corresponding discrete cumulative probability interval of each channel in advance. And the server receives the hidden layer characteristics corresponding to each channel, and decompresses the hidden layer characteristics of the corresponding channel through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain the decompression characteristics of the corresponding channel. And the server performs characteristic identification on the basis of the decompression characteristics respectively corresponding to the channels to obtain an identity identification result corresponding to the image to be identified. And the server feeds the identification result back to the terminal.
In one embodiment, the terminal encodes each hidden layer feature into a binary file, and uploads the binary file to the server. And the server receives the binary file and decodes the binary file to obtain the hidden layer characteristics of each channel.
In one embodiment, the server stores the real image of the user and the corresponding image features in advance, and the server performs feature recognition based on the decompression features respectively corresponding to the channels to obtain recognition features. And calculating the similarity between the identification feature and the pre-stored image feature, and determining an identity identification result according to the similarity. Further, when the similarity between the identification feature and the pre-stored image feature is greater than a similarity threshold, it is determined that the image to be identified and the pre-stored image correspond to the same user, and the identity identification is successful. And when the similarity between the identification feature and the pre-stored image feature is smaller than or equal to a similarity threshold value, judging that the image to be identified and the pre-stored image are not the same user, and indicating that the identity identification fails.
For example, the image to be recognized and the pre-stored image are both face images, and when the similarity between the recognition feature corresponding to the image to be recognized and the image feature of the pre-stored image is greater than the similarity threshold, it indicates that the image to be recognized and the pre-stored image are face images of the same person. And when the similarity between the identification feature corresponding to the image to be identified and the image feature of the pre-stored image is smaller than or equal to the similarity threshold value, indicating that the image to be identified and the pre-stored image are not the face image of the same person.
And step S208, receiving an identity recognition result corresponding to the image to be recognized and fed back by the server.
And the identity recognition result comprises successful identity recognition and failed identity recognition.
Specifically, the terminal receives an identity recognition result corresponding to the image to be recognized and fed back by the server, and performs corresponding processing based on the identity recognition result. For example, when the identity recognition result is that the identity recognition is successful, the terminal allows the user to execute the corresponding operation, and when the identity recognition result is that the identity recognition is failed, the user is prohibited from executing the corresponding operation.
According to the image identification method, the image to be identified is obtained, the image to be identified is coded and quantized to obtain the quantization characteristics corresponding to at least one channel, the quantization characteristics of the corresponding channel are compressed through the target value range and the corresponding discrete cumulative probability interval of each channel, and the hidden layer characteristics corresponding to each channel obtained after compression are uploaded to a server, so that the privacy of a user can be effectively protected, and meanwhile, the data volume and the bandwidth of transmission can be reduced. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the problem that the image to be recognized is directly leaked out due to data leakage in the process of directly uploading the image to be recognized to the server is solved. The server correspondingly decompresses the hidden layer characteristics based on the pre-stored target value ranges of all the channels and the corresponding discrete cumulative probability intervals, accurately obtains the decompressed characteristics to perform identity recognition and returns an identity recognition result to the terminal, and therefore the accuracy of user identity recognition and the safety of a recognition process are effectively protected.
In one embodiment, compressing the quantization features of the corresponding channels according to the target value ranges of the channels and the corresponding discrete cumulative probability intervals to obtain hidden layer features corresponding to each channel, includes:
for the quantization feature corresponding to each channel, determining a probability interval corresponding to each character in the quantization feature corresponding to the corresponding channel through a target value range and a discrete cumulative probability interval corresponding to the corresponding channel; determining a coding interval corresponding to each character respectively based on the probability interval corresponding to each character; and for each channel, determining the hidden layer characteristics corresponding to the corresponding channel according to the coding interval corresponding to each character corresponding to the corresponding channel.
Specifically, for the quantization feature corresponding to a single channel, the terminal obtains a target value range and a discrete cumulative probability interval corresponding to the channel. And sequentially reading characters from the quantization characteristics, and determining a probability interval corresponding to each character from the discrete cumulative probability intervals based on the target value range when each character is read. And determining the coding interval corresponding to the character according to the probability interval corresponding to the character. According to the same processing, the coding section corresponding to each character in the quantization characteristic can be obtained, and the hidden layer characteristic corresponding to the quantization characteristic can be determined according to the coding section corresponding to each character. The hidden layer feature corresponding to the quantization feature is the hidden layer feature corresponding to the corresponding channel.
And according to the same processing, obtaining hidden layer characteristics corresponding to each quantization characteristic respectively, namely obtaining hidden layer characteristics corresponding to each channel respectively.
In this embodiment, for the quantization feature corresponding to each channel, a probability interval corresponding to each character in the quantization feature corresponding to the corresponding channel is determined through a target value range and a discrete cumulative probability interval corresponding to the corresponding channel, a coding interval corresponding to each character is determined based on the probability interval corresponding to each character, for each channel, a hidden layer feature corresponding to the corresponding channel is determined according to the coding interval corresponding to each character corresponding to the corresponding channel, the quantization feature corresponding to each channel can be compressed, so as to reduce the data amount of the quantization feature, facilitate uploading the quantization feature to a server in the form of a compressed feature, and avoid leakage of the quantization feature in a transmission process.
In one embodiment, determining a probability interval corresponding to each character in the quantization feature corresponding to the corresponding channel according to the target value range and the discrete cumulative probability interval corresponding to the corresponding channel includes:
determining a matched character matched with the current character in a target value range corresponding to the corresponding channel for the current character in the quantization characteristics of the corresponding channel; taking the corresponding interval of the matched character in the corresponding discrete cumulative probability interval as the probability interval corresponding to the current character;
determining the coding interval corresponding to each character respectively based on the probability interval corresponding to each character, including: and determining the coding interval corresponding to the current character according to the probability interval corresponding to the current character and the coding interval corresponding to the adjacent previous character, and continuously processing the next character in the quantization characteristics of the corresponding channel until the coding interval corresponding to each character in the quantization characteristics is obtained.
Specifically, for the quantization feature corresponding to a single channel, the terminal obtains a target value range and a discrete cumulative probability interval corresponding to the channel. And reading characters from the quantization features in sequence, and taking the read characters as current characters. And matching the current character with each character in the target value range to determine a matched character matched with the current character in the target value range. And determining the corresponding interval of the matched character in the discrete cumulative probability interval corresponding to the target value range, and taking the determined interval as the probability interval corresponding to the current character.
And determining a coding interval corresponding to the adjacent previous character of the current character, and determining the coding interval corresponding to the current character according to the probability interval corresponding to the current character and the coding interval corresponding to the adjacent previous character. Further, the terminal can obtain a compression formula, and substitutes the probability interval of the current character and the coding interval corresponding to the previous character into the compression formula to obtain the coding interval corresponding to the current character.
And after the coding interval corresponding to the current character is obtained, reading the next character from the quantization feature, and taking the read character as the current character to perform the same processing as the above until the coding interval corresponding to each character in the quantization feature is obtained.
For example, the compression formula is:
lowi=lowi-1+(highi-1-lowi-1)*Li;highi=lowi-1+(highi-1-lowi-1)*Hi
wherein, lowiIs the lower limit value, high, in the coding interval corresponding to the current character iiIf the upper limit value in the coding interval corresponding to the current character i is the upper limit value, the coding interval corresponding to the current character i is (low)i,highi]。lowi-1、highi-1Coding intervals (low) respectively corresponding to adjacent previous characters i-1i-1,lowi-1]Lower limit value and upper limit value of (1). L isi、HiRespectively a lower limit value and an upper limit value in the probability interval corresponding to the current character i.
In one embodiment, when the current character is a first character in the quantized feature, the first character has no coding interval adjacent to the coding interval corresponding to the previous character, then the initial coding interval [0, 1] is used]And substituting the probability interval of the first character and the initial coding interval into a compression formula to obtain a coding interval corresponding to the first character. I.e. the coding interval corresponding to the first character in the quantization characteristic is calculated, (low) in the compression formulai-1,lowi-1]Is [0, 1]]。
And when the current character is not the first character, determining a coding interval corresponding to the adjacent previous character of the current character, and substituting the probability interval of the current character and the coding interval corresponding to the previous character into the compression formula to obtain the coding interval corresponding to the current character. According to the same processing, the coding section corresponding to the last character in the quantization characteristic can be obtained.
After the coding interval corresponding to the last character in the quantization characteristics is obtained, a numerical value can be randomly selected from the coding interval corresponding to the last character; and taking the randomly selected numerical value as the hidden layer characteristic corresponding to the quantization characteristic, wherein the hidden layer characteristic corresponding to the quantization characteristic is the hidden layer characteristic corresponding to the single channel.
And processing the quantization features corresponding to each channel according to the method to obtain the hidden layer features corresponding to each channel.
In one embodiment, after a matching character matched with the current character in the target value range corresponding to the corresponding channel is determined, the adjacent previous character of the matching character in the target value range and the matching character are used as the upper limit and the lower limit of the matching interval, so that the matching interval corresponding to the current character is obtained. And determining a numerical value corresponding to the lower limit value in the matching interval in the discrete cumulative probability interval corresponding to the target value range and a numerical value corresponding to the upper limit value in the matching interval in the discrete cumulative probability interval corresponding to the target value range, taking the numerical value corresponding to the lower limit value as the lower limit value of the probability interval, and taking the numerical value corresponding to the upper limit value as the upper limit value of the probability interval, thereby forming the probability interval corresponding to the current character.
For example, the quantization characteristic of one channel is [1, 2, 3, 4]]The target value range corresponding to the channel is [ -1, 0,1, 2, 3, 4]]The discrete cumulative probability interval corresponding to the target value range is [0, 0.2, 0.3,0.5, 0.8, 1]]If the matching character of the first character 1 in the quantization feature matched in the target value range is 1, the matching interval of the first character 1 corresponding to the target value range is (0,1)]And matching the interval (0,1)]The corresponding probability interval in the discrete cumulative probability interval is (0.2, 0.3)]. Get the initial compilationThe code interval is [0, 1]]Then the probability interval corresponding to the first character 1 is (0.2, 0.3)]And the initial coding interval [0, 1]]Substituting into the compression equation lowi=lowi-1+(highi-1-lowi-1)*Li;highi=lowi-1+(highi-1-lowi-1)*HiNamely: low wi=0+(1-0)*0.2;highi0+ (1-0) × 0.3, resulting in the coding interval corresponding to the first character 1 being (0.2, 0.3)]。
The 2 nd character in the quantized features is 2, the matching character of the character 2 matched in the target value range is 2, the matching interval of the character 2 corresponding to the target value range is (1, 2), and the probability interval of the matching interval (1, 2) corresponding to the discrete cumulative probability interval is (0.3, 0.5).
Coding interval (0.2, 0.3) of first character]0.2 and 0.3 in the formula are respectively taken as low in the compression formulai-1And highi-1To obtain lowi=0.2+(0.3-0.2)*0.3;highi0.2+ (0.3-0.2) × 0.5, giving the code interval corresponding to character 2 (0.23, 0.25)]。
According to the same processing mode, the coding section corresponding to each character in the quantization characteristics can be calculated, after the coding section corresponding to the last character 4 is calculated, any numerical value is selected from the coding sections of the characters 4 to be used as the hidden layer characteristics corresponding to the quantization characteristics.
In this embodiment, for a current character in the quantization features of a corresponding channel, a matching character matching the current character in a target value range corresponding to the corresponding channel is determined, and an interval corresponding to the matching character in a corresponding discrete cumulative probability interval is used as a probability interval corresponding to the current character, so that a coding interval corresponding to the current character is determined according to the probability interval corresponding to the current character and a coding interval corresponding to an adjacent previous character. And continuously processing the next character in the quantization characteristics of the corresponding channel according to the same processing until the coding interval corresponding to each character in the quantization characteristics is obtained, compressing the quantization characteristics corresponding to each channel to reduce the data volume of the quantization characteristics, conveniently uploading the quantization characteristics to a server in a compressed characteristic form, reducing the used bandwidth and improving the safety in the data transmission process.
In one embodiment, for each channel, determining hidden layer features corresponding to the corresponding channel according to the coding interval corresponding to each character corresponding to the corresponding channel, includes:
for each channel, randomly selecting a numerical value from a coding interval corresponding to the last character of the quantization characteristic corresponding to the corresponding channel; and taking the randomly selected numerical value as the hidden layer characteristics corresponding to the corresponding channel until the hidden layer characteristics corresponding to each channel are obtained.
Specifically, for a single channel, after a coding interval corresponding to a last character in the quantization features corresponding to the channel is obtained, a numerical value is randomly selected from the coding interval corresponding to the last character, and the randomly selected numerical value is used as the hidden layer feature corresponding to the channel. And according to the same processing mode, the hidden layer characteristics corresponding to each channel can be obtained.
In this embodiment, a value is randomly selected from the coding interval corresponding to the last character in the quantization features, and the randomly selected value is used as the hidden layer feature corresponding to the channel, so that the randomness of data selection can be increased within a certain constraint range. And the random numerical value selected in the coding interval corresponding to the last character does not influence the subsequent decompression processing, namely, any numerical value in the coding interval corresponding to the last character is taken as a hidden layer characteristic, and the same hidden layer characteristic can be obtained after the subsequent decompression processing, so that the decompression accuracy is ensured.
In one embodiment, the image recognition method is implemented by a target recognition model comprising a first sub-model and a second sub-model; the first sub-model is deployed at the terminal, and the second sub-model is deployed at the server; the first sub-model comprises an encoder, a quantizer, a target value range of each channel and a corresponding discrete accumulation probability interval; the second submodel comprises a target value range and a corresponding discrete accumulation probability interval of each channel and a feature identification network.
The image recognition method is realized by a target recognition model, and the target recognition model can comprise a first sub-model and a second sub-model. The first sub-model is deployed at the terminal, and the second sub-model is deployed at the server. And the terminal inputs the image to be recognized into the first submodel to obtain the hidden layer characteristics corresponding to each channel output by the first submodel. And the terminal uploads the hidden layer characteristics corresponding to each channel to the server, and the server inputs the hidden layer characteristics corresponding to each channel into a corresponding channel in the second submodel for processing to obtain an identity recognition result output by the second submodel.
The first submodel includes an encoder, a quantizer, a target value range of each channel, and a corresponding discrete cumulative probability interval. And the terminal inputs the image to be identified into the encoder to perform feature coding and outputs the coding features corresponding to each channel. And the coding characteristics corresponding to each channel output by the encoder are used as the input characteristics corresponding to each channel of the quantizer, namely the coding characteristics of each channel output by the encoder are input into the corresponding channel of the quantizer to carry out quantization processing on the coding characteristics so as to obtain the quantization characteristics corresponding to each channel output by the quantizer. And compressing the quantization characteristics corresponding to each channel output by the quantizer by using the target value range and the discrete cumulative probability interval of the corresponding channel respectively to obtain the hidden layer characteristics corresponding to each output channel.
And the terminal uploads the hidden layer characteristics corresponding to each channel to the server, and the server inputs the hidden layer characteristics corresponding to each channel into the second submodel. And the second submodel decompresses the hidden layer characteristics of the corresponding channels by using the target value range and the discrete cumulative probability interval of each channel to obtain the decompressed characteristics corresponding to each channel. And taking the decompression characteristics corresponding to each channel as the input of the corresponding channel in the characteristic identification network, and carrying out characteristic identification on each decompression characteristic through the characteristic identification network to obtain the identification characteristics. The feature recognition network outputs an identity recognition result based on the similarity between the recognition feature and the pre-stored image feature. And the server feeds back the identity recognition result input by the feature recognition network to the terminal.
In this embodiment, the image recognition method is implemented by a target recognition model including a first sub-model and a second sub-model, and the first sub-model and the second sub-model of the target recognition model are respectively deployed on a terminal and a server, so that the terminal processes an image to be processed into a hidden layer feature through the first sub-model. The data uploaded to the server is hidden layer characteristics, the uploaded hidden layer characteristics are compressed through a target value range and a discrete accumulation probability interval of each channel in the first sub-model, and even if the hidden layer characteristics are leaked in the data transmission process, the data before compression cannot be accurately recovered without a corresponding decompression mode for decompression, so that the private information of a user can be effectively protected.
In one embodiment, the target recognition model is determined by a training step; as shown in fig. 3, the training step includes:
step S302, determining a recognition model to be trained, wherein the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network.
Specifically, a recognition model to be trained may be determined, including an image encoder, a quantizer, an entropy network, and a feature recognition network.
The recognition model to be trained can be deployed at the terminal so as to be trained at the terminal. The recognition model to be trained can also be deployed in a server to be trained in the server.
And step S304, obtaining a sample image and a corresponding identity label, and sequentially coding and quantizing the sample image through a coder and a quantizer in the identification model to be trained to obtain sample quantization characteristics.
Specifically, the terminal obtains a sample image and an identity label corresponding to the sample image, and inputs the sample image and the corresponding identity label into an identification model to be trained.
And the identification model to be trained encodes the sample image through an encoder and outputs the sample encoding characteristics corresponding to each channel. The sample coding features corresponding to each channel output by the encoder are used as input features corresponding to each channel of the quantizer, that is, the sample coding features of each channel output by the encoder are input into the corresponding channel of the quantizer to perform quantization processing on the coding features, so as to obtain the sample quantization features corresponding to each channel output by the quantizer.
Step S306, determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through an entropy network, and determining sample image entropy corresponding to the sample image according to the sample probability distribution.
The probability distribution represents the information entropy contained by the pixels in the image, and the larger the probability distribution is, the less the useful information contained by the representative pixels is; the larger the probability distribution, the more useful information the representation pixel contains. The sample probability distribution represents the entropy of information contained by the pixels in the image.
Specifically, the sample quantization characteristics corresponding to each channel output by the quantizer are input into the corresponding channel of the entropy network. And calculating sample probability distribution corresponding to each pixel in the corresponding sample quantization characteristics through each channel of the entropy network so as to obtain the sample probability distribution corresponding to each pixel output by each channel. And after the sample probability distribution corresponding to each pixel in the sample image is obtained, calculating the sample image entropy corresponding to the sample image according to the sample probability distribution corresponding to each pixel.
In one embodiment, determining sample image entropy corresponding to a sample image from a sample probability distribution comprises: and averaging the sample probability distribution corresponding to each pixel in the sample image, and taking the average as the sample image entropy corresponding to the sample image.
In one embodiment, determining sample image entropy corresponding to a sample image from a sample probability distribution comprises: and carrying out weighted summation on the sample probability distribution corresponding to each pixel in the sample image, and solving the average value, wherein the average value is used as the sample image entropy corresponding to the sample image.
And S308, determining the value ranges respectively corresponding to the channels in the recognition model to be trained through the entropy network and based on the sample quantization characteristics, and calculating the channel probability distribution respectively corresponding to the value ranges.
Specifically, the entropy network is provided with a value range corresponding to each channel. And for each value range, respectively taking the lower limit value and the upper limit value in the value range as the input of the entropy network to obtain the channel probability distribution corresponding to the lower limit value and the upper limit value in each value range.
Step S310, determining the channel probability loss based on the channel probability distribution respectively corresponding to each value range.
Specifically, the channel probability loss is calculated according to the channel probability distribution corresponding to the lower limit value and the upper limit value in each value range.
And step S312, identifying the quantized features of the sample through a feature identification network to obtain a sample identification result.
Specifically, sample quantization features respectively output by each channel of the quantizer are respectively used as corresponding inputs of corresponding channels in the feature identification network, and the feature identification network performs feature extraction, pooling, full connection and other processing on the sample quantization features to obtain identification features. The feature recognition network outputs a sample recognition result based on the recognition feature.
And step S314, determining image recognition loss based on the sample recognition result and the identity label, and constructing a target loss function according to the image recognition loss, the channel probability loss and the sample image entropy.
Specifically, an image recognition image is calculated according to a sample image recognition result and a corresponding identity tag, and the image recognition loss, the channel probability loss and the sample image entropy are summed to be used as a target loss function.
In one embodiment, the image identification loss, the channel probability loss and the sample image entropy are respectively multiplied by corresponding weights, and the products are summed to be used as a target loss function.
Step S316, training the recognition model to be trained through a target loss function until the training stopping condition is reached, and obtaining a trained target recognition model;
the trained target recognition model comprises target value ranges respectively corresponding to the channels, and the discrete cumulative probability intervals respectively corresponding to the target value ranges of the channels are determined by the entropy network in the trained target recognition model based on the target value ranges of the corresponding channels.
The training stop condition may be at least one of a loss error of the recognition model being less than or equal to a loss threshold, a number of iterations of the recognition model reaching a preset number of iterations, and a time of the iterations reaching a preset iteration time.
Specifically, the terminal can train the recognition model to be trained through the target loss function, adjust parameters of the recognition model in the training process and continue training until the recognition model meets the target training stopping condition, and the trained target recognition model is obtained. The trained target recognition model is used for carrying out identity recognition on the image to be recognized so as to output an identity recognition result corresponding to the image to be recognized.
In one embodiment, parameters of the encoder, the quantizer, the entropy network and the feature recognition network in the recognition model are adjusted in the training process, and the training is continued until the recognition model meets the target training stopping condition, so that the trained target recognition model is obtained. The trained target recognition network comprises a trained encoder, a trained quantizer, a trained entropy network and a trained feature recognition network.
Further, the trained entropy network not only includes processing parameters of each layer, but also includes target value ranges respectively corresponding to each channel.
In one embodiment, the trained target recognition model includes target value ranges respectively corresponding to the channels, and then channel probability distribution corresponding to each value in each target value range is determined based on the target value range respectively corresponding to each channel through an entropy network in the target recognition model; and for the target value range corresponding to each channel, calculating a discrete cumulative probability interval corresponding to the corresponding target value range according to the channel probability distribution corresponding to each value in the corresponding target value range.
In this embodiment, the sample quantization features of the sample image are processed through an entropy network to obtain sample probability distributions corresponding to pixels in the sample image, so as to determine a sample image entropy corresponding to the sample image, and the sample image entropy is used as a part of a target loss function to determine a loss degree of key information of the sample image. Meanwhile, the channel probability loss is calculated according to the channel probability distribution corresponding to the value range of each channel of the entropy network, and can be used as a part of a target loss function, so that the value range of each channel is optimized in the training process, and the constraint on the output characteristics of the entropy network is realized. And the loss of the sample quantization characteristics between the prediction result and the real result in the characteristic identification network is used as a part of a target loss function, so that the training of the characteristic identification network is realized. The overall training of the recognition model is realized through a series of constraints, so that the trained target recognition model has higher prediction precision and accuracy.
In one embodiment, the image to be recognized is a face image; acquiring an image to be identified, coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel, wherein the quantization characteristics comprise:
responding to a resource transfer triggering operation of the resource amount, and acquiring a face image; coding and quantizing the face image to obtain quantization characteristics corresponding to at least one channel;
after receiving the identification result corresponding to the image to be identified fed back by the server, the method further comprises the following steps: when the identification result is successful, executing resource transfer operation; the resource transfer operation is used for transferring the resource amount from the resource account of the operation initiator to the resource account of the receiver.
In particular, the image recognition method can be applied to resource transfer scenes, and the image to be recognized can be a human face image. When a user needs to execute the resource transfer operation, the amount of the resource to be transferred can be acquired through the terminal, and the resource transfer operation of the amount of the resource is triggered.
In one embodiment, a resource transfer application is installed on a terminal, a user enters the resource transfer application, a graphic code acquisition function is called in the resource transfer application to acquire a graphic code of a receiver so as to enter a resource transfer interface, and the amount of a resource to be transferred is input in the resource transfer interface, or the graphic code of the receiver is acquired to enter the resource transfer interface, wherein the resource transfer interface already comprises the amount of the resource to be transferred. The user can trigger the amount of the resource, so that the resource transfer application calls the camera to acquire the current face image.
In one embodiment, a user may enter a resource account of a recipient in a resource transfer application to enter a resource transfer interface where an amount of a resource to be transferred is entered.
And the terminal responds to the resource transfer triggering operation of the user on the resource amount and acquires the current face image through the camera. And the terminal carries out coding and quantization processing on the face image to obtain quantization characteristics corresponding to at least one channel. And the terminal compresses the quantization characteristics of the corresponding channels through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain hidden layer characteristics respectively corresponding to each channel. The terminal can encode the hidden layer characteristics corresponding to each channel into a binary file and upload the binary file to a server corresponding to the terminal.
And the server receives the binary file for decoding to obtain the hidden layer characteristics corresponding to each channel. And the server decompresses the hidden layer characteristics of the corresponding channels through the target value ranges of the channels and the corresponding discrete cumulative probability intervals to obtain the decompressed characteristics of the corresponding channels. And performing feature recognition based on the decompression features respectively corresponding to the channels to obtain an identity recognition result corresponding to the face image, and feeding back the identity recognition result to the terminal.
And the terminal receives the identity recognition result, and executes resource transfer operation to transfer the resource amount from the resource account of the operation initiator to the resource account of the receiver when the identity recognition result is that the identity recognition is successful. And when the identity recognition result is that the identity recognition fails, not executing the resource transfer operation and prompting the identity recognition error.
In this embodiment, the image recognition method is applied to resource transfer, a face image is collected in response to a resource transfer triggering operation on a resource amount, and the face image is quantized, encoded and compressed by a terminal to obtain a compressed hidden layer feature. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the condition that the face image of the user is directly leaked out due to data leakage in the process of directly uploading the face image to the server is avoided. The server correspondingly decompresses the hidden layer characteristics to accurately obtain the decompressed characteristics for identity recognition, when the identity recognition result corresponding to the face image is that the identity recognition is successful, resource transfer operation is executed, and the resource transfer operation is not executed during the identity recognition, so that the safety of resource transfer can be improved.
Fig. 4 is a schematic flow chart of face recognition in one embodiment.
Step S402, deploying the target value range of the encoder, the quantizer and each channel and the corresponding discrete cumulative probability interval at the terminal; and deploying the target value range of each channel, the corresponding discrete cumulative probability interval and the characteristic identification network in a server.
And S404, acquiring a face image a of the user A by the terminal through the camera, and obtaining the hidden layer characteristic f through coding, quantization and compression processing of a coder, a quantizer, target value ranges of all channels and corresponding discrete accumulation probabilities.
And step S406, uploading the hidden layer features f to a server, and decompressing and identifying the hidden layer features by the server through the target value range of each channel, the corresponding discrete cumulative probability interval and the feature identification network to obtain the face features e.
Step S408, selecting the user B in the registry, and comparing the face feature e of the user A with the face feature of the user B to obtain the similarity j.
Step S410, judging whether the similarity j is higher than a preset threshold value t, if so, executing step S412, namely judging that the user A and the user B are the same person; otherwise, step S414 is executed, that is, the user a and the user B are not the same person, step S416 is executed.
Step S416, determining whether the registered feature library further includes facial features of other users, if yes, selecting another user B from the registered feature library, returning to step S408 and continuing to execute, otherwise, ending the identification process.
In one embodiment, the image recognition method may be applied to a face payment scenario. And installing a payment application on the mobile phone, enabling the user to enter the payment application, and calling a graphic code acquisition function to acquire a graphic code of the receiver in the payment application so as to enter a payment interface. And inputting the payment amount in a payment interface, or entering the payment interface by acquiring the graphic code of the receiver, wherein the payment amount exists in the payment interface. The user can confirm the payment amount, so that the payment application calls the camera to acquire the current face image.
The mobile phone acquires a face image, and the face image is coded by the coder to obtain coding features corresponding to each channel. And inputting the coding features corresponding to the channels into a quantizer for quantization processing to obtain the quantization features corresponding to the channels.
And the mobile phone determines a matched character matched with the current character in the target value range corresponding to the corresponding channel for the current character in the quantization characteristics of the corresponding channel. And taking the corresponding interval of the matched character in the corresponding discrete cumulative probability interval as the probability interval corresponding to the current character.
And the mobile phone determines the coding interval corresponding to the current character according to the probability interval corresponding to the current character and the coding interval corresponding to the adjacent previous character, and continues to process the next character in the quantization characteristics of the corresponding channel until the coding interval corresponding to each character in the quantization characteristics is obtained.
For each channel, the mobile phone randomly selects a numerical value from the coding interval corresponding to the last character of the quantization feature corresponding to the corresponding channel. And taking the randomly selected numerical value as the hidden layer characteristics corresponding to the corresponding channel until the hidden layer characteristics corresponding to each channel are obtained.
The mobile phone uploads the hidden layer characteristics corresponding to each channel to the server, and the server receives the hidden layer characteristics corresponding to each channel.
The server determines a character probability interval corresponding to the hidden layer feature in the discrete cumulative probability interval of the corresponding channel for the hidden layer feature corresponding to each channel, and determines the first character in the hidden layer feature from the corresponding target value range based on the character probability interval.
The server takes the first character as a current character, and updates the discrete cumulative probability interval based on the character probability interval and the initial probability interval of the current character.
And the server determines the next character of the current character from the target value range based on the character probability interval corresponding to the hidden layer feature in the updated discrete cumulative probability interval.
And the server takes the next character as the current character, returns the probability interval and the initial probability interval based on the current character, updates the step of discrete cumulative probability interval and continues to execute until the last character in the hidden layer characteristic is obtained.
And the server takes each character in the hidden layer characteristics as the decompression characteristics of the corresponding channel.
The server inputs the decompression characteristics of each channel into the characteristic recognition network to perform characteristic recognition based on the decompression characteristics corresponding to each channel, so as to obtain an identity recognition result corresponding to the face image, and the identity recognition result is fed back to the mobile phone.
And the mobile phone receives the identity recognition result, and executes a payment amount transfer operation when the identity recognition result is that the identity recognition is successful so as to transfer the payment amount from the payment account of the operation initiator to the payment account of the receiver. And when the identity recognition result is that the identity recognition fails, not executing the payment operation and prompting the identity recognition error on the mobile phone.
In this embodiment, the image recognition method is applied to mobile phone face payment, and the current face image of a user using a mobile phone is collected before payment amount, and the face image is quantized, encoded and compressed by an encoder and a quantizer of a terminal, so as to obtain a compressed hidden layer feature. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the condition that the face image of the user is directly leaked out due to data leakage in the process of directly uploading the face image to the server is avoided. The server correspondingly decompresses the hidden layer features, accurately obtains the decompressed features to perform identity recognition, executes payment operation when the identity recognition result corresponding to the face image is that the identity recognition is successful, and does not execute the payment operation during the identity recognition, so that the safety of the face payment of the mobile phone can be improved. And the private data of the user can be effectively protected in the face payment process.
In one embodiment, the image to be recognized is a face image; acquiring an image to be identified, comprising: responding to the triggering operation of the access control, and acquiring a face image;
after receiving the identification result corresponding to the image to be identified fed back by the server, the method further comprises the following steps: and when the identity recognition result corresponding to the face image is successful, controlling the access control terminal to execute an access control opening action.
Specifically, the image recognition method can be applied to an entrance guard control scene, and the image to be recognized can be a human face image. When the user needs to open the entrance guard, the entrance guard control flow of the terminal can be triggered.
The terminal responds to the triggering operation of the user on the access control, and acquires the current face image through the camera. And the terminal carries out coding and quantization processing on the face image to obtain quantization characteristics corresponding to at least one channel. And the terminal compresses the quantization characteristics of the corresponding channels through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain hidden layer characteristics respectively corresponding to each channel. The terminal can encode the hidden layer characteristics corresponding to each channel into a binary file and upload the binary file to a server corresponding to the terminal.
And the server receives the binary file for decoding to obtain the hidden layer characteristics corresponding to each channel. And the server decompresses the hidden layer characteristics of the corresponding channels through the target value ranges of the channels and the corresponding discrete cumulative probability intervals to obtain the decompressed characteristics of the corresponding channels. And performing feature recognition based on the decompression features respectively corresponding to the channels to obtain an identity recognition result corresponding to the face image, and feeding back the identity recognition result to the terminal.
And the terminal receives the identity recognition result, and when the identity recognition result is that the identity recognition is successful, the terminal controls the entrance guard terminal to execute the entrance guard opening action. And when the identity recognition result is that the identity recognition fails, the entrance guard opening action is not executed, and the identity recognition error is prompted.
It is understood that the terminal may be an access terminal, or a mobile phone with an access control function, but is not limited thereto. When the terminal is an entrance guard terminal, the entrance guard terminal receives an identity recognition result fed back by the server, and when the identity recognition result is that the identity recognition is successful, the entrance guard terminal executes an entrance guard opening action.
In this embodiment, the image recognition method is applied to access control, a face image is collected in response to a trigger operation on the access control, and the face image is quantized, encoded and compressed by the terminal to obtain a compressed hidden layer feature. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the condition that the face image of the user is directly leaked out due to data leakage in the process of directly uploading the face image to the server is avoided. The server correspondingly decompresses the hidden layer features, the decompressed features are accurately obtained to identify, and when the identification result corresponding to the face image is successful, the entrance guard terminal executes entrance guard opening action, so that the safety of entrance guard control and the safety of identity verification when the entrance guard is opened can be improved.
In one embodiment, as shown in fig. 5, an image recognition method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
step S502, receiving hidden layer characteristics corresponding to at least one channel, wherein the hidden layer characteristics corresponding to the at least one channel are obtained by coding and quantizing an image to be identified to obtain quantized characteristics corresponding to each channel, and compressing the quantized characteristics of the corresponding channel through a target value range and a corresponding discrete cumulative probability interval of each channel.
Specifically, the terminal acquires an image to be identified, and codes and quantizes the image to be identified to obtain a quantization feature corresponding to at least one channel. Compressing the quantization characteristics of the corresponding channels through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain hidden layer characteristics respectively corresponding to each channel, and uploading the hidden layer characteristics respectively corresponding to each channel to a server. And the server receives all the hidden layer characteristics uploaded by the terminal.
In one embodiment, the terminal encodes each hidden layer feature into a binary file, and uploads the binary file to the server. And the server receives the binary file and decodes the binary file to obtain the hidden layer characteristics of each channel.
And step S504, decompressing the hidden layer characteristics of the corresponding channel through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain the decompressed characteristics of the corresponding channel.
The server stores the target value range and the corresponding discrete cumulative probability interval of each channel in advance. And for the hidden layer characteristics of each channel, the server can decompress the hidden layer characteristics of the corresponding channel through the target value range and the corresponding discrete cumulative probability interval of the corresponding channel to obtain corresponding decompression characteristics, so that the decompression characteristics corresponding to each channel are obtained.
And S506, performing feature recognition based on the decompression features respectively corresponding to the channels to obtain an identity recognition result corresponding to the image to be recognized, and feeding back the identity recognition result to the terminal.
The identity recognition result comprises identity recognition success and identity recognition failure. The server stores the real image of the user and the corresponding image characteristics in advance, and performs characteristic identification on the basis of the decompression characteristics corresponding to the channels respectively to obtain identification characteristics. And calculating the similarity between the identification feature and the pre-stored image feature, and determining an identity identification result according to the similarity. Further, when the similarity between the identification feature and the pre-stored image feature is greater than a similarity threshold, it is determined that the image to be identified and the pre-stored image correspond to the same user, and the identity identification is successful. And when the similarity between the identification feature and the pre-stored image feature is smaller than or equal to a similarity threshold value, judging that the image to be identified and the pre-stored image are not the same user, and indicating that the identity identification fails. And after the identity recognition result corresponding to the image to be recognized is obtained, the server feeds the identity recognition result back to the terminal.
In this embodiment, the terminal encodes and quantizes an image to be recognized to obtain a quantization feature corresponding to at least one channel, compresses the quantization feature of the corresponding channel according to a target value range and a corresponding discrete cumulative probability interval of each channel, and uploads a hidden layer feature corresponding to each channel obtained after compression to the server, so that user privacy can be effectively protected, and the amount and bandwidth of transmitted data can be reduced. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the problem that the image to be recognized is directly leaked out due to data leakage in the process of directly uploading the image to be recognized to the server is solved. The server correspondingly decompresses the hidden layer characteristics based on the pre-stored target value ranges of all the channels and the corresponding discrete accumulation probability intervals, and accurately obtains the decompressed characteristics to perform identity recognition, so that the accuracy of user identity recognition and the safety of the recognition process are effectively improved. While also reducing the amount of storage on the server.
In an embodiment, as shown in fig. 6, decompressing the hidden layer feature of the corresponding channel according to the target value range of each channel and the corresponding discrete cumulative probability interval, to obtain the decompressed feature of the corresponding channel, includes:
step S602, for the hidden layer feature corresponding to each channel, determining a character probability interval corresponding to the hidden layer feature in the discrete cumulative probability interval of the corresponding channel, and determining a first character in the hidden layer feature from the corresponding target value range based on the character probability interval.
Specifically, for a hidden layer feature of a channel, the server determines a discrete cumulative probability interval corresponding to the channel, determines a range to which the hidden layer feature belongs in the discrete cumulative probability interval, and takes the range as a character probability interval corresponding to the first character. And determining a numerical value corresponding to the character probability interval in the target value range, and taking the numerical value as a first character in the hidden layer characteristic. In the same way, the first character in each hidden layer feature is obtained.
In step S604, the first character is taken as the current character, and the discrete cumulative probability interval is updated based on the character probability interval and the initial probability interval of the current character.
Specifically, the server uses the first character as a current character, determines an initial probability interval corresponding to the current character in the discrete cumulative probability interval, updates a value corresponding to the current character in the discrete cumulative probability interval according to the character probability interval and the initial probability interval of the current character, determines an initial probability interval corresponding to each character in the discrete cumulative probability interval for each character in the target value range, and updates a corresponding interval of each character in the discrete cumulative probability interval according to the character probability interval of the current character and the initial probability interval corresponding to each character to obtain an updated discrete cumulative probability interval.
Step S606, based on the character probability interval corresponding to the hidden layer feature in the updated discrete cumulative probability interval, determining the next character of the current character from the target value range.
Specifically, the range to which the hidden layer feature belongs in the updated discrete cumulative probability interval is used as the character probability interval corresponding to the next character of the current character. And determining a numerical value corresponding to the character probability interval in the target value range, and taking the numerical value as the next character of the current character. In the same way, the next character for each current character is obtained.
Step S608, taking the next character as the current character, returning to the probability interval and the initial probability interval based on the current character, updating the discrete cumulative probability interval and continuing to execute until the last character in the hidden layer characteristic is obtained; and taking each character in the hidden layer characteristics as the decompression characteristics of the corresponding channel.
Specifically, the next character is used as the current character, the probability interval and the initial probability interval based on the current character are returned, and the step of updating the discrete cumulative probability interval is continuously executed until the last character in the hidden layer characteristic is obtained and then the operation is stopped. And according to the same processing mode, obtaining each character in the hidden layer characteristics corresponding to each channel, and taking each character in the hidden layer characteristics as the decompression characteristics of the corresponding channel, thereby obtaining the decompression characteristics corresponding to each channel.
In this embodiment, for each hidden layer feature corresponding to each channel, a character probability interval corresponding to the hidden layer feature in a discrete cumulative probability interval of the corresponding channel is determined, a first character in the hidden layer feature is determined from a corresponding target value range based on the character probability interval, the first character is used as a current character, the discrete cumulative probability interval is updated based on the character probability interval and an initial probability interval of the current character, a next character of the current character is determined from the target value range based on the character probability interval corresponding to the hidden layer feature in the updated discrete cumulative probability interval, the next character is used as the current character, the probability interval and the initial probability interval based on the current character are returned, the step of updating the discrete cumulative probability interval is continued until a last character in the hidden layer feature is obtained, and each character in the hidden layer feature is used as a decompression feature of the corresponding channel, the decompression characteristics corresponding to each channel can be accurately decompressed based on the decompression corresponding to the compression processing, so that the identity recognition can be accurately carried out on the server.
For example, the quantization characteristic of a channel is [1, 3], the target value range corresponding to the channel is [ -1, 0,1, 2, 3, 4], and the discrete cumulative probability interval corresponding to the target value range is [0, 0.2, 0.3,0.5, 0.8, 1 ].
Compression treatment:
for the first character 1 in the quantization feature, the corresponding matching interval in the target value range is [0, 1]]The corresponding probability interval is [0.2, 0.3]]Taking the initial coding interval as [0, 1]]Then the probability interval corresponding to the first character 1 is [0.2, 0.3]]And the initial coding interval [0, 1]]Substituting into the compression formula, namely: low wi=0+(1-0)*0.2;highi0+ (1-0) × 0.3, resulting in a coding interval of [0.2, 0.3] corresponding to the first character 1]。
The 2 nd character in the quantization feature is 3, the matching interval corresponding to the character 3 in the target value range is [3, 4], and the probability interval corresponding to the discrete cumulative probability interval is [0.5, 0.8 ].
The coding interval [0.2, 0.3] of the previous character]0.2 and 0.3 in the formula are respectively taken as low in the compression formulai-1And highi-1To obtain lowi=0.2+(0.3-0.2)*0.5;highi0.2+ (0.3-0.2) × 0.8, resulting in a code interval [0.25, 0.28] for character 3]The slave code interval is [0.25, 0.28]]And selecting a numerical value of 0.25 as a hidden layer characteristic, and uploading the hidden layer characteristic to a server.
And (3) decompression processing:
the server deploys a target value range [ -1, 0,1, 2, 3, 4] corresponding to the channel and a discrete cumulative probability interval [0, 0.2, 0.3,0.5, 0.8, 1] corresponding to the target value range. After the hidden layer feature 0.25 is received, it can be determined that the character probability interval of 0.25 in the discrete cumulative probability interval is (0.2, 0.3), and the corresponding value of the character probability interval (0.2, 0.3) in the list is (0,1), and then the first character is 1.
Taking 0.2 and 0.3 in the character probability interval (0.2 and 0.3) as the lower limit value and the upper limit value in the new discrete cumulative probability interval respectively, then for [ -1 and 0] in the target value range, the corresponding initial probability interval in the original discrete cumulative probability interval is [0 and 0.2], and the [ -1 and 0] corresponds to the new character probability interval as follows:
low+(high-low)*L=0.2+(0.3-0.2)*0=0.2
low+(high-low)*H=0.2+(0.3-0.2)*0.2=0.22
for [0, 1], the initial probability interval in the discrete cumulative probability interval is [0.2, 0.3], then the corresponding new character probability interval:
low+(high-low)*L=0.2+(0.3-0.2)*0.2=0.22
low+(high-low)*H=0.2+(0.3-0.2)*0.3=0.23
for [1, 2], the initial probability interval is [0.3,0.5], then the corresponding new character probability interval:
low+(high-low)*L=0.2+(0.3-0.2)*0.3=0.23
low+(high-low)*H=0.2+(0.3-0.2)*0.5=0.25
for [2, 3], the initial probability interval is [0.5, 0.8], then the corresponding new character probability interval:
low+(high-low)*L=0.2+(0.3-0.2)*0.5=0.25
low+(high-low)*H=0.2+(0.3-0.2)*0.8=0.28
for [3, 4], the initial probability interval is [0.8, 1], then the corresponding new character probability interval:
low+(high-low)*L=0.2+(0.3-0.2)*0.8=0.28
low+(high-low)*H=0.2+(0.3-0.2)*1=0.30
therefore, if the updated discrete cumulative probability interval becomes (0.2, 0.22, 0.23, 0.25, 0.28, 0.3), the character probability interval corresponding to the hidden layer feature 0.25 in the updated discrete cumulative probability interval is (0.25, 0.28), the value corresponding to the probability interval (0.25, 0.28) in the list is (2, 3), and the second character is 3, so that the decompressed feature is 13.
FIG. 7 is an architectural diagram illustrating an embodiment of a test using a target recognition model.
Inputting the image to be identified into an encoder to obtain encoding characteristics y, and quantizing the encoding characteristics y by a quantizer to obtain quantization characteristics corresponding to each channel
Figure BDA0003145869530000341
Quantizing the characteristics of each channel
Figure BDA0003145869530000342
Inputting an entropy network, wherein the entropy network corresponds to the quantization characteristics through the target value range of the corresponding channel and the corresponding discrete cumulative probability interval
Figure BDA0003145869530000343
And performing compression processing to obtain shallow layer characteristics corresponding to each channel. And decompressing the corresponding shallow layer characteristics based on the target value range and the corresponding discrete cumulative probability interval of the corresponding channel to obtain the quantization characteristics corresponding to each channel
Figure BDA0003145869530000344
Quantization characteristics corresponding to each channel output by entropy network
Figure BDA0003145869530000345
And obtaining an identity recognition result corresponding to the image to be recognized output by the feature recognition model as the input of the feature recognition model.
In one embodiment, as shown in fig. 8, a recognition model training method is provided, which can be implemented on a terminal or a server as shown in fig. 1. For example, taking the application of the method to the terminal in fig. 1 as an example, the method includes the following steps:
step S802, determining a recognition model to be trained, wherein the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network.
Specifically, a recognition model to be trained may be determined, including an image encoder, a quantizer, an entropy network, and a feature recognition network.
The recognition model to be trained can be deployed at the terminal so as to be trained at the terminal. The recognition model to be trained can also be deployed in a server to be trained in the server.
Step S804, a sample image and a corresponding identity label are obtained, and the sample image is sequentially coded and quantized through a coder and a quantizer in the recognition model to be trained, so that sample quantization characteristics are obtained.
Specifically, the terminal obtains a sample image and an identity label corresponding to the sample image, and inputs the sample image and the corresponding identity label into an identification model to be trained.
And the identification model to be trained encodes the sample image through an encoder and outputs the sample encoding characteristics corresponding to each channel. The sample coding features corresponding to each channel output by the encoder are used as input features corresponding to each channel of the quantizer, that is, the sample coding features of each channel output by the encoder are input into the corresponding channel of the quantizer to perform quantization processing on the coding features, so as to obtain the sample quantization features corresponding to each channel output by the quantizer.
In one embodiment, the Encoder is a VAE (variant Auto-Encoder) based Encoder, which downsamples the sample image by 16 times, mainly by alternately stacking convolutional layers and GDNs with a step size of 2 and a convolution kernel size of 5. The GDN is similar to Batch Normalization (BN), the access is constrained to a certain range of sizes by using learnable parameters, but the parameters of the BN at each position in space are the same, and the constraint term of the GDN at each position in space is calculated according to all channels at the position, so that the statistical characteristics of data can be better learnt, and the GDN does not degenerate into a linear function in a test stage like the BN, so that the GDN has the characteristics of spatial adaptability and high nonlinearity. For example, the encoder processes the following equation:
Figure BDA0003145869530000351
the formula is shown below, wherein
Figure BDA0003145869530000352
Represents the eigenvalues at (m, n) on the ith channel of the kth stage, and β, γ are learnable parameters.
Figure BDA0003145869530000353
Represents the eigenvalue at (m, n) on the ith channel of the (k + 1) th stage. i is the current channel of the current stage and j is the other channels of the current stage.
Figure BDA0003145869530000354
Representing the eigenvalue at (m, n) on the jth channel of the kth stage. The output of the upper layer in the coder is used as the input of the next layerUntil the output of the last layer of the encoder is obtained, i.e. the potential representation y is obtained.
In one embodiment, the quantizer adds a uniform noise to the potential representation y to approximate the round form quantization, thereby making the entire training process scalable.
Figure BDA0003145869530000355
Wherein the content of the first and second substances,
Figure BDA0003145869530000356
and delta y is the added random noise value and is the quantization characteristic corresponding to the coding characteristic y, and the value range is (0,1), namely mu (0, 1).
Step S806, determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through an entropy network, and determining sample image entropy corresponding to the sample image according to the sample probability distribution.
Specifically, the sample quantization characteristics corresponding to each channel output by the quantizer are input into the corresponding channel of the entropy network. And calculating sample probability distribution corresponding to each pixel in the corresponding sample quantization characteristics through each channel of the entropy network so as to obtain the sample probability distribution corresponding to each pixel output by each channel. And after the sample probability distribution corresponding to each pixel in the sample image is obtained, calculating the sample image entropy corresponding to the sample image according to the sample probability distribution corresponding to each pixel.
In one embodiment, determining sample image entropy corresponding to a sample image from a sample probability distribution comprises: and averaging the sample probability distribution corresponding to each pixel in the sample image, and taking the average as the sample image entropy corresponding to the sample image.
In one embodiment, determining sample image entropy corresponding to a sample image from a sample probability distribution comprises: and carrying out weighted summation on the sample probability distribution corresponding to each pixel in the sample image, and solving the average value, wherein the average value is used as the sample image entropy corresponding to the sample image.
In one embodiment, the entropy network is processed as follows:
Figure BDA0003145869530000361
p=fK′·fK-1…f1
wherein c is a cumulative distribution function and refers to the combination of each function in the entropy network, and p is a probability density function and refers to the cumulative multiplication of a plurality of function derivatives.
The functional relationships in the entropy network are expressed as follows:
fk(x)=gk(H(k)x+b(k))
fK(x)=sigmoid(H(K)x+b(K))
gk(x)=x+a(k)⊙tanh(x)
Figure BDA0003145869530000362
Figure BDA0003145869530000363
wherein K is the last layer and K is the middle layer; f. ofk(x) Characteristic value, f, output for the intermediate layerk(x) I.e. g calculated for each intermediate layerk(x)。
Figure BDA0003145869530000364
b(K)And represents a learnable parameter.
A first layer: firstly calculating H according to the sample quantization characteristic x output by the quantization model(k)x+b(k)Then (H) is(k)x+b(k)) Represented by x; then calculate gk(x)=x+a(k)⊙tanh(x);
A second layer: g for outputting the previous layerk(x) Expressed by x, then H is calculated(k)x+b(k)Then (H) is(k)x+b(k)) Expressed by x, calculate gk(x)=x+a(k)L tanh (x); the g beingk(x) Is fk(x);
For each layer processing in the middle, the last layer is outputk(x) Expressed by x, then H is calculated(k)x+b(k)Then (H) is(k)x+b(k)) Expressed by x, calculate gk(x)=x+a(k)L tanh (x), i.e. g to the output of the layerk(x) G ofk(x) Is fk(x)
And a last layer: g for outputting the previous layerk(x) Expressed by x, calculate (H)(K)x+b(K)) Then calculate fK(x)=sigmoid(H(K)x+b(K));fK(x) Characteristic value, f, output for the last layerK(x) Has a value range of [0, 1]]。
In one embodiment, to ensure a derivative (i.e., p ═ f'K·f′K-1…f′1) Non-negative, c is guaranteed to be incremental, and by designing such a cumulative function, the following conditions are satisfied:
Figure BDA0003145869530000371
the probability distribution of the quantization feature y of the final sample can be obtained by the following formula, wherein
Figure BDA0003145869530000372
Density function representing the random noise added in the quantization step:
Figure BDA0003145869530000373
is the probability distribution of y.
Figure BDA0003145869530000374
Handle
Figure BDA0003145869530000375
Using the above-mentioned x to represent, and utilizing the above-mentioned processing course calculation from first layer to last layer to obtain characteristic value outputted by last layer
Figure BDA0003145869530000376
A (b) a
Figure BDA0003145869530000377
Using the above-mentioned x to represent, and utilizing the above-mentioned processing processes of first layer to third layer to calculate and obtain the characteristic value outputted by last layer
Figure BDA0003145869530000378
The difference value of the two characteristic values is used as the probability distribution of the pixel
Figure BDA0003145869530000379
And step S808, determining value ranges respectively corresponding to all channels in the recognition model to be trained through the entropy network and based on the sample quantization characteristics, and calculating the channel probability distribution respectively corresponding to all the value ranges.
Specifically, the entropy network is provided with a value range corresponding to each channel. And for each value range, respectively taking the lower limit value and the upper limit value in the value range as the input of the entropy network to obtain the channel probability distribution corresponding to the lower limit value and the upper limit value in each value range.
Step S810, determining the channel probability loss based on the channel probability distribution respectively corresponding to each value range.
Specifically, the channel probability loss is calculated according to the channel probability distribution corresponding to the lower limit value and the upper limit value in each value range.
And step S812, identifying the sample quantization characteristics through a characteristic identification network to obtain a sample identification result.
Specifically, sample quantization features respectively output by each channel of the quantizer are respectively used as corresponding inputs of corresponding channels in the feature identification network, and the feature identification network performs feature extraction, pooling, full connection and other processing on the sample quantization features to obtain identification features. The feature recognition network outputs a sample recognition result based on the recognition feature.
Step S814, determining image recognition loss based on the sample recognition result and the identity label, and constructing a target loss function according to the image recognition loss, the channel probability loss and the sample image entropy.
Specifically, an image recognition image is calculated according to a sample image recognition result and a corresponding identity tag, and the image recognition loss, the channel probability loss and the sample image entropy are summed to be used as a target loss function.
In one embodiment, the image identification penalty, the channel probability penalty, and the sample image entropy are weighted and summed as a function of the target penalty.
For example, the target loss function is as follows:
Figure BDA0003145869530000381
wherein L is a target loss function,
Figure BDA0003145869530000382
in order to identify the loss in the image,
Figure BDA0003145869530000383
is the sample image entropy, N is the number of pixels in the sample image, p (i) is the probability distribution of pixel i, LauxAnd λ is the weight corresponding to the entropy of the sample image, which is the channel probability loss.
Step S816, training the recognition model to be trained through the target loss function until the training stopping condition is reached, and obtaining a trained target recognition model; the target recognition model is used for carrying out identity recognition on the image to be recognized.
The training stopping condition may be at least one of a loss error of the recognition model being less than or equal to a loss threshold, an iteration number of the recognition model reaching a preset iteration number, an iteration time reaching a preset iteration time, and the like.
Specifically, the terminal can train the recognition model to be trained through the target loss function, adjust parameters of the recognition model in the training process and continue training until the recognition model meets the target training stopping condition, and the trained target recognition model is obtained. The trained target recognition model is used for carrying out identity recognition on the image to be recognized so as to output an identity recognition result corresponding to the image to be recognized.
In one embodiment, parameters of the encoder, the quantizer, the entropy network and the feature recognition network in the recognition model are adjusted in the training process, and the training is continued until the recognition model meets the target training stopping condition, so that the trained target recognition model is obtained. The trained target recognition network comprises a trained encoder, a trained quantizer, a trained entropy network and a trained feature recognition network.
Further, the trained entropy network not only includes processing parameters of each layer, but also includes target value ranges respectively corresponding to each channel.
In this embodiment, the sample quantization features of the sample image are processed through an entropy network to obtain sample probability distributions corresponding to pixels in the sample image, so as to determine a sample image entropy corresponding to the sample image, and the sample image entropy is used as a part of a target loss function to determine a loss degree of key information of the sample image. Meanwhile, the channel probability loss is calculated according to the channel probability distribution corresponding to the value range of each channel of the entropy network, and can be used as a part of a target loss function, so that the value range of each channel is optimized in the training process, and the constraint on the output characteristics of the entropy network is realized. And the loss of the sample quantization characteristics between the prediction result and the real result in the characteristic identification network is used as a part of a target loss function, so that the training of the characteristic identification network is realized. The overall training of the recognition model is realized through a series of constraints, so that the trained target recognition model has higher prediction precision and accuracy.
The feature recognition network in this embodiment may be modified in the conventional IR18 network structure, and in order to meet the change of the number of input channels and the feature size, the input layer convolution of the conventional IR18 network is changed to deconvolution, and the input channel parameters are changed from 3 to 192. As shown in the following table, the feature recognition network in this embodiment eliminates the first stage module, and the parameter changes of each layer are shown in the following table:
Figure BDA0003145869530000391
Figure BDA0003145869530000401
in the feature recognition network of the present embodiment, a residual structure based on deconvolution is shown in fig. 9, and the residual structure is obtained by performing batch normalization processing on input features and performing 3 × 3 deconvolution processing with a step size of 2 on the processed features. And carrying out batch normalization processing on the features after the deconvolution processing, carrying out 3 × 3 convolution processing with the step length of 1 on the processed features, and continuing to carry out batch normalization processing. And performing down-sampling on the input features by 2 times, and fusing the down-sampled features and the features subjected to batch normalization processing to obtain the output features of the residual error structure.
In one embodiment, calculating the channel probability distribution corresponding to each value range includes: calculating the channel probability distribution corresponding to the upper limit value and the lower limit value in each value range;
determining channel probability loss based on the channel probability distribution respectively corresponding to each value range, comprising: and determining the channel probability loss according to the channel probability distribution corresponding to the upper limit value and the lower limit value of each value range.
Specifically, the entropy network is provided with a value range corresponding to each channel. And for each value range, respectively taking the lower limit value and the upper limit value in the value range as the input of the entropy network to obtain the channel probability distribution corresponding to the lower limit value and the upper limit value in each value range.
And respectively calculating the difference between each upper limit value and a preset value to obtain the difference corresponding to each upper limit value. And summing the difference values corresponding to the upper limit values respectively, summing the lower limit values, and taking the sum of the lower limit values and the sum of the difference values as the channel probability loss of the entropy network.
For example, the channel probability loss is:
Laux=∑(cumulative(lower)-0)+∑(cumulative(upper)-1)
wherein L isauxFor channel probability loss, (cumulative (lower) is a lower value of channel probability distribution, and (cumulative (upper) is an upper value of channel probability distribution.
In this embodiment, for the value ranges of each channel in the entropy network, only the channel probability distributions corresponding to the upper and lower limits of each value range are calculated, so that the channel probability loss is calculated according to the channel probability distributions corresponding to the upper and lower limits, thereby reducing the calculation amount and increasing the processing speed.
FIG. 10 is a block diagram of a recognition model to be trained according to an embodiment. Each sample image A, B, C, D or the like is input to the encoder to obtain a coding feature corresponding to each sample image, and each coding feature is quantized by the quantizer to obtain a corresponding quantization feature.
And respectively inputting the quantization characteristics into an entropy network and a characteristic identification network, determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through the entropy network, and determining sample image entropy corresponding to the sample image according to the sample probability distribution.
And determining the value ranges respectively corresponding to all channels in the recognition model to be trained based on the sample quantization characteristics through an entropy network, and calculating the channel probability distribution respectively corresponding to all the value ranges. And determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range.
And identifying the quantitative characteristics of each sample through a characteristic identification network to respectively obtain a sample identification result corresponding to each sample image. And determining image loss based on each sample recognition result and the corresponding identity label, thereby constructing a target loss function according to the image recognition loss, the channel probability loss and the sample image entropy.
Training the recognition model to be trained through a target loss function until the training stopping condition is reached, and obtaining a trained target recognition model; the target recognition model is used for carrying out identity recognition on the image to be recognized.
In one embodiment, the trained target recognition model includes target value ranges respectively corresponding to the channels; the method further comprises the following steps:
determining channel probability distribution corresponding to each value in each target value range based on the target value range corresponding to each channel through an entropy network in the target identification model; for the target value range corresponding to each channel, calculating a discrete cumulative probability interval corresponding to the corresponding target value range according to the channel probability distribution corresponding to each value in the corresponding target value range;
the target value range and the corresponding discrete cumulative probability interval are used for compressing the quantization features corresponding to the image to be recognized into the hidden layer features, and decompressing the hidden layer features corresponding to the image to be recognized so as to obtain the identity recognition result corresponding to the image to be recognized based on the decompression result.
The trained target recognition network comprises a trained entropy network. The trained entropy network not only comprises processing parameters of each layer, but also comprises target value ranges respectively corresponding to all channels. Each value in the target value range is an integer. And after the target value range corresponding to each channel is obtained, regarding each target value range, taking each value in the target value range as the input of the trained entropy network, and obtaining the channel probability distribution corresponding to each value. And forming a discrete cumulative probability interval according to the channel probability distribution corresponding to each value in the target value range. And for each target value range, obtaining a discrete cumulative probability interval corresponding to each target value range according to the last processing.
In one embodiment, calculating a discrete cumulative probability interval corresponding to a corresponding target value range according to a channel probability distribution corresponding to each value in the corresponding target value range includes:
for each target value range, calculating discrete cumulative probability corresponding to the current value according to the channel probability distribution corresponding to the current value in the target value range and the channel probability distribution corresponding to each value before the current value; and forming a discrete cumulative probability interval according to the discrete cumulative probability corresponding to each value in the target value range. Each discrete accumulation probability in the discrete accumulation probability interval corresponds to each value in the target value range one by one.
Further, the channel probability distribution corresponding to the current value in the target value range is determined, the channel probability distributions corresponding to the values before the current value are determined, and the channel probability distribution corresponding to the current value and the channel probability distributions corresponding to the values before the current value are summed to obtain the discrete cumulative probability corresponding to the current value.
Namely, it is
For example:
Figure BDA0003145869530000421
cdf (i) is the discrete cumulative probability of the current value i, and p (j) is the channel probability distribution of the jth value in the target value range.
For example, the target value range is [6, 7, 8, 9, 10], the corresponding channel probability distribution set is [0, 0.2, 0.2, 0.2, 0.4, 1], and the corresponding discrete cumulative probability interval is [0, 0.2, 0.4, 0.6, 1 ]. The target value range, the channel probability distribution set and the discrete cumulative probability interval correspond to each other one by one, namely the channel probability distribution corresponding to the value 6 in the target value range is 0, and the corresponding discrete cumulative probability is 0; the channel probability distribution corresponding to the value 7 in the target value range is 0.2, and the corresponding discrete cumulative probability is 0.2.
The trained target recognition model comprises target value ranges respectively corresponding to all channels, discrete cumulative probability intervals respectively corresponding to all the target value ranges can be obtained through an entropy network, and after the discrete cumulative probability intervals respectively corresponding to all the target value ranges are obtained, the discrete cumulative probability intervals respectively corresponding to all the target value ranges can be deployed on a terminal and a server, so that the terminal can accurately compress quantization features corresponding to images to be recognized into hidden layer features through obtaining the discrete cumulative probability intervals respectively corresponding to all the target value ranges, and the hidden layer features are uploaded to a service. And the target value range and the corresponding discrete accumulation probability interval of the server are used for decompressing the hidden layer characteristics so as to accurately recover the compressed data, and the identity recognition result corresponding to the image to be recognized is obtained based on the decompression result.
In one embodiment, the target recognition model comprises a first sub-model and a second sub-model, wherein the first sub-model is deployed at the terminal, and the second sub-model is deployed at the server;
the first sub-model comprises an encoder, a quantizer, a target value range of each channel and a corresponding discrete accumulation probability interval, and the target value range of each channel and the corresponding discrete accumulation probability interval in the first sub-model are used for compressing quantization characteristics corresponding to the image to be recognized into hidden layer characteristics; the second submodel comprises a target value range of each channel, a corresponding discrete accumulation probability interval and a feature identification network; and the target value range and the corresponding discrete accumulation probability interval of each channel in the second submodel are used for decompressing the hidden layer characteristics corresponding to the image to be recognized.
The trained target recognition model comprises a first submodel and a second submodel. And deploying the first sub-model at the terminal, and deploying the second sub-model at the server. And the terminal inputs the image to be recognized into the first submodel to obtain the hidden layer characteristics corresponding to each channel output by the first submodel. And the terminal uploads the hidden layer characteristics corresponding to each channel to the server, and the server inputs the hidden layer characteristics corresponding to each channel into a corresponding channel in the second submodel for processing to obtain an identity recognition result output by the second submodel.
Further, the first sub-model comprises an encoder, a quantizer, a target value range of each channel and a corresponding discrete cumulative probability interval. And the terminal inputs the image to be identified into the encoder to perform feature coding and outputs the coding features corresponding to each channel. And the coding characteristics corresponding to each channel output by the encoder are used as the input characteristics corresponding to each channel of the quantizer, namely the coding characteristics of each channel output by the encoder are input into the corresponding channel of the quantizer to carry out quantization processing on the coding characteristics so as to obtain the quantization characteristics corresponding to each channel output by the quantizer. And compressing the quantization characteristics corresponding to each channel output by the quantizer by using the target value range and the discrete cumulative probability interval of the corresponding channel respectively to obtain the hidden layer characteristics corresponding to each output channel.
And the terminal uploads the hidden layer characteristics corresponding to each channel to the server, and the server inputs the hidden layer characteristics corresponding to each channel into the second submodel. And the second submodel decompresses the hidden layer characteristics of the corresponding channels by using the target value range and the discrete cumulative probability interval of each channel to obtain the decompressed characteristics corresponding to each channel. And taking the decompression characteristics corresponding to each channel as the input of the corresponding channel in the characteristic identification network, and carrying out characteristic identification on each decompression characteristic through the characteristic identification network to obtain the identification characteristics. The feature recognition network outputs an identity recognition result based on the similarity between the recognition feature and the pre-stored image feature. And the server feeds back the identity recognition result input by the feature recognition network to the terminal.
In this embodiment, the target recognition model includes a first sub-model and a second sub-model, and the first sub-model and the second sub-model of the target recognition model are respectively deployed on the terminal and the server, so that the terminal processes the image to be processed into the hidden layer feature through the first sub-model. The data uploaded to the server are hidden layer characteristics, and the uploaded hidden layer characteristics are compressed through the target value range and the discrete accumulation probability interval of each channel in the first sub-model. And the target value range and the corresponding discrete accumulation probability of each channel are pre-deployed in the first sub-model and the second sub-model, even if hidden layer characteristics are leaked in the data transmission process, the data before compression cannot be accurately recovered without decompressing the target value range and the corresponding discrete accumulation probability of each channel, so that the privacy information of a user can be effectively protected.
In one embodiment, there is provided an image recognition method including:
and the terminal acquires an image to be identified, and codes the image to be identified through the coder to obtain the coding characteristics corresponding to each channel.
And the terminal inputs the coding characteristics corresponding to each channel into the quantizer to carry out quantization processing, so as to obtain the quantization characteristics corresponding to each channel.
And the terminal determines a matched character matched with the current character in the target value range corresponding to the corresponding channel for the current character in the quantization characteristics of the corresponding channel.
And the terminal takes the corresponding interval of the matched character in the corresponding discrete cumulative probability interval as the probability interval corresponding to the current character.
And the terminal determines the coding interval corresponding to the current character according to the probability interval corresponding to the current character and the coding interval corresponding to the adjacent previous character, and continues to process the next character in the quantization characteristics of the corresponding channel until the coding interval corresponding to each character in the quantization characteristics is obtained.
For each channel, the terminal randomly selects a value from the coding interval corresponding to the last character of the quantization feature corresponding to the corresponding channel.
And the terminal takes the randomly selected numerical value as the hidden layer characteristics corresponding to the corresponding channel until the hidden layer characteristics corresponding to each channel are obtained.
The terminal uploads the hidden layer characteristics corresponding to each channel to the server, and the server receives the hidden layer characteristics corresponding to each channel.
The server determines a character probability interval corresponding to the hidden layer feature in the discrete cumulative probability interval of the corresponding channel for the hidden layer feature corresponding to each channel, and determines the first character in the hidden layer feature from the corresponding target value range based on the character probability interval.
The server takes the first character as a current character, and updates the discrete cumulative probability interval based on the character probability interval and the initial probability interval of the current character.
And the server determines the next character of the current character from the target value range based on the character probability interval corresponding to the hidden layer feature in the updated discrete cumulative probability interval.
And the server takes the next character as the current character, returns the probability interval and the initial probability interval based on the current character, updates the step of discrete cumulative probability interval and continues to execute until the last character in the hidden layer characteristic is obtained.
And the server takes each character in the hidden layer characteristics as the decompression characteristics of the corresponding channel.
The server inputs the decompression characteristics of each channel into the characteristic recognition network to perform characteristic recognition based on the decompression characteristics corresponding to each channel, so as to obtain an identity recognition result corresponding to the image to be recognized, and the identity recognition result is fed back to the terminal.
In the embodiment, the image to be recognized is encoded and quantized through the encoder and the quantizer to obtain the quantization characteristics corresponding to each channel, the quantization characteristics of the corresponding channel are compressed through the target value range and the corresponding discrete cumulative probability interval of each channel, and the hidden layer characteristics corresponding to each channel obtained after the compression are uploaded to the server, so that the user privacy can be effectively protected, and the transmitted data volume and bandwidth can be reduced. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the problem that the image to be recognized is directly leaked out due to data leakage in the process of directly uploading the image to be recognized to the server is solved. And the server performs corresponding decompression processing on the hidden layer characteristics based on the pre-stored target value range of each channel and the corresponding discrete cumulative probability interval, and accurately obtains the decompression characteristics corresponding to each channel. Identity recognition is carried out through the feature recognition network based on the decompression features of the channels, and an identity recognition result is returned to the terminal, so that the accuracy of user identity recognition and the safety of a recognition process are effectively protected.
It should be understood that although the various steps in the flowcharts of fig. 2-10 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-10 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps or stages.
In one embodiment, as shown in fig. 11, an image recognition apparatus 1100 is provided, which may be a part of a computer device using a software module or a hardware module, or a combination of the two modules, and specifically includes: an acquisition module 1102, a compression module 1104, an upload module 1106, and a result receiving module 1108, wherein:
the obtaining module 1102 is configured to obtain an image to be identified, and perform coding and quantization processing on the image to be identified to obtain a quantization feature corresponding to at least one channel.
And the compression module 1104 is configured to compress the quantization features of the corresponding channels according to the target value ranges of the channels and the corresponding discrete cumulative probability intervals to obtain hidden layer features corresponding to each channel.
An uploading module 1106, configured to upload the hidden layer features corresponding to each channel to the server, where the uploaded hidden layer features are used to instruct the server to decompress the hidden layer features based on a pre-stored target value range and a corresponding discrete cumulative probability interval of each channel, and perform identity recognition based on a decompression result.
And a result receiving module 1108, configured to receive an identity recognition result corresponding to the image to be recognized, where the identity recognition result is fed back by the server.
The method comprises the steps of obtaining an image to be recognized, coding and quantizing the image to be recognized to obtain quantized features corresponding to at least one channel, compressing the quantized features of the corresponding channel through a target value range and a corresponding discrete cumulative probability interval of each channel, uploading hidden layer features corresponding to each channel obtained after compression to a server, effectively protecting user privacy, and reducing the amount and bandwidth of transmitted data. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the problem that the image to be recognized is directly leaked out due to data leakage in the process of directly uploading the image to be recognized to the server is solved. The server correspondingly decompresses the hidden layer characteristics based on the pre-stored target value ranges of all the channels and the corresponding discrete cumulative probability intervals, accurately obtains the decompressed characteristics to perform identity recognition and returns an identity recognition result to the terminal, and therefore the accuracy of user identity recognition and the safety of a recognition process are effectively protected.
In an embodiment, the compressing module 1104 is further configured to, for the quantized feature corresponding to each channel, determine a probability interval corresponding to each character in the quantized feature corresponding to the corresponding channel through the target value range and the discrete cumulative probability interval corresponding to the corresponding channel; determining a coding interval corresponding to each character respectively based on the probability interval corresponding to each character; and for each channel, determining the hidden layer characteristics corresponding to the corresponding channel according to the coding interval corresponding to each character corresponding to the corresponding channel.
In this embodiment, for the quantization feature corresponding to each channel, a probability interval corresponding to each character in the quantization feature corresponding to the corresponding channel is determined through a target value range and a discrete cumulative probability interval corresponding to the corresponding channel, a coding interval corresponding to each character is determined based on the probability interval corresponding to each character, for each channel, a hidden layer feature corresponding to the corresponding channel is determined according to the coding interval corresponding to each character corresponding to the corresponding channel, the quantization feature corresponding to each channel can be compressed, so as to reduce the data amount of the quantization feature, facilitate uploading the quantization feature to a server in the form of a compressed feature, and avoid leakage of the quantization feature in a transmission process.
In one embodiment, the compressing module 1104 is further configured to, for a current character in the quantization feature of the corresponding channel, determine a matching character, which matches the current character, in the target value range corresponding to the corresponding channel; taking the corresponding interval of the matched character in the corresponding discrete cumulative probability interval as the probability interval corresponding to the current character; and determining the coding interval corresponding to the current character according to the probability interval corresponding to the current character and the coding interval corresponding to the adjacent previous character, and continuously processing the next character in the quantization characteristics of the corresponding channel until the coding interval corresponding to each character in the quantization characteristics is obtained.
In this embodiment, for a current character in the quantization features of a corresponding channel, a matching character matching the current character in a target value range corresponding to the corresponding channel is determined, and an interval corresponding to the matching character in a corresponding discrete cumulative probability interval is used as a probability interval corresponding to the current character, so that a coding interval corresponding to the current character is determined according to the probability interval corresponding to the current character and a coding interval corresponding to an adjacent previous character. And continuously processing the next character in the quantization characteristics of the corresponding channel according to the same processing until the coding interval corresponding to each character in the quantization characteristics is obtained, compressing the quantization characteristics corresponding to each channel to reduce the data volume of the quantization characteristics, conveniently uploading the quantization characteristics to a server in a compressed characteristic form, reducing the used bandwidth and improving the safety in the data transmission process.
In one embodiment, the compressing module 1104 is further configured to, for each channel, randomly select a value from the coding interval corresponding to the last character of the quantization feature corresponding to the corresponding channel; and taking the randomly selected numerical value as the hidden layer characteristics corresponding to the corresponding channel until the hidden layer characteristics corresponding to each channel are obtained.
In this embodiment, a value is randomly selected from the coding interval corresponding to the last character in the quantization features, and the randomly selected value is used as the hidden layer feature corresponding to the channel, so that the randomness of data selection can be increased within a certain constraint range. And the random numerical value selected in the coding interval corresponding to the last character does not influence the subsequent decompression processing, namely, any numerical value in the coding interval corresponding to the last character is taken as a hidden layer characteristic, and the same hidden layer characteristic can be obtained after the subsequent decompression processing, so that the decompression accuracy is ensured.
In one embodiment, the image recognition means is implemented by a target recognition model comprising a first sub-model and a second sub-model; the first sub-model is deployed at the terminal, and the second sub-model is deployed at the server; the first sub-model comprises an encoder, a quantizer, a target value range of each channel and a corresponding discrete accumulation probability interval; the second submodel comprises a target value range and a corresponding discrete accumulation probability interval of each channel and a feature identification network.
In this embodiment, the image recognition method is implemented by a target recognition model including a first sub-model and a second sub-model, and the first sub-model and the second sub-model of the target recognition model are respectively deployed on a terminal and a server, so that the terminal processes an image to be processed into a hidden layer feature through the first sub-model. The data uploaded to the server is hidden layer characteristics, the uploaded hidden layer characteristics are compressed through a target value range and a discrete accumulation probability interval of each channel in the first sub-model, and even if the hidden layer characteristics are leaked in the data transmission process, the data before compression cannot be accurately recovered without a corresponding decompression mode for decompression, so that the private information of a user can be effectively protected.
In one embodiment, the apparatus further comprises:
the model determining module is used for determining a recognition model to be trained, and the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network;
the processing module is used for acquiring a sample image and a corresponding identity label, and sequentially coding and quantizing the sample image through a coder and a quantizer in the identification model to be trained to obtain sample quantization characteristics;
the probability distribution determining module is used for determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through an entropy network and determining sample image entropy corresponding to the sample image according to the sample probability distribution;
the calculation module is used for determining the value ranges respectively corresponding to all channels in the recognition model to be trained through the entropy network and based on the sample quantization characteristics, and calculating the channel probability distribution respectively corresponding to all the value ranges;
the probability loss determining module is used for determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range;
the identification module is used for identifying and processing the quantized characteristics of the sample through a characteristic identification network to obtain a sample identification result;
the construction module is used for determining image identification loss based on the sample identification result and the identity label and constructing a target loss function according to the image identification loss, the channel probability loss and the sample image entropy;
and the training module is used for training the recognition model to be trained through the target loss function until the training stopping condition is reached, so as to obtain the trained target recognition model.
The trained target recognition model comprises target value ranges respectively corresponding to the channels, and the discrete cumulative probability intervals respectively corresponding to the target value ranges of the channels are determined by the entropy network in the trained target recognition model based on the target value ranges of the corresponding channels.
In this embodiment, the sample quantization features of the sample image are processed through an entropy network to obtain sample probability distributions corresponding to pixels in the sample image, so as to determine a sample image entropy corresponding to the sample image, and the sample image entropy is used as a part of a target loss function to determine a loss degree of key information of the sample image. Meanwhile, the channel probability loss is calculated according to the channel probability distribution corresponding to the value range of each channel of the entropy network, and can be used as a part of a target loss function, so that the value range of each channel is optimized in the training process, and the constraint on the output characteristics of the entropy network is realized. And the loss of the sample quantization characteristics between the prediction result and the real result in the characteristic identification network is used as a part of a target loss function, so that the training of the characteristic identification network is realized. The overall training of the recognition model is realized through a series of constraints, so that the trained target recognition model has higher prediction precision and accuracy.
In one embodiment, the image to be recognized is a face image; the obtaining module 1102 is further configured to respond to a resource transfer triggering operation on the resource amount, and acquire a face image; coding and quantizing the face image to obtain quantization characteristics corresponding to at least one channel;
the device also includes: a resource transfer module; the resource transfer module is used for executing resource transfer operation when the identification result is successful; and the resource transfer operation is used for transferring the resource amount from the resource account of the operation initiator to the resource account of the receiver.
In this embodiment, the image recognition method is applied to resource transfer, a face image is collected in response to a resource transfer triggering operation on a resource amount, and the face image is quantized, encoded and compressed by a terminal to obtain a compressed hidden layer feature. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the condition that the face image of the user is directly leaked out due to data leakage in the process of directly uploading the face image to the server is avoided. The server correspondingly decompresses the hidden layer characteristics to accurately obtain the decompressed characteristics for identity recognition, when the identity recognition result corresponding to the face image is that the identity recognition is successful, resource transfer operation is executed, and the resource transfer operation is not executed during the identity recognition, so that the safety of resource transfer can be improved.
In one embodiment, the image to be recognized is a face image; the obtaining module 1102 is further configured to collect a face image in response to a trigger operation for controlling the access control;
the device also includes: an access control module; and the access control module is used for controlling the access control terminal to execute the access control opening action when the identification result corresponding to the face image is successful.
In this embodiment, the image recognition method is applied to access control, a face image is collected in response to a trigger operation on the access control, and the face image is quantized, encoded and compressed by the terminal to obtain a compressed hidden layer feature. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the condition that the face image of the user is directly leaked out due to data leakage in the process of directly uploading the face image to the server is avoided. The server correspondingly decompresses the hidden layer features, the decompressed features are accurately obtained to identify, and when the identification result corresponding to the face image is successful, the entrance guard terminal executes entrance guard opening action, so that the safety of entrance guard control and the safety of identity verification when the entrance guard is opened can be improved.
In one embodiment, as shown in fig. 12, an image recognition apparatus 1200 is provided, which may be a part of a computer device using a software module or a hardware module, or a combination of the two modules, and specifically includes: a feature receiving module 1202, a decompression module 1204, and a feedback module 1206, wherein:
the feature receiving module 1202 is configured to receive hidden layer features corresponding to at least one channel, where the hidden layer features corresponding to the at least one channel are obtained by encoding and quantizing an image to be recognized to obtain quantized features corresponding to the channels, and compressing the quantized features of the corresponding channels according to a target value range and a corresponding discrete cumulative probability interval of each channel;
a decompression module 1204, configured to decompress the hidden layer characteristics of the corresponding channel according to the target value range of each channel and the corresponding discrete cumulative probability interval, to obtain decompressed characteristics of the corresponding channel;
the feedback module 1206 is configured to perform feature recognition based on the decompression features respectively corresponding to the channels, obtain an identity recognition result corresponding to the image to be recognized, and feed back the identity recognition result to the terminal.
In this embodiment, the terminal encodes and quantizes an image to be recognized to obtain a quantization feature corresponding to at least one channel, compresses the quantization feature of the corresponding channel according to a target value range and a corresponding discrete cumulative probability interval of each channel, and uploads a hidden layer feature corresponding to each channel obtained after compression to the server, so that user privacy can be effectively protected, and the amount and bandwidth of transmitted data can be reduced. The uploading server is the hidden layer characteristics after the compression processing, so that the accurate image characteristics can be obtained only by the corresponding decompression processing, and the problem that the image to be recognized is directly leaked out due to data leakage in the process of directly uploading the image to be recognized to the server is solved. The server correspondingly decompresses the hidden layer characteristics based on the pre-stored target value ranges of all the channels and the corresponding discrete accumulation probability intervals, and accurately obtains the decompressed characteristics to perform identity recognition, so that the accuracy of user identity recognition and the safety of the recognition process are effectively improved.
In one embodiment, the decompression module 1204 is further configured to determine, for each hidden layer feature corresponding to each channel, a character probability interval corresponding to the hidden layer feature in the discrete cumulative probability interval of the corresponding channel, and determine, based on the character probability interval, a first character in the hidden layer feature from a corresponding target value range; taking the first character as a current character, and updating a discrete cumulative probability interval based on a character probability interval and an initial probability interval of the current character; determining the next character of the current character from the target value range based on the character probability interval corresponding to the hidden layer feature in the updated discrete cumulative probability interval; taking the next character as the current character, returning to the probability interval and the initial probability interval based on the current character, updating the step of discrete cumulative probability interval and continuing to execute until the last character in the hidden layer characteristic is obtained; and taking each character in the hidden layer characteristics as the decompression characteristics of the corresponding channel.
In this embodiment, for each hidden layer feature corresponding to each channel, a character probability interval corresponding to the hidden layer feature in a discrete cumulative probability interval of the corresponding channel is determined, a first character in the hidden layer feature is determined from a corresponding target value range based on the character probability interval, the first character is used as a current character, the discrete cumulative probability interval is updated based on the character probability interval and an initial probability interval of the current character, a next character of the current character is determined from the target value range based on the character probability interval corresponding to the hidden layer feature in the updated discrete cumulative probability interval, the next character is used as the current character, the probability interval and the initial probability interval based on the current character are returned, the step of updating the discrete cumulative probability interval is continued until a last character in the hidden layer feature is obtained, and each character in the hidden layer feature is used as a decompression feature of the corresponding channel, the decompression characteristics corresponding to each channel can be accurately decompressed based on the decompression corresponding to the compression processing, so that the identity recognition can be accurately carried out on the server.
In one embodiment, as shown in fig. 13, there is provided a recognition model training apparatus 1300, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, the apparatus specifically includes: model determination module 1302, processing module 1304, probability distribution determination module 1306, calculation module 1308, probability loss determination module 1310, recognition module 1312, construction module 1314, and training module 1316, wherein:
the model determining module 1302 is configured to determine a recognition model to be trained, where the recognition model to be trained includes an image encoder, a quantizer, an entropy network, and a feature recognition network.
And the processing module 1304 is configured to obtain a sample image and a corresponding identity tag, and sequentially encode and quantize the sample image through an encoder and a quantizer in the recognition model to be trained to obtain a sample quantization feature.
And a probability distribution determining module 1306, configured to determine, through the entropy network and based on the sample quantization features, sample probability distributions respectively corresponding to the pixels in the sample image, and determine, according to the sample probability distributions, sample image entropies corresponding to the sample images.
The calculating module 1308 is configured to determine, through the entropy network and based on the sample quantization features, value ranges respectively corresponding to the channels in the recognition model to be trained, and calculate channel probability distributions respectively corresponding to the value ranges.
A probability loss determining module 1310, configured to determine a channel probability loss based on channel probability distributions respectively corresponding to the value ranges.
And the identifying module 1312 is configured to perform identification processing on the quantized features of the sample through a feature identification network to obtain a sample identification result.
And a constructing module 1314, configured to determine an image recognition loss based on the sample recognition result and the identity tag, and construct a target loss function according to the image recognition loss, the channel probability loss, and the sample image entropy.
The training module 1316 is configured to train the recognition model to be trained through the target loss function, and stop training until a training stop condition is reached, so as to obtain a trained target recognition model; the target recognition model is used for carrying out identity recognition on the image to be recognized.
In this embodiment, the sample quantization features of the sample image are processed through an entropy network to obtain sample probability distributions corresponding to pixels in the sample image, so as to determine a sample image entropy corresponding to the sample image, and the sample image entropy is used as a part of a target loss function to determine a loss degree of key information of the sample image. Meanwhile, the channel probability loss is calculated according to the channel probability distribution corresponding to the value range of each channel of the entropy network, and can be used as a part of a target loss function, so that the value range of each channel is optimized in the training process, and the constraint on the output characteristics of the entropy network is realized. And the loss of the sample quantization characteristics between the prediction result and the real result in the characteristic identification network is used as a part of a target loss function, so that the training of the characteristic identification network is realized. The overall training of the recognition model is realized through a series of constraints, so that the trained target recognition model has higher prediction precision and accuracy.
In an embodiment, the calculating module 1308 is further configured to calculate channel probability distributions corresponding to the upper limit value and the lower limit value in each value range;
the probability loss determining module 1310 is further configured to determine the channel probability loss according to the channel probability distributions corresponding to the upper limit value and the lower limit value of each value range.
In this embodiment, for the value ranges of each channel in the entropy network, only the channel probability distributions corresponding to the upper and lower limits of each value range are calculated, so that the channel probability loss is calculated according to the channel probability distributions corresponding to the upper and lower limits, thereby reducing the calculation amount and increasing the processing speed.
In one embodiment, the trained target recognition model includes target value ranges respectively corresponding to the channels; the probability distribution determining module 1306 is further configured to determine, through the entropy network in the target recognition model, a channel probability distribution corresponding to each value in each target value range based on the target value range corresponding to each channel; for the target value range corresponding to each channel, calculating a discrete cumulative probability interval corresponding to the corresponding target value range according to the channel probability distribution corresponding to each value in the corresponding target value range; the target value range and the corresponding discrete cumulative probability interval are used for compressing the quantization features corresponding to the image to be recognized into the hidden layer features, and decompressing the hidden layer features corresponding to the image to be recognized so as to obtain the identity recognition result corresponding to the image to be recognized based on the decompression result.
The trained target recognition model comprises target value ranges respectively corresponding to all channels, and discrete cumulative probability intervals respectively corresponding to all the target value ranges can be obtained through an entropy network. And deploying the discrete cumulative probability intervals corresponding to each target value range at the terminal and the server, so that the terminal can accurately compress the quantitative characteristics corresponding to the image to be identified into hidden layer characteristics by obtaining the discrete cumulative probability intervals corresponding to each target value range, and uploading the hidden layer characteristics to a service. And the target value range and the corresponding discrete accumulation probability interval of the server are used for decompressing the hidden layer characteristics so as to accurately recover the compressed data, and the identity recognition result corresponding to the image to be recognized is obtained based on the decompression result.
In one embodiment, the target recognition model comprises a first sub-model and a second sub-model, wherein the first sub-model is deployed at the terminal, and the second sub-model is deployed at the server; the first sub-model comprises an encoder, a quantizer, a target value range of each channel and a corresponding discrete accumulation probability interval, and the target value range of each channel and the corresponding discrete accumulation probability interval in the first sub-model are used for compressing quantization characteristics corresponding to the image to be recognized into hidden layer characteristics; the second submodel comprises a target value range of each channel, a corresponding discrete accumulation probability interval and a feature identification network; and the target value range and the corresponding discrete accumulation probability interval of each channel in the second submodel are used for decompressing the hidden layer characteristics corresponding to the image to be recognized.
In this embodiment, the target recognition model includes a first sub-model and a second sub-model, and the first sub-model and the second sub-model of the target recognition model are respectively deployed on the terminal and the server, so that the terminal processes the image to be processed into the hidden layer feature through the first sub-model. The data uploaded to the server are hidden layer characteristics, and the uploaded hidden layer characteristics are compressed through the target value range and the discrete accumulation probability interval of each channel in the first sub-model. And the target value range and the corresponding discrete accumulation probability of each channel are pre-deployed in the first sub-model and the second sub-model, even if hidden layer characteristics are leaked in the data transmission process, the data before compression cannot be accurately recovered without decompressing the target value range and the corresponding discrete accumulation probability of each channel, so that the privacy information of a user can be effectively protected.
For specific limitations of the image recognition apparatus and the recognition model training apparatus, reference may be made to the above limitations of the image recognition method and the recognition model training method, which are not described herein again. The modules in the image recognition device and the recognition model training device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 14. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store image recognition data and recognition model training data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image recognition method and a recognition model training method.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 15. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an image recognition method and a recognition model training method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configurations shown in fig. 14 and 15 are only block diagrams of partial configurations relevant to the present application, and do not constitute a limitation on the computer device to which the present application is applied, and a particular computer device may include more or less components than those shown in the figures, or may combine some components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (15)

1. An image recognition method, characterized in that the method comprises:
acquiring an image to be identified, and coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel;
compressing the quantization characteristics of the corresponding channels through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain hidden layer characteristics corresponding to each channel;
uploading hidden layer characteristics corresponding to each channel to a server, wherein the uploaded hidden layer characteristics are used for indicating the server to decompress the hidden layer characteristics based on a pre-stored target value range and a corresponding discrete accumulation probability interval of each channel, and performing identity recognition based on a decompression result;
and receiving an identity recognition result which is fed back by the server and corresponds to the image to be recognized.
2. The method according to claim 1, wherein the compressing the quantized features of the corresponding channels through the target value ranges of the channels and the corresponding discrete cumulative probability intervals to obtain hidden layer features corresponding to each channel respectively comprises:
for the quantization feature corresponding to each channel, determining a probability interval corresponding to each character in the quantization feature corresponding to the corresponding channel through a target value range and a discrete cumulative probability interval corresponding to the corresponding channel;
determining a coding interval corresponding to each character respectively based on a probability interval corresponding to each character;
and for each channel, determining the hidden layer characteristics corresponding to the corresponding channel according to the coding interval corresponding to each character corresponding to the corresponding channel.
3. The method of claim 2, wherein determining the probability interval corresponding to each character in the quantized features corresponding to the corresponding channel according to the target value range and the discrete cumulative probability interval corresponding to the corresponding channel comprises:
determining a matched character matched with the current character in a target value range corresponding to a corresponding channel for the current character in the quantization characteristics of the corresponding channel;
taking the corresponding interval of the matched character in the corresponding discrete cumulative probability interval as the probability interval corresponding to the current character;
the determining, based on the probability interval corresponding to each character, the coding interval corresponding to each character includes:
and determining the coding interval corresponding to the current character according to the probability interval corresponding to the current character and the coding interval corresponding to the adjacent previous character, and continuously processing the next character in the quantization characteristics of the corresponding channel until the coding interval corresponding to each character in the quantization characteristics is obtained.
4. The method according to claim 2, wherein for each channel, determining the hidden layer feature corresponding to the corresponding channel according to the coding interval corresponding to each character corresponding to the corresponding channel respectively comprises:
for each channel, randomly selecting a numerical value from a coding interval corresponding to the last character of the quantization characteristic corresponding to the corresponding channel;
and taking the randomly selected numerical value as the hidden layer characteristics corresponding to the corresponding channel until the hidden layer characteristics corresponding to each channel are obtained.
5. The method of claim 1, wherein the image recognition method is implemented by a target recognition model comprising a first sub-model and a second sub-model; the first sub-model is deployed at a terminal, and the second sub-model is deployed at a server; the first sub-model comprises an encoder, a quantizer, a target value range of each channel and a corresponding discrete accumulation probability interval; the second submodel comprises a target value range and a corresponding discrete accumulation probability interval of each channel and a feature identification network.
6. The method of claim 5, wherein the target recognition model is determined by a training step; the training step comprises:
determining a recognition model to be trained, wherein the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network;
acquiring a sample image and a corresponding identity label, and sequentially encoding and quantizing the sample image through an encoder and a quantizer in the identification model to be trained to obtain a sample quantization characteristic;
determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through the entropy network, and determining sample image entropy corresponding to the sample image according to the sample probability distribution;
determining the value ranges respectively corresponding to all channels in the recognition model to be trained based on the sample quantization characteristics through the entropy network, and calculating the channel probability distribution respectively corresponding to all the value ranges;
determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range;
identifying and processing the quantitative characteristics of the sample through the characteristic identification network to obtain a sample identification result;
determining image recognition loss based on the sample recognition result and the identity label, and constructing a target loss function according to the image recognition loss, the channel probability loss and the sample image entropy;
training the recognition model to be trained through the target loss function until the training stopping condition is reached, and obtaining a trained target recognition model;
the trained target recognition model comprises target value ranges respectively corresponding to the channels, and the discrete cumulative probability intervals respectively corresponding to the target value ranges of the channels are determined by the entropy network in the trained target recognition model based on the target value ranges of the corresponding channels.
7. The method according to any one of claims 1 to 6, characterized in that the image to be recognized is a human face image; the acquiring an image to be identified, and encoding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel includes:
responding to a resource transfer triggering operation of the resource amount, and acquiring a face image;
coding and quantizing the face image to obtain quantization characteristics corresponding to at least one channel;
after the receiving of the identification result corresponding to the image to be identified fed back by the server, the method further includes:
when the identification result is successful, executing resource transfer operation; and the resource transfer operation is used for transferring the resource amount from the resource account of the operation initiator to the resource account of the receiver.
8. The method according to any one of claims 1 to 6, characterized in that the image to be recognized is a human face image; the acquiring of the image to be recognized includes:
responding to the triggering operation of the access control, and acquiring a face image;
after the receiving of the identification result corresponding to the image to be identified fed back by the server, the method further includes:
and when the identity recognition result corresponding to the face image is successful, controlling the access control terminal to execute an access control opening action.
9. An image recognition method, characterized in that the method comprises:
receiving hidden layer characteristics corresponding to at least one channel, wherein the hidden layer characteristics corresponding to the at least one channel are obtained by coding and quantizing an image to be identified to obtain quantized characteristics corresponding to each channel, and compressing the quantized characteristics of the corresponding channel through a target value range and a corresponding discrete cumulative probability interval of each channel;
decompressing the hidden layer characteristics of the corresponding channel through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain the decompressed characteristics of the corresponding channel;
and performing feature recognition based on the decompression features respectively corresponding to the channels to obtain an identity recognition result corresponding to the image to be recognized, and feeding back the identity recognition result to the terminal.
10. The method according to claim 9, wherein the decompressing the hidden layer feature of the corresponding channel through the target value range and the corresponding discrete cumulative probability interval of each channel to obtain the decompressed feature of the corresponding channel comprises:
for hidden layer features respectively corresponding to each channel, determining a character probability interval corresponding to the hidden layer features in a discrete cumulative probability interval of the corresponding channel, and determining first characters in the hidden layer features from a corresponding target value range based on the character probability interval;
taking the first character as a current character, and updating a discrete cumulative probability interval based on a character probability interval and an initial probability interval of the current character;
determining a next character of the current character from the target value range based on a character probability interval corresponding to the hidden layer feature in the updated discrete cumulative probability interval;
taking the next character as a current character, returning the probability interval and the initial probability interval based on the current character, updating the discrete cumulative probability interval and continuing to execute until the last character in the hidden layer characteristic is obtained;
and taking each character in the hidden layer characteristics as the decompression characteristics of the corresponding channel.
11. A recognition model training method, the method comprising:
determining a recognition model to be trained, wherein the recognition model to be trained comprises an image encoder, a quantizer, an entropy network and a feature recognition network;
acquiring a sample image and a corresponding identity label, and sequentially encoding and quantizing the sample image through an encoder and a quantizer in the identification model to be trained to obtain a sample quantization characteristic;
determining sample probability distribution corresponding to each pixel in the sample image based on the sample quantization characteristics through the entropy network, and determining sample image entropy corresponding to the sample image according to the sample probability distribution;
determining the value ranges respectively corresponding to all channels in the recognition model to be trained based on the sample quantization characteristics through the entropy network, and calculating the channel probability distribution respectively corresponding to all the value ranges;
determining the probability loss of the channel based on the probability distribution of the channel corresponding to each value range;
identifying and processing the quantitative characteristics of the sample through the characteristic identification network to obtain a sample identification result;
determining image recognition loss based on the sample recognition result and the identity label, and constructing a target loss function according to the image recognition loss, the channel probability loss and the sample image entropy;
training the recognition model to be trained through the target loss function until the training stopping condition is reached, and obtaining a trained target recognition model; the target recognition model is used for carrying out identity recognition on the image to be recognized.
12. The method according to claim 11, wherein the calculating the channel probability distribution corresponding to each value range includes:
calculating the channel probability distribution corresponding to the upper limit value and the lower limit value in each value range;
the determining the channel probability loss based on the channel probability distribution respectively corresponding to each value range includes:
and determining the channel probability loss according to the channel probability distribution corresponding to the upper limit value and the lower limit value of each value range.
13. The method of claim 11, wherein the trained target recognition model includes a target value range corresponding to each of the channels; the method further comprises the following steps:
determining channel probability distribution corresponding to each value in each target value range based on the target value range corresponding to each channel through an entropy network in the target identification model;
for the target value range corresponding to each channel, calculating a discrete cumulative probability interval corresponding to the corresponding target value range according to the channel probability distribution corresponding to each value in the corresponding target value range;
the target value range and the corresponding discrete cumulative probability interval are used for compressing the quantization features corresponding to the image to be recognized into hidden layer features, and decompressing the hidden layer features corresponding to the image to be recognized so as to obtain the identity recognition result corresponding to the image to be recognized based on the decompression result.
14. The method of claim 13, wherein the target recognition model comprises a first sub-model and a second sub-model, the first sub-model being deployed at a terminal, the second sub-model being deployed at a server;
the first sub-model comprises an encoder, a quantizer, and a target value range and a corresponding discrete accumulation probability interval of each channel, and the target value range and the corresponding discrete accumulation probability interval of each channel in the first sub-model are used for compressing quantization features corresponding to an image to be identified into hidden layer features; the second submodel comprises a target value range and a corresponding discrete accumulation probability interval of each channel and a feature identification network; and the target value range and the corresponding discrete cumulative probability interval of each channel in the second submodel are used for decompressing the hidden layer characteristics corresponding to the image to be recognized.
15. An image recognition apparatus, characterized in that the apparatus comprises:
the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an image to be identified, and coding and quantizing the image to be identified to obtain quantization characteristics corresponding to at least one channel;
the compression module is used for compressing the quantization characteristics of the corresponding channels through the target value range of each channel and the corresponding discrete cumulative probability interval to obtain hidden layer characteristics corresponding to each channel;
the uploading module is used for uploading the hidden layer characteristics corresponding to each channel to a server, and the uploaded hidden layer characteristics are used for indicating the server to decompress the hidden layer characteristics based on the pre-stored target value range and the corresponding discrete accumulation probability interval of each channel and identify based on the decompression result;
and the result receiving module is used for receiving the identity recognition result which is fed back by the server and corresponds to the image to be recognized.
CN202110753108.8A 2021-07-02 2021-07-02 Image recognition method and device, computer equipment and storage medium Pending CN113822129A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110753108.8A CN113822129A (en) 2021-07-02 2021-07-02 Image recognition method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110753108.8A CN113822129A (en) 2021-07-02 2021-07-02 Image recognition method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113822129A true CN113822129A (en) 2021-12-21

Family

ID=78924128

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110753108.8A Pending CN113822129A (en) 2021-07-02 2021-07-02 Image recognition method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113822129A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115052174A (en) * 2022-06-13 2022-09-13 北京达佳互联信息技术有限公司 Resource transfer method, device, electronic equipment and storage medium
CN117668797A (en) * 2024-02-01 2024-03-08 北京国旺盛源智能终端科技有限公司 Identity recognition method and system for network convenient service terminal

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115052174A (en) * 2022-06-13 2022-09-13 北京达佳互联信息技术有限公司 Resource transfer method, device, electronic equipment and storage medium
CN115052174B (en) * 2022-06-13 2023-12-19 北京达佳互联信息技术有限公司 Resource transfer method, device, electronic equipment and storage medium
CN117668797A (en) * 2024-02-01 2024-03-08 北京国旺盛源智能终端科技有限公司 Identity recognition method and system for network convenient service terminal
CN117668797B (en) * 2024-02-01 2024-04-09 北京国旺盛源智能终端科技有限公司 Identity recognition method and system for network convenient service terminal

Similar Documents

Publication Publication Date Title
CN107665364B (en) Neural network method and apparatus
KR102608467B1 (en) Method for lightening neural network and recognition method and apparatus using the same
CN111818346B (en) Image encoding method and apparatus, image decoding method and apparatus
CN113822129A (en) Image recognition method and device, computer equipment and storage medium
US20230164336A1 (en) Training a Data Coding System Comprising a Feature Extractor Neural Network
CN111641832B (en) Encoding method, decoding method, device, electronic device and storage medium
Jankowski et al. Deep joint source-channel coding for wireless image retrieval
Agustsson et al. Soft-to-hard vector quantization for end-to-end learned compression of images and neural networks
CN108776832B (en) Information processing method, information processing device, computer equipment and storage medium
US20220215595A1 (en) Systems and methods for image compression at multiple, different bitrates
CN112418292B (en) Image quality evaluation method, device, computer equipment and storage medium
CN113132723B (en) Image compression method and device
US11893762B2 (en) Method and data processing system for lossy image or video encoding, transmission and decoding
CN113748605A (en) Method and apparatus for compressing parameters of neural network
US20230076017A1 (en) Method for training neural network by using de-identified image and server providing same
CN114071141A (en) Image processing method and equipment
TW202134958A (en) Neural network representation formats
Ngo et al. Adaptive anomaly detection for IoT data in hierarchical edge computing
CN110766048A (en) Image content identification method and device, computer equipment and storage medium
EP3849180A1 (en) Encoding or decoding data for dynamic task switching
EP3598343A1 (en) Method and apparatus for processing audio data
WO2023118317A1 (en) Method and data processing system for lossy image or video encoding, transmission and decoding
Song et al. Partial gated feedback recurrent neural network for data compression type classification
CN115361559A (en) Image encoding method, image decoding method, image encoding device, image decoding device, and storage medium
CN112749539A (en) Text matching method and device, computer readable storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination