CN117079058B - Image processing method and device, storage medium and electronic equipment - Google Patents

Image processing method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN117079058B
CN117079058B CN202311313702.0A CN202311313702A CN117079058B CN 117079058 B CN117079058 B CN 117079058B CN 202311313702 A CN202311313702 A CN 202311313702A CN 117079058 B CN117079058 B CN 117079058B
Authority
CN
China
Prior art keywords
sample
dimension
pixel point
classification
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311313702.0A
Other languages
Chinese (zh)
Other versions
CN117079058A (en
Inventor
许剑清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202311313702.0A priority Critical patent/CN117079058B/en
Publication of CN117079058A publication Critical patent/CN117079058A/en
Application granted granted Critical
Publication of CN117079058B publication Critical patent/CN117079058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification

Abstract

The application discloses an image processing method and device, a storage medium and electronic equipment, which can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like. The method comprises the following steps: extracting a feature matrix with the dimension of s multiplied by n of the target image, wherein s represents the number of pixel points in the target image, and n represents the number of features of each pixel point in the target image; according to the feature matrix with the dimension s multiplied by n, determining a classification weight vector with the dimension 1 multiplied by n of each pixel point in the target image to obtain a classification weight matrix with the dimension s multiplied by n; determining a confidence coefficient matrix with the dimension of n x n of each pixel point in the target image according to the feature matrix with the dimension of s x n and the classification weight matrix with the dimension of s x n to obtain s confidence coefficient matrices; and determining the confidence coefficient of the target image according to the s confidence coefficient matrixes. The method and the device solve the technical problem of low accuracy in the image segmentation process.

Description

Image processing method and device, storage medium and electronic equipment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to an image processing method and apparatus, a storage medium, and an electronic device.
Background
With the rapid development of artificial intelligence technology, a new research direction is provided for application in the field of image segmentation by a neural network, in the related technology, image segmentation based on the neural network can be a problem of classifying image pixels with semantic labels or a problem of segmenting single objects in images, and the method mainly comprises semantic segmentation, wherein the basic form of the semantic segmentation is to carry out pixel marking on the image pixels by using a group of classes of objects to form segmented images of different objects.
Image segmentation can be applied to a plurality of different fields, for example, authentication of user identity in the field of information security, the iris exists as a unique human body and cannot change along with development of recognition, and the characteristics of high uniqueness, incapability of being modified and the like, so that iris recognition is considered to be one of safer biological recognition technologies.
A complete iris authentication system generally includes processes of iris image acquisition, iris segmentation positioning, iris feature extraction, etc., wherein iris segmentation is an important part of the iris authentication system, and its accuracy directly affects the reliability of the authentication result, but since the iris image acquired in a practical application scene generally has a large amount of noise and interference (e.g. occlusion, defocus, etc.), in this case, the conventional semantic segmentation maps its features into a space region with poor distinguishing ability, thereby causing inaccuracy of the iris segmentation result.
In other words, the semantic segmentation method in the related art can only have a better expression capability on the iris image segmentation result with better quality (for example, high definition without shielding), but for the iris image with blurring or various different noises, inaccuracy of the iris segmentation result is easy to occur, thereby causing a technical problem of lower accuracy in the image segmentation process.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the application provides an image processing method and device, a storage medium and electronic equipment, so as to at least solve the technical problem of low accuracy in the image segmentation process.
According to an aspect of an embodiment of the present application, there is provided an image processing method including: extracting features of the target image to obtain a feature matrix with dimensions of s multiplied by n, wherein s represents the number of pixel points in the target image, n represents the number of features of each pixel point in the target image, and s and n are positive integers greater than or equal to 2; according to the feature matrix with the dimension s multiplied by n, determining a classification weight vector with the dimension 1 multiplied by n of each pixel point in the target image to obtain a classification weight with the dimension s multiplied by n A heavy matrix in which the classification weight vector of the kth pixel in the target image is the y-th pixel determined to belong to the preset c categories k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c; determining a confidence coefficient matrix with the dimension of n x n of each pixel point in the target image according to the feature matrix with the dimension of s x n and the classification weight matrix with the dimension of s x n to obtain s confidence coefficient matrices; and determining the confidence coefficient of the target image according to the s confidence coefficient matrixes.
Optionally, determining a classification weight vector with dimension of 1×n for each pixel point in the target image according to the feature matrix with dimension of s×n, to obtain a classification weight matrix with dimension of s×n, including: determining a classification result of each pixel point in the target image according to a feature matrix with dimension s multiplied by n and a predetermined classification weight matrix with dimension c multiplied by n to obtain s classification results, wherein the classification result of the kth pixel point in the target image is used for indicating that the kth pixel point belongs to the y in c categories k The classification result of the kth pixel point in the target image is based on the feature vector with dimension of 1×n for representing the kth pixel point and the y k The classification weight vector with dimension of 1 Xn corresponding to each category, the obtained classification result is determined, the feature matrix with dimension s Xn comprises the feature vector with dimension of 1 Xn for representing the kth pixel point, and the classification weight matrix with dimension c Xn comprises the feature vector with dimension y k Classification weight vectors with dimensions of 1 x n corresponding to the respective categories; determining a classification weight vector with a dimension of 1×n corresponding to each of the s classification results as a classification weight vector with a dimension of 1×n for each pixel point in the target image, wherein the classification weight vector with a dimension of 1×n corresponding to the kth classification result of the s classification results is the same as the y-th classification result k The dimension corresponding to each category is a classification weight vector of 1 multiplied by n, and the kth classification result is the kth in the target imageAnd classifying k pixel points.
Optionally, determining the classification result of each pixel point in the target image according to the feature matrix with the dimension s×n and the predetermined classification weight matrix with the dimension c×n to obtain s classification results, including: inputting a feature matrix with the dimension of s multiplied by 0n into a target classification result determining module in a trained target image recognition model to obtain s multiplied by c candidate classification results, wherein the target classification result determining module determines s multiplied by c candidate classification results according to the feature matrix with the dimension of s multiplied by n and a predetermined classification weight matrix with the dimension of c multiplied by n, c candidate classification results corresponding to the kth pixel point in the s multiplied by c candidate classification results represent the probability that the kth pixel point belongs to each of c categories, and c candidate classification results corresponding to the kth pixel point are respectively determined according to a feature vector with the dimension of 1 multiplied by n for representing the kth pixel point and a classification weight matrix with the dimension of c multiplied by n; among c candidate classification results corresponding to each pixel in the target image among s×c candidate classification results, determining the candidate classification result with the highest probability of representation as the classification result of each pixel in the target image, wherein the classification result of the kth pixel is the candidate classification result with the highest probability of representation among the c candidate classification results corresponding to the kth pixel, and the kth pixel belongs to the kth category among the c categories k The probability of each category is the largest.
Optionally, determining the confidence level of the target image according to the s confidence level matrixes includes: executing tracing operation on the confidence coefficient matrix corresponding to each pixel point in the s confidence coefficient matrixes to obtain s values; and performing summation operation on the s values to obtain the confidence coefficient of the target image.
Optionally, determining the confidence matrix with dimension n×n of each pixel point in the target image according to the feature matrix with dimension s×n and the classification weight matrix with dimension s×n to obtain s confidence matrices, including: inputting a feature matrix with the dimension of sXn and a classification weight matrix with the dimension of sXn into a trained target confidence estimation model to obtain s confidence matrixes, wherein the target image recognition model is a model obtained by training a confidence estimation model to be trained by using a sample image set.
Optionally, the method further comprises: under the condition that the sample image set comprises M sample images, extracting the characteristics of each sample image in the M sample images to obtain M sample characteristic matrixes, wherein the number of pixel points in each sample image is s, the t-th sample characteristic matrix in the M sample characteristic matrixes is a sample characteristic matrix with dimensions of s multiplied by n, which is obtained by extracting the characteristics of the t-th sample image in the M sample images, M is a positive integer greater than or equal to 2, and t is a positive integer greater than or equal to 1 and less than or equal to M; according to M sample feature matrixes, determining a sample classification weight matrix corresponding to each sample image in the M sample images to obtain M sample classification weight matrixes, wherein a t sample classification weight matrix in the M sample classification weight matrixes is a sample classification weight matrix with a dimension of s multiplied by n determined according to the t sample feature matrixes, wherein the t sample classification weight matrix comprises a sample classification weight vector with a dimension of 1 multiplied by n of each pixel point in the t sample images, and the sample classification weight vector of the p pixel point in the t sample images is a y-th pixel point determined to belong to c categories p In the case of the category, the c classification weight vectors corresponding to the p-th pixel point are associated with the y-th pixel point p Sample classification weight vectors corresponding to the categories, p is a positive integer greater than or equal to 1 and less than or equal to s, y p Is a positive integer greater than or equal to 1 and less than or equal to c; training the confidence coefficient estimation model to be trained according to the M sample feature matrixes and the M sample classification weight matrixes to obtain a target confidence coefficient estimation model.
Optionally, training the confidence coefficient estimation model to be trained according to the M sample feature matrices and the M sample classification weight matrices to obtain the target confidence coefficient estimation model, including: performing an mth round of training on the confidence estimation model to be trained by the following steps, wherein m is a positive integer greater than or equal to 1: inputting a q sample feature matrix and a q sample classification weight matrix used in the mth training to a confidence estimation model of the mth training to obtain a sample confidence matrix with the dimension of n multiplied by n of each pixel point in the q sample image, and obtaining s sample confidence matrices, wherein q is a positive integer which is greater than or equal to 1 and less than or equal to M, the M sample feature matrices comprise the q sample feature matrix, the M sample classification weight matrices comprise the q sample classification weight matrix, and the M sample images comprise the q sample image; determining a loss value of the mth training according to the q sample feature matrix, the q sample classification weight matrix and the s sample confidence coefficient matrix; under the condition that the loss value of the mth training does not meet the preset convergence condition, parameters in the confidence coefficient estimation model of the mth training are adjusted to obtain a confidence coefficient estimation model of the (m+1) th training; and ending the training when the loss value of the mth training meets the convergence condition, and determining the confidence coefficient estimation model of the mth training as a target confidence coefficient estimation model.
Optionally, determining the loss value of the mth training according to the q-th sample feature matrix, the q-th sample classification weight matrix and the s-th sample confidence coefficient matrix includes: in the case that the q-th sample feature matrix includes a sample feature vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, the q-th sample classification weight matrix includes a sample classification weight vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, and the s-th sample confidence matrix includes a sample confidence matrix having a dimension of n×n corresponding to each pixel point in the q-th sample image, determining a loss value corresponding to each pixel point in the q-th sample image according to the sample feature vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, the sample classification weight vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, and the sample confidence matrix having a dimension of n×n corresponding to each pixel point in the q-th sample image, obtaining s-th loss value; and determining the loss value of the mth training round according to the s loss values.
Optionally, determining the loss value of the mth training according to the q-th sample feature matrix, the q-th sample classification weight matrix and the s-th sample confidence coefficient matrix includes: the loss value L of the mth training round is determined by the following formula.
Where s=w×h, w and h are positive integers greater than or equal to 2, w represents the number of rows of pixel points in the q-th sample image, w represents the number of columns of pixel points in the q-th sample image,representing a loss value for the mth training round, i being greater than or equal to 1 and less than or equal to w, j being greater than or equal to 1 and less than or equal to h, & lt->Sample classification weight vector corresponding to pixel point of ith row and jth column in qth sample image>Sample feature vectors corresponding to pixel points of the ith row and the jth column in the qth sample image are represented by +.>And (5) representing a sample confidence matrix corresponding to the pixel point of the ith row and the jth column in the qth sample image.
Optionally, the extracting features of each of the M sample images to obtain M sample feature matrices includes: inputting each sample image in the M sample images into a target feature extraction module in a trained target image recognition model to obtain M sample feature matrixes, wherein the target image recognition model is a model obtained by training an image recognition model to be trained by using at least part of sample images in a sample image set.
Optionally, determining a sample classification weight matrix corresponding to each sample image in the M sample images according to the M sample feature matrices to obtain the M sample classification weight matrices includes: inputting each sample feature matrix in the M sample feature matrices into a target classification result determining module in a trained target image recognition model to obtain M sample classification weight matrices, wherein the target image recognition model is a model obtained by training an image recognition model to be trained by using at least part of sample images in a sample image set.
Optionally, before extracting the features of each sample image in the M sample images to obtain M sample feature matrices, the method further includes: acquiring M iris images, wherein at least part of the M iris images comprise corresponding noise information; determining M iris images as M sample images; or obtaining M groups of iris images, wherein at least part of the iris images in the M groups of iris images comprise corresponding noise information, and each group of iris images in the M groups of iris images comprises a plurality of iris images; each of the M sets of iris images is determined to be one of the M sample images.
Optionally, after determining the confidence coefficient of the target image according to the s confidence coefficient matrixes, the method further includes: under the condition that the confidence coefficient of the target image is larger than or equal to a preset threshold value, determining s classification results as effective classification results, wherein the s classification results are the classification results of each pixel point in the obtained target image according to a feature matrix with dimension s multiplied by n and a predetermined classification weight matrix with dimension c multiplied by n; and/or determining the s classification results as invalid classification results under the condition that the confidence coefficient of the target image is smaller than a preset threshold value.
Optionally, the method further comprises: in the case that the target image is an iris image, iris segmentation is performed on the target image to obtain f regions, wherein f is a positive integer greater than or equal to 1 and less than or equal to c, by: under the condition that the confidence coefficient of the target image is larger than or equal to a preset threshold value and the class represented by the classification result of s pixel points in the target image comprises f classes in c classes, determining that the classification result is the pixel point of each class in the f classes in the s pixel points in the target image, and obtaining f groups of pixel points, wherein the c classes comprise pupils, irises, scleras and skin; and determining f areas as positions of f groups of pixel points, wherein the f areas are in one-to-one correspondence with the f categories.
Optionally, determining the f areas as the positions of the f groups of pixel points includes: in the case that the category of the ith group of pixels in the f groups of pixels is the ith category in the f categories, determining the position of the ith group of pixels as the ith area in the f areas, wherein i is a positive integer greater than or equal to 1 and less than or equal to f.
According to still another aspect of the embodiments of the present application, there is also provided an image processing apparatus including: the feature extraction unit is used for extracting features of the target image to obtain a feature matrix with the dimension of s multiplied by n, wherein s represents the number of pixel points in the target image, n represents the number of features of each pixel point in the target image, and s and n are positive integers which are larger than or equal to 2; a first processing unit, configured to determine a classification weight vector with a dimension of 1×n for each pixel point in the target image according to a feature matrix with a dimension of s×n, to obtain a classification weight matrix with a dimension of s×n, where the classification weight vector of the kth pixel point in the target image is determined to belong to the y in the preset c categories at the kth pixel point k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c; the second processing unit is used for determining a confidence coefficient matrix with the dimension of n x n of each pixel point in the target image according to the feature matrix with the dimension of s x n and the classification weight matrix with the dimension of s x n to obtain s confidence coefficient matrices; and the third processing unit is used for determining the confidence coefficient of the target image according to the s confidence coefficient matrixes.
According to still another aspect of the embodiments of the present application, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the above-described image processing method when run.
According to yet another aspect of the embodiments of the present application, there is also provided a computer program product comprising a computer program/instruction which, when executed by a processor, implements the steps of the above-described image processing method.
According to still another aspect of the embodiments of the present application, there is also provided an electronic device including a memory in which a computer program is stored, and a processor configured to execute the image processing method described above by the computer program.
By adopting the mode, the feature matrix with the dimension s multiplied by n and the classification weight matrix with the dimension s multiplied by n for representing each pixel point in the target image are sequentially obtained by inputting the target image into the pre-trained target image recognition model; and determining a confidence coefficient matrix of each pixel point in the target image according to the feature matrix with the dimension of sXn and the classification weight matrix with the dimension of sXn, and determining the confidence coefficient of the target image. On the basis, the effectiveness of the classification result of the target image is determined according to the confidence of the target image, so that the image segmentation is assisted, and the technical effect of improving the accuracy of the image segmentation is realized.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application.
Fig. 1 is a schematic diagram of an application scenario of an alternative image processing method according to an embodiment of the present application.
Fig. 2 is a flow chart of an alternative image processing method according to an embodiment of the present application.
FIG. 3 is an overall schematic diagram of an alternative method for improving image segmentation accuracy using image confidence analysis according to an embodiment of the present application.
FIG. 4 is a flow chart of an alternative image recognition model training process according to an embodiment of the present application.
FIG. 5 is a flow chart of an alternative training process for an image confidence estimation model according to an embodiment of the present application.
FIG. 6 is a schematic diagram of the architecture of an alternative target confidence estimation model according to an embodiment of the present application.
FIG. 7 is a flow chart of an alternative module deployment phase according to an embodiment of the present application.
Fig. 8 is a schematic illustration of an alternative sample image according to an embodiment of the present application.
Fig. 9 is a schematic structural view of an alternative image processing apparatus according to an embodiment of the present application.
Fig. 10 is a schematic structural diagram of an alternative electronic device according to an embodiment of the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The technical scheme in the embodiment of the application can follow legal rules in the implementation process, and when the operation is executed according to the technical scheme in the embodiment, the used data cannot relate to user privacy, so that the safety of the data is ensured while the operation process is a compliance method.
The term mIoU is explained.
The mIoU is the average intersection ratio, which is the ratio of the intersection and union between the predicted area and the real area, and the index is used for representing the accuracy degree of image segmentation.
According to one aspect of embodiments of the present application, an image processing method is provided. As an alternative embodiment, the above image processing method may be applied to, but not limited to, an application scenario as shown in fig. 1. In an application scenario as shown in fig. 1, terminal device 102 may be, but is not limited to being, in communication with server 106 via network 104, and server 106 may be, but is not limited to being, performing operations on database 108, such as, for example, write data operations or read data operations. The terminal device 102 may include, but is not limited to, a man-machine interaction screen, a processor, and a memory. The man-machine interaction screen described above may be, but is not limited to, a target image for display on the terminal device 102, a confidence level of the target image, and the like. The processor may be, but is not limited to being, configured to perform a corresponding operation in response to the man-machine interaction operation, or generate a corresponding instruction and send the generated instruction to the server 106. The memory is used for storing related processing data, such as a feature matrix with dimension s×n, a classification weight matrix with dimension s×n, s confidence matrices and the like.
As an alternative, the following steps in the image processing method may be performed on the server 106: step S102, extracting features of a target image to obtain a feature matrix with dimensions of S multiplied by n, wherein S represents the number of pixel points in the target image, n represents the number of features of each pixel point in the target image, and S and n are positive integers greater than or equal to 2; step S104, determining a classification weight vector with dimension of 1 Xn for each pixel point in the target image according to the feature matrix with dimension of S Xn to obtain a classification weight matrix with dimension of S Xn, wherein the first pixel point in the target image is the first pixel point in the target imageThe classification weight vector of the k pixel points is the y-th pixel point determined to belong to the preset c categories k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c; step S106, determining a confidence matrix with dimension of n x n of each pixel point in the target image according to the feature matrix with dimension of S x n and the classification weight matrix with dimension of S x n, so as to obtain S confidence matrices; step S108, determining the confidence coefficient of the target image according to the S confidence coefficient matrixes.
By adopting the mode, the feature matrix with the dimension s multiplied by n and the classification weight matrix with the dimension s multiplied by n for representing each pixel point in the target image are sequentially obtained by inputting the target image into the pre-trained target image recognition model; and determining a confidence coefficient matrix of each pixel point in the target image according to the feature matrix with the dimension of sXn and the classification weight matrix with the dimension of sXn, and determining the confidence coefficient of the target image. On the basis, the effectiveness of the classification result of the target image is determined according to the confidence of the target image, so that the image segmentation is assisted, and the technical effect of improving the accuracy of the image segmentation is realized.
It should be noted that, the above technical solution may be but not limited to being applied to other image segmentation scenes including iris segmentation, and for convenience of understanding, in this embodiment, the above image processing method is described with a red magic segmentation bit sequence.
In the iris segmentation process, the sample images input into the iris segmentation model are generally subjected to noise interference of different degrees, and the existing semantic segmentation method only has better expression capability on iris image segmentation results with better quality, but for iris images with fuzzy or other different noises, the characteristics of the iris images are mapped into space areas with relatively poor distinguishing capability, so that the inaccuracy of iris segmentation results is caused, and the accuracy of follow-up vision estimation and other processes is influenced.
In order to solve the above-mentioned problems, a method for evaluating the confidence of iris segmentation results is proposed in the embodiments of the present application, and the validity of the classification result of iris images participating in iris segmentation can be determined by the evaluation result of the confidence.
Fig. 2 is a flowchart of an image processing method according to an embodiment of the present application, where the flowchart includes the following steps S202 to S210.
Step S202, extracting features of the target image to obtain a feature matrix with dimensions of sXn, wherein S represents the number of pixel points in the target image, n represents the number of features of each pixel point in the target image, and S and n are positive integers greater than or equal to 2.
Step S204, determining a classification weight vector with dimension of 1 Xn for each pixel point in the target image according to the feature matrix with dimension of S Xn to obtain a classification weight matrix with dimension of S Xn, wherein the classification weight vector of the kth pixel point in the target image is determined to belong to the y in the preset c categories at the kth pixel point k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c.
Step S206, determining a confidence matrix with dimension of n x n of each pixel point in the target image according to the feature matrix with dimension of S x n and the classification weight matrix with dimension of S x n, and obtaining S confidence matrices.
Step S208, determining the confidence coefficient of the target image according to the S confidence coefficient matrixes.
Before explaining the steps S202 to S208, the overall flow of the above technical solution is described with reference to a schematic diagram of a method for improving the accuracy of iris segmentation by using the uncertainty (confidence) analysis of iris images as shown in fig. 3.
As shown in fig. 3, the overall flow implemented by the technical scheme of the present application is mainly divided into 2 stages: a module training phase and a module deployment phase, wherein in the module training phase, firstly, a depth network unit module of iris segmentation and a segmented class weight module are trained, then parameters in the depth network unit module and the class weight module are fixed, the parameters are output as input of an uncertainty (confidence) estimation module, the confidence estimation module is trained, and in the process, only the parameters in the confidence determination module are trained, and the parameters in the depth network unit module and the class weight module are not updated.
After training the deep network unit module, the segmented classification weight module and the confidence estimation module, integrating the modules in a module deployment stage, and combining with a threshold screening module to form a complete iris segmentation module. The steps described above will be described in detail with reference to specific embodiments.
For the step S202, feature extraction is performed on the target image by using the trained target image recognition model to obtain a feature matrix with dimensions s×n, for example, it is assumed that the number of pixels in the target image is s=256, and each pixel corresponds to 1 n=512-dimensional feature.
And then, a target classification result determining module in the trained target image recognition model and a feature matrix with the dimension s multiplied by n, which is obtained by extracting the features, are utilized to obtain a classification weight vector of 1 multiplied by n of each pixel point in the target image. That is, in the case of n=512, the category weight corresponding to each value in the 512-dimensional feature corresponding to each pixel point in the target image is obtained.
As shown in fig. 4, the target image recognition model includes, but is not limited to, a depth network element module and a segmentation class weight element module (which may be understood as a class weight element module), wherein the depth network element module may be, but is not limited to, used for feature extraction of the target image; the segmentation class weight unit module is used for determining the classification weight of each pixel point, the classification weight of each image and the like.
Training a confidence estimation module according to a feature matrix with the dimension of sXn extracted by a depth network unit in a target image recognition model and a classification weight matrix with the dimension of sXn determined by a segmentation class weight unit module, and obtaining s confidence matrixes corresponding to each of s pixel points in the target image, wherein each confidence matrix is an n x n diagonal square matrix; and finally, determining the confidence coefficient of the target image according to the s confidence coefficient matrixes.
It should be noted that the confidence of the target image in the above embodiment may be applied to image segmentation of the scene, for example, iris segmentation, fingerprint segmentation, and other biological features.
How to obtain the trained target image recognition model is described below in conjunction with the flowchart shown in fig. 4.
As shown in FIG. 4, the method comprises the following steps S402-S414.
S402, training data preparation module.
The iris training data is read in the training process, and the read data is combined into a batch to be sent to the deep network unit module for processing.
S404, extracting spatial features of the target image by using the depth network unit module.
The output feature map retains the position and type information of each pixel of the input image (for example, the target image). The mode typically has a structure of Convolutional Neural Network (CNN) including convolutional (convolution) computation, nonlinear activation function (Relu) computation, pooling (Pooling) computation, and the like.
The feature map size output by the depth network element module is s×n (which may be understood as a feature matrix of s×n again), where s=w×h, w, h are the width and height of the input image, and n is the feature dimension extracted by each pixel, for example, 512.
S406, inputting the feature matrix output by the depth network unit module into the segmentation class weight unit module.
The function of the module is to map the features extracted by the depth network element module and containing the spatial structure information into a matrix of s×c, where c is a segmentation class, for example, for iris images, the segmentation class mainly includes pupil, iris, sclera, skin, and the like 4, and c=4.
Wherein, the vector of each s×c in the matrix of s×c is used to represent a class of one pixel point in the target image as a weight corresponding to each of c=4 classes.
S408, calculating iris segmentation objective functions.
The class probability f of each pixel point generated by the class weight dividing module and the pixel point label information of the iris image for generating the vector are taken as inputs, and an objective function value, namely a loss value is calculated.
The objective function may be, but is not limited to, a classification function (e.g., softmax), or may be an objective function of other segmentation types, which is not limited in this embodiment.
S410, determining whether the model training condition is satisfied according to the objective function value in the step S408.
In the case that the termination training condition is satisfied, step S412 is performed; otherwise, step S414 is performed.
The training optimization may be performed on the whole network based on a gradient descent mode (for example, random gradient descent of a driven quantity item, adam, adagard). In the actual training process, the steps S402 to S408 are repeatedly executed until the training result meets the training termination condition.
The condition for ending model training is generally set to be that the iteration times meet the set value, or the loss value calculated by the iris segmentation objective function is smaller than the set value, so that model training can be completed.
And S412, the deep network unit module is optimized.
S414, an iris segmentation objective function optimization module.
In the case where the loss value of the objective function (loss function) in step S410 does not satisfy the convergence condition, the configuration parameters in the deep network element module are adjusted.
After the training of the deep network element module and the segmentation class weight element module is completed, the opposite confidence estimation module is trained, and the specific process of training will be described in detail in the following embodiments.
And determining whether the classification result of the target image is effective or not by using the confidence coefficient of the target image output by the confidence coefficient estimation module after training, and further determining whether to reserve the target image with lower confidence coefficient or not, thereby improving the accuracy of iris segmentation.
That is, by inputting the target image into a pre-trained target image recognition model, a feature matrix with dimensions s×n and a classification weight matrix with dimensions s×n for representing each pixel point in the target image are sequentially obtained; and determining a confidence coefficient matrix of each pixel point in the target image according to the feature matrix with the dimension of sXn and the classification weight matrix with the dimension of sXn, and determining the confidence coefficient of the target image. On the basis, the effectiveness of the classification result of the target image is determined according to the confidence of the target image, so that the image segmentation is assisted, and the technical effect of improving the accuracy of the image segmentation is realized.
As an optional example, determining a classification weight vector with a dimension of 1×n for each pixel point in the target image according to the feature matrix with a dimension of s×n, to obtain a classification weight matrix with a dimension of s×n includes: determining a classification result of each pixel point in the target image according to a feature matrix with dimension s multiplied by n and a predetermined classification weight matrix with dimension c multiplied by n to obtain s classification results, wherein the classification result of the kth pixel point in the target image is used for indicating that the kth pixel point belongs to the y in c categories k The classification result of the kth pixel point in the target image is based on the feature vector with dimension of 1×n for representing the kth pixel point and the y k The classification weight vector with dimension of 1 Xn corresponding to each category, the obtained classification result is determined, the feature matrix with dimension s Xn comprises the feature vector with dimension of 1 Xn for representing the kth pixel point, and the classification weight matrix with dimension c Xn comprises the feature vector with dimension y k The dimension corresponding to each category isA classification weight vector of 1×n; determining a classification weight vector with a dimension of 1×n corresponding to each of the s classification results as a classification weight vector with a dimension of 1×n for each pixel point in the target image, wherein the classification weight vector with a dimension of 1×n corresponding to the kth classification result of the s classification results is the same as the y-th classification result k The dimension corresponding to each category is a classification weight vector of 1 multiplied by n, and the kth classification result is the classification result of the kth pixel point in the target image.
Assuming c=4, n=512, a classification weight matrix of dimension 4×512 is predetermined, the classification weight matrix including and the y-th of the 4 categories k The dimension corresponding to each category is a classification weight vector of 1×512, where y k And y is a constant, k represents the kth pixel point in the target image.
For example, in the case where 4 categories are pupil, iris, sclera, and skin, respectively, and the category label of pupil is 1, the category label of iris is 2, the category label of sclera is 3, and the category label of skin is 4, if y k =0, the class indicating the kth pixel is pupil, and the classification weight matrix with dimension 4×512 includes classification weight vectors with dimension 1×512 corresponding to the 1 st class.
It should be noted that, for the kth pixel, the feature vector of dimension 1×512 of the pixel extracted by the depth network element module includes 512 feature values. Then and y k Each value in the classification weight vector having dimensions 1×512 for each class represents the weight of each of the 512 eigenvalues.
That is, the classification result of the kth pixel point is determined first, for example, the classification result is 1, and the classification of the kth pixel point is represented as a pupil; if the classification result is 2, the classification of the kth pixel point is iris or the like. Under the condition that the classification result of the kth pixel point is determined, the classification weight vector with the dimension of 1 multiplied by 512 corresponding to each category on the pixel point can be obtained.
As an alternative implementation manner, the above-mentioned feature matrix with dimension s×n and the predetermined dimension c×nThe classification weight matrix determines a classification result of each pixel point in the target image to obtain s classification results, including: inputting a feature matrix with the dimension s multiplied by n into a target classification result determining module in a trained target image recognition model to obtain s multiplied by c candidate classification results, wherein the target classification result determining module determines s multiplied by c candidate classification results according to the feature matrix with the dimension s multiplied by n and a predetermined classification weight matrix with the dimension c multiplied by n, c candidate classification results corresponding to the kth pixel point in the s multiplied by c candidate classification results represent the probability that the kth pixel point belongs to each category in the c categories, and c candidate classification results corresponding to the kth pixel point are respectively determined according to a feature vector with the dimension 1 multiplied by n for representing the kth pixel point and a classification weight matrix with the dimension c multiplied by n; among c candidate classification results corresponding to each pixel in the target image among s×c candidate classification results, determining the candidate classification result with the highest probability of representation as the classification result of each pixel in the target image, wherein the classification result of the kth pixel is the candidate classification result with the highest probability of representation among the c candidate classification results corresponding to the kth pixel, and the kth pixel belongs to the kth category among the c categories k The probability of each category is the largest.
For example, assume that the target classification result determining module is a segmentation class weight unit module shown in fig. 4, and the probability that each pixel point belongs to each of c=4 classes is determined by using the feature matrix with the dimension s×n extracted by the module and the depth network unit module.
In a specific embodiment, it is assumed that the probability that the 1 st pixel belongs to each of the 4 categories is f 11 、f 12 、f 13 、f 14 The method comprises the steps of carrying out a first treatment on the surface of the The probability that the 2 nd pixel point belongs to each of the 4 categories is f 21 、f 22 、f 23 、f 24 Etc. Then according to f 11 、f 12 、f 13 、f 14 Determining the classification result of the 1 st pixel point as the maximum probability f 11 The corresponding category 1 pupil; according tof 21 、f 22 、f 23 、f 24 Determining the classification result of the 2 nd pixel point as the maximum probability f 23 The 3 rd category "sclera" corresponds to.
After the feature matrix with the dimension s×n and the classification weight matrix with the dimension s×n corresponding to the target image are obtained according to the above method, the process diagram shown in fig. 5 is adopted to train the opposite confidence estimation model (which can be understood as a confidence estimation module again) so as to obtain the target confidence estimation model.
As an alternative implementation, the specific process of training includes: under the condition that the sample image set comprises M sample images, extracting the characteristics of each sample image in the M sample images to obtain M sample characteristic matrixes, wherein the number of pixel points in each sample image is s, the t-th sample characteristic matrix in the M sample characteristic matrixes is a sample characteristic matrix with dimensions of s multiplied by n, which is obtained by extracting the characteristics of the t-th sample image in the M sample images, M is a positive integer greater than or equal to 2, and t is a positive integer greater than or equal to 1 and less than or equal to M; according to M sample feature matrixes, determining a sample classification weight matrix corresponding to each sample image in the M sample images to obtain M sample classification weight matrixes, wherein a t sample classification weight matrix in the M sample classification weight matrixes is a sample classification weight matrix with a dimension of s multiplied by n determined according to the t sample feature matrixes, wherein the t sample classification weight matrix comprises a sample classification weight vector with a dimension of 1 multiplied by n of each pixel point in the t sample images, and the sample classification weight vector of the p pixel point in the t sample images is a y-th pixel point determined to belong to c categories p In the case of the category, the c classification weight vectors corresponding to the p-th pixel point are associated with the y-th pixel point p Sample classification weight vectors corresponding to the categories, p is a positive integer greater than or equal to 1 and less than or equal to s, y p Is a positive integer greater than or equal to 1 and less than or equal to c; training a confidence estimation model to be trained according to the M sample feature matrixes and the M sample classification weight matrixes to obtain a target positionAnd (5) a reliability estimation model.
How the target confidence estimation model is obtained is further explained below in conjunction with fig. 5.
In the training process of the opposite reliability estimation model, the trained depth network element module shown in fig. 4 needs to be used in combination, and in the training process of the opposite reliability estimation model, the parameters of the trained depth network element module are not updated.
S502, acquiring iris image data required in a training process through a training data preparation module.
After the iris image data is read, the read data is combined into one batch to be input to the depth network unit module.
S504, combining the read data into a batch to be input to a deep network element module.
And extracting features of the iris image by using a depth network unit module to obtain a feature matrix with dimensions of sXn.
Where ij is assumed to represent the coordinate position of the pixel in the image, then for each pixel point, a feature z with dimension 1×n is extracted by the depth network element module ij
As an optional implementation manner, the extracting features of each of the M sample images to obtain M sample feature matrices includes: inputting each sample image in the M sample images into a target feature extraction module in a trained target image recognition model to obtain M sample feature matrixes, wherein the target image recognition model is a model obtained by training an image recognition model to be trained by using at least part of sample images in a sample image set.
The images in the sample image set are not limited in this embodiment, and may include, for example, images with at least partial noise interference, images with all noise interference, and the like.
S506, obtaining a classification weight matrix with the dimension S multiplied by n corresponding to the iris image through a training sample segmentation class weight acquisition module.
Assuming that ij represents the coordinate position of the pixel in the image, for each pixel point, one pixel point on the obtained ij position belongs to the partition category weight of 4 categories(again can be understood as +.>),y ij Indicating the class to which the pixel belongs (e.g., 0 indicates the pupil).
The calculation method of the training sample segmentation class weight acquisition module adopts the classification weight of each class obtained in the training segmentation class weight unit module as a calculation value.
As an optional implementation manner, determining a sample classification weight matrix corresponding to each sample image in the M sample images according to the M sample feature matrices, to obtain M sample classification weight matrices, includes: inputting each sample feature matrix in the M sample feature matrices into a target classification result determining module in a trained target image recognition model to obtain M sample classification weight matrices, wherein the target image recognition model is a model obtained by training an image recognition model to be trained by using at least part of sample images in a sample image set.
The images in the sample image set are not limited in this embodiment, and may include, for example, images with at least partial noise interference, images with all noise interference, and the like.
S508, estimating a confidence estimation matrix of each pixel point of the iris image input into the iris segmentation network in the high-dimensional feature space.
Wherein, according to the characteristic z of each pixel point ij And category weightsObtaining confidence estimation moment of each pixel pointMatrix, and confidence estimation matrix for each pixel point +.>Is a diagonal matrix of dimension n x n.
In addition, the structure of the confidence estimation model to be trained may be, but is not limited to, a CNN structure as shown in fig. 6, including a convolution layer, an activation function Relu, a pooling layer, a full connection layer, and the like.
As can be seen from fig. 6, the input of the confidence estimation model to be trained is a sample feature matrix with dimension s×n corresponding to one sample image and a classification weight matrix with dimension s×n, and the input is s n×n confidence matrices, where each confidence matrix in the s n×n confidence matrices corresponds to a pixel point in one sample image.
Obviously, the other middle layers except the input layer and the output layer in the confidence estimation model to be trained can be allocated according to the needs, and the confidence estimation model is not limited in the embodiment of the application.
S510, training a confidence estimation model to be trained according to the sample feature matrix of the extracted training sample (iris image) and the acquired sample classification weight.
As an optional implementation manner, training the confidence coefficient estimation model to be trained according to the M sample feature matrices and the M sample classification weight matrices to obtain a target confidence coefficient estimation model, including: performing an mth round of training on the confidence estimation model to be trained by the following steps, wherein m is a positive integer greater than or equal to 1: inputting a q sample feature matrix and a q sample classification weight matrix used in the mth training to a confidence estimation model of the mth training to obtain a sample confidence matrix with the dimension of n multiplied by n of each pixel point in the q sample image, and obtaining s sample confidence matrices, wherein q is a positive integer which is greater than or equal to 1 and less than or equal to M, the M sample feature matrices comprise the q sample feature matrix, the M sample classification weight matrices comprise the q sample classification weight matrix, and the M sample images comprise the q sample image; determining a loss value of the mth training according to the q sample feature matrix, the q sample classification weight matrix and the s sample confidence coefficient matrix; under the condition that the loss value of the mth training does not meet the preset convergence condition, parameters in the confidence coefficient estimation model of the mth training are adjusted to obtain a confidence coefficient estimation model of the (m+1) th training; and ending the training when the loss value of the mth training meets the convergence condition, and determining the confidence coefficient estimation model of the mth training as a target confidence coefficient estimation model.
How the loss value is calculated and how it is determined whether the loss value satisfies the convergence condition are described below in connection with specific embodiments.
S512, calculating a loss value of each training round by using the confidence coefficient objective function calculation module, and determining whether the training condition of the termination confidence coefficient estimation module is met.
Specifically, the characteristic z of each pixel point obtained by the depth network element module ij And category weightsThe calculation of the objective function (which in turn can be understood as the objective loss value) is performed.
As an optional implementation manner, determining the loss value of the mth training according to the q-th sample feature matrix, the q-th sample classification weight matrix and the s-th sample confidence coefficient matrix includes: in the case that the q-th sample feature matrix includes a sample feature vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, the q-th sample classification weight matrix includes a sample classification weight vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, and the s-th sample confidence matrix includes a sample confidence matrix having a dimension of n×n corresponding to each pixel point in the q-th sample image, determining a loss value corresponding to each pixel point in the q-th sample image according to the sample feature vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, the sample classification weight vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, and the sample confidence matrix having a dimension of n×n corresponding to each pixel point in the q-th sample image, obtaining s-th loss value; and determining the loss value of the mth training round according to the s loss values.
As an optional example, determining the loss value of the mth training according to the q-th sample feature matrix, the q-th sample classification weight matrix and the s-th sample confidence matrix includes: the loss value of the mth round of training is determined by the following formula (1).
Where s=w×h, w and h are positive integers greater than or equal to 2, w represents the number of rows of pixel points in the q-th sample image, w represents the number of columns of pixel points in the q-th sample image,representing a loss value for the mth training round, i being greater than or equal to 1 and less than or equal to w, j being greater than or equal to 1 and less than or equal to h, & lt->Sample classification weight vector corresponding to pixel point of ith row and jth column in qth sample image>Sample feature vectors corresponding to pixel points of the ith row and the jth column in the qth sample image are represented by +.>And (5) representing a sample confidence matrix corresponding to the pixel point of the ith row and the jth column in the qth sample image.
Wherein, if the training condition of the termination confidence estimation module is satisfied, step S514 is executed; otherwise, step S516 is performed.
And S514, optimizing the deep network unit module, and obtaining a trained confidence estimation model.
S516, repeating steps S502-S512 until the training condition (which can be understood as a convergence condition) of the termination confidence estimation module is satisfied.
The confidence level objective function optimization module shown in fig. 5 is mainly based on a map descent mode (for example, random gradient descent with a driving quantity term, adam, adagard), and optimizes the whole confidence level estimation module to be trained, and specifically adjusts the structural parameter x in the confidence level estimation module to be trained.
And repeating the steps S502-S512 until the training end condition of the training result installation part is set, wherein the training end condition comprises but is not limited to setting that the iteration times meet the set value, or finishing training of the model when the loss value calculated by the confidence estimation objective function is smaller than the set value, and obtaining a trained target confidence estimation model.
As an optional example, s confidence coefficient matrixes are obtained by inputting a feature matrix with a dimension of sxn and a classification weight matrix with a dimension of sxn into a trained target confidence coefficient estimation model, where the target image recognition model is a model obtained by training a confidence coefficient estimation model to be trained by using a sample image set.
The structure of the trained target confidence estimation model is shown in fig. 6, and the detailed description may refer to the description of the foregoing embodiment, which is not repeated herein.
In the training process of the confidence estimation model to be trained, the trained depth network element module and the segmentation class weight unit are used, and in the training process, parameters in the depth network element module and the segmentation class weight unit module are kept unchanged, so that the complexity of the whole model is reduced, the training time of the model is shortened, and the technical effect of improving the efficiency of determining the image confidence is realized.
As an optional example, before extracting features from each of the M sample images to obtain M sample feature matrices, the method further includes: acquiring M iris images, wherein at least part of the M iris images comprise corresponding noise information; determining M iris images as M sample images; or obtaining M groups of iris images, wherein at least part of the iris images in the M groups of iris images comprise corresponding noise information, and each group of iris images in the M groups of iris images comprises a plurality of iris images; each of the M sets of iris images is determined to be one of the M sample images.
Fig. 8 is an example of one sample image described in the above embodiment, and it can be intuitively seen that one sample image includes a plurality of iris images in which noise interference exists to different extents.
As can be seen from the description of the above embodiments, the technical solution of the present application provides a scheme for estimating the confidence level of the iris segmentation result, which is used for estimating the confidence level of the iris segmentation model on each segmentation result, specifically, by learning the distribution form of the iris image features in the high-dimensional space, and then performing confidence level analysis on the extracted features according to the learned distribution information, so as to assist the confidence level of the iris segmentation model on the segmentation result, thereby improving the overall segmentation accuracy.
In addition, as can be seen from the description in the above embodiments, the training sample data (iris image) used in the training process of the two modules (the depth network unit module and the segmentation class weight unit module) in the image recognition model is identical to the training sample data used in the training process of the confidence estimation module, so that the process of acquiring the training sample data for multiple times is avoided, the time for determining the confidence result is saved, and the technical effects of improving the accuracy and the segmentation efficiency of iris segmentation are realized.
In other words, the method for assisting iris segmentation by using the confidence analysis of the iris image segmentation features provided by the embodiment of the application is based on the existing iris segmentation method, and a group of models for estimating the distribution confidence of the segmentation features in a high-dimensional feature space are trained by using training images. The method does not need to retrain the deployed iris segmentation (method) model, and only needs to train the model with uncertainty estimation by using the original training data. The cost of collecting data can be saved without adopting new training data, and the uncertainty of the segmentation features in the high-dimensional feature space is estimated by theoretical basis. The technical scheme can be applied to all iris segmentation (method) models, and is not limited by application scenes or methods.
Specifically, through a module deployment phase flowchart shown in fig. 7, relevant modules obtained in module training are combined and deployed to form a complete solution, so as to obtain a confidence matrix with dimension of n×n of each pixel point in the confidence of the target image, which specifically includes the following steps S702-S712.
S702, acquiring iris images to be segmented.
S704, outputting the characteristic z of the information corresponding to each pixel point in the iris image to be segmented by using the depth network unit module ij
Wherein z is ij Is a feature vector of dimension 1 x n.
S706, using the segmentation class weight module, a class weight vector (which can be understood as class weight) with a dimension of 1×n is determined for each pixel in the image
S708, according to the classification weight vector (which can be understood as the classification weight again) with the dimension of 1×n of each pixel pointAnd determining a confidence coefficient matrix with the dimension of n multiplied by n of each pixel point to obtain s confidence coefficient matrixes.
S710, determining the confidence coefficient of the target image according to the S confidence coefficient matrixes.
As an alternative implementation, the specific process of determining the confidence level of the target image includes: executing tracing operation on the confidence coefficient matrix corresponding to each pixel point in the s confidence coefficient matrixes to obtain s values; and performing summation operation on the s values to obtain the confidence coefficient of the target image. Specifically, the confidence coefficient P of the target image is obtained by the following formula (2).
(2)
Where tr () performs a trace-finding operation on the confidence matrix,representing s confidence matrices for each pixel obtained by the steps in the above embodiments, x represents parameters in the confidence estimation model, +.>And->The meaning indicated is the same.
By the above formula (2), the confidence level P of each iris image is obtained, and then step S712 is performed.
S712, determining whether the classification result of the target image is a valid classification result by using the screening judging unit module.
The module is used for judging whether the segmentation result of the image meets the application condition according to the image sample confidence level output by the segmentation confidence level calculation module.
Illustratively, the screening is performed by: under the condition that the confidence coefficient of the target image is larger than or equal to a preset threshold value, determining s classification results as effective classification results, wherein the s classification results are the classification results of each pixel point in the obtained target image according to a feature matrix with dimension s multiplied by n and a predetermined classification weight matrix with dimension c multiplied by n; and/or determining the s classification results as invalid classification results under the condition that the confidence coefficient of the target image is smaller than a preset threshold value.
Optionally, screening the output of the judging unit module) As shown in the following formula (3). />
The determination of the threshold th is determined according to the accuracy required to be met in the actual application scene.
The determination result obtained in the above (3) may be used, but is not limited to, the above method for determining the confidence level may be applied to the following two scenarios.
First kind: if the confidence of the target image is low (the result is determined to be unavailable), the target image is directly discarded.
Under the condition that iris segmentation is carried out by utilizing a preset number of sample images, if part of sample images corresponding to segmentation results with lower confidence are discarded, only sample images corresponding to segmentation results with higher confidence are reserved, and obviously, the accuracy of iris segmentation can be obviously improved.
Second kind: if the confidence of the target image is low (the result is determined to be unavailable), the class weight of the target image is reduced.
Under the condition that iris segmentation is carried out by utilizing a preset number of sample images, the weight of a part of sample images corresponding to the segmentation result with lower confidence is reduced, the weight of the sample images corresponding to the segmentation result with higher confidence is kept unchanged, and therefore the overall accuracy of iris segmentation is improved.
After the confidence coefficient of the target image is obtained through the descriptions in the above embodiments, the iris image may be further segmented by using the confidence coefficient of the target image, which specifically includes: in the case that the target image is an iris image, iris segmentation is performed on the target image to obtain f regions, wherein f is a positive integer greater than or equal to 1 and less than or equal to c, by: under the condition that the confidence coefficient of the target image is larger than or equal to a preset threshold value and the class represented by the classification result of s pixel points in the target image comprises f classes in c classes, determining that the classification result is the pixel point of each class in the f classes in the s pixel points in the target image, and obtaining f groups of pixel points, wherein the c classes comprise pupils, irises, scleras and skin; and determining f areas as positions of f groups of pixel points, wherein the f areas are in one-to-one correspondence with the f categories.
As an optional implementation manner, determining the f areas as the positions of the f groups of pixel points includes: in the case that the category of the ith group of pixels in the f groups of pixels is the ith category in the f categories, determining the position of the ith group of pixels as the ith area in the f areas, wherein i is a positive integer greater than or equal to 1 and less than or equal to f.
The above-described determination of the confidence level of the target image and the overall implementation process of iris segmentation of the target image using the confidence level of the target image are described in detail below in connection with specific embodiments.
Assuming that the target image is an iris image, the iris image segmentation process may include, but is not limited to, c=4 categories, such as pupil, iris, sclera and skin, and specific steps of iris segmentation of the target image are as follows S11 to S15.
And S11, extracting features of the target image by using a depth network unit module shown in FIG. 4 to obtain a feature matrix with dimensions S multiplied by n.
S12, determining a classification weight vector with a dimension of 1×n for each pixel point in the target image and a probability that each pixel point belongs to each of c=4 categories by using the partition category weight unit module shown in fig. 4.
In a specific embodiment, the above c=4 categories are pupil, iris, sclera and skin, respectively, and the probability that the 1 st pixel point in the target image belongs to each of the 4 categories is f 11 、f 12 、f 13 、f 14 The method comprises the steps of carrying out a first treatment on the surface of the The probability that the 2 nd pixel point belongs to each of the 4 categories is f 21 、f 22 、f 23 、f 24 Etc. Then according to f 11 、f 12 、f 13 、f 14 Determining the classification result of the 1 st pixel point as the maximum probability f 11 Corresponding first1 category "pupil"; according to f 21 、f 22 、f 23 、f 24 Determining the classification result of the 2 nd pixel point as the maximum probability f 23 The 3 rd category "sclera" corresponds to.
S13, obtaining a confidence coefficient matrix with dimension of n multiplied by n of each pixel point in the target image by using the confidence coefficient estimation model shown in FIG. 5, obtaining S confidence coefficient matrixes, and determining the confidence coefficient of the t-th iris image according to the S confidence coefficient matrixes.
The method for determining the confidence level of the target image may refer to the description of the image processing method section, which is not repeated here.
S14, judging whether the confidence coefficient of the target image is larger than or equal to a preset threshold value.
If yes, go to step S15; otherwise, the flow will end.
In order to improve the accuracy of iris segmentation, only the images whose confidence level satisfies the preset requirement (greater than or equal to the preset threshold value) are subjected to iris segmentation, and the images which do not satisfy the preset requirement are directly regarded as invalid images.
Specifically, by using the above formula (3), the magnitude relation between the confidence of the target image and the preset threshold th is compared, wherein the preset threshold is determined according to the scene of iris recognition, and may be, but not limited to, related to the security level of the scene, for example, th1 is set as the preset threshold when entering public places by using the iris recognition method, th2 is set as the preset threshold when performing payment operation by using the iris recognition method, and th1 is smaller than th2.
And S15, dividing the target image according to S classification results to obtain f areas when the confidence coefficient of the target image is greater than or equal to a preset threshold value.
When c=4 and f=2, the classification result of the target image is described to include only 2 categories among the preset 4 categories; in the case of c=4 and f=4, the classification result of the target image includes all of the preset 4 categories.
For the above two cases, iris segmentation is achieved in two ways, respectively.
(1) When f=c, dividing the target image according to s classification results to obtain c areas, wherein the c areas are in one-to-one correspondence with the c categories.
For example, assume that the number of pixels s=100 in the target image, where the classification results of the 1 st to 20 th pixels are pupils, the classification results of the 21 st to 50 th pixels are irises, the classification results of the 51 st to 80 th pixels are scleras, and the classification results of the 81 st to 100 th pixels are skin.
The position of the 1 st to 20 th pixel points is determined as a 1 st area, the position of the 21 st to 50 th pixel points is determined as a 2 nd area, the position of the 51 st to 80 th pixel points is determined as a 3 rd area, the position of the 81 st to 100 th pixel points is determined as a 4 th area, and the 4 th area is the iris segmentation result.
(2) When f < c, f categories to which s pixels in the target image belong, for example, s=1000 are first determined, where the category of 500 pixels is pupil, and the category of 500 pixels is iris, and f=2.
For example, assume that the number of pixels s=100 in the target image, where the classification results of the 1 st to 20 th pixels are pupils, the classification results of the 21 st to 50 th pixels are irises, the classification results of the 51 st to 80 th pixels are pupils, and the classification results of the 81 st to 100 th pixels are irises.
The position of the 1 th to 20 th pixel points and the position of the 51 th to 80 th pixel points are determined as one area, the position of the 21 st to 50 th pixel points and the position of the 81 th to 100 th pixel points are determined as the other area, and the two areas are the iris segmentation result.
By the method, whether iris segmentation is carried out on the image is determined by using the comparison result of the confidence coefficient of the iris image to be segmented and the preset threshold value, and the iris image with lower confidence coefficient in the image set is regarded as an invalid image. And (3) iris segmentation is carried out on the image with higher confidence, and only iris segmentation results of the image with higher confidence are counted, so that the accuracy of iris segmentation is improved.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
According to another aspect of the embodiments of the present application, there is also provided an image processing apparatus as shown in fig. 9, including: a feature extraction unit 902, configured to perform feature extraction on a target image to obtain a feature matrix with a dimension s×n, where s represents the number of pixels in the target image, n represents the number of features of each pixel in the target image, and s and n are positive integers greater than or equal to 2; a first processing unit 904, configured to determine a classification weight vector with a dimension of 1×n for each pixel point in the target image according to a feature matrix with a dimension of s×n, to obtain a classification weight matrix with a dimension of s×n, where the classification weight vector of the kth pixel point in the target image is determined to belong to the y of the preset c categories at the kth pixel point k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c; a second processing unit 906, configured to determine a confidence matrix with dimension n×n of each pixel point in the target image according to the feature matrix with dimension s×n and the classification weight matrix with dimension s×n, to obtain s confidence matrices; a third processing unit 908 is configured to determine the confidence level of the target image according to the s confidence level matrices.
Optionally, the first processing unit 904 includes: a second processing module for processing the feature matrix sum according to the dimension s×nDetermining a classification result of each pixel point in the target image according to a predetermined classification weight matrix with dimension of c multiplied by n to obtain s classification results, wherein the classification result of the kth pixel point in the target image is used for indicating that the kth pixel point belongs to the (y) th category in the c categories k The classification result of the kth pixel point in the target image is based on the feature vector with dimension of 1×n for representing the kth pixel point and the y k The classification weight vector with dimension of 1 Xn corresponding to each category, the obtained classification result is determined, the feature matrix with dimension s Xn comprises the feature vector with dimension of 1 Xn for representing the kth pixel point, and the classification weight matrix with dimension c Xn comprises the feature vector with dimension y k Classification weight vectors with dimensions of 1 x n corresponding to the respective categories; a third processing module for determining a classification weight vector of dimension 1×n corresponding to each of the s classification results as a classification weight vector of dimension 1×n for each pixel point in the target image, wherein the classification weight vector of dimension 1×n corresponding to the kth classification result of the s classification results is the same as the y-th classification result k The dimension corresponding to each category is a classification weight vector of 1 multiplied by n, and the kth classification result is the classification result of the kth pixel point in the target image.
Optionally, the second processing module includes: a first processing sub-module, configured to input a feature matrix with a dimension s×n into a target classification result determining module in the trained target image recognition model, to obtain s×c candidate classification results, where the target classification result determining module determines s×c candidate classification results according to the feature matrix with the dimension s×n and a predetermined classification weight matrix with the dimension c×n, c candidate classification results corresponding to the kth pixel in the s×c candidate classification results represent a probability that the kth pixel belongs to each of the c categories, c candidate classification results corresponding to the kth pixel are respectively determined according to feature vectors with the dimension 1×n for representing the kth pixel, and classification weight matrices with the dimension c×n; a second processing sub-module for c candidates corresponding to each pixel point in the target image in s×c candidate classification results Determining the candidate classification result with the highest probability as the classification result of each pixel point in the target image in the classification results, wherein the classification result of the kth pixel point is the candidate classification result with the highest probability represented in the c candidate classification results corresponding to the kth pixel point, and the kth pixel point belongs to the y in the c categories k The probability of each category is the largest.
Optionally, the third processing unit 908 includes: the fourth processing module is used for executing tracing operation on the confidence coefficient matrix corresponding to each pixel point in the s confidence coefficient matrixes to obtain s values; and a fifth processing module, configured to perform a summation operation on the s values to obtain a confidence level of the target image.
Optionally, the second processing unit 906 includes: and a sixth processing module, configured to input a feature matrix with dimensions s×n and a classification weight matrix with dimensions s×n into a trained target confidence coefficient estimation model to obtain s confidence coefficient matrixes, where the target image recognition model is a model obtained by training a confidence coefficient estimation model to be trained by using a sample image set.
Optionally, the apparatus further includes: a fifth processing unit, configured to perform feature extraction on each of M sample images when the sample image set includes M sample images, to obtain M sample feature matrices, where the number of pixels in each sample image is s, a t-th sample feature matrix in the M sample feature matrices is a sample feature matrix with dimensions s×n obtained by performing feature extraction on a t-th sample image in the M sample images, M is a positive integer greater than or equal to 2, and t is a positive integer greater than or equal to 1 and less than or equal to M; a sixth processing unit, configured to determine a sample classification weight matrix corresponding to each sample image in the M sample images according to the M sample feature matrices, to obtain M sample classification weight matrices, where a nth sample classification weight matrix in the M sample classification weight matrices is a sample classification weight matrix with a dimension s×n determined according to the nth sample feature matrix, where the nth sample classification weight matrix includes each pixel point in the nth sample image Sample classification weight vector of dimension 1×n, sample classification weight vector of the p-th pixel in the t-th sample image is determined to belong to the y-th category in the c-th pixel p In the case of the category, the c classification weight vectors corresponding to the p-th pixel point are associated with the y-th pixel point p Sample classification weight vectors corresponding to the categories, p is a positive integer greater than or equal to 1 and less than or equal to s, y p Is a positive integer greater than or equal to 1 and less than or equal to c; the first training unit is used for training the confidence coefficient estimation model to be trained according to the M sample feature matrixes and the M sample classification weight matrixes to obtain a target confidence coefficient estimation model.
Optionally, the first training unit includes: a seventh processing module, configured to perform an mth round of training on the confidence estimation model to be trained by: inputting a q sample feature matrix and a q sample classification weight matrix used in the mth training to a confidence estimation model of the mth training to obtain a sample confidence matrix with the dimension of n multiplied by n of each pixel point in the q sample image, and obtaining s sample confidence matrices, wherein q is a positive integer which is greater than or equal to 1 and less than or equal to M, the M sample feature matrices comprise the q sample feature matrix, the M sample classification weight matrices comprise the q sample classification weight matrix, and the M sample images comprise the q sample image; determining a loss value of the mth training according to the q sample feature matrix, the q sample classification weight matrix and the s sample confidence coefficient matrix; under the condition that the loss value of the mth training does not meet the preset convergence condition, parameters in the confidence coefficient estimation model of the mth training are adjusted to obtain a confidence coefficient estimation model of the (m+1) th training; and ending the training when the loss value of the mth training meets the convergence condition, and determining the confidence coefficient estimation model of the mth training as a target confidence coefficient estimation model.
Optionally, the seventh processing module includes: a third processing sub-module, configured to determine, when the q-th sample feature matrix includes a sample feature vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, the q-th sample classification weight matrix includes a sample classification weight vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, and the s-th sample confidence matrix includes a sample confidence matrix having a dimension of n×n corresponding to each pixel point in the q-th sample image, a loss value corresponding to each pixel point in the q-th sample image according to the sample feature vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, the sample classification weight vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, and the sample confidence matrix having a dimension of n×n corresponding to each pixel point in the q-th sample image, and obtain s-th loss value; and the fourth processing submodule is used for determining the loss value of the mth training round according to the s loss values.
Optionally, the seventh processing module includes: and a fifth processing sub-module, configured to determine a loss value L of the mth training round through the following formula.
Where s=w×h, w and h are positive integers greater than or equal to 2, w represents the number of rows of pixel points in the q-th sample image, w represents the number of columns of pixel points in the q-th sample image,representing a loss value for the mth training round, i being greater than or equal to 1 and less than or equal to w, j being greater than or equal to 1 and less than or equal to h, & lt->Sample classification weight vector corresponding to pixel point of ith row and jth column in qth sample image>Sample feature vectors corresponding to pixel points of the ith row and the jth column in the qth sample image are represented by +.>Representation ofSample confidence matrix corresponding to pixel point of ith row and jth column in the qth sample image.
Optionally, the fifth processing unit includes: and the eighth processing module is used for inputting each sample image in the M sample images into the target feature extraction module in the trained target image recognition model to obtain M sample feature matrixes, wherein the target image recognition model is a model obtained by training the image recognition model to be trained by using at least part of sample images in the sample image set.
Optionally, the fifth processing unit includes: and the ninth processing module is used for inputting each sample feature matrix in the M sample feature matrices into the target classification result determining module in the trained target image recognition model to obtain M sample classification weight matrices, wherein the target image recognition model is a model obtained by training an image recognition model to be trained by using at least part of sample images in the sample image set.
Optionally, the fifth processing unit includes: the acquisition module is used for acquiring M iris images, wherein at least part of the iris images in the M iris images comprise corresponding noise information; determining M iris images as M sample images; or obtaining M groups of iris images, wherein at least part of the iris images in the M groups of iris images comprise corresponding noise information, and each group of iris images in the M groups of iris images comprises a plurality of iris images; each of the M sets of iris images is determined to be one of the M sample images.
Optionally, the apparatus further includes: a seventh processing unit, configured to, after determining the confidence coefficient of the target image according to the s confidence coefficient matrices, perform iris segmentation on the target image to obtain f regions if the target image is an iris image, where f is a positive integer greater than or equal to 1 and less than or equal to c: under the condition that the confidence coefficient of the target image is larger than or equal to a preset threshold value and the class represented by the classification result of s pixel points in the target image comprises f classes in c classes, determining that the classification result is the pixel point of each class in the f classes in the s pixel points in the target image, and obtaining f groups of pixel points, wherein the c classes comprise pupils, irises, scleras and skin; and determining f areas as positions of f groups of pixel points, wherein the f areas are in one-to-one correspondence with the f categories.
Optionally, the seventh processing unit includes: a tenth processing module, configured to determine, when the class of the ith group of pixels in the f groups of pixels is the ith class in the f classes, where i is a positive integer greater than or equal to 1 and less than or equal to f, a position of the ith group of pixels as the ith area in the f areas.
By applying the device to a target image recognition model trained in advance by inputting a target image, sequentially obtaining a feature matrix with dimensions s multiplied by n and a classification weight matrix with dimensions s multiplied by n for representing each pixel point in the target image; and determining a confidence coefficient matrix of each pixel point in the target image according to the feature matrix with the dimension of sXn and the classification weight matrix with the dimension of sXn, and determining the confidence coefficient of the target image. On the basis, the effectiveness of the classification result of the target image is determined according to the confidence of the target image, so that the image segmentation is assisted, and the technical effect of improving the accuracy of the image segmentation is realized.
According to still another aspect of the embodiments of the present application, there is also provided an electronic device for implementing the above image processing method, where the electronic device may be a terminal device shown in fig. 10. The present embodiment is described taking the electronic device as a background device as an example. As shown in fig. 10, the electronic device comprises a memory 1002 and a processor 1004, the memory 1002 having stored therein a computer program, the processor 1004 being arranged to perform the steps of any of the method embodiments described above by means of the computer program.
Alternatively, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of the computer network.
Alternatively, in the present embodiment, the above-mentioned processor may be configured to execute the following steps S1 to S4 by a computer program.
S1, extracting features of a target image to obtain a feature matrix with dimensions of S multiplied by n, wherein S represents the number of pixel points in the target image, n represents the number of features of each pixel point in the target image, and S and n are positive integers which are larger than or equal to 2.
S2, determining a classification weight vector with the dimension of 1 Xn of each pixel point in the target image according to the feature matrix with the dimension of S Xn to obtain a classification weight matrix with the dimension of S Xn, wherein the classification weight vector of the kth pixel point in the target image is determined to belong to the y in the preset c categories at the kth pixel point k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c.
S3, determining a confidence coefficient matrix with the dimension of n x n of each pixel point in the target image according to the feature matrix with the dimension of S x n and the classification weight matrix with the dimension of S x n, and obtaining S confidence coefficient matrices.
S4, determining the confidence coefficient of the target image according to the S confidence coefficient matrixes.
Alternatively, it will be appreciated by those skilled in the art that the structure shown in fig. 10 is merely illustrative, and the electronic device includes, but is not limited to, a mobile phone, a computer, an intelligent voice interaction device, an intelligent home appliance, and a vehicle-mounted terminal. Fig. 10 is not limited to the structure of the electronic device and the electronic apparatus described above. For example, the electronics may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
The memory 1002 may be configured to store software programs and modules, such as program instructions/modules corresponding to the apparatus and the determination of the image confidence in the embodiments of the present application, and the processor 1004 executes the software programs and modules stored in the memory 1002 to perform various functional applications and data processing, that is, implement the above-mentioned image processing method or iris segmentation method. The memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory 1002 may further include memory located remotely from the processor 1004, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1002 may be, but is not limited to, storing a target image, a feature matrix with dimensions s×n, s confidence matrices, and the like. As an example, as shown in fig. 10, the memory 1002 may include, but is not limited to, the feature extraction unit 902, the first processing unit 904, the second processing unit 906, and the third processing unit 908 in the image processing apparatus. In addition, other module units in the image processing apparatus or other module units in the iris segmentation apparatus may be included, but are not limited to, and are not described in detail in this example.
Optionally, the transmission device 1006 is configured to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 1006 includes a network adapter (Network Interface Controller, NIC) that can be connected to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 1006 is a Radio Frequency (RF) module for communicating with the internet wirelessly.
In addition, the electronic device further includes: a display 1008 for displaying the target image; and a connection bus 1010 for connecting the respective module parts in the above-described electronic apparatus.
In other embodiments, the target terminal or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting the plurality of nodes through a network communication. Among them, the nodes may form a Peer-To-Peer (Peer To Peer) network, and any type of computing device, such as a server, a terminal, etc., may become a node in the blockchain system by joining the Peer-To-Peer network.
According to yet another aspect of the present application, a computer program product or computer program is provided, comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from a computer readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the image processing method provided in various alternative implementations of the server verification process described above, wherein the computer program is configured to perform the steps of any of the method embodiments described above when run.
Alternatively, in the present embodiment, the above-described computer-readable storage medium may be configured to store a computer program for executing the following steps.
S1, extracting features of a target image to obtain a feature matrix with dimensions of S multiplied by n, wherein S represents the number of pixel points in the target image, n represents the number of features of each pixel point in the target image, and S and n are positive integers which are larger than or equal to 2.
S2, determining a classification weight vector with the dimension of 1 Xn of each pixel point in the target image according to the feature matrix with the dimension of S Xn to obtain a classification weight matrix with the dimension of S Xn, wherein the classification weight vector of the kth pixel point in the target image is determined to belong to the y in the preset c categories at the kth pixel point k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c.
S3, determining a confidence coefficient matrix with the dimension of n x n of each pixel point in the target image according to the feature matrix with the dimension of S x n and the classification weight matrix with the dimension of S x n, and obtaining S confidence coefficient matrices.
S4, determining the confidence coefficient of the target image according to the S confidence coefficient matrixes.
Alternatively, in this embodiment, it will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by a program for instructing the target terminal related hardware, and the program may be stored in a computer readable storage medium, where the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.
The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.
The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the methods of the various embodiments of the present application.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and are merely a logical functional division, and there may be other manners of dividing the apparatus in actual implementation, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims (15)

1. An image processing method, comprising:
extracting features of a target image to obtain a feature matrix with dimensions of s multiplied by n, wherein s represents the number of pixel points in the target image, n represents the number of features of each pixel point in the target image, and s and n are positive integers larger than or equal to 2;
according to the feature matrix with the dimension s multiplied by n, determining a classification weight vector with the dimension 1 multiplied by n of each pixel point in the target image to obtain a classification weight matrix with the dimension s multiplied by n, wherein the classification weight vector of the kth pixel point in the target image is determined to belong to the y in preset c categories at the kth pixel point k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y-th pixel point k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c;
determining a confidence coefficient matrix with the dimension of n x n of each pixel point in the target image according to the feature matrix with the dimension of s x n and the classification weight matrix with the dimension of s x n to obtain s confidence coefficient matrices;
and determining the confidence coefficient of the target image according to the s confidence coefficient matrixes.
2. The method according to claim 1, wherein determining a classification weight vector with a dimension of 1×n for each pixel point in the target image according to the feature matrix with a dimension of s×n, to obtain a classification weight matrix with a dimension of s×n includes:
determining a classification result of each pixel point in the target image according to the feature matrix with the dimension of sXn and a predetermined classification weight matrix with the dimension of cXn to obtain s classification results, wherein the classification result of the kth pixel point in the target image is used for indicating that the kth pixel point belongs to the y in the c categories k A classification result of the kth pixel point in the target image is based on a feature vector representing the kth pixel point with a dimension of 1×n, and the y k Determining a classification result obtained by classifying a classification weight vector with a dimension of 1×n corresponding to each category, wherein a feature matrix with a dimension of s×n comprises a feature vector with a dimension of 1×n used for representing the kth pixel point, and a classification weight matrix with a dimension of c×n comprises a feature vector with a dimension of y×n used for representing the kth pixel point k Classification weight vectors with dimensions of 1 x n corresponding to the respective categories;
determining a classification weight vector with a dimension of 1×n corresponding to each classification result of the s classification results as a classification weight vector with a dimension of 1×n for each pixel point in the target image, wherein a classification weight with a dimension of 1×n corresponding to a kth classification result of the s classification resultsVector is with the y k And the dimension corresponding to each category is a classification weight vector of 1 multiplied by n, and the kth classification result is the classification result of the kth pixel point in the target image.
3. The method according to claim 2, wherein determining the classification result of each pixel point in the target image according to the feature matrix with the dimension s×n and the predetermined classification weight matrix with the dimension c×n, to obtain s classification results includes:
Inputting the feature matrix with the dimension of s×n into a target classification result determining module in a trained target image recognition model to obtain s×c candidate classification results, wherein the target classification result determining module determines the s×c candidate classification results according to the feature matrix with the dimension of s×n and a predetermined classification weight matrix with the dimension of c×n, c candidate classification results corresponding to the kth pixel in the s×c candidate classification results represent the probability that the kth pixel belongs to each category in the c categories, and c candidate classification results corresponding to the kth pixel are respectively determined according to a feature vector with the dimension of 1×n and a classification weight matrix with the dimension of c×n, which are used for representing the kth pixel;
determining, as a classification result of each pixel point in the target image, a candidate classification result having the highest probability of representation among c candidate classification results corresponding to each pixel point in the target image among the s×c candidate classification results, wherein the classification result of the kth pixel point is a candidate classification result having the highest probability of representation among c candidate classification results corresponding to the kth pixel point, the kth pixel point belonging to the y-th category k The probability of each category is the largest.
4. The method of claim 1, wherein determining the confidence level of the target image based on the s confidence matrices comprises:
executing trace solving operation on the confidence coefficient matrix corresponding to each pixel point in the s confidence coefficient matrixes to obtain s values;
and performing summation operation on the s values to obtain the confidence coefficient of the target image.
5. The method according to claim 1, wherein determining a confidence matrix with dimension n x n for each pixel point in the target image according to the feature matrix with dimension s x n and the classification weight matrix with dimension s x n, to obtain s confidence matrices, includes:
inputting the feature matrix with the dimension of sXn and the classification weight matrix with the dimension of sXn into a trained target confidence estimation model to obtain the s confidence matrixes, wherein the target image recognition model is a model obtained by training a confidence estimation model to be trained by using a sample image set.
6. The method of claim 5, wherein the method further comprises:
under the condition that the sample image set comprises M sample images, extracting features of each sample image in the M sample images to obtain M sample feature matrixes, wherein the number of pixel points in each sample image is s, the t sample feature matrix in the M sample feature matrixes is a sample feature matrix with dimensions of s multiplied by n, which is obtained by extracting features of the t sample image in the M sample images, M is a positive integer greater than or equal to 2, and t is a positive integer greater than or equal to 1 and less than or equal to M;
According to the M sample feature matrixes, determining a sample classification weight matrix corresponding to each sample image in the M sample images to obtain M sample classification weight matrixes, wherein a t sample classification weight matrix in the M sample classification weight matrixes is a sample classification weight matrix with a dimension s multiplied by n determined according to the t sample feature matrixes, and the t sample classification weight is obtained by the t sample classification weight matrixA matrix including a sample classification weight vector of dimension 1×n for each pixel point in the t-th sample image, the sample classification weight vector of the p-th pixel point in the t-th sample image being determined to belong to the y-th category in the p-th pixel point p In the case of the category, the c classification weight vectors corresponding to the p-th pixel point are associated with the y-th pixel point p Sample classification weight vectors corresponding to the categories, p is a positive integer greater than or equal to 1 and less than or equal to s, y p Is a positive integer greater than or equal to 1 and less than or equal to c;
and training the confidence coefficient estimation model to be trained according to the M sample feature matrixes and the M sample classification weight matrixes to obtain the target confidence coefficient estimation model.
7. The method of claim 6, wherein training the confidence estimation model to be trained based on the M sample feature matrices and the M sample classification weight matrices to obtain the target confidence estimation model comprises:
performing an mth round of training on the confidence estimation model to be trained by the following steps, wherein m is a positive integer greater than or equal to 1:
inputting a q sample feature matrix and a q sample classification weight matrix used in the mth training to a confidence estimation model of the mth training to obtain a sample confidence matrix with the dimension of n multiplied by n of each pixel point in the q sample image, and obtaining s sample confidence matrices, wherein q is a positive integer which is greater than or equal to 1 and less than or equal to M, the M sample feature matrices comprise the q sample feature matrix, the M sample classification weight matrices comprise the q sample classification weight matrix, and the M sample images comprise the q sample image;
determining a loss value of the mth training according to the q-th sample feature matrix, the q-th sample classification weight matrix and the s-th sample confidence coefficient matrix;
Under the condition that the loss value of the mth training does not meet a preset convergence condition, parameters in the confidence coefficient estimation model of the mth training are adjusted to obtain a confidence coefficient estimation model of the (m+1) th training;
and ending training when the loss value of the mth training meets the convergence condition, and determining the confidence coefficient estimation model of the mth training as the target confidence coefficient estimation model.
8. The method of claim 7, wherein the determining the loss value for the mth round of training based on the q-th sample feature matrix, the q-th sample classification weight matrix, and the s-th sample confidence matrix comprises:
in the case that the q-th sample feature matrix includes a sample feature vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, the q-th sample classification weight matrix includes a sample classification weight vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, and the s-th sample confidence matrix includes a sample confidence matrix having a dimension of n×n corresponding to each pixel point in the q-th sample image, determining a loss value corresponding to each pixel point in the q-th sample image according to the sample feature vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, the sample classification weight vector having a dimension of 1×n corresponding to each pixel point in the q-th sample image, and the sample confidence having a dimension of n×n corresponding to each pixel point in the q-th sample image, and obtaining an s-th loss value;
And determining the loss value of the mth training according to the s loss values.
9. The method of claim 6, wherein the feature extracting each of the M sample images to obtain M sample feature matrices comprises:
inputting each sample image in the M sample images into a target feature extraction module in a trained target image recognition model to obtain the M sample feature matrixes, wherein the target image recognition model is a model obtained by training the image recognition model to be trained by using at least part of sample images in the sample image set.
10. The method of claim 6, wherein determining a sample classification weight matrix corresponding to each of the M sample images according to the M sample feature matrices, to obtain M sample classification weight matrices, comprises:
inputting each sample feature matrix in the M sample feature matrices into a target classification result determining module in a trained target image recognition model to obtain M sample classification weight matrices, wherein the target image recognition model is a model obtained by training the image recognition model to be trained by using at least part of sample images in the sample image set.
11. The method according to any one of claims 1 to 10, wherein after determining the confidence level of the target image from the s confidence matrices, the method further comprises:
in the case that the target image is an iris image, iris segmentation is performed on the target image to obtain f regions, wherein f is a positive integer greater than or equal to 1 and less than or equal to c, by:
when the confidence coefficient of the target image is greater than or equal to a preset threshold value, and the class represented by the classification result of s pixel points in the target image comprises f classes in the c classes, determining that the classification result is the pixel point of each class in the f classes in the s pixel points in the target image, so as to obtain f groups of pixel points, wherein the c classes comprise pupils, irises, scleras and skin;
and determining the f areas as positions of the f groups of pixel points, wherein the f areas are in one-to-one correspondence with the f categories.
12. The method of claim 11, wherein determining the f regions as the locations of the f groups of pixels comprises:
And determining the position of the ith group of pixel points as the ith area in the f areas under the condition that the category of the ith group of pixel points in the f groups of pixel points is the ith category in the f categories, wherein i is a positive integer which is greater than or equal to 1 and less than or equal to f.
13. An image processing apparatus, comprising:
the feature extraction unit is used for extracting features of a target image to obtain a feature matrix with a dimension of s multiplied by n, wherein s represents the number of pixel points in the target image, n represents the number of features of each pixel point in the target image, and s and n are positive integers which are larger than or equal to 2;
a first processing unit, configured to determine a classification weight vector with a dimension of 1×n for each pixel point in the target image according to the feature matrix with a dimension of s×n, to obtain a classification weight matrix with a dimension of s×n, where the classification weight vector of the kth pixel point in the target image is determined to belong to the y of preset c categories at the kth pixel point k In the case of the category, the c classification weight vectors corresponding to the kth pixel point are associated with the y-th pixel point k The classification weight vector corresponding to each category is c is a positive integer greater than or equal to 2, k is a positive integer greater than or equal to 1 and less than or equal to s, y k Is a positive integer greater than or equal to 1 and less than or equal to c;
the second processing unit is used for determining a confidence coefficient matrix with the dimension of n multiplied by n of each pixel point in the target image according to the feature matrix with the dimension of s multiplied by n and the classification weight matrix with the dimension of s multiplied by n to obtain s confidence coefficient matrices;
and the third processing unit is used for determining the confidence coefficient of the target image according to the s confidence coefficient matrixes.
14. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a stored program, wherein the program is executable by a terminal device or a computer to perform the method of any one of claims 1 to 12.
15. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 12 by means of the computer program.
CN202311313702.0A 2023-10-11 2023-10-11 Image processing method and device, storage medium and electronic equipment Active CN117079058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311313702.0A CN117079058B (en) 2023-10-11 2023-10-11 Image processing method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311313702.0A CN117079058B (en) 2023-10-11 2023-10-11 Image processing method and device, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN117079058A CN117079058A (en) 2023-11-17
CN117079058B true CN117079058B (en) 2024-01-09

Family

ID=88719747

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311313702.0A Active CN117079058B (en) 2023-10-11 2023-10-11 Image processing method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN117079058B (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805061A (en) * 2018-05-30 2018-11-13 西北工业大学 Hyperspectral image classification method based on local auto-adaptive discriminant analysis
JP2019109843A (en) * 2017-12-20 2019-07-04 コニカミノルタ株式会社 Classification device, classification method, attribute recognition device, and machine learning device
CN109993221A (en) * 2019-03-25 2019-07-09 新华三大数据技术有限公司 A kind of image classification method and device
CN110781956A (en) * 2019-10-24 2020-02-11 精硕科技(北京)股份有限公司 Target detection method and device, electronic equipment and readable storage medium
CN110930417A (en) * 2019-11-26 2020-03-27 腾讯科技(深圳)有限公司 Training method and device of image segmentation model, and image segmentation method and device
CN111476806A (en) * 2020-06-23 2020-07-31 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and storage medium
CN112464850A (en) * 2020-12-08 2021-03-09 东莞先知大数据有限公司 Image processing method, image processing apparatus, computer device, and medium
CN112884716A (en) * 2021-01-28 2021-06-01 中国空气动力研究与发展中心超高速空气动力研究所 Method for strengthening characteristics of ultra-high-speed impact damage area
CN113095370A (en) * 2021-03-18 2021-07-09 北京达佳互联信息技术有限公司 Image recognition method and device, electronic equipment and storage medium
CN113177890A (en) * 2021-04-27 2021-07-27 深圳市慧鲤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN114494765A (en) * 2021-12-21 2022-05-13 北京瑞莱智慧科技有限公司 Method and device for identifying true and false cigarette identification points, electronic equipment and storage medium
CN114494691A (en) * 2020-11-13 2022-05-13 瑞昱半导体股份有限公司 Image processing method and image processing system
CN114926666A (en) * 2022-06-08 2022-08-19 腾讯科技(深圳)有限公司 Image data processing method and device
CN115115829A (en) * 2022-05-06 2022-09-27 腾讯医疗健康(深圳)有限公司 Medical image segmentation method, device, equipment, storage medium and program product
CN115700785A (en) * 2021-07-16 2023-02-07 顺丰科技有限公司 Image feature extraction method and device, electronic equipment and storage medium
CN116645719A (en) * 2023-05-06 2023-08-25 杭州萤石软件有限公司 Pupil and iris positioning method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6608110B2 (en) * 2016-04-13 2019-11-20 富士フイルム株式会社 Image alignment apparatus and method, and program
CN109684920B (en) * 2018-11-19 2020-12-11 腾讯科技(深圳)有限公司 Object key point positioning method, image processing method, device and storage medium
US11636609B2 (en) * 2019-12-16 2023-04-25 Nvidia Corporation Gaze determination machine learning system having adaptive weighting of inputs

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019109843A (en) * 2017-12-20 2019-07-04 コニカミノルタ株式会社 Classification device, classification method, attribute recognition device, and machine learning device
CN108805061A (en) * 2018-05-30 2018-11-13 西北工业大学 Hyperspectral image classification method based on local auto-adaptive discriminant analysis
CN109993221A (en) * 2019-03-25 2019-07-09 新华三大数据技术有限公司 A kind of image classification method and device
CN110781956A (en) * 2019-10-24 2020-02-11 精硕科技(北京)股份有限公司 Target detection method and device, electronic equipment and readable storage medium
CN110930417A (en) * 2019-11-26 2020-03-27 腾讯科技(深圳)有限公司 Training method and device of image segmentation model, and image segmentation method and device
CN111476806A (en) * 2020-06-23 2020-07-31 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and storage medium
CN114494691A (en) * 2020-11-13 2022-05-13 瑞昱半导体股份有限公司 Image processing method and image processing system
CN112464850A (en) * 2020-12-08 2021-03-09 东莞先知大数据有限公司 Image processing method, image processing apparatus, computer device, and medium
CN112884716A (en) * 2021-01-28 2021-06-01 中国空气动力研究与发展中心超高速空气动力研究所 Method for strengthening characteristics of ultra-high-speed impact damage area
CN113095370A (en) * 2021-03-18 2021-07-09 北京达佳互联信息技术有限公司 Image recognition method and device, electronic equipment and storage medium
CN113177890A (en) * 2021-04-27 2021-07-27 深圳市慧鲤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN115700785A (en) * 2021-07-16 2023-02-07 顺丰科技有限公司 Image feature extraction method and device, electronic equipment and storage medium
CN114494765A (en) * 2021-12-21 2022-05-13 北京瑞莱智慧科技有限公司 Method and device for identifying true and false cigarette identification points, electronic equipment and storage medium
CN115115829A (en) * 2022-05-06 2022-09-27 腾讯医疗健康(深圳)有限公司 Medical image segmentation method, device, equipment, storage medium and program product
CN114926666A (en) * 2022-06-08 2022-08-19 腾讯科技(深圳)有限公司 Image data processing method and device
CN116645719A (en) * 2023-05-06 2023-08-25 杭州萤石软件有限公司 Pupil and iris positioning method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
An Efficient and Robust Iris Segmentation Algorithm Using Deep Learning;Yung-Hui Li等;《Mobile Information Systems》;第2019卷;第1-14页 *

Also Published As

Publication number Publication date
CN117079058A (en) 2023-11-17

Similar Documents

Publication Publication Date Title
CN109902546B (en) Face recognition method, face recognition device and computer readable medium
CN107529650B (en) Closed loop detection method and device and computer equipment
WO2021238366A1 (en) Neural network construction method and apparatus
CN111401344B (en) Face recognition method and device and training method and device of face recognition system
CN111368943B (en) Method and device for identifying object in image, storage medium and electronic device
CN112446270A (en) Training method of pedestrian re-identification network, and pedestrian re-identification method and device
CN112639828A (en) Data processing method, method and equipment for training neural network model
CN110807437B (en) Video granularity characteristic determination method and device and computer-readable storage medium
CN111914997B (en) Method for training neural network, image processing method and device
CN113705769A (en) Neural network training method and device
CN110765882B (en) Video tag determination method, device, server and storage medium
CN110222718B (en) Image processing method and device
CN113095370B (en) Image recognition method, device, electronic equipment and storage medium
CN111476806B (en) Image processing method, image processing device, computer equipment and storage medium
CN113570029A (en) Method for obtaining neural network model, image processing method and device
CN113688765B (en) Action recognition method of self-adaptive graph rolling network based on attention mechanism
CN115018039A (en) Neural network distillation method, target detection method and device
CN112464930A (en) Target detection network construction method, target detection method, device and storage medium
CN113449548A (en) Method and apparatus for updating object recognition model
CN111626212B (en) Method and device for identifying object in picture, storage medium and electronic device
CN112446462A (en) Generation method and device of target neural network model
CN113706550A (en) Image scene recognition and model training method and device and computer equipment
CN117079058B (en) Image processing method and device, storage medium and electronic equipment
CN112232292A (en) Face detection method and device applied to mobile terminal
CN111444957A (en) Image data processing method, image data processing device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant