CN117115900B - Image segmentation method, device, equipment and storage medium - Google Patents

Image segmentation method, device, equipment and storage medium Download PDF

Info

Publication number
CN117115900B
CN117115900B CN202311370361.0A CN202311370361A CN117115900B CN 117115900 B CN117115900 B CN 117115900B CN 202311370361 A CN202311370361 A CN 202311370361A CN 117115900 B CN117115900 B CN 117115900B
Authority
CN
China
Prior art keywords
iris
image
segmentation
subgraph
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311370361.0A
Other languages
Chinese (zh)
Other versions
CN117115900A (en
Inventor
许剑清
姚炜鹏
张菁芸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202311370361.0A priority Critical patent/CN117115900B/en
Publication of CN117115900A publication Critical patent/CN117115900A/en
Application granted granted Critical
Publication of CN117115900B publication Critical patent/CN117115900B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Ophthalmology & Optometry (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

An image segmentation method, an image segmentation device, image segmentation equipment and a storage medium relate to the technical field of image processing and artificial intelligence, and can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like; in the application, an iris region in an image to be processed is obtained; sampling the image to be processed based on the iris region to obtain an iris subgraph; respectively carrying out attribute recognition on each pixel point in the iris subgraph to obtain a corresponding recognition result, wherein the recognition result represents that the pixel point is iris or noise; mapping the identification result of each pixel point in the iris subgraph to the corresponding pixel point in the image to be processed; based on the mapping result, the iris and the noise are segmented. An iris subgraph is obtained based on an iris region obtained by positioning the iris position, and a segmentation result is determined based on the recognition result of the iris subgraph; the iris subgraph is reduced in size relative to the original graph, and more irrelevant interference noise is eliminated, so that the segmentation speed and accuracy are improved.

Description

Image segmentation method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to an image segmentation method, apparatus, device, and storage medium.
Background
With the development of technology, the manner of determining the object triggering operation instruction by iris recognition is mature, and iris segmentation is an essential technology for iris recognition. For example, when the device unlocking operation triggered by the object is determined by using the iris recognition method, or when the line of sight direction of the object is estimated by using the iris recognition method and the game screen updating operation triggered by the object is determined based on the line of sight direction, iris segmentation is performed on the image first, the iris is obtained from the image, and further recognition is performed based on the iris.
Iris segmentation is realized based on the current iris segmentation model, but the current iris segmentation model cannot give consideration to segmentation speed and segmentation accuracy; namely: the division delay is larger when the division accuracy is high, and the division accuracy is low when the division delay is small.
Therefore, how to improve the segmentation speed and accuracy is a technical problem to be solved at present.
Disclosure of Invention
The embodiment of the application provides an image segmentation method, device, equipment and storage medium, which are used for improving segmentation speed and accuracy.
In a first aspect, an embodiment of the present application provides an image segmentation method, including:
iris positioning is carried out on the image to be processed, and an iris area in the image to be processed is obtained;
Sampling the image to be processed based on the iris region to obtain an iris subgraph;
respectively carrying out attribute identification on each pixel point in the iris subgraph to obtain a corresponding identification result;
the recognition result represents that the pixel point is an iris or noise, and the noise comprises other eye structures and background noise;
mapping the identification results of each pixel point in the iris subgraph to corresponding pixel points in the image to be processed;
based on the mapping result, iris and noise are segmented within the image to be processed.
In a second aspect, an embodiment of the present application provides an image segmentation apparatus, including:
the positioning unit is used for positioning the iris of the image to be processed to obtain an iris area in the image to be processed;
the sampling unit is used for sampling the image to be processed based on the iris region to obtain an iris subgraph;
the identification unit is used for respectively carrying out attribute identification on each pixel point in the iris subgraph to obtain a corresponding identification result; the recognition result represents that the pixel point is an iris or noise, and the noise comprises other eye structures and background noise;
the mapping unit is used for mapping the identification results of each pixel point in the iris subgraph to corresponding pixel points in the image to be processed;
And the segmentation unit is used for segmenting the iris and the noise in the image to be processed based on the mapping result.
In one possible implementation, the sampling unit is specifically configured to:
based on the iris region, determining a sampling transformation matrix according to the initial scaling coefficient and a preset size;
and based on the sampling transformation matrix, sampling the image to be processed to obtain an iris subgraph.
In one possible implementation, the sampling unit is specifically configured to:
determining an initial offset based on a target size of the iris region and the initial scaling factor;
determining a transformation scaling factor based on the initial offset and a preset size; and determining a pixel offset based on the initial offset and the center point coordinates of the iris region;
a sampling transformation matrix is determined based on the transformation scaling coefficients and the pixel offset.
In one possible implementation, the initial scaling factor is preset, or the initial scaling factor is obtained by a target factor recognition model;
the target coefficient identification model is obtained by adopting a prediction sample set and training in combination with a pre-training segmentation model, and each prediction sample in the prediction sample set comprises: and processing the obtained pre-segmented image marked with the iris region through a detection model.
In one possible implementation, the identification unit is specifically configured to:
respectively carrying out attribute identification on each pixel point in the iris subgraph through a target segmentation model to obtain a corresponding identification result; the target segmentation model is determined based on a pre-training segmentation model obtained by training a prediction sample set;
wherein each prediction sample in the prediction sample set comprises: and processing the obtained pre-segmented image marked with the iris region through a detection model.
In one possible implementation, the mapping unit is specifically configured to:
determining an inverse transformation matrix of the sampling transformation matrix;
based on the inverse transformation matrix, mapping the identification result to corresponding pixel points in the iris region of the image to be processed, and setting the pixel points in other regions of the image to be processed as background noise.
In one possible implementation, each prediction sample further includes: a first labeling category of each pixel point in the pre-segmentation image; the pre-training segmentation model is obtained by training in the following way:
based on the prediction sample set, performing loop iteration training on the segmentation model to be trained to obtain a pre-training segmentation model, wherein the loop iteration training is performed in a loop iteration process:
Selecting a prediction sample from a prediction sample set, and performing regional sampling on the pre-segmented image according to a preset scaling factor and a preset size based on an iris region in the prediction sample to obtain a prediction subgraph;
inputting the prediction subgraph into a segmentation model to be trained, predicting the initial prediction category of each pixel point in the prediction subgraph, and determining the standard prediction category of each pixel point in the pre-segmentation image according to the initial prediction category;
and carrying out parameter adjustment by adopting a first loss function constructed based on the standard prediction category and the first labeling category.
In one possible implementation, determining the target segmentation model based on the pre-trained segmentation model includes:
performing loop iteration training on the pre-training segmentation model based on the target sample set to obtain a target segmentation model, wherein the loop iteration training is performed in a loop iteration process:
selecting a target sample from the target sample set; wherein the target sample comprises: the target segmentation image marked with the iris region is obtained through detection model processing, and the second marking category of each pixel point in the target segmentation image is obtained;
based on iris areas in the target segmentation images, carrying out area sampling on the target segmentation images according to the random scaling factors and the preset size to obtain target subgraphs; the random scaling factor is randomly selected from the scaling factor interval;
Inputting the target subgraph into a pre-training segmentation model, predicting the initial segmentation class of each pixel point in the target subgraph, and determining the prediction segmentation class of each pixel point in the target segmentation image according to the initial segmentation class;
and performing parameter fine adjustment by adopting a second loss function constructed based on the prediction segmentation category and the second labeling category.
In one possible implementation, the positioning unit is specifically configured to:
iris positioning is carried out on the image to be processed through a target detection model, and an iris area is obtained; the target detection model is obtained by performing loop iteration training on a detection model to be trained based on a detection sample set, wherein the loop iteration training is performed in a loop iteration process:
selecting a detection sample from the detection sample set; wherein detecting the sample comprises: a sample image and an annotation region of the iris in the sample image;
inputting a detection sample into a detection model to be trained, and predicting a prediction area of an iris in a sample image;
and constructing a third loss function based on the labeling area and the prediction area, and adopting the loss function to carry out parameter adjustment.
In a third aspect, embodiments of the present application provide a computing device comprising: a memory and a processor, wherein the memory is used for storing a computer program; and a processor for executing a computer program to implement the steps of the image segmentation method provided in the embodiments of the present application.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program, which when executed by a processor, implements the steps of the image segmentation method provided by embodiments of the present application.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program stored in a computer readable storage medium; when the processor of the computing device reads the computer program from the computer-readable storage medium, the processor executes the computer program, so that the computing device performs the steps of the image segmentation method provided in the embodiments of the present application.
The beneficial effects of the application are as follows:
the embodiment of the application provides an image segmentation method, an image segmentation device, image segmentation equipment and a storage medium, relates to the fields of image processing technology, artificial intelligence technology and the like, and can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, driving assistance and the like. The method is used for dividing the image based on the iris subgraph, and accuracy and speed of division are improved. In the embodiment of the application, firstly, iris positioning is carried out on an image to be processed to obtain an iris region in the image to be processed, and sampling processing is carried out on the image to be processed based on the iris region to obtain an iris subgraph; then, carrying out attribute recognition on the pixel points in the iris subgraph to obtain a recognition result, wherein the recognition result represents that the pixel points are iris or noise, and the noise comprises other eye structures and background noise; the iris region occupies smaller area, so that the iris subgraph is obtained by sampling based on the iris region, the requirement of reducing the calculated amount is met, the segmentation speed is improved, and the iris subgraph is obtained by sampling based on the iris region, so that more irrelevant interference noise is eliminated from the iris subgraph, the segmentation difficulty is reduced, and the segmentation accuracy is improved. Then, mapping the identification results of all the pixel points in the iris subgraph to corresponding pixel points in the image to be processed; finally, based on the mapping result, dividing the iris and the noise in the image to be processed, so as to realize the division of the image to be processed.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;
fig. 2 is a schematic diagram of another application scenario provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a detection model according to an embodiment of the present application;
FIG. 4 is a flowchart of a method for training a detection model according to an embodiment of the present application;
FIG. 5 is a schematic diagram of test model training according to an embodiment of the present disclosure;
Fig. 6 is a schematic structural diagram of a segmentation model according to an embodiment of the present application;
FIG. 7 is a flowchart of a method for pre-training a segmentation model to be trained according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a prediction subgraph obtained for a pre-segmented image according to an embodiment of the present application;
FIG. 9 is a flowchart of a method for fine tuning a pre-trained segmentation model according to an embodiment of the present application;
FIG. 10 is a schematic diagram of a segmentation model training provided in an embodiment of the present application;
FIG. 11 is a schematic diagram of a joint training of a coefficient identification model and a pre-training segmentation model according to an embodiment of the present application;
FIG. 12 is a schematic diagram of a model deployment application provided in an embodiment of the present application;
FIG. 13 is a schematic diagram of another model deployment application provided in an embodiment of the present application;
fig. 14 is a flowchart of an image segmentation method according to an embodiment of the present application;
FIG. 15 is a diagram of an embodiment of image segmentation according to an embodiment of the present application;
FIG. 16 is a diagram of an embodiment of image segmentation according to an embodiment of the present application;
fig. 17 is a block diagram of an image segmentation apparatus according to an embodiment of the present application;
fig. 18 is a block diagram of a computing device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantageous effects of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments, but not all embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Some of the terms in the embodiments of the present application are explained below to facilitate understanding by those skilled in the art.
The iris is part of the eye's construction and can be used to identify identity and line of sight recognition.
Other ocular structures are other ocular structures besides the iris, such as: pupil, sclera, periocular skin, etc.
The fine tuning is to use a pre-trained model to custom train certain tasks and correct the model for specific tasks. The pre-training segmentation model in the embodiment of the application is a pre-training model, and after parameter adjustment is performed by the method in the embodiment of the application, a target segmentation model can be obtained and used for realizing the task of performing attribute identification on each pixel point in the iris subgraph.
The word "exemplary" is used hereinafter to mean "serving as an example, embodiment, or illustration. Any embodiment described as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The terms "first," "second," and the like herein are used for descriptive purposes only and are not to be construed as either explicit or implicit relative importance or to indicate the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more features, and in the description of embodiments of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
With the development of technology, the manner of determining the object triggering operation instruction by iris recognition is mature, and iris segmentation is an essential technology for iris recognition. For example, when the device unlocking operation triggered by the object is determined by using the iris recognition method, or when the line of sight direction of the object is estimated by using the iris recognition method and the game screen updating operation triggered by the object is determined based on the line of sight direction, iris segmentation is performed on the image first, the iris is obtained from the image, and further recognition is performed based on the iris.
Iris segmentation is realized based on the current iris segmentation model, but the current iris segmentation model cannot give consideration to segmentation speed and segmentation accuracy; namely: the division delay is larger when the division accuracy is high, and the division accuracy is low when the division delay is small.
Illustratively, the related art uses a scaling method to segment the original image, or uses a dual-branch model to segment the original image.
When the original image is segmented in a scaling mode, the original image is scaled, namely the original image is scaled to a certain size, then the scaled image is segmented, and finally the segmentation result is mapped to the original image size, so that the time consumption of a model can be reduced by reducing the input size, but the segmentation result is consistent with the input size, the segmentation result is required to be mapped to the original size, the edge position is caused to generate larger saw tooth shape during mapping, and the segmentation accuracy is reduced; meanwhile, as the original image is larger in size, the details of the iris area are blurred due to the larger scaling ratio, the segmentation model cannot accurately distinguish the blurred details, and the segmentation accuracy is reduced;
when the original image is segmented by adopting the double-branch model, the model structure consists of two branches, one branch is used for leading details and the other branch is used for leading categories, so that the accuracy rate of iris segmentation can be improved; however, a model structure with larger calculation amount is needed by adopting a detail branch in the double-branch model, so that the segmentation time is longer in order to ensure the segmentation accuracy; meanwhile, the double-branch model structure aims at semantic segmentation tasks with more segmentation categories, and in iris segmentation, only 4 types of targets are segmented, so that more calculation redundancy exists.
Therefore, how to improve the segmentation speed and accuracy is a technical problem to be solved at present.
In view of this, the embodiments of the present application provide an image segmentation method, apparatus, device and storage medium, which relate to the fields of image processing technology, artificial intelligence technology, etc., and can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, auxiliary driving, etc. In the embodiment of the application, the image segmentation is performed based on the iris subgraph, so that the accuracy and the speed of the segmentation are improved. Firstly, iris positioning is carried out on an image to be processed to obtain an iris region in the image to be processed, and sampling processing is carried out on the image to be processed based on the iris region to obtain an iris subgraph; then, carrying out attribute recognition on the pixel points in the iris subgraph to obtain a recognition result, wherein the recognition result represents that the pixel points are iris or noise, and the noise comprises other eye structures and background noise; the iris region occupies smaller area, so that the iris subgraph obtained by sampling based on the iris region can meet the requirement of reducing the calculated amount without scaling treatment, the segmentation speed is improved, and the iris subgraph is obtained by sampling based on the iris region, so that more irrelevant interference noise is eliminated from the iris subgraph, the segmentation difficulty is reduced, and the segmentation accuracy is improved. Then, mapping the identification results of all the pixel points in the iris subgraph to corresponding pixel points in the image to be processed; finally, based on the mapping result, dividing the iris and the noise in the image to be processed, so as to realize the division of the image to be processed.
Embodiments of the present application relate to artificial intelligence (Artificial Intelligence, AI) and Machine Learning techniques, designed based on speech technology, natural language processing technology, and Machine Learning (ML) in artificial intelligence.
Artificial intelligence is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence.
Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. Artificial intelligence techniques mainly include computer vision techniques, natural language processing techniques, machine learning/deep learning, and other major directions. With research and progress of artificial intelligence technology, artificial intelligence is developed in various fields such as common smart home, intelligent customer service, virtual assistant, smart speaker, smart marketing, unmanned, automatic driving, robot, smart medical, etc., and it is believed that with the development of technology, artificial intelligence will be applied in more fields and with increasingly important value.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include, for example, sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, pre-training model technologies, operation/interaction systems, mechatronics, and the like. The pre-training model is also called a large model and a basic model, and can be widely applied to all large-direction downstream tasks of artificial intelligence after fine adjustment. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace human eyes with a camera and a Computer to perform machine Vision such as recognition, tracking and measurement on a target, and further perform graphic processing to make the Computer process into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. The large model technology brings important innovation for the development of computer vision technology, and a pre-trained model in the vision fields of swin-transformer, viT, V-MOE, MAE and the like can be rapidly and widely applied to downstream specific tasks through fine tuning. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
The application scenario set in the present application is briefly described below. It should be noted that the following scenario is only for illustrating the embodiments of the present application, and is not limiting. In specific implementation, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.
Referring to fig. 1, fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application. The application scene comprises a target object and terminal equipment, wherein the target object wears the terminal equipment; the terminal equipment is Virtual Reality (VR)/augmented Reality (Augmented Reality, AR) equipment; the terminal device may acquire an eye image of the target object, and may perform iris segmentation on the acquired eye image.
In a possible application scenario, when a target object wears a terminal device to play a game, the terminal device acquires an eye image of the target object, performs iris segmentation on the acquired eye image by adopting the image segmentation method provided by the embodiment of the application, acquires an iris segmentation result, estimates the sight line direction of the target object by using the iris segmentation result, and presents a game picture of the sight line direction for the target object in the terminal device according to the estimation result so as to provide the experience of the target object on the terminal device.
Referring to fig. 2, another application scenario is provided in the embodiment of the present application, where the application scenario includes a terminal device 210 and a server 220, and communication between the terminal device 210 and the server 220 may be performed through a communication network.
In an alternative embodiment, the communication network may be a wired network or a wireless network. Accordingly, the terminal device 210 and the server 220 may be directly or indirectly connected through wired or wireless communication. For example, the terminal device 210 may be indirectly connected to the server 220 through a wireless access point, or the terminal device 210 may be directly connected to the server 220 through the internet, which is not limited herein.
The terminal device 210 includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a desktop computer, an electronic book reader, an intelligent voice interaction device, an intelligent home appliance, a vehicle-mounted terminal, and the like; various clients can be installed on the terminal device, and the clients support the iris segmentation function.
The server 220 is a backend server corresponding to a client installed in the terminal apparatus 210. The server 220 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), basic cloud computing services such as big data and artificial intelligence platforms, and the like.
In a possible application scenario, a terminal device 210 acquires an iris image of a target object, uploads the acquired iris image to a server 220 for verification, after the server 220 receives the iris image, the iris image is segmented by adopting the image segmentation method provided by the embodiment of the application, an iris segmentation result is obtained, verification is performed based on the iris segmentation result, after the verification is determined to pass, the verification result is returned to the terminal device 210, and the terminal device 210 receives the verification result and performs the next operation; for example: a certain application is logged in.
The illustration in fig. 2 is merely exemplary, and the number of terminal devices 210 and servers 220 is not limited in practice, and is not specifically limited in the embodiments of the present application. In the embodiment, when the number of servers 220 is plural, plural servers 220 may be configured as a blockchain, and the servers 220 are nodes on the blockchain.
It should be noted that, the image segmentation method and the model training method involved in the image segmentation process in the embodiment of the present application may be performed by a computing device, which may be the server 220 or the terminal device 210, that is, the method may be performed by the server 220 or the terminal device 210 alone, or may be performed by both the server 220 and the terminal device 210 together.
In order to further explain the technical solutions provided in the embodiments of the present application, the following uses a terminal device to perform an example alone, and describes an image segmentation method and a training method of a model used in an image segmentation process according to an exemplary embodiment of the present application with reference to the accompanying drawings. It should be noted that the above application scenario is only shown for the convenience of understanding the spirit and principles of the present application, and embodiments of the present application are not limited in any way in this respect. Furthermore, although the operations of the methods of the present application are depicted in the drawings in a particular order, this is not required to or suggested that these operations must be performed in this particular order or that all of the illustrated operations must be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
In the embodiment of the application, in order to improve the accuracy and the speed of iris segmentation, an implementation manner for segmenting an image to be processed based on an iris subgraph is provided. Therefore, when the image is segmented, the iris area is determined through the target detection model to determine the iris subgraph according to the iris area, then the iris subgraph is segmented through the target segmentation model to obtain the segmentation result of the iris subgraph, and finally the segmentation result of the image to be processed is determined according to the segmentation result of the iris subgraph.
Therefore, in order to ensure the accuracy of the iris subgraph and the accuracy of the segmentation result, the detection model and the segmentation model should be trained first, and then the trained target detection model and the trained target segmentation model should be applied to segment the image to be processed.
The overall implementation of the embodiments of the present application will be described below in terms of a model application procedure from a model training procedure, respectively.
The training process is a process of performing repeated cyclic iterative training on a model to be trained by using a training sample, and mainly comprises a model design stage, a data preparation stage and an iterative training stage; the training process of the detection model and the segmentation model according to the embodiment of the present application will be described below.
1. Model design stage
Referring to fig. 3, fig. 3 is a schematic structural diagram of a detection model provided in an embodiment of the present application. The structure of the detection model is a convolutional neural model (Convolutional Neural Network, CNN) and comprises one or more convolutional (con) layers, pooling (Pooling) layers, full-connection layers and the like; wherein the convolution layer is used for convolution calculation, the Pooling layer is used for Pooling (Pooling) calculation, and the full-connection layer is used for nonlinear activation function (Relu) calculation.
2. Data preparation phase
The preparation of training data can be said to be the most important link of machine learning, and the data preparation stage in the embodiment of the present application mainly includes: and acquiring a sample image containing the iris region and labeling the iris region in the sample image.
In one possible implementation, the acquired sample image containing the iris region may also be subjected to enhancement processing, such as random cropping, rotation, random flipping, etc., during the data preparation stage.
3. Iterative training phase
In the embodiment of the application, the target detection model is obtained by carrying out cyclic iterative training on the detection model to be trained. In the model training process, a total sample image (i.e. a detection sample set) is subjected to multiple (e.g. 100) round of loop iteration, wherein the total sample image is trained once in a detection model to be trained, and the iteration is called one round. In each iteration, because the memory resources of the training machine are limited, the full sample images cannot be input into the detection model to be trained at one time for training, batch-by-batch (batch) training is needed for the detection sample set, each batch of detection samples are generated in a random division mode, and are input into the detection model to be trained respectively for training such as forward calculation, backward calculation, model parameter updating and the like.
In one possible implementation, a detection sample is selected from a detection sample set, and then the detection sample is input into a detection model to be trained to predict a prediction area of an iris in a sample image; and then constructing a third loss function based on the labeling area and the prediction area, and adopting the loss function to carry out parameter adjustment.
Because the operation executed by each loop iteration is consistent, taking one loop iteration as an example, the training of the detection model to be trained is described. Referring to fig. 4, a flowchart of a method for training a detection model according to an embodiment of the present application includes the following steps:
step S400, selecting a detection sample from a detection sample set; wherein detecting the sample comprises: a sample image, and an annotated region of the iris in the sample image.
Step S401, inputting a detection sample into a detection model to be trained, and predicting a prediction area of the iris in a sample image.
Illustratively, detection samples are randomly selected from a detection sample set, the selected detection samples are formed into a batch to be input into a detection model to be trained, so that iris positions of the detection samples are extracted through convolution (calculation), pooling (Pooling) calculation, nonlinear activation function (Relu) calculation and the like, and coordinate vectors of a predicted region of an iris in a sample image are output.
Step S402, a third loss function is constructed based on the labeling area and the prediction area, and parameter adjustment is performed by adopting the third loss function.
In one possible implementation manner, the coordinates of the prediction area extracted by the detection model to be trained and the coordinates of the labeling area of the iris in the sample image are taken as inputs, a third loss function is adopted, a loss function value is calculated, and the parameter adjustment is performed on the detection model to be trained by adopting a mode that the loss function value is reduced based on gradient.
It should be noted that, the third loss function may be selected from various distance metric functions, for example: an L1 loss function, an L2 loss function, a smoth_L1 loss function; other types of loss functions may also be employed; auxiliary information such as iris frame classification, confidence degree classification and the like can be added to ensure that model training is more accurate. The manner in which the gradient decreases includes, but is not limited to: random gradient descent, random gradient descent of the driven measure, adaptive moment estimation (adam) optimizer, gradient descent algorithm of adaptive learning rate (adagard).
In one possible implementation manner, in order to increase the detection speed, after a detection sample is selected, before the detection sample is input into a detection model to be trained, scaling the sample image to a smaller size; for example, w×h is 320×240, i.e., the width of the picture is set to 320 and the height of the picture is set to 240.
In one possible implementation, before the parameter adjustment is performed, it may also be determined whether the model convergence condition is satisfied. For example, the model convergence condition may include at least one of the following conditions: model loss is not greater than a preset loss value threshold; the iteration number reaches a preset number upper limit value. For example: setting an iteration training frequency upper limit value, and stopping training when the iteration training frequency reaches a preset frequency upper limit value; or setting a loss value threshold, and stopping training when the loss calculated by the third loss function is smaller than or equal to a preset loss value threshold.
In order to facilitate a clearer understanding of the training process of the detection model, the embodiment of the present application proposes a schematic diagram of training the detection model, see fig. 5:
firstly, selecting a detection sample from a detection sample set; then scaling the width and height of a sample image in the selected detection sample to 320 multiplied by 240, inputting the scaled sample image into a detection model to be trained, predicting the position of the iris in the sample image through the detection model to be trained, and outputting the coordinate value of a prediction area of the iris in the sample image; then constructing a third loss function based on the coordinate value of the prediction area and the coordinate value of the labeling area of the iris in the sample image, and calculating a loss value; and then, determining whether a model training stop condition is met or not based on the loss value and a preset loss value threshold value, obtaining a trained target detection model when the stop condition is determined to be met, and carrying out parameter adjustment on the detection model to be trained based on a gradient descent mode when the stop condition is determined not to be met.
1. Model design stage
In one possible implementation, the structure of the segmentation model is a convolutional neural model, which includes one or more convolutional (convolution) layers, pooling (Pooling) layers, full-connection layers, and the like; wherein the convolution layer is used for convolution calculation, the Pooling layer is used for Pooling (Pooling) calculation, and the full-connection layer is used for nonlinear activation function (Relu) calculation. When the structure of the segmentation model is a convolutional neural model, the structure of the segmentation model may be referred to as the structure of the detection model, and will not be described herein.
In another possible implementation, the structure of the segmentation model is a U-Net model, which includes a plurality of cross-attention (QKV) modules (also called cross-attention layers), and the U-Net model includes a cross-attention (QKV) module. Referring to fig. 6, a schematic structural diagram of a segmentation model according to an embodiment of the present application is provided; wherein fig. 6 illustrates the structure of different layers of the U-Net model, and fig. 6 illustrates only a part of the structure, which is limited by the space. The method is divided into three parts: an input section (IN) 61, an intermediate section (MID) 62 and an output section (OUT) 63.
As shown in fig. 6, in which the input section 61 is simply exemplified by 4 layers, which are a residual module 611, an attention module 612, a residual module 613, and an attention module 614, respectively; the middle portion 62 is simply illustrated as 3 layers, a residual block 621, an attention block 622, a residual block 623, respectively; the output section 63 simply illustrates 4 layers, namely, a residual module 631, an attention module 632, a residual module 633, and an attention module 634. Note that the Unet structure listed in fig. 6 is only a simple example.
The U-Net model can also comprise a skip connection (skip connection) structure, each time of downsampling can have a skip connection cascaded with a corresponding upsampling, so that the U-Net model fuses the characteristics of the corresponding position of the encoder on a channel in each time of upsampling, and the detection precision is improved through the fusion of the characteristics of different sizes.
2. Data preparation phase
The preparation of training data can be said to be the most important link of machine learning, and the data preparation stage in the embodiment of the present application mainly includes: and obtaining a pre-segmentation image marked with the iris region and marking the first marking category of each pixel point of the pre-segmentation image.
In one possible implementation, a pre-segmented image labeled with iris regions is obtained by the trained object detection model described above.
3. Iterative training phase
In the embodiment of the application, the target detection model is obtained by carrying out cyclic iterative training on the segmentation model to be trained. In the model training process, the segmentation model to be trained is pre-trained based on a prediction sample set to obtain a pre-trained segmentation model, and the pre-trained segmentation model is further subjected to fine tuning to obtain a trained target segmentation model in order to ensure the robustness of the segmentation model.
It should be noted that, both the pre-training process and the fine tuning process, multiple rounds of iterative training are performed on the sample set, and the sample set is trained once in the model to form one round of iteration. In each iteration, because the memory resources of the training machine are limited, the sample set cannot be input into the model for training at one time, so that batch (batch) training is required for the sample set, each batch of samples are generated in a mode such as random division, and are input into the to-be-trained detection model for training such as forward calculation, backward calculation and model parameter updating. And because the operation executed by each iteration of the loop is consistent, the training of the segmentation model to be trained is described below by taking one iteration of the loop as an example.
Referring to fig. 7, fig. 7 is a flowchart of a method for pre-training a segmentation model to be trained according to an embodiment of the present application, including the following steps:
step 700, selecting a prediction sample from the prediction sample set, and based on the iris region in the prediction sample, performing region sampling on the pre-segmented image according to a preset scaling factor and a preset size to obtain a prediction subgraph.
In one possible implementation, a prediction sample is selected from a prediction sample set, and before a prediction sub-graph is obtained based on an iris region of a pre-segmented image in the prediction sample, the image of the pre-segmented image in the selected prediction sample is processed; for example, gamma correction transformation, histogram equalization, etc. are performed on the pre-divided image.
In the embodiment of the present application, based on an iris region in a prediction sample, performing region sampling processing on a pre-segmented image according to a preset scaling factor and a preset size, and executing the following steps A1 to A5 when obtaining a prediction subgraph:
and A1, acquiring an iris region in the pre-segmentation image.
Illustratively, the starting point coordinates of the iris region are determined, and the starting point coordinates of the upper left corner of the iris region are generally taken as the starting point coordinates of the iris regionThe method comprises the steps of carrying out a first treatment on the surface of the Determining the target size of the iris region +.>The general iris region is a regular rectangle, and the width and height of the rectangle are the target dimensions of the iris region.
And step A2, determining an initial offset based on the target size of the iris region and a preset scaling factor.
Exemplary, let the initial offset be,/>Wherein->For the target size of the iris area +.>For the preset scaling factor, it is generally set to 1.2.
A3, determining a transformation scaling factor based on the initial offset and a preset size; and determining a pixel offset based on the initial offset and the center point coordinates of the iris region.
Illustratively, the transform scaling factor is set to,/>,/>For determining an initial offset based on a target size of the iris region and a preset scaling factor,/for >Is a pixel value determined according to a preset size, and +.>Wherein->For the preset size, the preset size is +.>
Exemplary, let the pixel offset beAnd->Wherein->Is the central point coordinate of the iris region, and the central point coordinate is based on the starting point coordinate of the iris region and the large iris regionSmall deterministic; the specific determination mode is as follows:
and step A4, determining a sampling transformation matrix based on the transformation scaling coefficient and the pixel offset.
Exemplary, the sampling transformation matrix provided in the embodiments of the present application is:
and step A5, carrying out region sampling processing on the pre-segmented image based on the sampling transformation matrix to obtain a prediction subgraph.
It should be noted that the size of the prediction subgraph may be adjusted according to practical situations, for example, the size of the prediction subgraph is 96×96.
Referring to fig. 8, fig. 8 is a schematic diagram of obtaining a prediction subgraph for a pre-segmented image according to an embodiment of the present application, and as can be seen from fig. 8:
firstly, processing a pre-segmentation image containing an iris region through a target detection model to obtain a pre-segmentation image marked with the iris region; and inputting the pre-segmented image marked with the iris region into a sampling module, setting a preset scaling factor and a preset size in the sampling module, and sampling the pre-segmented image according to the preset scaling factor and the preset size based on the iris region to obtain a prediction subgraph.
Step S701, inputting a prediction sub-graph into a segmentation model to be trained, predicting initial prediction categories of all pixel points in the prediction sub-graph, and determining standard prediction categories of all pixel points in a pre-segmentation image according to the initial prediction categories.
After obtaining a predicted sub-image, inputting the predicted sub-image into a segmentation model to be trained, iris segmentation is carried out on the predicted sub-image, and the initial prediction category of each pixel point in the predicted sub-image is predicted; the total categories are five, pupil, iris, sclera, periocular skin, and background noise, respectively. At this time, it is specifically identified which of the pupil, iris, sclera, periocular skin, and background noise each pixel is.
In one possible implementation, after the initial prediction category is obtained, a standard prediction category for each pixel in the pre-segmented image is determined based on the initial prediction category.
Illustratively, an inverse transform matrix of the sampling transform matrix is determined; and mapping the initial prediction category of each pixel point in the prediction subgraph to the corresponding pixel point in the iris region of the pre-segmentation image based on the inverse transformation matrix, setting the pixel points in other regions of the pre-segmentation image as background noise, and determining the standard prediction category of each pixel point in the pre-segmentation image.
Step S702, parameter adjustment is performed by using a first loss function constructed based on the standard prediction category and the first labeling category.
In one possible implementation, the standard prediction category and the first labeling category are taken as inputs, a first loss function is adopted, a loss function value is calculated, and a mode that the loss function value is reduced based on gradient is adopted to carry out parameter adjustment on the segmentation model to be trained.
It should be noted that, the first loss function may be selected from various distance metric functions, for example: an L1 loss function, an L2 loss function, a smoth_L1 loss function; other types of loss functions may also be employed. The manner in which the gradient decreases includes, but is not limited to: random gradient descent, random gradient descent of the driven measure, adaptive moment estimation (adam) optimizer, gradient descent algorithm of adaptive learning rate (adagard).
In one possible implementation, before the parameter adjustment is performed, it may also be determined whether the model convergence condition is satisfied. For example, the model convergence condition may include at least one of the following conditions: model loss is not greater than a preset loss value threshold; the iteration number reaches a preset number upper limit value. For example: setting an iteration training frequency upper limit value, and stopping training when the iteration training frequency reaches a preset frequency upper limit value; or setting a loss value threshold, and stopping training when the loss calculated by the first loss function is smaller than or equal to a preset loss value threshold.
In the embodiment of the application, in order to ensure that the segmentation model has robustness to multi-scale input of the iris, an implementation mode for carrying out joint fine adjustment on iris sampling and the segmentation model is provided. In the process, a scaling factor interval is set for the scaling factor used in the iris sampling process, and the scaling factor interval is generally 0.8-1.4.
Referring to fig. 9, fig. 9 is a flowchart of a method for fine tuning a pre-trained segmentation model according to an embodiment of the present application, including the following steps:
step S900, selecting a target sample from a target sample set; wherein the target sample comprises: and processing the obtained target segmentation image marked with the iris region through a detection model, wherein the second marking category of each pixel point in the target segmentation image.
Step S901, based on the iris region in the target segmented image, performing region sampling on the target segmented image according to the random scaling factor and the preset size, and obtaining a target subgraph.
In one possible implementation, the random scaling factor is randomly selected from the scale factor interval.
In another possible implementation, the random scaling factor is obtained by a factor recognition model, and the random scaling factor is in the scaling factor interval.
Step S902, inputting the target subgraph into a pre-training segmentation model, predicting the initial segmentation class of each pixel point in the target subgraph, and determining the predicted segmentation class of each pixel point in the target segmented image according to the initial segmentation class.
And step S903, performing parameter fine adjustment by adopting a second loss function constructed based on the prediction segmentation category and the second labeling category.
In one possible implementation, when the random scaling factor is randomly selected from the scaling factor interval, only the pre-trained segmentation model is subjected to parameter tuning when parameter tuning is performed based on the second loss function. And in this case the pre-training is similar to the fine tuning principle, so in order to facilitate a clearer understanding of the training process of the segmentation model, the embodiment of the present application is explained by pre-training, and a schematic diagram of the training of the segmentation model is provided, see fig. 10:
firstly, selecting a prediction sample from a prediction sample set; then based on iris areas in the prediction samples, carrying out area sampling on the pre-segmented image according to a preset scaling factor and a preset size to obtain a prediction subgraph; inputting the predicted sub-graph into a to-be-trained segmentation model, iris segmentation is carried out on the predicted sub-graph through the to-be-trained segmentation model, and initial prediction types of all pixel points in the predicted sub-graph are output; then, determining standard prediction categories of all pixel points in the pre-segmentation image according to the initial prediction categories; and then constructing a first loss function based on the initial prediction category and the standard prediction category, calculating a loss value, determining whether a model training stop condition is met based on the loss value and a preset loss value threshold, obtaining a pre-training segmentation model when the stop condition is determined to be met, and performing parameter adjustment on the segmentation model to be trained based on a gradient descent mode when the stop condition is determined not to be met.
In another possible implementation, when the random scaling factor is obtained by the coefficient identification model, the parameter adjustment is performed not only on the pre-trained segmentation model but also on the coefficient identification model when the parameter fine adjustment is performed based on the second loss function; i.e. the coefficient recognition model is trained in combination with the pre-trained segmentation model. In this case, the implementation shown in fig. 10 is adopted to obtain a pre-trained segmentation model, and then the joint training of the coefficient identification model and the segmentation model is performed. In order to facilitate a clearer understanding of the training process of model joint training, the embodiment of the present application proposes a schematic diagram of joint training of a coefficient identification model and a pre-training segmentation model, see fig. 11:
first, selecting a target sample from a target sample set, wherein the target sample comprises: the target segmentation image marked with the iris region is obtained through detection model processing, and the second marking category of each pixel point in the target segmentation image is obtained; then, respectively inputting the target segmented image into a coefficient identification model and a sampling module, determining a random scaling coefficient associated with the target segmented image through the coefficient identification model, and inputting the random scaling coefficient into the sampling module; then, in a sampling module, based on an iris region in the target segmentation image, performing region sampling on the target segmentation image according to the random scaling coefficient and a preset size to obtain a target subgraph; inputting the target subgraph into a pre-training segmentation model, predicting the initial segmentation class of each pixel point in the target subgraph, and determining the prediction segmentation class of each pixel point in the target segmentation image according to the initial segmentation class; and then constructing a second loss function based on the prediction segmentation class and the second labeling class, calculating a loss value, determining whether a model training stopping condition is met based on the loss value and a preset loss value threshold, obtaining a target segmentation model and a target coefficient identification model when the stopping condition is determined to be met, and performing parameter adjustment on the pre-training segmentation model and the coefficient identification model based on a gradient descent mode when the stopping condition is determined not to be met.
In the embodiment of the application, when the coefficient identification model is set, the coefficient identification model adopts a fully connected network or a convolution network and the like, performs mapping processing on an input image to obtain an output value mapped into a 0-1 interval, and then determines a scaling coefficient based on the output value. Specifically, the output value obtained by mapping is multiplied by 0.6 and then added with 0.8, and finally the scaling factor required by the area sampling is obtained.
It should be noted that, the training method shown in fig. 9 is different from the training method shown in fig. 7 in that the scaling factor is selected from the scaling factor interval when sampling is performed in fig. 9 to obtain the target subgraph, and the scaling factor is fixed when sampling is performed in fig. 7 to obtain the predicted subgraph, where different scaling factors may obtain different target subgraphs. Random scaling factors are configured for any picture in the training stage, robustness of the segmentation model to iris pictures with different scales is guaranteed, meanwhile, the training is carried out on the basis of the pre-training segmentation model obtained through training in fig. 7, and learning rate is set to be lower than that in fig. 7 in the optimization process, and is generally 0.001. The rest of the implementation is similar to the training method shown in fig. 7, and the detailed description is not repeated here.
In the method, the region after sub-sampling eliminates interference factors of irrelevant noise, reduces difficulty of the model in iris segmentation, so that a segmentation model can obtain higher segmentation accuracy only by smaller calculated amount, and improves the running speed of the model.
In one possible implementation, if only the target detection model and the target segmentation model are obtained, the target detection model and the target segmentation model are deployed in combination to implement a complete image segmentation scheme. Referring to fig. 12, fig. 12 is a schematic diagram of a model deployment application according to an embodiment of the present application. As can be seen from fig. 12: the method comprises the steps that an image to be processed firstly passes through a target detection model, then the iris subgraph is obtained through regional sampling by a sampling module, then the iris subgraph obtains a segmentation result by a target segmentation model, and finally the segmentation result is subjected to post-processing by a post-processing module to obtain the segmentation result of the image to be processed.
In another possible implementation manner, if the target detection model, the target segmentation model and the target coefficient identification model are obtained, the target detection model, the target segmentation model and the target coefficient identification model are combined and deployed to realize a complete image segmentation scheme. Referring to fig. 13, fig. 13 is a schematic diagram of another model deployment application provided in an embodiment of the present application. As can be seen from fig. 13: firstly, detecting an iris region in an image to be processed through a target detection model to obtain the image to be processed marked with the iris region; then, determining an initial scaling factor related to the image to be processed through a target factor identification model, and transmitting the determined scaling factor to a sampling module; then, through a sampling module, performing regional sampling on the image to be processed based on the initial scaling coefficient to obtain an iris subgraph; and then the iris subgraph obtains a segmentation result through a target segmentation model, and finally, the segmentation result is subjected to post-processing through a post-processing module to obtain a segmentation result of the image to be processed.
In order to better understand how to apply the object detection model and the object segmentation model for model segmentation, an image segmentation method flowchart is provided in the embodiment of the present application, and referring to fig. 14, the method includes the following steps:
step S1400, iris positioning is performed on the image to be processed, and an iris region in the image to be processed is obtained.
In one possible implementation, iris positioning processing is performed on an image to be processed through a trained target detection model, and an iris region in the image to be processed is obtained.
In this embodiment of the present application, the image to be processed may be an initial image, and when the image to be processed is the initial image, before the iris region positioning is performed on the iris image to be processed, scaling the iris image to a fixed size, for example, 320×240 is required;
the image to be processed may also be an image after scaling the initial image.
Step S1401, based on the iris region, samples the image to be processed, and obtains an iris subgraph.
In one possible implementation manner, when an iris sub-image is obtained by sampling an image to be processed based on an iris region, a sampling transformation matrix is determined according to an initial scaling factor and a preset size, and sampling is performed on the iris region of the image to be processed based on the sampling transformation matrix, so that the iris sub-image is obtained.
Illustratively, in determining the sampling transformation matrix, an initial offset is first determined based on a target size of the iris region and an initial scaling factor; then, based on the initial offset and a preset size, determining a transformation scaling factor; and determining a pixel offset based on the initial offset and the center point coordinates of the iris region; finally, a sampling transformation matrix is determined based on the transformation scaling factor and the pixel offset.
It should be noted that, the manner of constructing the sampling transformation matrix is similar to that in the training process, and will not be described here again.
In one possible implementation, if the manner shown in fig. 12 is adopted, the initial scaling factor is preset; if the manner shown in fig. 13 is employed, the initial scaling factor is obtained by the target factor recognition model.
Step S1402, respectively carrying out attribute recognition on each pixel point in the iris subgraph to obtain a corresponding recognition result; the recognition result represents that the pixel point is an iris or noise, and the noise comprises other eye structures and background noise.
In one possible implementation manner, attribute identification is performed on each pixel point in the iris subgraph through the target segmentation model, and a corresponding identification result is obtained.
Step S1403 maps the recognition results of each pixel point in the iris subgraph to the corresponding pixel point in the image to be processed.
In one possible implementation manner, when respective recognition results of each pixel point in the iris subgraph are mapped to corresponding pixel points in the image to be processed, respectively: determining an inverse transformation matrix of the sampling transformation matrix; based on the inverse transformation matrix, mapping the identification result to corresponding pixel points in the iris region of the image to be processed, and setting the pixel points in other regions of the image to be processed as background noise.
In step S1404, based on the mapping result, the iris and the noise are segmented within the image to be processed.
For ease of understanding, with respect to the manner shown in fig. 12, embodiments of the present application provide a specific implementation example of image segmentation, involving model training and model deployment applications, see fig. 15:
firstly, processing the iris region to be processed through a target detection model to obtain an image marked with the iris region; inputting the image marked with the iris region into a sampling module, setting an initial scaling factor and a preset size in the sampling module, and sampling the image based on the iris region according to the initial scaling factor and the preset size to obtain an iris subgraph; and inputting the iris subgraph into a target detection model to obtain a segmentation result of the iris subgraph, and obtaining a segmentation result of the processed image based on the segmentation result of the iris subgraph.
Similarly, for the manner shown in fig. 13, another example of a specific implementation of image segmentation is provided in the embodiments of the present application, where model training and model deployment applications are involved, see fig. 16:
firstly, processing the iris region to be processed through a target detection model to obtain an image marked with the iris region; the method comprises the steps of inputting an image marked with an iris region into a target coefficient recognition model and a sampling module respectively, obtaining an initial scaling coefficient through the target coefficient recognition model, transmitting the initial scaling coefficient to the sampling module, setting a preset size in the sampling module, and sampling the image according to the initial scaling coefficient and the preset size based on the iris region to obtain an iris subgraph; and inputting the iris subgraph into a target detection model to obtain a segmentation result of the iris subgraph, and obtaining a segmentation result of the processed image based on the segmentation result of the iris subgraph.
In the application, firstly, iris positioning is carried out on an image to be processed to obtain an iris region in the image to be processed, and sampling processing is carried out on the image to be processed based on the iris region to obtain an iris subgraph; then, carrying out attribute recognition on the pixel points in the iris subgraph to obtain a recognition result, wherein the recognition result represents that the pixel points are iris or noise, and the noise comprises other eye structures and background noise; the iris region occupies smaller area, so that the iris subgraph is obtained by sampling based on the iris region, the requirement of reducing the calculated amount is met, the segmentation speed is improved, and the iris subgraph is obtained by sampling based on the iris region, so that more irrelevant interference noise is eliminated from the iris subgraph, the segmentation difficulty is reduced, and the segmentation accuracy is improved. Then, mapping the identification results of all the pixel points in the iris subgraph to corresponding pixel points in the image to be processed; finally, based on the mapping result, dividing the iris and the noise in the image to be processed, so as to realize the division of the image to be processed.
Based on the same inventive concept, the embodiment of the application also provides an image segmentation device; as shown in fig. 17, the image dividing apparatus 1700 includes:
the positioning unit 1701 is configured to perform iris positioning on an image to be processed to obtain an iris region in the image to be processed;
the sampling unit 1702 is configured to sample an image to be processed based on an iris region to obtain an iris subgraph;
the identifying unit 1703 is configured to identify attributes of each pixel point in the iris subgraph, so as to obtain a corresponding identifying result; the recognition result represents that the pixel point is an iris or noise, and the noise comprises other eye structures and background noise;
the mapping unit 1704 is configured to map the recognition results of each pixel point in the iris subgraph onto a corresponding pixel point in the image to be processed;
a segmentation unit 1705, configured to segment the iris and the noise in the image to be processed based on the mapping result.
In one possible implementation, the sampling unit 1702 is specifically configured to:
based on the iris region, determining a sampling transformation matrix according to the initial scaling coefficient and a preset size;
and based on the sampling transformation matrix, sampling the image to be processed to obtain an iris subgraph.
In one possible implementation, the sampling unit 1702 is specifically configured to:
determining an initial offset based on a target size of the iris region and the initial scaling factor;
determining a transformation scaling factor based on the initial offset and a preset size; and determining a pixel offset based on the initial offset and the center point coordinates of the iris region;
a sampling transformation matrix is determined based on the transformation scaling coefficients and the pixel offset.
In one possible implementation, the initial scaling factor is preset, or the initial scaling factor is obtained by a target factor recognition model;
the target coefficient identification model is obtained by adopting a prediction sample set and training in combination with a pre-training segmentation model, and each prediction sample in the prediction sample set comprises: and processing the obtained pre-segmented image marked with the iris region through a detection model.
In one possible implementation, the identifying unit 1703 is specifically configured to:
respectively carrying out attribute identification on each pixel point in the iris subgraph through a target segmentation model to obtain a corresponding identification result; the target segmentation model is determined based on a pre-training segmentation model obtained by training a prediction sample set;
Wherein each prediction sample in the prediction sample set comprises: and processing the obtained pre-segmented image marked with the iris region through a detection model.
In one possible implementation, the mapping unit 1704 is specifically configured to:
determining an inverse transformation matrix of the sampling transformation matrix;
based on the inverse transformation matrix, mapping the identification result to corresponding pixel points in the iris region of the image to be processed, and setting the pixel points in other regions of the image to be processed as background noise.
In one possible implementation, each prediction sample further includes: a first labeling category of each pixel point in the pre-segmentation image; the pre-training segmentation model is obtained by training in the following way:
based on the prediction sample set, performing loop iteration training on the segmentation model to be trained to obtain a pre-training segmentation model, wherein the loop iteration training is performed in a loop iteration process:
selecting a prediction sample from a prediction sample set, and performing regional sampling on the pre-segmented image according to a preset scaling factor and a preset size based on an iris region in the prediction sample to obtain a prediction subgraph;
inputting the prediction subgraph into a segmentation model to be trained, predicting the initial prediction category of each pixel point in the prediction subgraph, and determining the standard prediction category of each pixel point in the pre-segmentation image according to the initial prediction category;
And carrying out parameter adjustment by adopting a first loss function constructed based on the standard prediction category and the first labeling category.
In one possible implementation, determining the target segmentation model based on the pre-trained segmentation model includes:
performing loop iteration training on the pre-training segmentation model based on the target sample set to obtain a target segmentation model, wherein the loop iteration training is performed in a loop iteration process:
selecting a target sample from the target sample set; wherein the target sample comprises: the target segmentation image marked with the iris region is obtained through detection model processing, and the second marking category of each pixel point in the target segmentation image is obtained;
based on iris areas in the target segmentation images, carrying out area sampling on the target segmentation images according to the random scaling factors and the preset size to obtain target subgraphs; the random scaling factor is randomly selected from the scaling factor interval;
inputting the target subgraph into a pre-training segmentation model, predicting the initial segmentation class of each pixel point in the target subgraph, and determining the prediction segmentation class of each pixel point in the target segmentation image according to the initial segmentation class;
and performing parameter fine adjustment by adopting a second loss function constructed based on the prediction segmentation category and the second labeling category.
In one possible implementation, the positioning unit 1701 is specifically configured to:
iris positioning is carried out on the image to be processed through a target detection model, and an iris area is obtained; the target detection model is obtained by performing loop iteration training on a detection model to be trained based on a detection sample set, wherein the loop iteration training is performed in a loop iteration process:
selecting a detection sample from the detection sample set; wherein detecting the sample comprises: a sample image and an annotation region of the iris in the sample image;
inputting a detection sample into a detection model to be trained, and predicting a prediction area of an iris in a sample image;
and constructing a third loss function based on the labeling area and the prediction area, and adopting the loss function to carry out parameter adjustment.
It should be noted that although several units (or modules) of the apparatus are mentioned in the above detailed description, this division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units (or modules) described above may be embodied in one unit (or module) in accordance with embodiments of the present application. Conversely, the features and functions of one unit (or module) described above may be further divided into a plurality of units (or modules) to be embodied. Of course, when implementing the present application, the functions of each unit (or module) may be implemented in the same or multiple pieces of software or hardware.
It should be noted that in the specific embodiments of the present application, data related to the user is involved, and when the above embodiments of the present application are applied to specific products or technologies, user permission or consent is required, and the collection, use and processing of related data is required to comply with relevant laws and regulations and standards of relevant countries and regions.
Having described the image segmentation method and apparatus of the exemplary embodiments of the present application, another exemplary embodiment computing device of the present application is described next.
Those skilled in the art will appreciate that the various aspects of the present application may be implemented as a system, method, or program product. Accordingly, aspects of the present application may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
In one possible implementation, a computing device provided by an embodiment of the present application may include at least a processor and a memory. The memory stores therein program code that, when executed by the processor, causes the processor to perform any of the steps of the image segmentation methods of the various exemplary embodiments herein.
In this embodiment, the architecture of the computing device may include a memory 1801, a communication module 1803, and one or more processors 1802, as shown in fig. 18.
A memory 1801 for storing computer programs for execution by the processor 1802. The memory 1801 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a program required for running an instant communication function, and the like; the storage data area can store various instant messaging information, operation instruction sets and the like.
The memory 1801 may be a volatile memory (RAM) such as a random-access memory (RAM); the memory 1801 may also be a nonvolatile memory (non-volatile memory), such as a read-only memory, a flash memory (flash memory), a hard disk (HDD) or a Solid State Drive (SSD); or memory 1801, is any other medium that can be used to carry or store a desired computer program in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 1801 may be a combination of the above memories.
The processor 1802 may include one or more central processing units (central processing unit, CPU) or digital processing units, or the like. A processor 1802 for implementing the above-described image segmentation method when invoking a computer program stored in a memory 1801.
The communication module 1803 is used for communicating with a terminal device and other servers.
The specific connection medium between the memory 1801, the communication module 1803, and the processor 1802 is not limited to the above embodiments. The embodiment of the present application is illustrated in fig. 18 by a connection between the memory 1801 and the processor 1802 via the bus 1804, and the bus 1804 is illustrated in fig. 18 by a thick line, and the connection between other components is merely illustrative, and not limited thereto. The bus 1804 may be divided into an address bus, a data bus, a control bus, and the like. For ease of description, only one thick line is depicted in fig. 18, but only one bus or one type of bus is not depicted.
The memory 1801 stores therein a computer storage medium in which computer-executable instructions for implementing the image segmentation method of the embodiment of the present application are stored. The processor 1802 is configured to perform the image segmentation method described above.
In some possible embodiments, aspects of the image segmentation method provided herein may also be implemented in the form of a program product comprising program code for causing a computing device to carry out the steps of the image segmentation method according to various exemplary embodiments of the present application as described herein above, when the program product is run on the computing device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The program product of embodiments of the present application may employ a portable compact disc read only memory (CD-ROM) and include program code and may run on a computing device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with a command execution system, apparatus, or device.
The readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a command execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's equipment, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of model, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected over the Internet using an Internet service provider).
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program commands may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable device to produce a machine, such that the commands executed by the processor of the computer or other programmable device produce a means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program commands may also be stored in a computer readable memory that can direct a computer or other programmable apparatus to function in a particular manner, such that the commands stored in the computer readable memory produce an article of manufacture including command means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (10)

1. An image segmentation method, the method comprising:
iris positioning is carried out on an image to be processed, and an iris area in the image to be processed is obtained;
sampling the image to be processed based on the iris region to obtain an iris subgraph;
respectively carrying out attribute identification on each pixel point in the iris subgraph to obtain a corresponding identification result; the recognition result represents that the pixel point is an iris or noise, and the noise comprises other eye structures and background noise;
mapping the identification results of all the pixel points in the iris subgraph to corresponding pixel points in the image to be processed;
dividing the iris and the noise in the image to be processed based on a mapping result;
the step of sampling the image to be processed based on the iris region to obtain an iris subgraph comprises the following steps:
Based on the iris region, determining a sampling transformation matrix according to the initial scaling coefficient and a preset size;
sampling the image to be processed based on the sampling transformation matrix to obtain an iris subgraph;
the mapping the identification result of each pixel point in the iris subgraph to the corresponding pixel point in the image to be processed includes:
determining an inverse transformation matrix of the sampling transformation matrix;
and mapping the identification result to corresponding pixel points in the iris region of the image to be processed based on the inverse transformation matrix, and setting the pixel points in other regions of the image to be processed as background noise.
2. The method of claim 1, wherein the determining a sampling transformation matrix based on the iris region from an initial scaling factor and a preset size comprises:
determining an initial offset based on a target size of the iris region and the initial scaling factor;
determining a transformation scaling factor based on the initial offset and the preset size; and determining a pixel offset based on the initial offset and a center point coordinate of the iris region;
The sampling transformation matrix is determined based on the transformation scaling factor and the pixel offset.
3. The method of claim 1, wherein the initial scaling factor is preset or obtained by a target factor recognition model;
the target coefficient identification model is obtained by adopting a prediction sample set and training in combination with a pre-training segmentation model, and each prediction sample in the prediction sample set comprises: and processing the obtained pre-segmented image marked with the iris region through a detection model.
4. The method of claim 1, wherein the performing attribute recognition on each pixel point in the iris subgraph to obtain a corresponding recognition result includes:
respectively carrying out attribute identification on each pixel point in the iris subgraph through a target segmentation model to obtain a corresponding identification result; the target segmentation model is determined based on a pre-training segmentation model obtained by training a prediction sample set;
wherein each prediction sample in the set of prediction samples comprises: and processing the obtained pre-segmented image marked with the iris region through a detection model.
5. The method of claim 4, wherein each prediction sample further comprises: a first labeling category of each pixel point in the pre-segmented image;
the pre-training segmentation model is obtained through training in the following way:
performing loop iteration training on the segmentation model to be trained based on the prediction sample set to obtain the pre-training segmentation model, wherein the loop iteration training is performed in a loop iteration process:
selecting a prediction sample from the prediction sample set, and performing regional sampling on the pre-segmented image according to a preset scaling factor and a preset size based on an iris region in the prediction sample to obtain a prediction subgraph;
inputting the prediction subgraph into the segmentation model to be trained, predicting the initial prediction category of each pixel point in the prediction subgraph, and determining the standard prediction category of each pixel point in the pre-segmentation image according to the initial prediction category;
and carrying out parameter adjustment by adopting a first loss function constructed based on the standard prediction category and the first annotation category.
6. The method of claim 4 or 5, wherein determining the target segmentation model based on the pre-trained segmentation model comprises:
Performing loop iteration training on the pre-training segmentation model based on a target sample set to obtain the target segmentation model, wherein the loop iteration training is performed in a loop iteration process:
selecting a target sample from the target sample set; wherein the target sample comprises: a target segmentation image marked with an iris region is obtained through detection model processing, and a second marking category of each pixel point in the target segmentation image is obtained;
based on the iris region in the target segmentation image, performing region sampling on the target segmentation image according to a random scaling coefficient and a preset size to obtain a target subgraph; the random scaling coefficient is selected from a scaling coefficient interval;
inputting the target subgraph into the pre-training segmentation model, predicting the initial segmentation class of each pixel point in the target subgraph, and determining the prediction segmentation class of each pixel point in the target segmentation image according to the initial segmentation class;
and performing parameter fine adjustment by adopting a second loss function constructed based on the prediction segmentation category and the second annotation category.
7. The method of claim 1, wherein the iris positioning the image to be processed to obtain an iris region in the image to be processed comprises:
Iris positioning is carried out on the image to be processed through a target detection model, and the iris area is obtained; the target detection model is obtained by performing loop iteration training on a detection model to be trained based on a detection sample set, wherein the loop iteration training is performed in a loop iteration process:
selecting a detection sample from the detection sample set; wherein the detecting the sample comprises: a sample image and an annotation region of the iris in the sample image;
inputting the detection sample into the detection model to be trained, and predicting a prediction area of the iris in the sample image;
and constructing a third loss function based on the labeling area and the prediction area, and adopting the loss function to carry out parameter adjustment.
8. An image segmentation apparatus, the apparatus comprising:
the positioning unit is used for positioning the iris of the image to be processed and obtaining an iris area in the image to be processed;
the sampling unit is used for sampling the image to be processed based on the iris region to obtain an iris subgraph;
the identification unit is used for respectively carrying out attribute identification on each pixel point in the iris subgraph to obtain a corresponding identification result; the recognition result represents that the pixel point is an iris or noise, and the noise comprises other eye structures and background noise;
The mapping unit is used for mapping the identification results of all the pixel points in the iris subgraph to corresponding pixel points in the image to be processed;
the segmentation unit is used for segmenting the iris and the noise in the image to be processed based on a mapping result;
the sampling unit is specifically configured to:
based on the iris region, determining a sampling transformation matrix according to the initial scaling coefficient and a preset size;
sampling the image to be processed based on the sampling transformation matrix to obtain an iris subgraph;
the mapping unit is specifically configured to:
determining an inverse transformation matrix of the sampling transformation matrix;
and mapping the identification result to corresponding pixel points in the iris region of the image to be processed based on the inverse transformation matrix, and setting the pixel points in other regions of the image to be processed as background noise.
9. A computing device, the computing device comprising: a processor and a memory, wherein:
the memory is used for storing a computer program;
the processor being adapted to execute the computer program for implementing the method of any of claims 1-7.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the method of any of claims 1-7.
CN202311370361.0A 2023-10-23 2023-10-23 Image segmentation method, device, equipment and storage medium Active CN117115900B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311370361.0A CN117115900B (en) 2023-10-23 2023-10-23 Image segmentation method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311370361.0A CN117115900B (en) 2023-10-23 2023-10-23 Image segmentation method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN117115900A CN117115900A (en) 2023-11-24
CN117115900B true CN117115900B (en) 2024-02-02

Family

ID=88800488

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311370361.0A Active CN117115900B (en) 2023-10-23 2023-10-23 Image segmentation method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117115900B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117689660B (en) * 2024-02-02 2024-05-14 杭州百子尖科技股份有限公司 Vacuum cup temperature quality inspection method based on machine vision

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882222A (en) * 2009-06-26 2010-11-10 哈尔滨工业大学 Iris partitioning and sunlight radiating canal extracting method based on basic-element structure definition and region growing technology
CN102429637A (en) * 2011-08-17 2012-05-02 北京百纳威尔科技有限公司 Mobile terminal iris detection method and mobile terminal
CN102844766A (en) * 2011-04-20 2012-12-26 中国科学院自动化研究所 Human eyes images based multi-feature fusion identification method
CN113780239A (en) * 2021-09-27 2021-12-10 上海聚虹光电科技有限公司 Iris recognition method, iris recognition device, electronic equipment and computer readable medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7277561B2 (en) * 2000-10-07 2007-10-02 Qritek Co., Ltd. Iris identification
US10311300B2 (en) * 2016-05-18 2019-06-04 Eyelock Llc Iris recognition systems and methods of using a statistical model of an iris for authentication
TWI754806B (en) * 2019-04-09 2022-02-11 栗永徽 System and method for locating iris using deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882222A (en) * 2009-06-26 2010-11-10 哈尔滨工业大学 Iris partitioning and sunlight radiating canal extracting method based on basic-element structure definition and region growing technology
CN102844766A (en) * 2011-04-20 2012-12-26 中国科学院自动化研究所 Human eyes images based multi-feature fusion identification method
CN102429637A (en) * 2011-08-17 2012-05-02 北京百纳威尔科技有限公司 Mobile terminal iris detection method and mobile terminal
CN113780239A (en) * 2021-09-27 2021-12-10 上海聚虹光电科技有限公司 Iris recognition method, iris recognition device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
CN117115900A (en) 2023-11-24

Similar Documents

Publication Publication Date Title
US11200424B2 (en) Space-time memory network for locating target object in video content
WO2021227726A1 (en) Methods and apparatuses for training face detection and image detection neural networks, and device
CN110503076B (en) Video classification method, device, equipment and medium based on artificial intelligence
CN117115900B (en) Image segmentation method, device, equipment and storage medium
CN111709471B (en) Object detection model training method and object detection method and device
CN115861462B (en) Training method and device for image generation model, electronic equipment and storage medium
CN113408566A (en) Target detection method and related equipment
CN112232355A (en) Image segmentation network processing method, image segmentation device and computer equipment
GB2579262A (en) Space-time memory network for locating target object in video content
CN115631112B (en) Building contour correction method and device based on deep learning
CN115205150A (en) Image deblurring method, device, equipment, medium and computer program product
WO2023184817A1 (en) Image processing method and apparatus, computer device, computer-readable storage medium, and computer program product
CN112668608A (en) Image identification method and device, electronic equipment and storage medium
CN112784750A (en) Fast video object segmentation method and device based on pixel and region feature matching
CN116977674A (en) Image matching method, related device, storage medium and program product
CN114972016A (en) Image processing method, image processing apparatus, computer device, storage medium, and program product
CN115331071A (en) Tuberculous meningoencephalitis prediction method and system based on multi-scale feature map
CN113537187A (en) Text recognition method and device, electronic equipment and readable storage medium
KR102434969B1 (en) Method and apparatus for face super-resolution using adversarial distillation of facial region dictionary
CN114155417B (en) Image target identification method and device, electronic equipment and computer storage medium
CN113658231B (en) Optical flow prediction method and device, electronic equipment and storage medium
CN117011156A (en) Image processing method, device, equipment and storage medium
CN114882372A (en) Target detection method and device
CN115115972A (en) Video processing method, video processing apparatus, computer device, medium, and program product
CN116883770A (en) Training method and device of depth estimation model, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant