CN114926635A

CN114926635A - Method for segmenting target in multi-focus image combined with deep learning method

Info

Publication number: CN114926635A
Application number: CN202210427559.7A
Authority: CN
Inventors: 徐靖翔; 李娟�; 李建强; 王全增; 赵琳娜; 罗锦涛; 高正凯; 刘朝磊
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2022-04-21
Filing date: 2022-04-21
Publication date: 2022-08-19

Abstract

The invention discloses a method for segmenting a target in a multi-focus image combined with a deep learning method, which comprises the steps of firstly registering the multi-focus image; carrying out rough segmentation based on the color and the outline of the target to obtain a local image only containing a single target, and forming a positioning label of the target based on the position of the center of the target and the image to which the center of the target is positioned; two evaluation scales are provided, namely a target definition judgment module and a target desirability judgment module; the scores of the comprehensive definition judging module and the demanding judging module are integrated to obtain a segmentation value coefficient of the target; and finally, aiming at the targets with the same positioning labels, comparing the segmentation value coefficients, and putting the targets with the segmentation value into a semantic segmentation module to obtain all clear non-repetitive targets in the multi-focus image so as to solve the problems of rough segmentation existing in the traditional method and low efficiency, poor effect and the like existing in the method of singly using the deep learning semantic segmentation. Therefore, the effect and the efficiency of the segmentation are considered.

Description

Method for segmenting target in multi-focus image combined with deep learning method

Technical Field

The invention relates to a method for extracting a target in a multi-focus image based on the combination of a traditional method and a deep learning method, and belongs to the field of computer vision.

Background

In the information age, images become a main carrier for human recorded information, the application scenes are quite wide, but when an image capture device acquires an image, the focusing point of the image is determined, and generally speaking, the object near the focusing point is clearer and the image farther away from the focusing point is lower in definition. The more clear objects are required in an image, the higher the application value of the image in an actual application scene. If there is a set of multi-focal images, the capture device, capture background, capture object, etc. are the same, simply because the sharpness of each object in different images is different due to the capture parameters; meaning that the individual sharp objects required in the actual application scenario are scattered in different pictures. At this time, if a method can be used to identify, segment, and extract each cleaning target, it will bring great help to the subsequent application.

When the traditional method is used for target segmentation, the following segmentation methods are mainly used, one method is based on a threshold value, the gray value of each pixel is compared with the threshold value, and finally the pixels are classified into proper categories according to the comparison result. The other is image segmentation based on regions, the regions are directly searched and classified, and a region growing method is adopted, namely, the regions are gradually combined from a single pixel to form the required segmented regions; or from the global point of view, gradually cutting to the required segmentation area. In addition, the watershed algorithm is also a commonly used target segmentation algorithm, which is a segmentation method based on the mathematical morphology of the topological theory, and the basic idea is that an image is regarded as a topological landform on geodesic, the gray value of each pixel in the image represents the altitude of the point, each local minimum value and the influence area thereof are called as a catchbasin, and the boundary of the catchbasin forms the watershed. While the most common of the existing methods is the segmentation method based on edge detection, the algorithm tries to solve the segmentation problem by detecting edges containing different regions. The gray value of the pixels on the boundaries of different areas changes violently, and if the picture is transformed from a space domain to a frequency domain through Fourier transform, the edges correspond to high-frequency parts, and the image can be segmented according to the target edges. However, the accuracy of the existing method for judging the target edge is very low, and the situation that the target is incomplete after segmentation or the background is segmented into the target by mistake often occurs.

When the deep learning method is used for target segmentation, a semantic segmentation algorithm is mainly adopted for segmentation, and the problem that the semantic segmentation effect is not ideal and even segmentation is wrong is caused by the problems that targets in an image are not clear or partial targets in the source image are out of focus and the like in the source image.

Generally, all current object segmentation methods have three problems, one is that the traditional object segmentation method mainly segments objects based on color differences between the objects and the background and the unique contour features of the objects, but the traditional method segments the objects roughly, often contains partial useless background information, and is difficult to segment unclear objects. Secondly, the target evaluation system after being segmented by the traditional method is various and whether the segmented target meets the requirement can only be judged manually. Thirdly, a semantic segmentation algorithm is independently used for segmenting all images possibly containing targets, and as part of source images contain targets without use values, the semantic segmentation efficiency is low, and the segmented invalid targets are too many to influence the subsequent experiment results.

Therefore, the invention provides a generalization method of the method for extracting the clear target from the multi-focus image on the basis of the traditional method and the deep learning semantic segmentation method provided by the predecessor based on the characteristics of the multi-focus image. The method aims to form the target segmentation value coefficient according to the definition and the demand of the target, so that the non-repeated demand clear target in the multi-focus image is extracted according to the parameters.

Disclosure of Invention

At present, the traditional target extraction method mainly extracts a target according to the color and contour characteristics of the target, and because the obtained image often comprises the problems of small difference between the target color and the background color, unclear target contour and the like, incomplete target and background segmentation can be caused, and the final segmentation effect is influenced. In addition, the existing evaluation system of the divided target is not perfect, various evaluation indexes are various, most evaluation methods of the divided target are manual evaluation, and the efficiency is low. The deep learning semantic segmentation method has extremely high requirements on the quality of the source image, and if the image quality of the semantic segmentation model is too low, the segmentation effect is poor; meanwhile, the efficiency of semantic segmentation is reduced by mixing a certain amount of low-quality images into the input image.

Aiming at the problems, the invention designs a method for extracting the target in the multi-focus image based on the combination of the traditional method and the deep learning method, and the multi-focus image is firstly registered by utilizing the traditional method; carrying out rough segmentation based on the color and the outline of the target to obtain a local image only containing a single target, and forming a positioning label of the target based on the position of the center of the target and the image to which the center of the target is positioned; aiming at the problems of low efficiency, non-uniform standard and the like in the current evaluation method system, two evaluation scales are provided, namely a target definition judgment module and a target desirability judgment module, and the two systems respectively correspond to a definition coefficient and a desirability coefficient of a target; the scores of the comprehensive definition judging module and the demand judging module are integrated to obtain a segmentation value coefficient of the target; and finally, aiming at the targets with the same positioning labels, comparing the segmentation value coefficients, and putting the targets with the segmentation value into a semantic segmentation module to obtain all clear non-repetitive targets in the multi-focus image so as to solve the problems of rough segmentation existing in the traditional method and low efficiency, poor effect and the like existing in the method of singly using the deep learning semantic segmentation. Therefore, the effect and the efficiency of the segmentation are considered.

The specific scheme of the invention is shown in figure 2.

Step 1: image registration;

image registration is matching multiple images acquired at different times, different capturers (imaging devices), or different capture conditions. The main objective of image registration is to find the most efficient transformation in the transformation space, in a way that matches the two images. Specifically, one picture in the image group is randomly selected as a reference, and other pictures in the image group perform operations such as feature selection, feature matching, image transformation and the like according to the reference picture, so that the features of all the pictures in the image group are consistent.

Step 2: coarse segmentation based on color and contour features;

after image features in the image group are adjusted to be consistent through the step 1, a boundary edge between a foreground and a background is searched by taking the color difference between a required target and an irrelevant background and between the irrelevant target and the special outline of the required target as starting points; therefore, the existing image is roughly segmented to obtain a local image containing the class seeking target.

Step 2.1, searching pixel points with different foreground and background colors;

and converting the image into a gray scale image, then calculating gradient operators in the x and y directions based on gray scale characteristics of the image, then calculating the gradient by using first-order finite difference, and multiplying the gradient operators by the difference to calculate the amplitude and the direction so as to determine the target edge profile.

Step 2.2, detecting and judging the target contour features;

the image global features are utilized to connect the edge pixels to form a region closed boundary, the image space is converted into a parameter space, points are described in the parameter space, and the purpose of detecting the image edge is achieved. And performing statistical calculation on all possible points on the edge, and determining whether the target contour feature meets the standard feature according to the statistical result of the data.

And step 3: positioning a target;

and (3) calculating the center distance of the edge points of the target according to the edge of the target contour formed after rough segmentation in the step (2) so as to obtain the coordinates of the center point, and then forming a label by using the center coordinates of the target to be associated with the target image.

And 4, step 4: evaluating the definition of a target image;

the traditional evaluation of image definition is to obtain a score by a mathematical method aiming at a digital matrix of the whole image, for example: variance, image entropy, gradient, etc. In the step, some traditional parameters are tried to be standardized and then are fused with a training machine learning model, so that the image definition evaluation system is more objective and accurate. Inputting the roughly segmented target with the positioning information processed in the steps 1,2 and 3 into a model to evaluate the image definition.

Step 4.1, training a multi-parameter fused definition evaluation regression model;

and calculating the scores of the training set on each parameter index, then carrying out two-class labeling on the training set, and then training the multi-parameter fusion definition evaluation regression model.

Step 4.2, acquiring the definition score of the target image;

and inputting the target image into the trained model to calculate the definition score of the target image.

And 5: predicting the demand of the target image;

the main purpose of this step is to determine whether the features of the target image are consistent with the required target features, and therefore, a two-class deep learning model capable of identifying the required target features is trained to predict the desirability of the target image. Inputting the roughly segmented target with the positioning information processed in the steps 1,2 and 3 into a model to evaluate the image desirability. This step is parallel to step 4.

Step 5.1, training a two-classification deep learning model based on textural feature demand target recognition;

and selecting a target image which only contains a required target image and a target image with similar contour color and required target but different texture characteristics in a proper scale to make a data set and associating labels for training a two-classification deep learning model based on the texture characteristic required target identification.

Step 5.2, acquiring the demand confidence of the target image;

and inputting the target image into the trained model to obtain the required confidence of the target.

Step 6: comprehensively judging the segmentation value of the target image;

and solving segmentation value judgment of the image according to the image definition score obtained in the step 4 and the requirement confidence coefficient of the image obtained in the step 5.

Step 6.1, calculating a segmentation value coefficient according to the definition score and the demand confidence coefficient;

step 6.2, dividing a high segmentation value target image group;

and setting a threshold according to the subsequent required data scale, and dividing the high segmentation value target image group.

And 7: screening non-repetitive target images in the high-segmentation-value target image group according to the positioning information of the target images;

and (4) acquiring a positioning tag associated with the target image, comparing the positioning tag with positioning tags of other targets, dividing an image group with the same positioning, and screening out non-repetitive target images in a high-segmentation-value target image group according to a segmentation value coefficient set threshold obtained in the step (6) of the image.

And step 8: finely dividing a high-value target image;

and (4) performing outline fine segmentation on the target image obtained in the step (7) by adopting a semantic segmentation model.

Step 8.1, training a semantic segmentation model based on the contour;

and performing pixel-level drawing on the target contour of the training set to train a semantic segmentation model based on the contour.

Step 8.2, drawing a target fine contour;

and inputting the target image into the trained semantic segmentation model to draw the target fine contour.

And step 9: non-repetitive clear demand targets in the multi-focus image are obtained.

And (3) putting the calculated non-repetitive image with high segmentation value into the model trained in the step (8.1) for semantic fine segmentation to obtain a contour mask image, and comparing the mask image with the source image to separate out a non-repetitive clear demand target in the multi-focus image.

Compared with the prior art, the invention has the advantages that:

1. and a multi-parameter definition evaluation model is established, so that the definition score of the image is more reasonable and closer to the reality. Meanwhile, the numerical range of the definition score is controlled to ensure the reasonability of the calculation of the subsequent segmentation value coefficient.

2. And the required confidence coefficient of the definition score secondary coupling classification model is used for enhancing the influence of the definition of the texture features on the classification confidence coefficient of the model, so that a more reasonable and scientific segmentation value coefficient is obtained.

3. The images screened by the target definition judging module and the target desirability judging module are sent to the semantic segmentation model, so that the semantic segmentation efficiency and the desirability of the segmented targets can be greatly improved.

4. Compared with the existing segmentation method, the semantic segmentation model trained by the training data labeled at the pixel level can segment the target more accurately, basically can completely remove useless background information, and improve the subsequent availability of the target image.

Drawings

Fig. 1 is a diagram illustrating a problem in a conventional method.

FIG. 2 is an overall model diagram of the method of the present invention.

Detailed Description

The following detailed description of embodiments of the invention is provided in conjunction with the accompanying drawings:

the invention relates to a method for extracting a target in a multi-focus image based on the combination of a traditional method and a deep learning method. The method includes the steps of conducting difference value standardization on definition scores and threshold values to obtain definition influence factors, then using the definition influence factors to correct required confidence coefficients to obtain segmentation value coefficients, and then determining whether to extract a target or not according to the coefficients and other labels of the target such as positioning information. And then, describing a specific flow of the multi-focus image target extraction method by taking the pollen image as a case. Firstly, registering the multifocal pollen images to keep the overall characteristics of the non-background targets of the images in a group of images consistent; then, roughly dividing the pollen-like target, the invalid background and the non-pollen target according to the color and contour difference; forming a related label according to the positioning information of each pollen-like target for later use; obtaining a segmentation value coefficient by performing definition judgment and model verification of demand judgment on the preliminarily screened pollen-like target; secondly, pollen targets with real fine segmentation value are screened and then placed into a semantic segmentation model to draw a contour; obtaining a single pollen image meeting the requirements.

Specifically, the method comprises the following steps:

step 1: image registration:

firstly, an ORB algorithm is used for extracting and describing feature points, specifically, FAST feature points are extracted through a FAST algorithm, and then a scale factor scale and the pyramid layer number n are set to establish an image pyramid. And then the original image is reduced into n images according to the scale factor. The scaled image is:

where I is the original image, I' is the processed image, k is 1,2, …, n. And calculating a moment to calculate the centroid of the feature point in a radius range of r, wherein a vector is formed from the coordinates of the feature point to the centroid as the direction of the feature point. Thus, I obtain all the feature points and calculate the directions of the feature points. And then, carrying out correlation matching on corresponding key points in the two images by adopting a K-nearest neighbor algorithm. After matching at least four pairs of keypoints, performing image transformation using homography transformation.

And 2, step: rough segmentation of pollen-like target:

the main purpose of this step is to identify the position of the target whose color and contour feature are similar to those of the target in the image group registered in step 1, and segment it from the image, and this step is called coarse segmentation because the existing method has a great limitation on the segmentation of the target.

Step 2.1, filtering and denoising the image:

and carrying out Gaussian filtering on the gray-scale image to eliminate interference noise, and preventing subsequent steps of noise from influencing color and contour recognition.

Step 2.2 differential calculation amplitude and direction:

two matrixes of partial derivatives of the image in the x direction and the y direction can be obtained by calculating the gradient by using a first-order finite difference, and the two matrixes are multiplied by a Sobel operator to obtain gradient amplitudes and gradient directions which are respectively:

h (x, y) is an image, x and y are pixel point coordinates of horizontal and vertical coordinates in an image, S _x 、S _y For a matrix of Sobel operators set according to a priori knowledge, G _C(i,j) θ is the gradient direction.

Step 2.3 non-maximum suppression:

and grouping 8 surrounding pixel points taking the current pixel point as a center, and performing linear interpolation according to the gradient value and the gradient direction obtained by calculation, wherein the gradient value closer to the gradient direction is larger in proportion. And then searching the local maximum value, and setting the gray value corresponding to the non-maximum value point as 0. After the non-maximum suppression is completed, a binary image is obtained, the gray value of the non-edge point is 0, and the gray value of the local gray maximum point which may be the edge can be set to be 128. A rough edge profile is derived.

Step 2.4 high-low threshold connection edge:

and setting two thresholds, removing the false contour by using the high threshold, and searching points capable of closing the edge by using the low threshold in an 8-pixel region around the pixel point position incapable of being closed when the edge point of the image cannot be closed by using the high threshold, thereby finally forming a complete edge contour.

Step 2.5 edge roundlike detection:

the method comprises the following steps of searching possible circle centers on a module vector of each point on a circle according to the fact that the circle centers are certain by adopting a Hough gradient method, determining the radius according to the support degree of non-0 pixels of the edges of all candidate centers on the circle centers, and then determining the coincidence degree of the drawn circle and the edge to judge whether the target edge profile is a quasi-circle.

Step 2.6, cutting the target to form an independent pollen-like target image;

and step 3: pollen-like target positioning:

in the step 2, the center of the target obtained by rough segmentation is associated with the target image to form a position tag, so that subsequent targets can be conveniently screened and retained.

Step 3.1, obtaining the target center coordinate to form a label:

calculating the central moment of the edge points to further obtain the coordinates of the central point, wherein the calculation formula of the central moment and the central point is as follows:

wherein i and j are the order of an image moment, x and y are pixel point coordinates of horizontal and vertical coordinates in an image, array (x, y) is the pixel value at the current (x, y) coordinate,

is the finally calculated center point coordinate.

The target center coordinates are associated with the target image.

Step 3.2, forming a label by the name of the picture to which the target belongs;

and 4, step 4: evaluating the definition of the pollen-like image:

the purpose of the step is to calculate the definition score of the pollen-like image, and because the existing methods for evaluating the definition of the image are numerous, the step adopts a fusion method to fuse various evaluation systems together to obtain the most objective definition evaluation score of the target after the processing of the steps 1,2 and 3.

Step 4.1, training a multi-parameter fusion definition evaluation regression model:

through investigation, experiment and research, the following parameters are selected as basic parameters for model fusion in the step:

tenengrad gradient function:

wherein T is a given edge detection threshold, G (x, y) is a gradient amplitude, and x and y are pixel point coordinates of horizontal and vertical coordinates in an image.

Vollanth function:

wherein mu is the average gray value of the whole image, M and N are the width and height of the image respectively, f (x, y) is the pixel value of the current coordinate, and x and y are the pixel point coordinates of the horizontal and vertical coordinates in the image.

3. Variance function:

4. Grayscale difference product function:

wherein, f (x, y) is the pixel value of the current coordinate, and x and y are the pixel coordinates of the horizontal and vertical coordinates in an image.

Brenner gradient function:

wherein f (x, y) is the pixel value of the current coordinate, and x and y are the pixel point coordinates of the horizontal and vertical coordinates in an image.

Normalizing the parameters according to the data scale and the data attributes of the parameters to obtain the basic element coefficients of the definition evaluation system:

wherein f denotes an image input to the evaluation system.

Finally, forming a definition evaluation system:

wherein mu _i And (3) trainable regression coefficients of the basic element coefficients of the definition evaluation system of the ith item. And calculating values of the training set mapped on the functions, taking the values as the training input of the regression model, and training the regression model after labeling.

Step 4.2, obtaining the definition score of the pollen-like image:

inputting the target image into a model to calculate the definition score of the image.

And 5: judging the pollen confidence of the pollen-like target based on the texture characteristics;

and (4) training a ResNet model to be used for judging the confidence coefficient of the pollen-like target obtained after the treatment in the steps 1,2 and 3.

Step 5.1, training a pollen recognition binary deep learning model based on texture features;

and performing two-classification labeling on the training set, and then training a ResNet model based on the texture features.

Step 5.2, calculating the pollen confidence coefficient of the pollen-like target;

inputting the target image into a model to calculate the pollen confidence P of the image _pollen

And 6: comprehensively judging the segmentation value of the pollen-like image;

and (5) calculating the segmentation value of the pollen-like image target by integrating the scores in the steps 4 and 5.

Step 6.1 calculating segmentation value coefficient according to definition score and pollen confidence coefficient

Because the deep learning is a black box model, the influence of the image definition on the pollen confidence coefficient cannot be solved, and the influence of the image definition on the pollen confidence coefficient is enlarged by adopting the following formula in the step:

wherein S _standard The method is manually set, and preset according to the subsequent required data scale and data quality, and the principle is that the data with high quality is required to be setSet high threshold S _standard S with a low threshold set if a large amount of data is required _standard 。

Step 6.2, dividing pollen image group with high segmentation value

And setting a segmentation value coefficient threshold value to divide the pollen image group with high segmentation value.

And 7: screening a non-repetitive pollen image of the pollen with high segmentation value according to the positioning information of the pollen image;

and (4) dividing the pollen targets with similar positioning information but different images into a group, and screening out the only high-quality pollen target in the group based on the segmentation value coefficient calculated in the step (6).

And 8: finely dividing a high-value pollen image;

the main purpose of the step is to cut residual background information existing after the previous step processing and coarse segmentation so as to ensure the accuracy of fine segmentation and the integrity of pollen after the fine segmentation.

Step 8.1, training a semantic segmentation model based on the pollen outline;

performing pixel-level contour labeling on the training set image to ensure the accuracy of model segmentation, and then training a semantic segmentation model based on the pollen contour;

step 8.2, drawing a fine outline of the pollen target;

and (4) placing the pollen image screened in the step into a model for fine segmentation to obtain a fine outline of the pollen image.

And step 9: and obtaining a non-repeated clear pollen image in the multi-focus image.

And overlapping the contour mask image obtained in the last step with the original image to cut out a non-repetitive clear pollen image in the multi-focus image.

Reference:

[1] lie, multifocal image fusion algorithm based on deep learning study [ D ] Jilin university, 2020, DOI:10.27162/d.cnki.gjlin.2020.006095.

[2] Dow 29667, multisource image fusion method research based on convolutional neural networks [ D ]. SiAn electronic technology university, 2020.DOI: 10.27389/d.cnki.gxudu.2020.003576.

[3] Wangjie multifocal image fusion algorithm research and detection based on depth convolution network [ D ] Beijing post and telecommunications university, 2020 DOI 10.26969/d.cnki.gbydu.2020.001435.

[4] Liengxu, a multi-source image fusion technology based on convolutional neural networks, research on [ D ]. sienna electronics science and technology university, 2019.DOI:10.27389/d.cnki.gxadu.2019.001698.

Claims

1. The method for segmenting the target in the multi-focus image combined with the deep learning method is characterized in that: the method comprises the following steps of,

step 1: image registration;

randomly selecting one picture in the image group as a reference, and performing feature selection, feature matching and image transformation operations on other images in the image group by contrasting the reference image to finally realize the feature consistency of all images in the image group;

step 2: coarse segmentation based on color and contour features;

after image features in the image group are adjusted to be consistent through the step 1, a boundary edge between a foreground and a background is searched by taking the color difference between a required target and an irrelevant background and between the irrelevant target and a special outline of the required target as starting points; therefore, the existing image is roughly segmented to obtain a local image containing a class seeking target;

and step 3: positioning a target;

calculating the center distance of the edge points of the target according to the edge of the target contour formed after rough segmentation in the step 2 so as to obtain the coordinates of the center point, and forming a label by the center coordinates of the target to be associated with the target image;

and 4, step 4: evaluating the definition of a target image;

inputting the roughly segmented target with positioning information processed in the previous steps 1,2 and 3 into a model to evaluate the image definition;

and 5: predicting the demand of the target image;

inputting the roughly segmented target with the positioning information processed in the steps 1,2 and 3 into a model to evaluate the image demand degree; this step is parallel to step 4;

step 6: comprehensively judging the segmentation value of the target image;

solving segmentation value judgment of the image according to the image definition score obtained in the step 4 and the requirement confidence coefficient of the image obtained in the step 5;

and 7: screening non-repetitive target images in a high segmentation value target image group according to the positioning information of the target images;

acquiring a positioning label associated with a target image, comparing the positioning label with positioning labels of other targets, dividing an image group with the same positioning, and screening out a non-repetitive target image in a high-segmentation-value target image group according to a segmentation value coefficient set threshold obtained in the step 6 of the image;

and 8: finely dividing a high-value target image;

carrying out contour fine segmentation on the target image obtained in the step (7) by adopting a semantic segmentation model;

step 8.1, training a semantic segmentation model based on the contour;

performing pixel-level drawing on a target contour of a training set to train a semantic segmentation model based on the contour;

step 8.2, drawing a target fine contour;

inputting the target image into a trained semantic segmentation model to draw a target fine contour;

and step 9: obtaining a non-repetitive clear demand target in a multi-focus image;

and (3) placing the calculated non-repetitive image with high segmentation value into the model trained in the step 8.1 for semantic fine segmentation to obtain a contour mask image, and comparing the mask image with the source image to separate out a non-repetitive clear demand target in the multi-focus image.

2. The method for segmenting the target in the multi-focus image combined with the deep learning method according to claim 1, characterized in that: the implementation of step 2 is as follows,

converting the image into a gray scale image, then calculating gradient operators in the x and y directions based on gray scale characteristics of the image, then calculating the gradient by using first-order finite difference, and multiplying the gradient operators by difference to calculate the amplitude and the direction so as to determine the edge profile of the target;

step 2.2, detecting and judging the target contour features;

connecting edge pixels by using the global features of the image to form a region closed boundary, converting an image space into a parameter space, and describing points in the parameter space to achieve the purpose of detecting the edge of the image; and performing statistical calculation on all the points which possibly fall on the edge, and determining whether the target contour feature meets the standard feature according to the statistical result of the data.

3. The method for segmenting the target in the multi-focus image combined with the deep learning method according to claim 1, characterized in that: the implementation of step 4 is as follows,

step 4.1, training a multi-parameter fusion definition evaluation regression model;

calculating scores of the training set on each parameter index, then carrying out two-class labeling on the training set, and then training a multi-parameter fusion definition evaluation regression model;

step 4.2, acquiring the definition score of the target image;

4. The method for segmenting the target in the multi-focus image combined with the deep learning method according to claim 1, characterized in that: the implementation of step 5 is as follows,

selecting a target image which only contains a required target image and has similar contour color and required target but different texture characteristics at a proper scale to make a data set and associating a label for training a two-classification deep learning model based on texture characteristic required target identification;

step 5.2, acquiring the required confidence of the target image;

5. The method for segmenting the target in the multi-focus image combined with the deep learning method according to claim 1, characterized in that: the implementation of step 6 is as follows,

step 6.2, dividing a high segmentation value target image group;