WO2021057848A1 - Network training method, image processing method, network, terminal device and medium - Google Patents

Network training method, image processing method, network, terminal device and medium Download PDF

Info

Publication number
WO2021057848A1
WO2021057848A1 PCT/CN2020/117470 CN2020117470W WO2021057848A1 WO 2021057848 A1 WO2021057848 A1 WO 2021057848A1 CN 2020117470 W CN2020117470 W CN 2020117470W WO 2021057848 A1 WO2021057848 A1 WO 2021057848A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
image
mask
edge
neural network
Prior art date
Application number
PCT/CN2020/117470
Other languages
French (fr)
Chinese (zh)
Inventor
刘钰安
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2021057848A1 publication Critical patent/WO2021057848A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • This application relates to the field of image processing technology, and in particular to an image segmentation network training method, image processing method, image segmentation network, terminal equipment, and computer-readable storage medium.
  • the current commonly used method is: use the trained image segmentation network to output a mask that represents the area where the target object (that is, the foreground, such as a portrait) is located, and then use the mask to segment the target object Come out, and then change the image background.
  • the mask output by the current image segmentation network cannot accurately represent the contour edge of the target object, so that the target object cannot be accurately segmented, and the effect of replacing the image background is poor. Therefore, how to enable the mask output by the image segmentation network to more accurately represent the contour edge of the target object is a technical problem that needs to be solved urgently.
  • the purpose of the embodiments of this application is to provide an image segmentation network training method, image processing method, image segmentation network, terminal equipment, and computer-readable storage medium, which can make the output of the trained image segmentation network to a certain extent
  • the mask can more accurately represent the contour edge of the target object.
  • an image segmentation network training method which includes steps S101-S105:
  • S101 Obtain each sample image containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, where each sample mask is used to indicate the target in the corresponding sample image The image area where the object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask;
  • S102 For each sample image, input the sample image to an image segmentation network, and obtain a generation mask output by the image segmentation network for indicating the area where the target object in the sample image is located;
  • S103 For each generated mask, input the generated mask to the trained edge neural network to obtain generated edge information output by the edge neural network, and the generated edge information is used to indicate the target object indicated by the generated mask The contour edge of the area;
  • S104 Determine the loss function of the image segmentation network, where the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image Corresponding to the gap between the generated edge information and the sample edge information;
  • S105 Adjust various parameters of the image segmentation network, and then return to execute S102 until the loss function of the image segmentation network is less than a first preset threshold, thereby obtaining a trained image segmentation network.
  • an image processing method including:
  • the trained image segmentation network Obtains the image to be processed, and input the image to be processed into the trained image segmentation network to obtain the mask corresponding to the image to be processed, wherein the trained image segmentation network adopts a trained edge neural network After training, the trained edge neural network is used to output the contour edge of the area where the target object is indicated by the mask according to the input mask;
  • the target object contained in the image to be processed is segmented.
  • an image segmentation network is provided, and the image segmentation network is obtained by training using the training method described in the first aspect.
  • a terminal device including a memory, a processor, and a computer program that is stored in the memory and can run on the processor.
  • the processor executes the computer program, it implements the first aspect or the second aspect. The steps of the method described in the aspect.
  • a computer-readable storage medium stores a computer program, and when the above-mentioned computer program is executed by a processor, the steps of the method described in the first aspect or the second aspect are implemented.
  • a computer program product includes a computer program, and when the computer program is executed by one or more processors, the steps of the method described in the first aspect or the second aspect are implemented.
  • the trained edge neural network when training the image segmentation network, the trained edge neural network will be used to train the image segmentation network.
  • the trained edge neural network 001 is input to the edge neural network 001 according to the generated mask 002 indicated by the image area (pure) White area), output generated edge information 003, the edge information is used to indicate the location of the contour edge of the image area, the generated edge information 003 in Figure 1 is presented in the form of an image.
  • the training method provided by this application includes the following steps: First, for each sample, the sample image is input to the image segmentation network to obtain the generated mask output by the image segmentation network, and the generated mask is input to the training After the edge neural network, the generated edge information output by the edge neural network is obtained; secondly, the loss function of the image segmentation network is determined, and the loss function is positively correlated with the mask gap corresponding to each sample image (a sample image corresponds to The mask gap is the gap between the sample mask corresponding to the sample image and the generated mask), and the loss function and the edge gap corresponding to each sample image are also positively correlated (the edge gap corresponding to a sample image is the The difference between the sample edge information corresponding to the sample image and the generated edge information), and finally, adjust the various parameters of the image segmentation network until the loss function is less than the first preset threshold.
  • the above training method ensures that the generated mask output by the image segmentation network is close to the sample mask, it will further ensure that the contour edge of the target object represented in the generated mask output by the image segmentation network is more consistent with the actual contour edge. For approximation, therefore, the mask image output by the image segmentation network provided by this application can more accurately represent the contour edge of the target object.
  • Fig. 1 is a schematic diagram of the working principle of a trained edge neural network provided by the present application
  • FIG. 2 is a schematic diagram of a training method of an image segmentation network provided by Embodiment 1 of the present application;
  • FIG. 3 is a schematic diagram of a sample image, sample mask, and sample edge information provided in Embodiment 1 of the present application;
  • Embodiment 4 is a schematic structural diagram of an image segmentation network provided by Embodiment 1 of the present application.
  • FIG. 5 is a schematic diagram of the connection relationship between the image segmentation network provided in the first embodiment of the present application and the trained edge neural network;
  • FIG. 6 is a schematic diagram of the structure of the edge neural network provided in the first embodiment of the present application.
  • FIG. 7 is a schematic diagram of another image segmentation network training method provided in Embodiment 2 of the present application.
  • FIG. 8(a) is a schematic diagram of the training process of the edge segmentation network provided in the second embodiment of the present application.
  • Fig. 8(b) is a schematic diagram of the training process of the image segmentation network provided in the second embodiment of the present application.
  • FIG. 9 is a schematic diagram of the work flow of the image processing method provided in the third embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an image segmentation network training device provided in the fourth embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an image processing apparatus according to Embodiment 5 of the present application.
  • FIG. 12 is a schematic structural diagram of a terminal device according to Embodiment 6 of the present application.
  • the method provided in the embodiments of the present application may be applicable to terminal devices.
  • the terminal devices include, but are not limited to: smart phones, tablet computers, notebooks, desktop computers, cloud servers, and the like.
  • the term “if” can be interpreted as “when” or “once” or “in response to determination” or “in response to detection” depending on the context .
  • the phrase “if determined” or “if detected [described condition or event]” can be interpreted as meaning “once determined” or “in response to determination” or “once detected [described condition or event]” depending on the context ]” or “in response to detection of [condition or event described]”.
  • the training method includes:
  • step S101 each sample image containing the target object, the sample mask corresponding to each sample image, and the sample edge information corresponding to each sample mask are obtained, where each sample mask is used to indicate the corresponding sample In the image area where the target object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
  • a part of the sample images can be obtained from the data set first, and then the number of sample images used for training the image segmentation network can be expanded in the following ways: mirror inversion, scale scaling and/or scaling of the sample images obtained in advance Gamma changes, etc., so as to increase the number of sample images, so as to obtain each sample image described in step S101.
  • the sample mask described in this application is a binary image.
  • the method of obtaining the sample edge information corresponding to a certain sample mask in step S101 may be: performing an expansion operation on the sample mask to obtain a mask image after the expansion operation, and combining the mask image after the expansion operation with the By subtracting the sample mask, the edge information of the sample corresponding to the sample mask can be obtained.
  • the edge information of the sample obtained in this way is the same as the sample mask, which is a binary image.
  • an image 201 is a sample image containing a target object (ie, a portrait)
  • an image 202 may be a sample mask corresponding to the sample image 201
  • an image 203 may be sample edge information corresponding to the sample mask 201.
  • the above sample edge information is not necessarily a binary image, but can also be other information expression forms, as long as it can reflect the "contour edge of the image area where the target object is indicated by the sample mask". .
  • the above-mentioned target object may be any subject, such as a portrait, a dog, a cat, etc., and this application does not limit the category of the target object.
  • the image content contained in each sample image should be as different as possible.
  • the image content contained in sample image 1 can be a frontal portrait of Xiao Ming.
  • the image content contained in the sample image 2 may be a half-profile portrait of Xiaohong.
  • step S102 for each sample image, the sample image is input to the image segmentation network, and a generation mask output by the image segmentation network for indicating the area of the target object in the sample image is obtained.
  • an image segmentation network needs to be established in advance, and the image segmentation network is used to output a mask corresponding to the image (that is, to generate a mask) according to the input image.
  • the image segmentation network may be CNN (Convolutional Neural Networks, Convolutional Neural Network), or FPN (Feature Pyramid Networks, Feature Pyramid Network), and this application does not limit the specific network structure of the image segmentation network.
  • the image segmentation network using the FPN structure can be specifically referred to in Figure 4.
  • step S102 is started to train the image segmentation network.
  • each sample image needs to be input to the image segmentation network to obtain each generation mask output by the image segmentation network, where each generation mask corresponds to a sample image.
  • the "generating mask" described in this step is the same as the sample mask described in step S101, and may be a binary image.
  • step S103 for each generated mask, input the generated mask to the trained edge neural network to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate what the generated mask indicates The contour edge of the area where the target object is located.
  • a trained edge neural network Before performing this step S103, a trained edge neural network needs to be obtained.
  • the trained edge neural network is used to generate a mask based on the input and output to generate edge information.
  • the generated edge information is used to indicate the input generated mask The contour edge of the area where the indicated target object is located.
  • the edge neural network after training may be as shown in FIG. 1.
  • FIG. 1 After inputting the generated mask shown in 002 into the trained edge neural network shown in 001, the trained edge neural network shown in 001 will output the generated edge information shown in 003.
  • each of the generated masks described in step S102 is input to the trained edge neural network, and each generated edge information output by the trained edge neural network is obtained, where each Each generation edge information corresponds to a generation mask, which is used to represent the contour edge of the image area where the target object indicated by the generation mask is located.
  • the connection between the image segmentation network and the trained edge neural network is shown in FIG. 5.
  • step S104 the loss function of the above-mentioned image segmentation network is determined.
  • the loss function is used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used to measure each sample image. Respectively correspond to the gap between the generated edge information and the sample edge information.
  • each sample image corresponds to a sample mask, sample edge information, mask generation, and edge information generation.
  • it is necessary to calculate the gap between the sample mask corresponding to the sample image and the generated mask for the convenience of subsequent description, it is defined that for a certain sample image, The gap between the sample mask corresponding to the sample image and the generated mask is the mask gap corresponding to the sample image), it is also necessary to calculate the gap between the sample edge information corresponding to the sample image and the generated edge information (for the convenience of subsequent description, the definition of For a sample image, the difference between the sample edge information corresponding to the sample image and the generated edge information is the edge difference corresponding to the sample image).
  • step S104 the loss function of the aforementioned image segmentation network needs to be calculated.
  • the loss function and each sample image are used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used To measure the gap between the generated edge information and the sample edge information corresponding to each sample image, that is, the loss function is positively correlated with the mask gap corresponding to each sample image, and the loss function is related to each sample image.
  • the corresponding marginal gaps are also positively correlated.
  • the calculation process of the aforementioned loss function may be:
  • Step A For each sample image, calculate the image difference between the generated mask corresponding to the sample image and the sample mask corresponding to the sample image (that is, it can be regarded as m1 i is the pixel value of the i-th pixel of the generated mask, m2 i is the pixel value of the i-th pixel of the sample mask, and M is the total number of pixels of the generated mask).
  • Step B If the above sample edge information and generated edge information are both images, then for each sample image, calculate the image difference between the sample edge information corresponding to the sample image and the generated edge information corresponding to the sample image (calculation of the image difference Refer to step A).
  • Step C The image differences obtained in the above step A and the image differences obtained in the image B can be averaged (if the number of sample images is N, the image differences obtained in step A and the image differences obtained in step B can be calculated And then divide by 2N) to get the loss function.
  • the calculation method of the aforementioned loss function is not limited to the aforementioned step A-step C.
  • the aforementioned loss function can also be calculated by the following formula (1):
  • LOSS 1 is the loss function of the above image segmentation network
  • N is the total number of sample images
  • F1 j is used to measure the gap between the sample mask corresponding to the jth sample image and the generated mask
  • F2 j is used to measure the The difference between the sample edge information corresponding to j sample images and the generated edge information
  • the calculation method of F1 j may be: calculating the cross entropy loss between the sample mask and the generated mask corresponding to the j-th sample image, and the specific formula is as follows:
  • M is the total number of pixels in the j-th sample image
  • the value of y ji is determined according to the sample mask corresponding to the j-th sample image
  • y ji is used to indicate the i-th sample image in the j-th sample image.
  • p ji is the probability that the i-th pixel in the j-th sample image predicted by the image segmentation network is in the image area where the target object is located
  • x is the logarithm log Bottom value.
  • the value of y ji is determined according to the sample mask corresponding to the j-th sample image. For example, if the sample mask corresponding to the j-th sample image is in the sample mask, it indicates that the j-th sample image is If i pixels are located in the image area where the target object is located, y ji can be 1. If the sample mask corresponding to the j-th sample image is in the sample mask, it indicates that the i-th pixel in the j-th sample image is not located where the target object is In the image area, y ji can be 0. Those skilled in the art should understand that the value of y ji is not limited to 1 and 0, and can also be other values. The value of y ji is preset, such as 1 or 0.
  • the value of y ji is greater than when the sample mask indicates that the i-th pixel is not located in the image area where the target object is located.
  • the value of y ji at time That is, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 1, otherwise, y ji is 0. Or, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 2, otherwise, y ji is 1. Or, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 0.8, otherwise, y ji is 0.2.
  • the calculation method of F2 j may be similar to the above formula (2), that is, calculating the cross entropy loss of the sample edge information corresponding to the j-th sample image and the generated edge information.
  • mask 1 is the generation mask corresponding to the j-th sample image
  • mask 2 is the sample mask corresponding to the j-th sample image
  • h c (mask 1 ) is when the trained edge neural network input is mask 1
  • the first The output of c convolutional blocks, h c (mask 2 ) is the output of the c-th convolutional block when the trained edge neural network input is mask 2
  • ⁇ c is a constant.
  • the sample edge information can be measured by the above formula (3) The gap with generating edge information.
  • the edge neural network can be formed by cascading three convolutional blocks, and each convolutional block is a convolutional layer.
  • step S105 it is determined whether the aforementioned loss function is less than the first preset threshold, if so, step S107 is executed, otherwise, step S106 is executed.
  • step S106 adjust each parameter of the above-mentioned image segmentation network, and then return to perform step S102.
  • step S107 a trained image segmentation network is obtained.
  • the parameters of the image segmentation network are continuously adjusted until the loss function is less than the first preset threshold.
  • the parameter adjustment method is not specifically limited, and a gradient descent algorithm, a power update algorithm, etc. can be used, and the method used for adjusting the parameters is not limited here.
  • the sample image when the image segmentation network is trained, before the sample image is input to the image segmentation network, the sample image can be preprocessed first, and then the preprocessed sample image can be input to the image Split the network.
  • the above-mentioned preprocessing may include: image cropping and/or normalization processing and so on.
  • the test set can also be used to evaluate the trained image segmentation network.
  • the method of obtaining the test set can be referred to the prior art, and will not be repeated here.
  • the evaluation function can be:
  • X is the image area of the target object indicated by the generated mask output by the image segmentation network after the sample image is input to the trained image segmentation network.
  • Y is the image area of the target object indicated by the sample mask corresponding to the sample image.
  • the IoU Intersection-over-Union
  • X and Y are used to evaluate the image segmentation network after training.
  • By evaluating the trained image segmentation network we can further evaluate whether the performance of the trained image segmentation network meets the requirements. For example, if it is determined that the performance of the trained image segmentation network does not meet the requirements, the training of the trained image segmentation network is continued.
  • the training method provided in the first embodiment of the application ensures that the generated mask output by the image segmentation network is close to the sample mask, and at the same time, it will further ensure that the contour edge of the target object represented in the generated mask output by the image segmentation network is true The contour edge of is closer. Therefore, the image corresponding to the generated mask output by the image segmentation network provided by this application can more accurately represent the contour edge of the target object.
  • the training method includes the training process of the edge neural network. Please refer to Figure 7.
  • the training method includes:
  • each sample image containing the target object, the sample mask corresponding to each sample image, and the sample edge information corresponding to each sample mask are obtained, where each sample mask is used to indicate the corresponding sample In the image area where the target object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
  • step S301 please refer to the part of step S101 in the first embodiment, which will not be repeated here.
  • step S302 for each sample mask, the sample mask is input to the edge neural network to obtain edge information output by the edge neural network, and the edge information is used to indicate the area where the target object indicated by the sample mask is located Contour edges.
  • the step S302 to the subsequent step S306 are the training process of the edge neural network to obtain the trained edge neural network.
  • steps S302-S306 are executed before the subsequent step S308, and need not be executed before the step S307.
  • an edge neural network needs to be established in advance, and the edge neural network is used to obtain the contour edge of the area where the target object indicated by the input sample mask is located.
  • the edge neural network can be formed by cascading three convolutional layers.
  • each sample mask is input to the edge neural network to obtain each edge information output by the edge neural network, wherein each sample mask corresponds to one edge information output by the edge neural network.
  • step S303 the loss function of the edge neural network is determined, and the loss function is used to measure the difference between the sample edge information corresponding to each sample mask and the edge information output by the edge neural network.
  • step S303 determines the loss function of the above-mentioned edge neural network, where the loss function is positively correlated with the edge gap corresponding to each sample mask (the edge gap is corresponding to the sample mask). The difference between the edge information of the sample and the edge information output by the edge neural network after the sample mask is input to the edge neural network).
  • the calculation method of the loss function of the above-mentioned edge neural network may be:
  • the loss function of the edge neural network may be the difference between the edge information of the sample and the edge information output by the edge neural network.
  • Image difference the image difference calculation method can be referred to as described in step A in the first embodiment, which will not be repeated here).
  • the calculation method of the loss function of the edge neural network may be: for each sample mask, calculate the cross-entropy loss of the corresponding sample edge information and the edge information output by the edge neural network, and then calculate the average.
  • the specific calculation formula is as follows:
  • LOSS 2 is the loss function of the aforementioned edge neural network
  • N is the total number of sample masks (it is easy for those skilled in the art to understand that the total number of sample images, sample masks, and sample edge information are all the same, and they are all N )
  • M is the total number of pixels in the j-th sample mask
  • the value of r ji is determined according to the sample edge information corresponding to the j-th sample image
  • r ji is used to indicate the number of pixels in the j-th sample mask.
  • q ji is the probability that the i-th pixel in the j-th sample mask predicted by the edge neural network is the contour edge
  • x is the bottom value of the logarithm log.
  • the value of r ji is determined according to the sample edge information corresponding to the jth sample mask. For example, if the sample edge information corresponding to the jth sample mask indicates the jth sample mask If the i-th pixel in the film is a contour edge, r ji can be 1. If the sample edge information corresponding to the j-th sample mask indicates that the i-th pixel is not a contour edge, then r ji can be 0 . Those skilled in the art should understand that the value of r ji is not limited to 1 and 0, and may also be other values. The value of r ji is preset, such as 1 or 0.
  • the edge of the sample information indicates the i-th pixel as the edge contour
  • the value of r ji when the sample is greater than the edge information indicates an i-th pixel is not the value of r ji edge contour. That is, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 1, otherwise, r ji is 0. Or, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 2, otherwise, r ji is 1. Or, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 0.8, otherwise, r ji is 0.2.
  • step S304 it is determined whether the loss function of the aforementioned edge neural network is less than a second preset threshold, if not, step S305 is executed, and if yes, step S306 is executed.
  • step S305 adjust each parameter of the above-mentioned edge neural network model, and then return to step S302.
  • step S306 a trained edge neural network is obtained.
  • the parameters of the edge neural network are continuously adjusted until the loss function is less than the second preset threshold.
  • the parameter adjustment method is not specifically limited, and a gradient descent algorithm, a power update algorithm, etc. can be used, and the method used for adjusting the parameters is not limited here.
  • step S307 for each sample image, the sample image is input to the image segmentation network, and the generated mask output by the image segmentation network for indicating the area of the target object in the sample image is obtained.
  • step S308 for each generated mask, input the generated mask to the trained edge neural network to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate what the generated mask indicates The contour edge of the area where the target object is located.
  • step S309 the loss function of the above-mentioned image segmentation network is determined.
  • the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image. Respectively correspond to the gap between the generated edge information and the sample edge information.
  • step S310 it is determined whether the aforementioned loss function is less than a first preset threshold, if so, step S312 is executed, otherwise, step S311 is executed.
  • step S31 adjust the various parameters of the above-mentioned image segmentation network, and then return to perform step S102.
  • step S312 a trained image segmentation network is obtained.
  • the training process of the edge neural network First, input the sample mask into the edge neural network to obtain the edge information output by the edge neural network. Secondly, the cross-entropy loss is calculated according to the edge neural network and the edge information of each sample. The edge information of the sample is obtained by the expansion operation and the subtraction operation of the sample mask. For details, please refer to the description of the first embodiment, which will not be repeated here. Then, the various cross entropy losses are averaged to obtain the loss function. Finally, continuously adjust the various parameters of the edge neural network until the loss function is small, so as to obtain the edge neural network after training.
  • the training method described in the second embodiment of the present application has an additional training process of the edge neural network, which can make the samples used for training the edge neural network consistent with the samples used for training the image segmentation network Therefore, the accuracy of the edge of the mask output by the image segmentation network can be better measured according to the output result of the edge neural network, so as to better train the image segmentation network.
  • the third embodiment of the present application provides an image processing method. Please refer to FIG. 9.
  • the image processing method includes:
  • step S401 an image to be processed is obtained, and the image to be processed is input to the trained image segmentation network to obtain a mask corresponding to the image to be processed, wherein the trained image segmentation network uses the trained edge
  • the neural network is trained, and the trained edge neural network is used to output the edge contour of the area where the target object indicated by the mask is located according to the input mask.
  • the trained edge neural network described in this step S401 is a neural network obtained by training using the method described in the first or second embodiment above.
  • step S402 the target objects contained in the image to be processed are segmented based on the mask corresponding to the image to be processed.
  • step S402 a specific operation of changing the background can also be performed. This operation is in the prior art and will not be repeated here.
  • the method described in the third embodiment can be a method applied in a terminal device (such as a mobile phone).
  • This method can facilitate the user to replace the background in the image to be processed.
  • This method can accurately segment the target object and more accurately replace the background. Can improve user experience to a certain extent.
  • the fourth embodiment of the present application provides a training device for an image segmentation network. For ease of description, only the parts related to the present application are shown. As shown in FIG. 10, the training device 500 includes:
  • the sample acquisition module 501 is used to acquire each sample image containing the target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, wherein each sample mask is used to indicate Corresponding to the image area where the target object is located in the sample image, each sample edge information is used to indicate the contour edge of the image area where the target object indicated by the corresponding sample mask is located.
  • the generation mask acquisition module 502 is configured to input the sample image to the image segmentation network for each sample image, and obtain the generation mask output by the image segmentation network for indicating the area of the target object in the sample image.
  • the generated edge acquisition module 503 is used to input the generated mask to the trained edge neural network for each generated mask to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate the generated mask.
  • the contour edge of the area where the target object indicated by the film is located.
  • the loss determination module 504 is used to determine the loss function of the image segmentation network.
  • the loss function is used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used to Measure the gap between the generated edge information and the sample edge information corresponding to each sample image.
  • the parameter adjustment module 505 is used to adjust various parameters of the image segmentation network, and then trigger the generation mask acquisition module to continue to perform corresponding steps until the loss function of the image segmentation network is less than the first preset threshold, thereby Get the trained image segmentation network.
  • the aforementioned loss determination module 504 is specifically configured to:
  • LOSS 1 is the loss function of the image segmentation network
  • N is the total number of sample images
  • F1 j is used to measure the gap between the sample mask corresponding to the jth sample image and the generated mask
  • F2 j is used to measure The difference between the sample edge information corresponding to the j-th sample image and the generated edge information
  • M is the total number of pixels in the j-th sample image
  • the value of y ji is determined according to the sample mask corresponding to the j-th sample image
  • y ji is used to indicate the i-th sample image in the j-th sample image.
  • p ji is the probability that the i-th pixel in the j-th sample image predicted by the image segmentation network is in the image area where the target object is located
  • x is the logarithm log Bottom value.
  • the value of y ji is greater than when the sample mask indicates the i-th pixel of the image region of the target object is not located in the y ji value.
  • the above-mentioned trained edge neural network is formed by cascading A convolutional blocks, and each convolutional block is composed of B convolutional layers.
  • mask 1 is the generation mask corresponding to the j-th sample image
  • mask 2 is the sample mask corresponding to the j-th sample image
  • h c (mask 1 ) is when the trained edge neural network input is mask 1
  • the first The output of c convolutional blocks, h c (mask 2 ) is the output of the c-th convolutional block when the trained edge neural network input is mask 2
  • ⁇ c is a constant.
  • the above-mentioned training device further includes an edge neural network training module, and the edge neural network training module includes:
  • the edge information acquisition unit is used to input the sample mask to the edge neural network for each sample mask to obtain the edge information output by the edge neural network, and the edge information is used to indicate the target object indicated by the sample mask The contour edge of the area.
  • the edge loss determining unit is used to determine the loss function of the edge neural network, and the loss function is used to measure the difference between the sample edge information corresponding to each sample mask and the edge information output by the edge neural network.
  • the edge parameter adjustment unit is used to adjust various parameters of the edge neural network, and then trigger the edge information acquisition unit to continue to perform corresponding steps until the loss function value of the edge neural network is less than the second preset threshold, thereby obtaining training After the edge of the neural network.
  • the aforementioned edge loss determining unit is specifically used for:
  • LOSS 2 is the loss function of the edge neural network
  • N is the total number of sample images
  • M is the total number of pixels in the j-th sample mask
  • the value of r ji is based on the j-th sample image
  • the corresponding sample edge information is determined
  • r ji is used to indicate whether the i-th pixel in the j-th sample mask is a contour edge
  • q ji is the i-th pixel in the j-th sample mask predicted by the edge neural network
  • the pixel point is the probability of the edge of the contour
  • x is the base value of the logarithm log.
  • the value of r ji when the sample is greater than the edge information indicates an i-th pixel is not the value of r ji edge contour.
  • the image processing apparatus 600 includes:
  • the mask acquisition module 601 is used to acquire the image to be processed, and input the image to be processed into the trained image segmentation network to obtain the mask corresponding to the image to be processed, wherein the trained edge neural network is used After training, the trained edge neural network is used to output the edge contour of the area where the target object indicated by the mask is located according to the input mask (specifically, the trained image segmentation network adopts the method as in the first embodiment or The training method described in the second embodiment is obtained through training).
  • the target object segmentation module 602 is configured to segment the target object contained in the image to be processed based on the mask corresponding to the image to be processed.
  • FIG. 12 is a schematic diagram of a terminal device provided in Embodiment 6 of the present application.
  • the terminal device 700 of this embodiment includes a processor 701, a memory 702, and a computer program 703 that is stored in the memory 702 and can run on the processor 701.
  • the above-mentioned processor 701 implements the steps in the above-mentioned method embodiments when the above-mentioned computer program 703 is executed.
  • the processor 701 executes the computer program 703, the function of each module/unit in the foregoing device embodiments is realized.
  • the foregoing computer program 703 may be divided into one or more modules/units, and the foregoing one or more modules/units are stored in the foregoing memory 702 and executed by the foregoing processor 701 to complete the present application.
  • the foregoing one or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the foregoing computer program 703 in the foregoing terminal device 700.
  • the aforementioned computer program 703 can be divided into a sample acquisition module, a mask generation module, an edge generation module, a loss determination module, and a parameter adjustment module.
  • the specific functions of each module are as follows:
  • S101 Obtain each sample image containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, where each sample mask is used to indicate the target in the corresponding sample image In the image area where the object is located, the edge information of each sample is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
  • S102 For each sample image, input the sample image to an image segmentation network, and obtain a generation mask output by the image segmentation network for indicating a region where a target object in the sample image is located.
  • S103 For each generated mask, input the generated mask to the trained edge neural network to obtain generated edge information output by the edge neural network, and the generated edge information is used to indicate the target object indicated by the generated mask The contour edge of the area.
  • S104 Determine the loss function of the image segmentation network, where the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image Respectively correspond to the gap between the generated edge information and the sample edge information.
  • S105 Adjust various parameters of the image segmentation network, and then return to execute S102 until the loss function of the image segmentation network is less than a first preset threshold, thereby obtaining a trained image segmentation network.
  • the aforementioned computer program 703 can be divided into a mask acquisition module and a target object segmentation module, and the specific functions of each module are as follows:
  • the target object contained in the image to be processed is segmented.
  • the foregoing terminal device may include, but is not limited to, a processor 701 and a memory 702.
  • FIG. 12 is only an example of the terminal device 700, and does not constitute a limitation on the terminal device 700. It may include more or less components than those shown in the figure, or a combination of certain components, or different components.
  • the aforementioned terminal device may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 701 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the foregoing memory 702 may be an internal storage unit of the foregoing terminal device 700, such as a hard disk or a memory of the terminal device 700.
  • the memory 702 may also be an external storage device of the terminal device 700, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory equipped on the terminal device 700. Card (Flash Card), etc.
  • the aforementioned memory 702 may also include both an internal storage unit of the aforementioned terminal device 700 and an external storage device.
  • the above-mentioned memory 702 is used to store the above-mentioned computer program and other programs and data required by the above-mentioned terminal device.
  • the aforementioned memory 702 can also be used to temporarily store data that has been output or will be output.
  • the disclosed device/terminal device and method may be implemented in other ways.
  • the device/terminal device embodiments described above are only illustrative.
  • the division of the above-mentioned modules or units is only a logical function division, and there may be other division methods in actual implementation, such as multiple units or Components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the above-mentioned integrated modules/units are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the foregoing method embodiments, and can also be completed by instructing relevant hardware through a computer program.
  • the foregoing computer program may be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments.
  • the above-mentioned computer program includes computer program code, and the above-mentioned computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the above-mentioned computer-readable medium may include: any entity or device capable of carrying the above-mentioned computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random Access memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal, software distribution medium, etc.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electric carrier signal telecommunications signal
  • software distribution medium etc.
  • the content contained in the above-mentioned computer-readable media can be appropriately added or deleted in accordance with the requirements of the legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to the legislation and patent practice, the computer-readable media cannot Including electric carrier signal and telecommunication signal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The present application provides a network training method, an image processing method, a network, a terminal device, and a medium. The training method comprise: S1, acquiring sample images containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to the sample mask; S2, inputting each sample image to an image segmentation network to obtain a generated mask output by the image segmentation network; S3, inputting the generated mask to a trained edge neural network to obtain generated edge information output by the edge neural network; S4, determining a loss function according to the gap between the sample mask and the generated mask, and the gap between the generated edge information and the sample edge information; S5, adjusting parameters of the image segmentation network, and returning to S2 until the loss function is less than a threshold. The present application enables the contour edge of the target object to be more accurately represented on the mask image output by the image segmentation network.

Description

网络的训练方法、图像处理方法、网络、终端设备及介质Network training method, image processing method, network, terminal equipment and medium 技术领域Technical field
本申请涉及图像处理技术领域,尤其涉及一种图像分割网络的训练方法、图像处理方法、图像分割网络、终端设备及计算机可读存储介质。This application relates to the field of image processing technology, and in particular to an image segmentation network training method, image processing method, image segmentation network, terminal equipment, and computer-readable storage medium.
背景技术Background technique
用户在拍摄完图像之后,往往希望变更图像中的背景(比如,将背景更换为室外沙滩场景,或者,将背景更换为用于拍摄证件照的纯色背景)。为了实现上述技术效果,目前常用的做法为:利用训练后的图像分割网络输出用于表示目标对象(也即是前景,比如,人像)所在区域的掩膜,然后利用该掩膜将目标对象分割出来,进而更换图像背景。After the user has taken the image, he often wants to change the background in the image (for example, to change the background to an outdoor beach scene, or to change the background to a solid color background for taking ID photos). In order to achieve the above technical effects, the current commonly used method is: use the trained image segmentation network to output a mask that represents the area where the target object (that is, the foreground, such as a portrait) is located, and then use the mask to segment the target object Come out, and then change the image background.
然而,目前图像分割网络输出的掩膜并不能精确表示目标对象的轮廓边缘,从而不能将目标对象进行精准分割,造成更换图像背景的效果较差。因此,如何使得图像分割网络所输出的掩膜能够更加精确地表示目标对象的轮廓边缘,是目前亟待解决的技术问题。However, the mask output by the current image segmentation network cannot accurately represent the contour edge of the target object, so that the target object cannot be accurately segmented, and the effect of replacing the image background is poor. Therefore, how to enable the mask output by the image segmentation network to more accurately represent the contour edge of the target object is a technical problem that needs to be solved urgently.
申请内容Application content
本申请实施例的目的在于:提供一种图像分割网络的训练方法、图像处理方法、图像分割网络、终端设备及计算机可读存储介质,可以在一定程度上使得训练后的图像分割网络所输出的掩膜能够更加精确地表示目标对象的轮廓边缘。The purpose of the embodiments of this application is to provide an image segmentation network training method, image processing method, image segmentation network, terminal equipment, and computer-readable storage medium, which can make the output of the trained image segmentation network to a certain extent The mask can more accurately represent the contour edge of the target object.
本申请实施例采用的技术方案是:The technical solutions adopted in the embodiments of this application are:
第一方面,提供了一种图像分割网络的训练方法,包括步骤S101-S105:In the first aspect, an image segmentation network training method is provided, which includes steps S101-S105:
S101,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘;S101: Obtain each sample image containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, where each sample mask is used to indicate the target in the corresponding sample image The image area where the object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask;
S102,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜;S102: For each sample image, input the sample image to an image segmentation network, and obtain a generation mask output by the image segmentation network for indicating the area where the target object in the sample image is located;
S103,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘;S103: For each generated mask, input the generated mask to the trained edge neural network to obtain generated edge information output by the edge neural network, and the generated edge information is used to indicate the target object indicated by the generated mask The contour edge of the area;
S104,确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距;S104. Determine the loss function of the image segmentation network, where the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image Corresponding to the gap between the generated edge information and the sample edge information;
S105,调整所述图像分割网络的各个参数,然后返回执行S102,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。S105: Adjust various parameters of the image segmentation network, and then return to execute S102 until the loss function of the image segmentation network is less than a first preset threshold, thereby obtaining a trained image segmentation network.
第二方面,提供了一种图像处理方法,包括:In the second aspect, an image processing method is provided, including:
获取待处理图像,并将所述待处理图像输入至训练后的图像分割网络,得到所述待处理图像对应的掩膜,其中,所述训练后的图像分割网络是采用训练后的边缘神经网络训练得到,所述训练后的边缘神经网络用于根据输入的掩膜,输出该掩膜所指示的目标对象所在区域的轮廓边缘;Obtain the image to be processed, and input the image to be processed into the trained image segmentation network to obtain the mask corresponding to the image to be processed, wherein the trained image segmentation network adopts a trained edge neural network After training, the trained edge neural network is used to output the contour edge of the area where the target object is indicated by the mask according to the input mask;
基于所述待处理图像对应的掩膜,将所述待处理图像中所包含的目标对象分割出来。Based on the mask corresponding to the image to be processed, the target object contained in the image to be processed is segmented.
第三方面,提供一种图像分割网络,该图像分割网络是采用如上述第一方面所述的训练方法训练得到。In a third aspect, an image segmentation network is provided, and the image segmentation network is obtained by training using the training method described in the first aspect.
第四方面,提供一种终端设备,包括存储器、处理器以及存储在上述存储器中并可在上述处理器上运行的计算机程序,上述处理器执行上述计算机程序时实现如上述第一方面或者第二方面所述方法的步骤。In a fourth aspect, a terminal device is provided, including a memory, a processor, and a computer program that is stored in the memory and can run on the processor. When the processor executes the computer program, it implements the first aspect or the second aspect. The steps of the method described in the aspect.
第五方面,提供一种计算机可读存储介质,上述计算机可读存储介质存储有计算机程序,上述计算机程序被处理器执行时实现如上述第一方面或者第二方面所述方法的步骤。In a fifth aspect, a computer-readable storage medium is provided, and the above-mentioned computer-readable storage medium stores a computer program, and when the above-mentioned computer program is executed by a processor, the steps of the method described in the first aspect or the second aspect are implemented.
第六方面,提供一种计算机程序产品,上述计算机程序产品包括计算机程序,上述计算机程序被一个或多个处理器执行时实现如上述第一方面或者第二方面所述方法的步骤。In a sixth aspect, a computer program product is provided. The computer program product includes a computer program, and when the computer program is executed by one or more processors, the steps of the method described in the first aspect or the second aspect are implemented.
由上可见,本申请所提供的训练方法中,在对图像分割网络进行训练时,会利用训练后的边缘神经网络对图像分割网络进行训练。It can be seen from the above that in the training method provided by this application, when training the image segmentation network, the trained edge neural network will be used to train the image segmentation network.
首先利用附图1描述训练后的边缘神经网络,如图1所示,训练后的边缘神经网络001根据输入至该边缘神经网络001的生成掩膜002所指示的目标对象所在的图像区域(纯白色区域),输出生成边缘信息003,该边缘信息用于指示该图像区域的轮廓边缘所在的位置,附图1中的生成边缘信息003是以图像形式呈现的。First, use Figure 1 to describe the trained edge neural network. As shown in Figure 1, the trained edge neural network 001 is input to the edge neural network 001 according to the generated mask 002 indicated by the image area (pure) White area), output generated edge information 003, the edge information is used to indicate the location of the contour edge of the image area, the generated edge information 003 in Figure 1 is presented in the form of an image.
本申请所提供的训练方法包括如下步骤:首先,对于每个样本来说,将该样本图像输入至图像分割网络,得到该图像分割网络输出的生成掩膜,并将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息;其次,确定该图像分割网络的损失函数,该损失函数与每个样本图像对应的掩膜差距均正相关(某个样本图像对应的掩膜差距为该样本图像对应的样本掩膜与生成掩膜的差距),并且,该损失函数与每个样本图像对应的边缘差距也均正相关(某个样本图像对应的边缘差距为该样本图像对应的样本边缘信息与生成边缘信息的差距),最后,调整图像分割网络的各个参数,直至损失函数小于第一预设阈值为止。The training method provided by this application includes the following steps: First, for each sample, the sample image is input to the image segmentation network to obtain the generated mask output by the image segmentation network, and the generated mask is input to the training After the edge neural network, the generated edge information output by the edge neural network is obtained; secondly, the loss function of the image segmentation network is determined, and the loss function is positively correlated with the mask gap corresponding to each sample image (a sample image corresponds to The mask gap is the gap between the sample mask corresponding to the sample image and the generated mask), and the loss function and the edge gap corresponding to each sample image are also positively correlated (the edge gap corresponding to a sample image is the The difference between the sample edge information corresponding to the sample image and the generated edge information), and finally, adjust the various parameters of the image segmentation network until the loss function is less than the first preset threshold.
由此可见,上述训练方法在保证图像分割网络输出的生成掩膜逼近样本掩膜的同时,会进一步保证图像分割网络输出的生成掩膜中所表示的目标对象的轮廓边缘与实际的轮廓边缘更为逼近,因此,本申请所提供的图像分割网络所输出的掩膜图像能够更加精确地表示目标对象的轮廓边缘。It can be seen that while the above training method ensures that the generated mask output by the image segmentation network is close to the sample mask, it will further ensure that the contour edge of the target object represented in the generated mask output by the image segmentation network is more consistent with the actual contour edge. For approximation, therefore, the mask image output by the image segmentation network provided by this application can more accurately represent the contour edge of the target object.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或示范性技术描述中所需要使用的 附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the accompanying drawings that need to be used in the embodiments or exemplary technical descriptions. Obviously, the accompanying drawings in the following description are only of the present application. Some examples.
图1是本申请提供的一种训练后的边缘神经网络的工作原理示意图;Fig. 1 is a schematic diagram of the working principle of a trained edge neural network provided by the present application;
图2是本申请实施例一提供的一种图像分割网络的训练方法示意图;FIG. 2 is a schematic diagram of a training method of an image segmentation network provided by Embodiment 1 of the present application;
图3是本申请实施例一提供的一种样本图像、样本掩膜以及样本边缘信息的示意图;FIG. 3 is a schematic diagram of a sample image, sample mask, and sample edge information provided in Embodiment 1 of the present application;
图4是本申请实施例一提供的一种图像分割网络的结构示意图;4 is a schematic structural diagram of an image segmentation network provided by Embodiment 1 of the present application;
图5是本申请实施例一提供的图像分割网络与训练后的边缘神经网络的连接关系示意图;FIG. 5 is a schematic diagram of the connection relationship between the image segmentation network provided in the first embodiment of the present application and the trained edge neural network;
图6是本申请实施例一提供的边缘神经网络的结构示意图;FIG. 6 is a schematic diagram of the structure of the edge neural network provided in the first embodiment of the present application;
图7是本申请实施例二提供的另一种图像分割网络的训练方法示意图;FIG. 7 is a schematic diagram of another image segmentation network training method provided in Embodiment 2 of the present application;
图8(a)是本申请实施例二提供的边缘分割网络的训练过程示意图;FIG. 8(a) is a schematic diagram of the training process of the edge segmentation network provided in the second embodiment of the present application;
图8(b)是本申请实施例二提供的图像分割网络的训练过程示意图;Fig. 8(b) is a schematic diagram of the training process of the image segmentation network provided in the second embodiment of the present application;
图9是本申请实施例三提供的图像处理方法的工作流程示意图;FIG. 9 is a schematic diagram of the work flow of the image processing method provided in the third embodiment of the present application;
图10是本申请实施例四提供的一种图像分割网络的训练装置的结构示意图;FIG. 10 is a schematic structural diagram of an image segmentation network training device provided in the fourth embodiment of the present application;
图11是本申请实施例五提供的一种图像处理装置的结构示意图;FIG. 11 is a schematic structural diagram of an image processing apparatus according to Embodiment 5 of the present application;
图12是本申请实施例六提供的一种终端设备的结构示意图。FIG. 12 is a schematic structural diagram of a terminal device according to Embodiment 6 of the present application.
具体实施方式detailed description
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.
本申请实施例提供的方法可以适用于终端设备,示例性地,该终端设备包括但不限于:智能手机、平板电脑、笔记本、桌上型计算机、云端服务器等。The method provided in the embodiments of the present application may be applicable to terminal devices. Illustratively, the terminal devices include, but are not limited to: smart phones, tablet computers, notebooks, desktop computers, cloud servers, and the like.
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and the appended claims, the term "comprising" indicates the existence of the described features, wholes, steps, operations, elements and/or components, but does not exclude one or more other features The existence or addition of, whole, step, operation, element, component and/or its collection.
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should be further understood that the term "and/or" used in the specification and appended claims of this application refers to any combination and all possible combinations of one or more of the associated listed items, and includes these combinations .
如在本说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。As used in this specification and the appended claims, the term "if" can be interpreted as "when" or "once" or "in response to determination" or "in response to detection" depending on the context . Similarly, the phrase "if determined" or "if detected [described condition or event]" can be interpreted as meaning "once determined" or "in response to determination" or "once detected [described condition or event]" depending on the context ]" or "in response to detection of [condition or event described]".
另外,在本申请的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相 对重要性。In addition, in the description of this application, the terms "first", "second", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.
为了说明本申请所提供的技术方案,以下结合具体附图及实施例进行详细说明。In order to illustrate the technical solutions provided by the present application, detailed descriptions are given below in conjunction with specific drawings and embodiments.
实施例一Example one
下面对本申请实施例一提供的图像分割网络的训练方法进行描述,请参阅附图2,该训练方法包括:The following describes the training method of the image segmentation network provided in the first embodiment of the present application. Please refer to FIG. 2. The training method includes:
在步骤S101中,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘。In step S101, each sample image containing the target object, the sample mask corresponding to each sample image, and the sample edge information corresponding to each sample mask are obtained, where each sample mask is used to indicate the corresponding sample In the image area where the target object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
在本申请实施例中,可以先从数据集中获取一部分样本图像,然后可以通过如下方式扩充用于训练图像分割网络的样本图像数量:对预先获取的样本图像进行镜面反转、尺度缩放和/或Gamma变化等,以此增大样本图像数量,从而获取该步骤S101所述的各个样本图像。In the embodiment of the present application, a part of the sample images can be obtained from the data set first, and then the number of sample images used for training the image segmentation network can be expanded in the following ways: mirror inversion, scale scaling and/or scaling of the sample images obtained in advance Gamma changes, etc., so as to increase the number of sample images, so as to obtain each sample image described in step S101.
本申请所述的样本掩膜为二值图像。该步骤S101所述的某一样本掩膜对应的样本边缘信息的获取方式可以为:对该样本掩膜进行膨胀运算,得到膨胀运算后的掩膜图像,将膨胀运算后的掩膜图像与该样本掩膜做相减运算,即可得到该样本掩膜对应的样本边缘信息。采用这种方式所获取的样本边缘信息与样本掩膜相同,为二值图像。The sample mask described in this application is a binary image. The method of obtaining the sample edge information corresponding to a certain sample mask in step S101 may be: performing an expansion operation on the sample mask to obtain a mask image after the expansion operation, and combining the mask image after the expansion operation with the By subtracting the sample mask, the edge information of the sample corresponding to the sample mask can be obtained. The edge information of the sample obtained in this way is the same as the sample mask, which is a binary image.
为便于本领域技术人员对样本图像、样本掩膜以及样本边缘信息有更为直观的认识,下面利用附图3来进行说明。如图3所示,图像201是包含目标对象(即人像)的样本图像,图像202可以是该样本图像201对应的样本掩膜,图像203可以为该样本掩膜201对应的样本边缘信息。此外,本领域技术人员应该可以理解,上述样本边缘信息并非一定为二值图像,也可以是其他信息表现形式,只要能够体现“样本掩膜所指示的目标对象所在图像区域的轮廓边缘”即可。In order to facilitate those skilled in the art to have a more intuitive understanding of the sample image, sample mask, and sample edge information, the following will use FIG. 3 for description. As shown in FIG. 3, an image 201 is a sample image containing a target object (ie, a portrait), an image 202 may be a sample mask corresponding to the sample image 201, and an image 203 may be sample edge information corresponding to the sample mask 201. In addition, those skilled in the art should understand that the above sample edge information is not necessarily a binary image, but can also be other information expression forms, as long as it can reflect the "contour edge of the image area where the target object is indicated by the sample mask". .
此外,本领域技术人员应该能够理解,上述目标对象可以为人像、狗、猫等一切拍摄主体,本申请并不对目标对象的类别进行限定。In addition, those skilled in the art should be able to understand that the above-mentioned target object may be any subject, such as a portrait, a dog, a cat, etc., and this application does not limit the category of the target object.
另外,为了更好的训练图像分割网络,各个样本图像所包含的图像内容要尽可能的不同,比如,若上述目标对象为人像时,则样本图像1所包含的图像内容可以为小明正面人像,样本图像2所包含的图像内容可以为小红半侧面人像。In addition, in order to better train the image segmentation network, the image content contained in each sample image should be as different as possible. For example, if the above target object is a portrait, the image content contained in sample image 1 can be a frontal portrait of Xiao Ming. The image content contained in the sample image 2 may be a half-profile portrait of Xiaohong.
在步骤S102中,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜。In step S102, for each sample image, the sample image is input to the image segmentation network, and a generation mask output by the image segmentation network for indicating the area of the target object in the sample image is obtained.
在本申请实施例中,执行该步骤S102之前,需要事先建立一图像分割网络,该图像分割网络用于根据输入的图像,输出该图像所对应的掩膜(也即生成掩膜)。该图像分割网络可以为CNN(Convolutional Neural Networks,卷积神经网络),也可以为FPN(Feature Pyramid Networks,特征金字塔网络),本申请并不对图像分割网络的具体网络结构进行限定。采用FPN结构的图像分割网络可具体参见附图4。In the embodiment of the present application, before performing step S102, an image segmentation network needs to be established in advance, and the image segmentation network is used to output a mask corresponding to the image (that is, to generate a mask) according to the input image. The image segmentation network may be CNN (Convolutional Neural Networks, Convolutional Neural Network), or FPN (Feature Pyramid Networks, Feature Pyramid Network), and this application does not limit the specific network structure of the image segmentation network. The image segmentation network using the FPN structure can be specifically referred to in Figure 4.
在建立好上述图像分割网络之后,开始执行该步骤S102,以对该图像分割网络进行训练。After the above-mentioned image segmentation network is established, the step S102 is started to train the image segmentation network.
在训练过程中,需要将每一个样本图像均输入至该图像分割网络,得到该图像分割网络输出的各个生成掩膜,其中,每一个生成掩膜对应一个样本图像。另外,本领域技术人员容易理解,该步骤所述的“生成掩膜”与步骤S101所述的样本掩膜相同,可以为二值图像。In the training process, each sample image needs to be input to the image segmentation network to obtain each generation mask output by the image segmentation network, where each generation mask corresponds to a sample image. In addition, those skilled in the art can easily understand that the "generating mask" described in this step is the same as the sample mask described in step S101, and may be a binary image.
在步骤S103中,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘。In step S103, for each generated mask, input the generated mask to the trained edge neural network to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate what the generated mask indicates The contour edge of the area where the target object is located.
在执行该步骤S103之前,需要获取训练后的边缘神经网络,该训练后的边缘神经网络用于根据输入的生成掩膜,输出生成边缘信息,该生成边缘信息用于指示输入的该生成掩膜所指示的目标对象所在区域的轮廓边缘。在本申请实施例中,训练后的边缘神经网络可以如图1所示。在图1中,将002所示的生成掩膜输入001所示的训练后的边缘神经网络后,该001所示的训练后的边缘神经网络将输出003所示的生成边缘信息。Before performing this step S103, a trained edge neural network needs to be obtained. The trained edge neural network is used to generate a mask based on the input and output to generate edge information. The generated edge information is used to indicate the input generated mask The contour edge of the area where the indicated target object is located. In the embodiment of the present application, the edge neural network after training may be as shown in FIG. 1. In FIG. 1, after inputting the generated mask shown in 002 into the trained edge neural network shown in 001, the trained edge neural network shown in 001 will output the generated edge information shown in 003.
在获取到训练后的边缘神经网络之后,将步骤S102所述的各个生成掩膜分别输入至该训练后的边缘神经网络,得到该训练后的边缘神经网络输出的各个生成边缘信息,其中,每个生成边缘信息对应一个生成掩膜,用于表示该生成掩膜所指示的目标对象所在图像区域的轮廓边缘。After the trained edge neural network is obtained, each of the generated masks described in step S102 is input to the trained edge neural network, and each generated edge information output by the trained edge neural network is obtained, where each Each generation edge information corresponds to a generation mask, which is used to represent the contour edge of the image area where the target object indicated by the generation mask is located.
在本申请实施例中,在训练图像分割网络的过程中,图像分割网络与训练后的边缘神经网络的连接方式如图5所示。In the embodiment of the present application, in the process of training the image segmentation network, the connection between the image segmentation network and the trained edge neural network is shown in FIG. 5.
在步骤S104中,确定上述图像分割网络的损失函数,该损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,该损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距。In step S104, the loss function of the above-mentioned image segmentation network is determined. The loss function is used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used to measure each sample image. Respectively correspond to the gap between the generated edge information and the sample edge information.
本领域技术人员容易理解,每个样本图像均对应有一个样本掩膜、样本边缘信息、生成掩膜以及生成边缘信息。为了得到该步骤S104所述的损失函数,对于每个样本图像来说,需要计算该样本图像对应的样本掩膜与生成掩膜的差距(为便于后续描述,定义对于某个样本图像来说,该样本图像对应的样本掩膜与生成掩膜的差距为该样本图像对应的掩膜差距),还需要计算该样本图像对应的样本边缘信息与生成边缘信息的差距(为便于后续描述,定义对于某个样本图像来说,该样本图像对应的样本边缘信息与生成边缘信息的差距为该样本图像对应的边缘差距)。Those skilled in the art can easily understand that each sample image corresponds to a sample mask, sample edge information, mask generation, and edge information generation. In order to obtain the loss function described in step S104, for each sample image, it is necessary to calculate the gap between the sample mask corresponding to the sample image and the generated mask (for the convenience of subsequent description, it is defined that for a certain sample image, The gap between the sample mask corresponding to the sample image and the generated mask is the mask gap corresponding to the sample image), it is also necessary to calculate the gap between the sample edge information corresponding to the sample image and the generated edge information (for the convenience of subsequent description, the definition of For a sample image, the difference between the sample edge information corresponding to the sample image and the generated edge information is the edge difference corresponding to the sample image).
在步骤S104中,需要计算上述图像分割网络的损失函数,该损失函数与每个样本图像用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,该损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距,也即是:该损失函数与每个样本图像对应的掩膜差距均正相关,并且,该损失函数与每个样本图像对应的边缘差距也均正相关。In step S104, the loss function of the aforementioned image segmentation network needs to be calculated. The loss function and each sample image are used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used To measure the gap between the generated edge information and the sample edge information corresponding to each sample image, that is, the loss function is positively correlated with the mask gap corresponding to each sample image, and the loss function is related to each sample image. The corresponding marginal gaps are also positively correlated.
在本申请实施例中,上述损失函数的计算过程可以为:In the embodiment of the present application, the calculation process of the aforementioned loss function may be:
步骤A、对于每个样本图像,计算该样本图像对应的生成掩膜与该样本图像对应的样本掩膜的图像差(即可以为
Figure PCTCN2020117470-appb-000001
m1 i为生成掩膜第i个像素点的像素值,m2 i为样本掩膜第i个像素点的像素值,M为生成掩膜的像素点总个数)。
Step A. For each sample image, calculate the image difference between the generated mask corresponding to the sample image and the sample mask corresponding to the sample image (that is, it can be regarded as
Figure PCTCN2020117470-appb-000001
m1 i is the pixel value of the i-th pixel of the generated mask, m2 i is the pixel value of the i-th pixel of the sample mask, and M is the total number of pixels of the generated mask).
步骤B、若上述样本边缘信息以及生成边缘信息均为图像,则对于每个样本图像,计算该样本图像对应的样本边缘信息与该样本图像对应的生成边缘信息的图像差(该图像差的计算可参考步骤A所述)。Step B. If the above sample edge information and generated edge information are both images, then for each sample image, calculate the image difference between the sample edge information corresponding to the sample image and the generated edge information corresponding to the sample image (calculation of the image difference Refer to step A).
步骤C、可以将上述步骤A得到的各个图像差以及图像B得到的各个图像差进行平均(若样本图像个数为N,则步骤A求得的各个图像差与步骤B得到的各个图像差求和,然后除以2N),即可得到损失函数。Step C. The image differences obtained in the above step A and the image differences obtained in the image B can be averaged (if the number of sample images is N, the image differences obtained in step A and the image differences obtained in step B can be calculated And then divide by 2N) to get the loss function.
然而,上述损失函数的计算方式并不局限于上述步骤A-步骤C,在本申请实施例中,上述损失函数也可以通过如下公式(1)计算得到:However, the calculation method of the aforementioned loss function is not limited to the aforementioned step A-step C. In the embodiment of the present application, the aforementioned loss function can also be calculated by the following formula (1):
Figure PCTCN2020117470-appb-000002
Figure PCTCN2020117470-appb-000002
其中,LOSS 1为上述图像分割网络的损失函数,N为样本图像的总个数,F1 j用于衡量第j个样本图像对应的样本掩膜与生成掩膜的差距,F2 j用于衡量第j个样本图像对应的样本边缘信息与生成边缘信息的差距,
Figure PCTCN2020117470-appb-000003
Among them, LOSS 1 is the loss function of the above image segmentation network, N is the total number of sample images, F1 j is used to measure the gap between the sample mask corresponding to the jth sample image and the generated mask, and F2 j is used to measure the The difference between the sample edge information corresponding to j sample images and the generated edge information,
Figure PCTCN2020117470-appb-000003
在本申请实施例中,上述F1 j的计算方法可以为:计算第j个样本图像所对应的样本掩膜与生成掩膜的交叉熵损失,具体公式如下: In the embodiment of the present application, the calculation method of F1 j may be: calculating the cross entropy loss between the sample mask and the generated mask corresponding to the j-th sample image, and the specific formula is as follows:
Figure PCTCN2020117470-appb-000004
Figure PCTCN2020117470-appb-000004
其中,M为第j个样本图像中像素点的总个数,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,y ji用于指示第j个样本图像中第i个像素点是否在目标对象所位于的图像区域,p ji为所述图像分割网络预测的第j个样本图像中第i个像素点在目标对象所位于的图像区域的概率,x为对数log的底值。 Among them, M is the total number of pixels in the j-th sample image, the value of y ji is determined according to the sample mask corresponding to the j-th sample image, and y ji is used to indicate the i-th sample image in the j-th sample image. Whether the pixel is in the image area where the target object is located, p ji is the probability that the i-th pixel in the j-th sample image predicted by the image segmentation network is in the image area where the target object is located, x is the logarithm log Bottom value.
在本申请实施例中,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,比如:若第j个样本图像对应的样本掩膜中,指示该第j个样本图像中第i个像素点位于目标对象所在的图像区域,则y ji可以为1,若第j个样本图像对应的样本掩膜中,指示该第j个样本图像中第i个像素点没有位于目标对象所在的图像区域,则y ji可以为0。本领域技术人员应该能够理解,y ji的取值并不局限与1和0,也可以为其他数值。y ji的取值是预先设定的,比如为1或者0。 In the embodiment of the present application, the value of y ji is determined according to the sample mask corresponding to the j-th sample image. For example, if the sample mask corresponding to the j-th sample image is in the sample mask, it indicates that the j-th sample image is If i pixels are located in the image area where the target object is located, y ji can be 1. If the sample mask corresponding to the j-th sample image is in the sample mask, it indicates that the i-th pixel in the j-th sample image is not located where the target object is In the image area, y ji can be 0. Those skilled in the art should understand that the value of y ji is not limited to 1 and 0, and can also be other values. The value of y ji is preset, such as 1 or 0.
本领域技术人员应该注意,当样本掩膜指示第i个像素点位于目标对象所在的图像区域时,y ji的值要大于当样本掩膜指示第i个像素点未位于目标对象所在的图像区域时y ji的值。也即是,若样本掩膜指示第i个像素点位于目标对象所在的图像区域时,y ji为1,否则,y ji为0。或者,若样本掩膜指示第i个像素点位于目标对象所在的图像区域时,y ji为2,否则,y ji为1。或者,若样本掩膜指示第i个像素点位于目标对象所在的图像区域时,y ji为0.8,否则,y ji为0.2。 Those skilled in the art should note that when the sample mask indicates that the i-th pixel is located in the image area where the target object is located, the value of y ji is greater than when the sample mask indicates that the i-th pixel is not located in the image area where the target object is located. The value of y ji at time. That is, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 1, otherwise, y ji is 0. Or, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 2, otherwise, y ji is 1. Or, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 0.8, otherwise, y ji is 0.2.
在本申请实施例中,F2 j的计算方式可以与上述公式(2)类似,即可以为:计算第j个样本图像所对应的样本边缘信息与生成边缘信息的交叉熵损失。 In the embodiment of the present application, the calculation method of F2 j may be similar to the above formula (2), that is, calculating the cross entropy loss of the sample edge information corresponding to the j-th sample image and the generated edge information.
或者,若上述训练后的边缘神经网络由A个卷积块级联而成,每个卷积块均有B个卷积层构成,则相应地,F2 j的计算公式可以如下: Or, if the above-mentioned trained edge neural network is formed by cascading A convolutional blocks, and each convolutional block has B convolutional layers, correspondingly, the calculation formula of F2 j can be as follows:
Figure PCTCN2020117470-appb-000005
Figure PCTCN2020117470-appb-000005
其中,mask 1为第j个样本图像对应的生成掩膜,mask 2为第j个样本图像对应的样本掩膜,h c(mask 1)为训练后的边缘神经网络输入为mask 1时,第c个卷积块的输出,h c(mask 2)为训练后的边缘神经网络输入为mask 2时,第c个卷积块的输出,λ c为常数。 Among them, mask 1 is the generation mask corresponding to the j-th sample image, mask 2 is the sample mask corresponding to the j-th sample image, and h c (mask 1 ) is when the trained edge neural network input is mask 1 , the first The output of c convolutional blocks, h c (mask 2 ) is the output of the c-th convolutional block when the trained edge neural network input is mask 2 , and λ c is a constant.
上述F2 j的计算公式中,当训练后的边缘神经网络输入为mask 2时,最后一个卷积块的输出可以认为等同于样本边缘信息,因此,可以通过上述公式(3)来衡量样本边缘信息与生成边缘信息的差距。 In the above calculation formula of F2 j , when the trained edge neural network input is mask 2 , the output of the last convolution block can be considered equivalent to the sample edge information. Therefore, the sample edge information can be measured by the above formula (3) The gap with generating edge information.
如图6所示,边缘神经网络可以由3个卷积块级联而成,每个卷积块均为一个卷积层。As shown in Figure 6, the edge neural network can be formed by cascading three convolutional blocks, and each convolutional block is a convolutional layer.
在步骤S105中,判断上述损失函数是否小于第一预设阈值,若是,则执行步骤S107,否则,执行步骤S106。In step S105, it is determined whether the aforementioned loss function is less than the first preset threshold, if so, step S107 is executed, otherwise, step S106 is executed.
在步骤S106中,调整上述图像分割网络的各个参数,然后返回执行步骤S102。In step S106, adjust each parameter of the above-mentioned image segmentation network, and then return to perform step S102.
在步骤S107中,得到训练后的图像分割网络。In step S107, a trained image segmentation network is obtained.
也即是,不断调整图像分割网络的各个参数,直至损失函数小于第一预设阈值。此外,在本申请实施例中,并不对参数调整方式进行具体限定,可以采用梯度下降算法、动力更新算法等等,此处对调整参数所使用的方法不作限定。That is, the parameters of the image segmentation network are continuously adjusted until the loss function is less than the first preset threshold. In addition, in the embodiments of the present application, the parameter adjustment method is not specifically limited, and a gradient descent algorithm, a power update algorithm, etc. can be used, and the method used for adjusting the parameters is not limited here.
此外,在本申请实施例中,在对图像分割网络进行训练的时候,当将样本图像输入至图像分割网络之前,可以先将样本图像进行预处理,然后将预处理后的样本图像输入至图像分割网络。其中,上述预处理可以包括:图像裁剪和/或归一化处理等。In addition, in the embodiment of the present application, when the image segmentation network is trained, before the sample image is input to the image segmentation network, the sample image can be preprocessed first, and then the preprocessed sample image can be input to the image Split the network. Wherein, the above-mentioned preprocessing may include: image cropping and/or normalization processing and so on.
在上述步骤S107之后,还可以用测试集对训练后的图像分割网络进行评价。其中,测试集的获取方式可以参见现有技术,此处不再赘述。After the above step S107, the test set can also be used to evaluate the trained image segmentation network. Wherein, the method of obtaining the test set can be referred to the prior art, and will not be repeated here.
对于测试集中的单个样本图像来说,评价函数可以为:For a single sample image in the test set, the evaluation function can be:
Figure PCTCN2020117470-appb-000006
Figure PCTCN2020117470-appb-000006
其中,X是将该样本图像输入至训练后的图像分割网络后,该图像分割网络输出的生成掩膜所指示的目标对象的图像区域。Where X is the image area of the target object indicated by the generated mask output by the image segmentation network after the sample image is input to the trained image segmentation network.
Y是该样本图像对应的样本掩膜所指示的目标对象的图像区域。Y is the image area of the target object indicated by the sample mask corresponding to the sample image.
通过X与Y的IoU(Intersection-over-Union,交并比),来评价该训练后的图像分割网络,IoU的值越接近1,说明该训练后的图像分割网络性能越好。通过对训练后的图像分割网络进行评价,能够进一 步评估得到的训练后的图像分割网络的性能是否符合要求。例如,若判断出训练后的图像分割网络的性能不符合要求,则继续对训练后的图像分割网络进行训练。The IoU (Intersection-over-Union) of X and Y is used to evaluate the image segmentation network after training. The closer the value of IoU is to 1, the better the performance of the image segmentation network after training. By evaluating the trained image segmentation network, we can further evaluate whether the performance of the trained image segmentation network meets the requirements. For example, if it is determined that the performance of the trained image segmentation network does not meet the requirements, the training of the trained image segmentation network is continued.
本申请实施例一所提供的训练方法,在保证图像分割网络输出的生成掩膜逼近样本掩膜的同时,会进一步保证图像分割网络输出的生成掩膜中所表示的目标对象的轮廓边缘与真实的轮廓边缘更为逼近,因此,本申请所提供的图像分割网络所输出的生成掩膜所对应的图像能够更加精确地表示目标对象的轮廓边缘。The training method provided in the first embodiment of the application ensures that the generated mask output by the image segmentation network is close to the sample mask, and at the same time, it will further ensure that the contour edge of the target object represented in the generated mask output by the image segmentation network is true The contour edge of is closer. Therefore, the image corresponding to the generated mask output by the image segmentation network provided by this application can more accurately represent the contour edge of the target object.
实施例二Example two
下面对本申请实施例二提供的另一种图像分割网络的训练方法进行描述,该训练方法相比于实施例一所述的训练方法,包含了对边缘神经网络的训练过程。请参阅附图7,该训练方法包括:The following describes another image segmentation network training method provided in the second embodiment of the present application. Compared with the training method described in the first embodiment, the training method includes the training process of the edge neural network. Please refer to Figure 7. The training method includes:
在步骤S301中,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘。In step S301, each sample image containing the target object, the sample mask corresponding to each sample image, and the sample edge information corresponding to each sample mask are obtained, where each sample mask is used to indicate the corresponding sample In the image area where the target object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
该步骤S301的具体实施过程具体可参见实施例一中步骤S101部分,此处不再赘述。For the specific implementation process of step S301, please refer to the part of step S101 in the first embodiment, which will not be repeated here.
在步骤S302中,对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息,该边缘信息用于指示该样本掩膜所指示的目标对象所在区域的轮廓边缘。In step S302, for each sample mask, the sample mask is input to the edge neural network to obtain edge information output by the edge neural network, and the edge information is used to indicate the area where the target object indicated by the sample mask is located Contour edges.
在本申请实施例中,该步骤S302至后续的步骤S306为对边缘神经网络的训练过程,以得到训练后的边缘神经网络。本领域技术人员应该能够理解,该步骤S302-S306执行在后续步骤S308之前,并不必须执行在步骤S307之前。In the embodiment of the present application, the step S302 to the subsequent step S306 are the training process of the edge neural network to obtain the trained edge neural network. Those skilled in the art should be able to understand that the steps S302-S306 are executed before the subsequent step S308, and need not be executed before the step S307.
在执行该步骤S302之前,需要事先建立边缘神经网络,该边缘神经网络用于获取输入的样本掩膜所指示的目标对象所在区域的轮廓边缘。如附图6所示,该边缘神经网络可以由3个卷积层级联而成。Before performing step S302, an edge neural network needs to be established in advance, and the edge neural network is used to obtain the contour edge of the area where the target object indicated by the input sample mask is located. As shown in Fig. 6, the edge neural network can be formed by cascading three convolutional layers.
在该步骤S302中,将各个样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的各个边缘信息,其中,每个样本掩膜均对应有一个该边缘神经网络输出的边缘信息。In step S302, each sample mask is input to the edge neural network to obtain each edge information output by the edge neural network, wherein each sample mask corresponds to one edge information output by the edge neural network.
在步骤S303中,确定上述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与上述边缘神经网络输出的边缘信息的差距。In step S303, the loss function of the edge neural network is determined, and the loss function is used to measure the difference between the sample edge information corresponding to each sample mask and the edge information output by the edge neural network.
在本申请实施例中,该步骤S303的具体含义是:确定上述边缘神经网络的损失函数,其中该损失函数正相关于每个样本掩膜对应的边缘差距(该边缘差距为该样本掩膜对应的样本边缘信息,与将该样本掩膜输入至上述边缘神经网络后该边缘神经网络输出的边缘信息的差距)。In the embodiment of the present application, the specific meaning of step S303 is to determine the loss function of the above-mentioned edge neural network, where the loss function is positively correlated with the edge gap corresponding to each sample mask (the edge gap is corresponding to the sample mask). The difference between the edge information of the sample and the edge information output by the edge neural network after the sample mask is input to the edge neural network).
在本申请实施例中,上述边缘神经网络的损失函数计算方式可以为:In the embodiment of the present application, the calculation method of the loss function of the above-mentioned edge neural network may be:
若上述样本边缘信息与上述边缘神经网络输出的边缘信息均为如图1中003所示的图像,则上述边缘神经网络的损失函数可以为上述样本边缘信息与上述边缘神经网络输出的边缘信息的图像差(该图像差计算方式可参见实施例一中的步骤A所述,此处不再赘述)。If the edge information of the sample and the edge information output by the edge neural network are both the image shown in 003 in Figure 1, the loss function of the edge neural network may be the difference between the edge information of the sample and the edge information output by the edge neural network. Image difference (the image difference calculation method can be referred to as described in step A in the first embodiment, which will not be repeated here).
此外,上述边缘神经网络的损失函数计算方式可以为:对于每个样本掩膜来说,计算对应的样本边缘信息与上述边缘神经网络输出的边缘信息的交叉熵损失,然后求取平均。具体计算公式如下:In addition, the calculation method of the loss function of the edge neural network may be: for each sample mask, calculate the cross-entropy loss of the corresponding sample edge information and the edge information output by the edge neural network, and then calculate the average. The specific calculation formula is as follows:
Figure PCTCN2020117470-appb-000007
Figure PCTCN2020117470-appb-000007
其中,LOSS 2为上述边缘神经网络的损失函数,N为样本掩膜的总个数(本领域技术人员容易理解,样本图像、样本掩膜以及样本边缘信息的总个数均相同,均为N),M为第j个样本掩膜中像素点的总个数,r ji的数值是根据第j个样本图像对应的样本边缘信息确定的,r ji用于指示第j个样本掩膜中第i个像素点是否为轮廓边缘,q ji为上述边缘神经网络预测的第j个样本掩膜中第i个像素点为轮廓边缘的概率,x为对数log的底值。 Among them, LOSS 2 is the loss function of the aforementioned edge neural network, and N is the total number of sample masks (it is easy for those skilled in the art to understand that the total number of sample images, sample masks, and sample edge information are all the same, and they are all N ), M is the total number of pixels in the j-th sample mask, the value of r ji is determined according to the sample edge information corresponding to the j-th sample image, and r ji is used to indicate the number of pixels in the j-th sample mask. Whether i pixels are contour edges, q ji is the probability that the i-th pixel in the j-th sample mask predicted by the edge neural network is the contour edge, and x is the bottom value of the logarithm log.
在本申请实施例中,r ji的数值是根据第j个样本掩膜对应的样本边缘信息确定的,比如:若第j个样本掩膜对应的样本边缘信息中,指示该第j个样本掩膜中第i个像素点为轮廓边缘,则r ji可以为1,若第j个样本掩膜对应的样本边缘信息中,指示该第i个像素点不为轮廓边缘,则r ji可以为0。本领域技术人员应该能够理解,r ji的取值并不局限与1和0,也可以为其他数值。r ji的取值是预先设定的,比如为1或者0。 In the embodiment of the present application, the value of r ji is determined according to the sample edge information corresponding to the jth sample mask. For example, if the sample edge information corresponding to the jth sample mask indicates the jth sample mask If the i-th pixel in the film is a contour edge, r ji can be 1. If the sample edge information corresponding to the j-th sample mask indicates that the i-th pixel is not a contour edge, then r ji can be 0 . Those skilled in the art should understand that the value of r ji is not limited to 1 and 0, and may also be other values. The value of r ji is preset, such as 1 or 0.
本领域技术人员应该注意,当样本边缘信息指示第i个像素点为轮廓边缘时,r ji的值要大于当样本边缘信息指示第i个像素点不为轮廓边缘时r ji的值。也即是,若样本边缘信息指示第i个像素点为轮廓边缘时,r ji为1,否则,r ji为0。或者,若样本边缘信息指示第i个像素点为轮廓边缘时,r ji为2,否则,r ji为1。或者,若样本边缘信息指示第i个像素点为轮廓边缘时,r ji为0.8,否则,r ji为0.2。 Those skilled in the art should be noted that, when the edge of the sample information indicates the i-th pixel as the edge contour, the value of r ji when the sample is greater than the edge information indicates an i-th pixel is not the value of r ji edge contour. That is, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 1, otherwise, r ji is 0. Or, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 2, otherwise, r ji is 1. Or, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 0.8, otherwise, r ji is 0.2.
在步骤S304中,判断上述边缘神经网络的损失函数是否小于第二预设阈值,若否,则执行步骤S305,若是,则执行步骤S306。In step S304, it is determined whether the loss function of the aforementioned edge neural network is less than a second preset threshold, if not, step S305 is executed, and if yes, step S306 is executed.
在步骤S305中,调整上述边缘神经网络模型的各个参数,然后返回执行步骤S302。In step S305, adjust each parameter of the above-mentioned edge neural network model, and then return to step S302.
在步骤S306中,得到训练后的边缘神经网络。In step S306, a trained edge neural network is obtained.
也即是,不断调整边缘神经网络的各个参数,直至损失函数小于第二预设阈值。此外,在本申请实施例中,并不对参数调整方式进行具体限定,可以采用梯度下降算法、动力更新算法等等,此处对调整参数所使用的方法不作限定。That is, the parameters of the edge neural network are continuously adjusted until the loss function is less than the second preset threshold. In addition, in the embodiments of the present application, the parameter adjustment method is not specifically limited, and a gradient descent algorithm, a power update algorithm, etc. can be used, and the method used for adjusting the parameters is not limited here.
在步骤S307中,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜。In step S307, for each sample image, the sample image is input to the image segmentation network, and the generated mask output by the image segmentation network for indicating the area of the target object in the sample image is obtained.
在步骤S308中,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘。In step S308, for each generated mask, input the generated mask to the trained edge neural network to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate what the generated mask indicates The contour edge of the area where the target object is located.
在步骤S309中,确定上述图像分割网络的损失函数,该损失函数用于衡量每个样本图像分别对应 的样本掩膜与生成掩膜的差距,并且,该损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距。In step S309, the loss function of the above-mentioned image segmentation network is determined. The loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image. Respectively correspond to the gap between the generated edge information and the sample edge information.
在步骤S310中,判断上述损失函数是否小于第一预设阈值,若是,则执行步骤S312,否则,执行步骤S311。In step S310, it is determined whether the aforementioned loss function is less than a first preset threshold, if so, step S312 is executed, otherwise, step S311 is executed.
在步骤S311中,调整上述图像分割网络的各个参数,然后返回执行步骤S102。In step S311, adjust the various parameters of the above-mentioned image segmentation network, and then return to perform step S102.
在步骤S312中,得到训练后的图像分割网络。In step S312, a trained image segmentation network is obtained.
上述步骤S307-S312的具体实施方式与实施例一中步骤S102-S107的具体实施方式完全相同,具体可参见实施例一的描述,此处不再赘述。The specific implementation manners of the above steps S307-S312 are completely the same as the specific implementation manners of the steps S102-S107 in the first embodiment. For details, please refer to the description of the first embodiment, which will not be repeated here.
下面利用附图8对本申请实施例二所述的训练过程进行简要说明。The following briefly describes the training process described in the second embodiment of the present application by using FIG. 8.
如图8(a)所示,为边缘神经网络的训练过程。首先,将样本掩膜输入至边缘神经网络中,得到该边缘神经网络输出的边缘信息。其次,根据边缘神经网络与各个样本边缘信息,计算交叉熵损失,其中,该样本边缘信息为样本掩膜通过膨胀运算及相减运算得到,具体可参见实施例一的描述,此处不在赘述。然后,将各个交叉熵损失进行平均即得到损失函数。最后,不断调整该边缘神经网络的各个参数,直至损失函数较小为止,从而得到训练后的边缘神经网络。As shown in Figure 8(a), the training process of the edge neural network. First, input the sample mask into the edge neural network to obtain the edge information output by the edge neural network. Secondly, the cross-entropy loss is calculated according to the edge neural network and the edge information of each sample. The edge information of the sample is obtained by the expansion operation and the subtraction operation of the sample mask. For details, please refer to the description of the first embodiment, which will not be repeated here. Then, the various cross entropy losses are averaged to obtain the loss function. Finally, continuously adjust the various parameters of the edge neural network until the loss function is small, so as to obtain the edge neural network after training.
在得到训练后的边缘神经网络之后,可参见图8(b)实现对图像分割网络的训练。After the trained edge neural network is obtained, see Figure 8(b) to realize the training of the image segmentation network.
如图8(b)所示,为图像分割网络的训练过程,首先,将样本图像输入至图像分割网络中,得到该图像分割网络输出的生成掩膜,并计算该生成掩膜与样本掩膜的交叉熵损失。其次,将该生成掩膜输入至训练后的边缘神经网络,得到每个卷积层的输出,并将样本掩膜输入至该训练后的边缘神经网络,得到每个卷积层的输出。然后,根据计算的上述交叉熵损失、将生成掩膜输入时,各个卷积层的输出以及将样本掩膜输入时,各个卷积层的输出,计算图像分割网络的损失函数(具体计算方式可参见实施例一的描述)。最后,不断调整该图像分割网络的各个参数,直至损失函数较小为止,从而得到训练后的图像分割网络。As shown in Figure 8(b), it is the training process of the image segmentation network. First, input the sample image into the image segmentation network to obtain the generated mask output by the image segmentation network, and calculate the generated mask and sample mask The cross entropy loss. Secondly, the generated mask is input to the trained edge neural network to obtain the output of each convolutional layer, and the sample mask is input to the trained edge neural network to obtain the output of each convolutional layer. Then, according to the calculated cross-entropy loss, the output of each convolutional layer when the generated mask is input, and the output of each convolutional layer when the sample mask is input, the loss function of the image segmentation network is calculated (the specific calculation method can be See the description of Example 1). Finally, the parameters of the image segmentation network are continuously adjusted until the loss function is small, so as to obtain the image segmentation network after training.
本申请实施例二所述的训练方法,相比于实施例一多出了边缘神经网络的训练过程,这可以使得训练边缘神经网络所采用的样本与训练图像分割网络所采用的样本是一致的,从而可以更好的根据边缘神经网络的输出结果,来衡量图像分割网络输出的掩膜的边缘的准确度,从而更好地训练图像分割网络。Compared with the first embodiment, the training method described in the second embodiment of the present application has an additional training process of the edge neural network, which can make the samples used for training the edge neural network consistent with the samples used for training the image segmentation network Therefore, the accuracy of the edge of the mask output by the image segmentation network can be better measured according to the output result of the edge neural network, so as to better train the image segmentation network.
实施例三Example three
本申请实施例三提供了一种图像处理方法,请参阅附图9,该图像处理方法包括:The third embodiment of the present application provides an image processing method. Please refer to FIG. 9. The image processing method includes:
在步骤S401中,获取待处理图像,并将该待处理图像输入至训练后的图像分割网络,得到该待处理图像对应的掩膜,其中,该训练后的图像分割网络是采用训练后的边缘神经网络训练得到,该训练后的边缘神经网络用于根据输入的掩膜输出该掩膜所指示的目标对象所在区域的边缘轮廓。In step S401, an image to be processed is obtained, and the image to be processed is input to the trained image segmentation network to obtain a mask corresponding to the image to be processed, wherein the trained image segmentation network uses the trained edge The neural network is trained, and the trained edge neural network is used to output the edge contour of the area where the target object indicated by the mask is located according to the input mask.
具体地,该步骤S401所述的训练后的边缘神经网络是采用如上述实施例一或实施例二所述方法训 练得到的神经网络。Specifically, the trained edge neural network described in this step S401 is a neural network obtained by training using the method described in the first or second embodiment above.
在步骤S402中,基于上述待处理图像对应的掩膜,将该待处理图像中所包含的目标对象分割出来。In step S402, the target objects contained in the image to be processed are segmented based on the mask corresponding to the image to be processed.
本领域技术人员容易理解,在上述步骤S402之后,还可以执行更换背景的具体操作,该操作是现有技术,此处不在赘述。It is easy for those skilled in the art to understand that after the above step S402, a specific operation of changing the background can also be performed. This operation is in the prior art and will not be repeated here.
该实施例三所述的方法可以是应用在终端设备(比如手机)中的方法,该方法可以便于用户更换待处理图像中的背景,该方法能够准确的分割目标对象,更加精确的更换背景,能够在一定程度上提高用户体验。The method described in the third embodiment can be a method applied in a terminal device (such as a mobile phone). This method can facilitate the user to replace the background in the image to be processed. This method can accurately segment the target object and more accurately replace the background. Can improve user experience to a certain extent.
实施例四Example four
本申请实施例四提供了一种图像分割网络的训练装置。为了便于说明,仅示出与本申请相关的部分,如图10所示,该训练装置500包括:The fourth embodiment of the present application provides a training device for an image segmentation network. For ease of description, only the parts related to the present application are shown. As shown in FIG. 10, the training device 500 includes:
样本获取模块501,用于获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘。The sample acquisition module 501 is used to acquire each sample image containing the target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, wherein each sample mask is used to indicate Corresponding to the image area where the target object is located in the sample image, each sample edge information is used to indicate the contour edge of the image area where the target object indicated by the corresponding sample mask is located.
生成掩膜获取模块502,用于对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜。The generation mask acquisition module 502 is configured to input the sample image to the image segmentation network for each sample image, and obtain the generation mask output by the image segmentation network for indicating the area of the target object in the sample image.
生成边缘获取模块503,用于对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘。The generated edge acquisition module 503 is used to input the generated mask to the trained edge neural network for each generated mask to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate the generated mask. The contour edge of the area where the target object indicated by the film is located.
损失确定模块504,用于确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距。The loss determination module 504 is used to determine the loss function of the image segmentation network. The loss function is used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used to Measure the gap between the generated edge information and the sample edge information corresponding to each sample image.
调参模块505,用于调整所述图像分割网络的各个参数,然后触发所述生成掩膜获取模块继续执行相应的步骤,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。The parameter adjustment module 505 is used to adjust various parameters of the image segmentation network, and then trigger the generation mask acquisition module to continue to perform corresponding steps until the loss function of the image segmentation network is less than the first preset threshold, thereby Get the trained image segmentation network.
可选地,上述损失确定模块504具体用于:Optionally, the aforementioned loss determination module 504 is specifically configured to:
确定所述图像分割网络的损失函数,该损失函数的计算公式为:Determine the loss function of the image segmentation network, and the calculation formula of the loss function is:
Figure PCTCN2020117470-appb-000008
Figure PCTCN2020117470-appb-000008
其中,LOSS 1为所述图像分割网络的损失函数,N为样本图像的总个数,F1 j用于衡量第j个样本图像对应的样本掩膜与生成掩膜的差距,F2 j用于衡量第j个样本图像对应的样本边缘信息与生成边缘信息的 差距,
Figure PCTCN2020117470-appb-000009
Among them, LOSS 1 is the loss function of the image segmentation network, N is the total number of sample images, F1 j is used to measure the gap between the sample mask corresponding to the jth sample image and the generated mask, and F2 j is used to measure The difference between the sample edge information corresponding to the j-th sample image and the generated edge information,
Figure PCTCN2020117470-appb-000009
可选地,上述F1 j的计算公式如下: Optionally, the calculation formula of F1 j is as follows:
Figure PCTCN2020117470-appb-000010
Figure PCTCN2020117470-appb-000010
其中,M为第j个样本图像中像素点的总个数,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,y ji用于指示第j个样本图像中第i个像素点是否在目标对象所位于的图像区域,p ji为所述图像分割网络预测的第j个样本图像中第i个像素点在目标对象所位于的图像区域的概率,x为对数log的底值。 Among them, M is the total number of pixels in the j-th sample image, the value of y ji is determined according to the sample mask corresponding to the j-th sample image, and y ji is used to indicate the i-th sample image in the j-th sample image. Whether the pixel is in the image area where the target object is located, p ji is the probability that the i-th pixel in the j-th sample image predicted by the image segmentation network is in the image area where the target object is located, x is the logarithm log Bottom value.
此外,当样本掩膜指示第i个像素点位于目标对象所在的图像区域时,该y ji的值要大于当样本掩膜指示第i个像素点未位于目标对象所在的图像区域时y ji的值。 Further, when the sample mask indicates the i-th pixel of the image region of the target object is located, the value of y ji is greater than when the sample mask indicates the i-th pixel of the image region of the target object is not located in the y ji value.
可选地,上述训练后的边缘神经网络由A个卷积块级联而成,每个卷积块均有B个卷积层构成。Optionally, the above-mentioned trained edge neural network is formed by cascading A convolutional blocks, and each convolutional block is composed of B convolutional layers.
相应地,上述F2 j的计算公式如下: Correspondingly, the calculation formula for the above F2 j is as follows:
Figure PCTCN2020117470-appb-000011
Figure PCTCN2020117470-appb-000011
其中,mask 1为第j个样本图像对应的生成掩膜,mask 2为第j个样本图像对应的样本掩膜,h c(mask 1)为训练后的边缘神经网络输入为mask 1时,第c个卷积块的输出,h c(mask 2)为训练后的边缘神经网络输入为mask 2时,第c个卷积块的输出,λ c为常数。 Among them, mask 1 is the generation mask corresponding to the j-th sample image, mask 2 is the sample mask corresponding to the j-th sample image, and h c (mask 1 ) is when the trained edge neural network input is mask 1 , the first The output of c convolutional blocks, h c (mask 2 ) is the output of the c-th convolutional block when the trained edge neural network input is mask 2 , and λ c is a constant.
可选地,上述训练装置还包括边缘神经网络训练模块,该边缘神经网络训练模块包括:Optionally, the above-mentioned training device further includes an edge neural network training module, and the edge neural network training module includes:
边缘信息获取单元,用于对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息,该边缘信息用于指示该样本掩膜所指示的目标对象所在区域的轮廓边缘。The edge information acquisition unit is used to input the sample mask to the edge neural network for each sample mask to obtain the edge information output by the edge neural network, and the edge information is used to indicate the target object indicated by the sample mask The contour edge of the area.
边缘损失确定单元,用于确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距。The edge loss determining unit is used to determine the loss function of the edge neural network, and the loss function is used to measure the difference between the sample edge information corresponding to each sample mask and the edge information output by the edge neural network.
边缘调参单元,用于调整所述边缘神经网络的各个参数,然后触发所述边缘信息获取单元继续执行相应步骤,直至所述边缘神经网络的损失函数值小于第二预设阈值,从而得到训练后的边缘神经网络。The edge parameter adjustment unit is used to adjust various parameters of the edge neural network, and then trigger the edge information acquisition unit to continue to perform corresponding steps until the loss function value of the edge neural network is less than the second preset threshold, thereby obtaining training After the edge of the neural network.
可选地,上述边缘损失确定单元具体用于:Optionally, the aforementioned edge loss determining unit is specifically used for:
确定所述边缘神经网络的损失函数,该损失函数的计算公式为:Determine the loss function of the edge neural network, and the calculation formula of the loss function is:
Figure PCTCN2020117470-appb-000012
Figure PCTCN2020117470-appb-000012
其中,LOSS 2为所述边缘神经网络的损失函数,N为样本图像的总个数,M为第j个样本掩膜中像素点的总个数,r ji的数值是根据第j个样本图像对应的样本边缘信息确定的,r ji用于指示第j个样本掩膜中第i个像素点是否为轮廓边缘,q ji为所述边缘神经网络预测的第j个样本掩膜中第i个像素点为轮廓边缘的概率,x为对数log的底值。 Among them, LOSS 2 is the loss function of the edge neural network, N is the total number of sample images, M is the total number of pixels in the j-th sample mask, and the value of r ji is based on the j-th sample image The corresponding sample edge information is determined, r ji is used to indicate whether the i-th pixel in the j-th sample mask is a contour edge, and q ji is the i-th pixel in the j-th sample mask predicted by the edge neural network The pixel point is the probability of the edge of the contour, and x is the base value of the logarithm log.
此外,当样本边缘信息指示第i个像素点为轮廓边缘时,该r ji的值要大于当样本边缘信息指示第i个像素点不为轮廓边缘时r ji的值。 Further, when the sample information indicates the i-th edge pixel as an edge contour, the value of r ji when the sample is greater than the edge information indicates an i-th pixel is not the value of r ji edge contour.
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例一以及方法实施例二基于同一构思,其具体功能及带来的技术效果,具体可参见相应方法实施例部分,此处不再赘述。It should be noted that the information interaction and execution process among the above-mentioned devices/units are based on the same concept as the method embodiment 1 and method embodiment 2 of this application. For specific functions and technical effects, please refer to The corresponding method embodiment part will not be repeated here.
实施例五Example five
本申请实施例五提供了一种图像处理装置。为了便于说明,仅示出与本申请相关的部分,如图11所示,该图像处理装置600包括:The fifth embodiment of the present application provides an image processing device. For ease of description, only the parts related to the present application are shown. As shown in FIG. 11, the image processing apparatus 600 includes:
掩膜获取模块601,用于获取待处理图像,并将所述待处理图像输入至训练后的图像分割网络,得到所述待处理图像对应的掩膜,其中,是采用训练后的边缘神经网络训练得到,该训练后的边缘神经网络用于根据输入的掩膜输出该掩膜所指示的目标对象所在区域的边缘轮廓(具体地,所述训练后的图像分割网络是采用如实施例一或实施例二所述训练方法训练得到)。The mask acquisition module 601 is used to acquire the image to be processed, and input the image to be processed into the trained image segmentation network to obtain the mask corresponding to the image to be processed, wherein the trained edge neural network is used After training, the trained edge neural network is used to output the edge contour of the area where the target object indicated by the mask is located according to the input mask (specifically, the trained image segmentation network adopts the method as in the first embodiment or The training method described in the second embodiment is obtained through training).
目标对象分割模块602,用于基于所述待处理图像对应的掩膜,将所述待处理图像中所包含的目标对象分割出来。The target object segmentation module 602 is configured to segment the target object contained in the image to be processed based on the mask corresponding to the image to be processed.
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例三基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例三部分,此处不再赘述。It should be noted that the information interaction and execution process between the above-mentioned devices/units are based on the same concept as the third method of this application. For specific functions and technical effects, please refer to the third part of the third method of the application. , I won’t repeat it here.
实施例六Example Six
图12是本申请实施例六提供的终端设备的示意图。如图12所示,该实施例的终端设备700包括:处理器701、存储器702以及存储在上述存储器702中并可在上述处理器701上运行的计算机程序703。上述处理器701执行上述计算机程序703时实现上述各个方法实施例中的步骤。或者,上述处理器701执行上述计算机程序703时实现上述各装置实施例中各模块/单元的功能。FIG. 12 is a schematic diagram of a terminal device provided in Embodiment 6 of the present application. As shown in FIG. 12, the terminal device 700 of this embodiment includes a processor 701, a memory 702, and a computer program 703 that is stored in the memory 702 and can run on the processor 701. The above-mentioned processor 701 implements the steps in the above-mentioned method embodiments when the above-mentioned computer program 703 is executed. Alternatively, when the processor 701 executes the computer program 703, the function of each module/unit in the foregoing device embodiments is realized.
示例性的,上述计算机程序703可以被分割成一个或多个模块/单元,上述一个或者多个模块/单元被存储在上述存储器702中,并由上述处理器701执行,以完成本申请。上述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述上述计算机程序703在上述终端设备700中的执行过程。例如,上述计算机程序703可以被分割成样本获取模块、生成掩膜获取模块、生成边缘获取模块、损失确定模块以及调参模块,各模块具体功能如下:Exemplarily, the foregoing computer program 703 may be divided into one or more modules/units, and the foregoing one or more modules/units are stored in the foregoing memory 702 and executed by the foregoing processor 701 to complete the present application. The foregoing one or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the foregoing computer program 703 in the foregoing terminal device 700. For example, the aforementioned computer program 703 can be divided into a sample acquisition module, a mask generation module, an edge generation module, a loss determination module, and a parameter adjustment module. The specific functions of each module are as follows:
S101,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘。S101: Obtain each sample image containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, where each sample mask is used to indicate the target in the corresponding sample image In the image area where the object is located, the edge information of each sample is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
S102,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜。S102: For each sample image, input the sample image to an image segmentation network, and obtain a generation mask output by the image segmentation network for indicating a region where a target object in the sample image is located.
S103,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘。S103: For each generated mask, input the generated mask to the trained edge neural network to obtain generated edge information output by the edge neural network, and the generated edge information is used to indicate the target object indicated by the generated mask The contour edge of the area.
S104,确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距。S104. Determine the loss function of the image segmentation network, where the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image Respectively correspond to the gap between the generated edge information and the sample edge information.
S105,调整所述图像分割网络的各个参数,然后返回执行S102,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。S105: Adjust various parameters of the image segmentation network, and then return to execute S102 until the loss function of the image segmentation network is less than a first preset threshold, thereby obtaining a trained image segmentation network.
或者,上述计算机程序703可以被分割成掩膜获取模块以及目标对象分割模块,各模块具体功能如下:Alternatively, the aforementioned computer program 703 can be divided into a mask acquisition module and a target object segmentation module, and the specific functions of each module are as follows:
获取待处理图像,并将所述待处理图像输入至训练后的图像分割网络,得到所述待处理图像对应的掩膜,其中,所述训练后的图像分割网络是采用实施例一或实施例二所述的训练方法训练得到。Obtain the image to be processed, and input the image to be processed into the trained image segmentation network to obtain the mask corresponding to the image to be processed, wherein the trained image segmentation network adopts the first embodiment or the embodiment The training method described in the second is obtained by training.
基于所述待处理图像对应的掩膜,将所述待处理图像中所包含的目标对象分割出来。Based on the mask corresponding to the image to be processed, the target object contained in the image to be processed is segmented.
上述终端设备可包括,但不仅限于,处理器701、存储器702。本领域技术人员可以理解,图12仅仅是终端设备700的示例,并不构成对终端设备700的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如上述终端设备还可以包括输入输出设备、网络接入设备、总线等。The foregoing terminal device may include, but is not limited to, a processor 701 and a memory 702. Those skilled in the art can understand that FIG. 12 is only an example of the terminal device 700, and does not constitute a limitation on the terminal device 700. It may include more or less components than those shown in the figure, or a combination of certain components, or different components. For example, the aforementioned terminal device may also include input and output devices, network access devices, buses, and so on.
所称处理器701可以是中央处理单元(Central Processing Unit,CPU),还可以是其它通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The so-called processor 701 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
上述存储器702可以是上述终端设备700的内部存储单元,例如终端设备700的硬盘或内存。上述存储器702也可以是上述终端设备700的外部存储设备,例如上述终端设备700上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,上述存储器702还可以既包括上述终端设备700的内部存储单元也包括外部存储设备。上述存储器702用于存储上述计算机程序以及上述终端设备所需的其它程序和数据。上述存储器702还可以用于暂时地存储已经输出或者将要输出的数据。The foregoing memory 702 may be an internal storage unit of the foregoing terminal device 700, such as a hard disk or a memory of the terminal device 700. The memory 702 may also be an external storage device of the terminal device 700, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory equipped on the terminal device 700. Card (Flash Card), etc. Further, the aforementioned memory 702 may also include both an internal storage unit of the aforementioned terminal device 700 and an external storage device. The above-mentioned memory 702 is used to store the above-mentioned computer program and other programs and data required by the above-mentioned terminal device. The aforementioned memory 702 can also be used to temporarily store data that has been output or will be output.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将上述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以 上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the above device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only used to facilitate distinguishing from each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
在本申请所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,上述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed device/terminal device and method may be implemented in other ways. For example, the device/terminal device embodiments described above are only illustrative. For example, the division of the above-mentioned modules or units is only a logical function division, and there may be other division methods in actual implementation, such as multiple units or Components can be combined or integrated into another system, or some features can be omitted or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
上述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述各个方法实施例中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,上述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,上述计算机程序包括计算机程序代码,上述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。上述计算机可读介质可以包括:能够携带上述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,上述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。If the above-mentioned integrated modules/units are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the foregoing method embodiments, and can also be completed by instructing relevant hardware through a computer program. The foregoing computer program may be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments. Wherein, the above-mentioned computer program includes computer program code, and the above-mentioned computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The above-mentioned computer-readable medium may include: any entity or device capable of carrying the above-mentioned computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random Access memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal, software distribution medium, etc. It should be noted that the content contained in the above-mentioned computer-readable media can be appropriately added or deleted in accordance with the requirements of the legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to the legislation and patent practice, the computer-readable media cannot Including electric carrier signal and telecommunication signal.
以上上述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进 行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, rather than to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still compare the foregoing embodiments. The recorded technical solutions are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in this Within the scope of protection applied for.

Claims (20)

  1. 一种图像分割网络的训练方法,其特征在于,包括:A method for training an image segmentation network, which is characterized in that it includes:
    S101,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘;S101: Obtain each sample image containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, where each sample mask is used to indicate the target in the corresponding sample image The image area where the object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask;
    S102,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜;S102: For each sample image, input the sample image to an image segmentation network, and obtain a generation mask output by the image segmentation network for indicating the area where the target object in the sample image is located;
    S103,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘;S103: For each generated mask, input the generated mask to the trained edge neural network to obtain generated edge information output by the edge neural network, and the generated edge information is used to indicate the target object indicated by the generated mask The contour edge of the area;
    S104,确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距;S104. Determine the loss function of the image segmentation network, where the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image Corresponding to the gap between the generated edge information and the sample edge information;
    S105,调整所述图像分割网络的各个参数,然后返回执行S102,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。S105: Adjust various parameters of the image segmentation network, and then return to execute S102 until the loss function of the image segmentation network is less than a first preset threshold, thereby obtaining a trained image segmentation network.
  2. 如权利要求1所述的训练方法,其特征在于,所述确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距,包括:The training method according to claim 1, wherein the determining the loss function of the image segmentation network, the loss function is used to measure the difference between the sample mask and the generated mask corresponding to each sample image, In addition, the loss function is also used to measure the gap between the generated edge information and the sample edge information corresponding to each sample image, including:
    确定所述图像分割网络的损失函数,所述损失函数正相关于每个样本图像分别对应的掩膜差距,并且,所述损失函数正相关于每个样本图像分别对应的边缘差距,其中,每个样本图像对应的掩膜差距为该样本图像对应的样本掩膜与生成掩膜的差距,每个样本图像对应的边缘差距为该样本图像对应的样本边缘信息与生成边缘信息的差距。Determine the loss function of the image segmentation network, the loss function is positively related to the mask gap corresponding to each sample image, and the loss function is positively related to the edge gap corresponding to each sample image, where each The mask gap corresponding to each sample image is the gap between the sample mask corresponding to the sample image and the generated mask, and the edge gap corresponding to each sample image is the gap between the sample edge information corresponding to the sample image and the generated edge information.
  3. 如权利要求2所述的训练方法,其特征在于,所述确定所述图像分割网络的损失函数,所述损失函数正相关于每个样本图像分别对应的掩膜差距,并且,所述损失函数正相关于每个样本图像分别对应的边缘差距,包括:The training method according to claim 2, wherein the loss function of the image segmentation network is determined, and the loss function is positively correlated with the mask gap corresponding to each sample image, and the loss function It is positively related to the edge gap corresponding to each sample image, including:
    确定所述图像分割网络的损失函数,该损失函数的计算公式为:Determine the loss function of the image segmentation network, and the calculation formula of the loss function is:
    Figure PCTCN2020117470-appb-100001
    Figure PCTCN2020117470-appb-100001
    其中,LOSS 1为所述图像分割网络的损失函数,N为样本图像的总个数,F1 j用于衡量第j个样本图像对应的样本掩膜与生成掩膜的差距,F2 j用于衡量第j个样本图像对应的样本边缘信息与生成边缘信息的差距,
    Figure PCTCN2020117470-appb-100002
    Among them, LOSS 1 is the loss function of the image segmentation network, N is the total number of sample images, F1 j is used to measure the gap between the sample mask corresponding to the jth sample image and the generated mask, and F2 j is used to measure The difference between the sample edge information corresponding to the j-th sample image and the generated edge information,
    Figure PCTCN2020117470-appb-100002
  4. 如权利要求3所述的训练方法,其特征在于,F1 j的计算公式如下: The training method according to claim 3, wherein the calculation formula of F1 j is as follows:
    Figure PCTCN2020117470-appb-100003
    Figure PCTCN2020117470-appb-100003
    其中,M为第j个样本图像中像素点的总个数,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,y ji用于指示第j个样本图像中第i个像素点是否在目标对象所位于的图像区域,p ji为所述图像分割网络预测的第j个样本图像中第i个像素点在目标对象所位于的图像区域的概率,x为对数log的底值; Among them, M is the total number of pixels in the j-th sample image, the value of y ji is determined according to the sample mask corresponding to the j-th sample image, and y ji is used to indicate the i-th sample image in the j-th sample image. Whether the pixel is in the image area where the target object is located, p ji is the probability that the i-th pixel in the j-th sample image predicted by the image segmentation network is in the image area where the target object is located, x is the logarithm log Bottom value
    此外,当样本掩膜指示第i个像素点位于目标对象所在的图像区域时,该y ji的值要大于当样本掩膜指示第i个像素点未位于目标对象所在的图像区域时y ji的值。 Further, when the sample mask indicates the i-th pixel of the image region of the target object is located, the value of y ji is greater than when the sample mask indicates the i-th pixel of the image region of the target object is not located in the y ji value.
  5. 如权利要求1所述的训练方法,其特征在于,所述确定图像分割网络的损失函数,包括:The training method according to claim 1, wherein said determining the loss function of the image segmentation network comprises:
    对于每个样本图像,计算所述样本图像对应的生成掩膜与所述样本图像对应的样本掩膜的图像差;For each sample image, calculate the image difference between the generated mask corresponding to the sample image and the sample mask corresponding to the sample image;
    若上述样本边缘信息以及生成边缘信息均为图像,则对于每个样本图像,计算该样本图像对应的样本边缘信息与该样本图像对应的生成边缘信息的图像差;If the foregoing sample edge information and the generated edge information are both images, then for each sample image, calculate the image difference between the sample edge information corresponding to the sample image and the generated edge information corresponding to the sample image;
    将所述样本掩膜的图像差与所述生成边缘信息的图像差进行平均,得到所述图像分割网络的损失函数。The image difference of the sample mask and the image difference of the generated edge information are averaged to obtain the loss function of the image segmentation network.
  6. 如权利要求3所述的训练方法,其特征在于,所述训练后的边缘神经网络由A个卷积块级联而成,每个卷积块均有B个卷积层构成;The training method according to claim 3, wherein the trained edge neural network is formed by cascading A convolutional blocks, and each convolutional block is composed of B convolutional layers;
    相应地,F2 j的计算公式如下: Correspondingly, the calculation formula of F2 j is as follows:
    Figure PCTCN2020117470-appb-100004
    Figure PCTCN2020117470-appb-100004
    其中,mask 1为第j个样本图像对应的生成掩膜,mask 2为第j个样本图像对应的样本掩膜,h c(mask 1)为训练后的边缘神经网络输入为mask 1时,第c个卷积块的输出,h c(mask 2)为训练后的边缘神经网络输入为mask 2时,第c个卷积块的输出,λ c为常数。 Among them, mask 1 is the generation mask corresponding to the j-th sample image, mask 2 is the sample mask corresponding to the j-th sample image, and h c (mask 1 ) is when the trained edge neural network input is mask 1 , the first The output of c convolutional blocks, h c (mask 2 ) is the output of the c-th convolutional block when the trained edge neural network input is mask 2 , and λ c is a constant.
  7. 如权利要求1至6中任一项所述的训练方法,其特征在于,在所述步骤S103之前,所述训练方法还包括对边缘神经网络的训练过程,所述对边缘神经网络的训练过程如下:The training method according to any one of claims 1 to 6, characterized in that, before the step S103, the training method further comprises a training process for the edge neural network, and the training process for the edge neural network as follows:
    对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息,该边缘信息用于指示该样本掩膜所指示的目标对象所在区域的轮廓边缘;For each sample mask, input the sample mask to the edge neural network to obtain edge information output by the edge neural network, and the edge information is used to indicate the contour edge of the area where the target object indicated by the sample mask is located;
    确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距;Determining a loss function of the edge neural network, where the loss function is used to measure the difference between the sample edge information corresponding to each sample mask and the edge information output by the edge neural network;
    调整所述边缘神经网络的各个参数,然后返回执行所述对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息的步骤以及后续步骤,直至所述边缘神经网络的损失函数值小于第二预设阈值,从而得到训练后的边缘神经网络。Adjust each parameter of the edge neural network, and then return to the execution of the steps for each sample mask, input the sample mask to the edge neural network, and obtain the edge information output by the edge neural network and the subsequent steps until all The value of the loss function of the edge neural network is less than the second preset threshold, so that the edge neural network after training is obtained.
  8. 如权利要求7所述的训练方法,其特征在于,所述确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距,包括:The training method according to claim 7, wherein the determining the loss function of the edge neural network, the loss function is used to measure the sample edge information corresponding to each sample mask and the output of the edge neural network The marginal information gaps include:
    确定所述边缘神经网络的损失函数,该损失函数的计算公式为:Determine the loss function of the edge neural network, and the calculation formula of the loss function is:
    Figure PCTCN2020117470-appb-100005
    Figure PCTCN2020117470-appb-100005
    其中,LOSS 2为所述边缘神经网络的损失函数,N为样本图像的总个数,M为第j个样本掩膜中像素点的总个数,r ji的数值是根据第j个样本图像对应的样本边缘信息确定的,r ji用于指示第j个样本掩膜中第i个像素点是否为轮廓边缘,q ji为所述边缘神经网络预测的第j个样本掩膜中第i个像素点为轮廓边缘的概率,x为对数log的底值; Among them, LOSS 2 is the loss function of the edge neural network, N is the total number of sample images, M is the total number of pixels in the j-th sample mask, and the value of r ji is based on the j-th sample image The corresponding sample edge information is determined, r ji is used to indicate whether the i-th pixel in the j-th sample mask is a contour edge, and q ji is the i-th pixel in the j-th sample mask predicted by the edge neural network The probability that the pixel is the edge of the contour, x is the base value of the logarithm log;
    此外,当样本边缘信息指示第i个像素点为轮廓边缘时,该r ji的值要大于当样本边缘信息指示第i个像素点不为轮廓边缘时r ji的值。 Further, when the sample information indicates the i-th edge pixel as an edge contour, the value of r ji when the sample is greater than the edge information indicates an i-th pixel is not the value of r ji edge contour.
  9. 如权利要求1至6任一项所述的训练方法,其特征在于,在所述得到训练后的图像分割网络之后,包括:The training method according to any one of claims 1 to 6, characterized in that, after obtaining the trained image segmentation network, it comprises:
    根据测试集的样本图像以及评价函数对所述训练后的图像分割网络进行评价,其中,评价函数为:The trained image segmentation network is evaluated according to the sample images of the test set and the evaluation function, where the evaluation function is:
    Figure PCTCN2020117470-appb-100006
    Figure PCTCN2020117470-appb-100006
    其中,X是将所述测试集中的任一样本图像输入至所述训练后的图像分割网络后,所述图像分割网络输出的生成掩膜所指示的目标对象的图像区域;Wherein, X is the image area of the target object indicated by the generation mask output by the image segmentation network after any sample image in the test set is input to the trained image segmentation network;
    Y是输入所述训练后的图像分割网络的样本图像所对应的样本掩膜所指示的目标对象的图像区域;Y is the image area of the target object indicated by the sample mask corresponding to the sample image of the input image segmentation network after training;
    IoU的值越接近1,说明所述训练后的图像分割网络性能越好。The closer the IoU value is to 1, the better the performance of the image segmentation network after training.
  10. 一种图像处理方法,其特征在于,包括:An image processing method, characterized in that it comprises:
    获取待处理图像,并将所述待处理图像输入至训练后的图像分割网络,得到所述待处理图像对应的掩膜,其中,所述训练后的图像分割网络是采用训练后的边缘神经网络训练得到,所述训练后的边缘神经网络用于根据输入的掩膜,输出该掩膜所指示的目标对象所在区域的轮廓边缘;Obtain the image to be processed, and input the image to be processed into the trained image segmentation network to obtain the mask corresponding to the image to be processed, wherein the trained image segmentation network adopts a trained edge neural network After training, the trained edge neural network is used to output the contour edge of the area where the target object is indicated by the mask according to the input mask;
    基于所述待处理图像对应的掩膜,将所述待处理图像中所包含的目标对象分割出来。Based on the mask corresponding to the image to be processed, the target object contained in the image to be processed is segmented.
  11. 如权利要求10所述的图像处理方法,其特征在于,所述训练后的图像分割网络是采用训练后的边缘神经网络训练得到,所述训练后的边缘神经网络用于根据输入的掩膜,输出该掩膜所指示的目标对象所在区域的轮廓边缘,包括:10. The image processing method of claim 10, wherein the trained image segmentation network is obtained by training using a trained edge neural network, and the trained edge neural network is used according to the input mask, Output the contour edge of the area where the target object indicated by the mask is located, including:
    所述训练后的图像分割网络是采用如权利要求1至9中任一项所述的训练方法训练得到。The trained image segmentation network is obtained by using the training method according to any one of claims 1 to 9.
  12. 一种图像分割网络,其特征在于,所述图像分割网络采用如权利要求1至9中任一项所述的训练方法训练得到。An image segmentation network, characterized in that the image segmentation network is trained by the training method according to any one of claims 1 to 9.
  13. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如下步骤:A terminal device includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the following steps when the processor executes the computer program:
    S101,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘;S101: Obtain each sample image containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, where each sample mask is used to indicate the target in the corresponding sample image The image area where the object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask;
    S102,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜;S102: For each sample image, input the sample image to an image segmentation network, and obtain a generation mask output by the image segmentation network for indicating the area where the target object in the sample image is located;
    S103,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘;S103: For each generated mask, input the generated mask to the trained edge neural network to obtain generated edge information output by the edge neural network, and the generated edge information is used to indicate the target object indicated by the generated mask The contour edge of the area;
    S104,确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距;S104. Determine the loss function of the image segmentation network, where the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image Corresponding to the gap between the generated edge information and the sample edge information;
    S105,调整所述图像分割网络的各个参数,然后返回执行S102,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。S105: Adjust various parameters of the image segmentation network, and then return to execute S102 until the loss function of the image segmentation network is less than a first preset threshold, thereby obtaining a trained image segmentation network.
  14. 如权利要求13所述的终端设备,其特征在于,所述处理器执行所述确定所述图像分割网络的损失函数时,包括:The terminal device according to claim 13, wherein when the processor executes the determination of the loss function of the image segmentation network, the method comprises:
    确定所述图像分割网络的损失函数,所述损失函数正相关于每个样本图像分别对应的掩膜差距,并且,所述损失函数正相关于每个样本图像分别对应的边缘差距,其中,每个样本图像对应的掩膜差距为该样本图像对应的样本掩膜与生成掩膜的差距,每个样本图像对应的边缘差距为该样本图像对应的样本边缘信息与生成边缘信息的差距。Determine the loss function of the image segmentation network, the loss function is positively related to the mask gap corresponding to each sample image, and the loss function is positively related to the edge gap corresponding to each sample image, where each The mask gap corresponding to each sample image is the gap between the sample mask corresponding to the sample image and the generated mask, and the edge gap corresponding to each sample image is the gap between the sample edge information corresponding to the sample image and the generated edge information.
  15. 如权利要求14所述的终端设备,其特征在于,所述处理器执行所述确定所述图像分割网络的损失函数,所述损失函数正相关于每个样本图像分别对应的掩膜差距,并且,所述损失函数正相关于每个样本图像分别对应的边缘差距时,包括:The terminal device according to claim 14, wherein the processor executes the determination of the loss function of the image segmentation network, and the loss function is positively correlated with the mask gap corresponding to each sample image, and , When the loss function is positively correlated with the edge gap corresponding to each sample image, it includes:
    确定所述图像分割网络的损失函数,该损失函数的计算公式为:Determine the loss function of the image segmentation network, and the calculation formula of the loss function is:
    Figure PCTCN2020117470-appb-100007
    Figure PCTCN2020117470-appb-100007
    其中,LOSS 1为所述图像分割网络的损失函数,N为样本图像的总个数,F1 j用于衡量第j个样本图像对应的样本掩膜与生成掩膜的差距,F2 j用于衡量第j个样本图像对应的样本边缘信息与生成边缘信息的差距,
    Figure PCTCN2020117470-appb-100008
    Among them, LOSS 1 is the loss function of the image segmentation network, N is the total number of sample images, F1 j is used to measure the gap between the sample mask corresponding to the jth sample image and the generated mask, and F2 j is used to measure The difference between the sample edge information corresponding to the j-th sample image and the generated edge information,
    Figure PCTCN2020117470-appb-100008
  16. 如权利要求15所述的终端设备,其特征在于,F1 j的计算公式如下: The terminal device according to claim 15, wherein the calculation formula of F1 j is as follows:
    Figure PCTCN2020117470-appb-100009
    Figure PCTCN2020117470-appb-100009
    其中,M为第j个样本图像中像素点的总个数,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,y ji用于指示第j个样本图像中第i个像素点是否在目标对象所位于的图像区域,p ji为所述图像分割网络预测的第j个样本图像中第i个像素点在目标对象所位于的图像区域的概率,x为对数log的底值; Among them, M is the total number of pixels in the j-th sample image, the value of y ji is determined according to the sample mask corresponding to the j-th sample image, and y ji is used to indicate the i-th sample image in the j-th sample image. Whether the pixel is in the image area where the target object is located, p ji is the probability that the i-th pixel in the j-th sample image predicted by the image segmentation network is in the image area where the target object is located, x is the logarithm log Bottom value
    此外,当样本掩膜指示第i个像素点位于目标对象所在的图像区域时,该y ji的值要大于当样本掩膜指示第i个像素点未位于目标对象所在的图像区域时y ji的值。 Further, when the sample mask indicates the i-th pixel of the image region of the target object is located, the value of y ji is greater than when the sample mask indicates the i-th pixel of the image region of the target object is not located in the y ji value.
  17. 如权利要求15所述的终端设备,其特征在于,所述训练后的边缘神经网络由A个卷积块级联而成,每个卷积块均有B个卷积层构成;The terminal device according to claim 15, wherein the trained edge neural network is formed by cascading A convolutional blocks, and each convolutional block is composed of B convolutional layers;
    相应地,F2 j的计算公式如下: Correspondingly, the calculation formula of F2 j is as follows:
    Figure PCTCN2020117470-appb-100010
    Figure PCTCN2020117470-appb-100010
    其中,mask 1为第j个样本图像对应的生成掩膜,mask 2为第j个样本图像对应的样本掩膜,h c(mask 1)为训练后的边缘神经网络输入为mask 1时,第c个卷积块的输出,h c(mask 2)为训练后的边缘神经网络输入为mask 2时,第c个卷积块的输出,λ c为常数。 Among them, mask 1 is the generation mask corresponding to the j-th sample image, mask 2 is the sample mask corresponding to the j-th sample image, and h c (mask 1 ) is when the trained edge neural network input is mask 1 , the first The output of c convolutional blocks, h c (mask 2 ) is the output of the c-th convolutional block when the trained edge neural network input is mask 2 , and λ c is a constant.
  18. 如权利要求13至17任一项所述的终端设备,其特征在于,所述处理器执行所述计算机程序包括对边缘神经网络的训练过程,所述对边缘神经网络的训练过程如下:The terminal device according to any one of claims 13 to 17, wherein the execution of the computer program by the processor includes a training process for the edge neural network, and the training process for the edge neural network is as follows:
    对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息,该边缘信息用于指示该样本掩膜所指示的目标对象所在区域的轮廓边缘;For each sample mask, input the sample mask to the edge neural network to obtain edge information output by the edge neural network, and the edge information is used to indicate the contour edge of the area where the target object indicated by the sample mask is located;
    确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距;Determining a loss function of the edge neural network, where the loss function is used to measure the difference between the sample edge information corresponding to each sample mask and the edge information output by the edge neural network;
    调整所述边缘神经网络的各个参数,然后返回执行所述对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息的步骤以及后续步骤,直至所述边缘神经网络的损失函数值小于第二预设阈值,从而得到训练后的边缘神经网络。Adjust each parameter of the edge neural network, and then return to the execution of the steps for each sample mask, input the sample mask to the edge neural network, and obtain the edge information output by the edge neural network and the subsequent steps until all The value of the loss function of the edge neural network is less than the second preset threshold, so that the edge neural network after training is obtained.
  19. 如权利要求18所述的终端设备,其特征在于,所述处理器在执行所述确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距时,包括:The terminal device of claim 18, wherein the processor is executing the determination of the loss function of the edge neural network, and the loss function is used to measure the sample edge information corresponding to each sample mask and The gap of the edge information output by the edge neural network includes:
    确定所述边缘神经网络的损失函数,该损失函数的计算公式为:Determine the loss function of the edge neural network, and the calculation formula of the loss function is:
    Figure PCTCN2020117470-appb-100011
    Figure PCTCN2020117470-appb-100011
    其中,LOSS 2为所述边缘神经网络的损失函数,N为样本图像的总个数,M为第j个样本掩膜中 像素点的总个数,r ji的数值是根据第j个样本图像对应的样本边缘信息确定的,r ji用于指示第j个样本掩膜中第i个像素点是否为轮廓边缘,q ji为所述边缘神经网络预测的第j个样本掩膜中第i个像素点为轮廓边缘的概率,x为对数log的底值; Among them, LOSS 2 is the loss function of the edge neural network, N is the total number of sample images, M is the total number of pixels in the j-th sample mask, and the value of r ji is based on the j-th sample image The corresponding sample edge information is determined, r ji is used to indicate whether the i-th pixel in the j-th sample mask is a contour edge, and q ji is the i-th pixel in the j-th sample mask predicted by the edge neural network The probability that the pixel is the edge of the contour, and x is the base value of the logarithm log;
    此外,当样本边缘信息指示第i个像素点为轮廓边缘时,该r ji的值要大于当样本边缘信息指示第i个像素点不为轮廓边缘时r ji的值。 Further, when the sample information indicates the i-th edge pixel as an edge contour, the value of r ji when the sample is greater than the edge information indicates an i-th pixel is not the value of r ji edge contour.
  20. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至11中任一项所述方法的步骤。A computer-readable storage medium storing a computer program, wherein the computer program implements the steps of the method according to any one of claims 1 to 11 when the computer program is executed by a processor.
PCT/CN2020/117470 2019-09-29 2020-09-24 Network training method, image processing method, network, terminal device and medium WO2021057848A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910931784.2A CN110660066B (en) 2019-09-29 2019-09-29 Training method of network, image processing method, network, terminal equipment and medium
CN201910931784.2 2019-09-29

Publications (1)

Publication Number Publication Date
WO2021057848A1 true WO2021057848A1 (en) 2021-04-01

Family

ID=69039787

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/117470 WO2021057848A1 (en) 2019-09-29 2020-09-24 Network training method, image processing method, network, terminal device and medium

Country Status (2)

Country Link
CN (1) CN110660066B (en)
WO (1) WO2021057848A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177606A (en) * 2021-05-20 2021-07-27 上海商汤智能科技有限公司 Image processing method, device, equipment and storage medium
US20210279883A1 (en) * 2020-03-05 2021-09-09 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
CN113378948A (en) * 2021-06-21 2021-09-10 梅卡曼德(北京)机器人科技有限公司 Image mask generation method and device, electronic equipment and storage medium
CN113724163A (en) * 2021-08-31 2021-11-30 平安科技(深圳)有限公司 Image correction method, device, equipment and medium based on neural network
CN115100404A (en) * 2022-05-17 2022-09-23 阿里巴巴(中国)有限公司 Method for processing image and image segmentation model and method for setting image background
CN115223171A (en) * 2022-03-15 2022-10-21 腾讯科技(深圳)有限公司 Text recognition method, device, equipment and storage medium
CN116823864A (en) * 2023-08-25 2023-09-29 锋睿领创(珠海)科技有限公司 Data processing method, device, equipment and medium based on balance loss function
CN117315263A (en) * 2023-11-28 2023-12-29 杭州申昊科技股份有限公司 Target contour segmentation device, training method, segmentation method and electronic equipment

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110660066B (en) * 2019-09-29 2023-08-04 Oppo广东移动通信有限公司 Training method of network, image processing method, network, terminal equipment and medium
CN111311485B (en) * 2020-03-17 2023-07-04 Oppo广东移动通信有限公司 Image processing method and related device
CN111415358B (en) * 2020-03-20 2024-03-12 Oppo广东移动通信有限公司 Image segmentation method, device, electronic equipment and storage medium
CN111462086B (en) * 2020-03-31 2024-04-26 推想医疗科技股份有限公司 Image segmentation method and device, and training method and device of neural network model
CN113744293A (en) * 2020-05-13 2021-12-03 Oppo广东移动通信有限公司 Image processing method, image processing apparatus, electronic device, and readable storage medium
CN111899273A (en) * 2020-06-10 2020-11-06 上海联影智能医疗科技有限公司 Image segmentation method, computer device and storage medium
CN111754521B (en) * 2020-06-17 2024-06-25 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment and storage medium
CN113808003B (en) * 2020-06-17 2024-02-09 北京达佳互联信息技术有限公司 Training method of image processing model, image processing method and device
CN111488876B (en) * 2020-06-28 2020-10-23 平安国际智慧城市科技股份有限公司 License plate recognition method, device, equipment and medium based on artificial intelligence
CN112070793A (en) * 2020-09-11 2020-12-11 北京邮电大学 Target extraction method and device
CN112132847A (en) * 2020-09-27 2020-12-25 北京字跳网络技术有限公司 Model training method, image segmentation method, device, electronic device and medium
CN112465843A (en) * 2020-12-22 2021-03-09 深圳市慧鲤科技有限公司 Image segmentation method and device, electronic equipment and storage medium
CN112669228B (en) * 2020-12-22 2024-05-31 厦门美图之家科技有限公司 Image processing method, system, mobile terminal and storage medium
CN112580567B (en) * 2020-12-25 2024-04-16 深圳市优必选科技股份有限公司 Model acquisition method, model acquisition device and intelligent equipment
CN113159074B (en) * 2021-04-26 2024-02-09 京东科技信息技术有限公司 Image processing method, device, electronic equipment and storage medium
CN113643311B (en) * 2021-06-28 2024-04-09 清华大学 Image segmentation method and device with robust boundary errors
CN113327210B (en) * 2021-06-30 2023-04-07 中海油田服务股份有限公司 Well logging image filling method, device, medium and electronic equipment
CN113822287B (en) * 2021-11-19 2022-02-22 苏州浪潮智能科技有限公司 Image processing method, system, device and medium
CN118266013A (en) * 2021-11-30 2024-06-28 华为技术有限公司 Method for training model, method and device for constructing three-dimensional auricle structure
CN114419086A (en) * 2022-01-20 2022-04-29 北京字跳网络技术有限公司 Edge extraction method and device, electronic equipment and storage medium
CN114998168B (en) * 2022-05-19 2024-07-23 清华大学 Ultrasonic image sample generation method and device
CN114758136B (en) * 2022-06-13 2022-10-18 深圳比特微电子科技有限公司 Target removal model establishing method and device and readable storage medium
CN117237397B (en) * 2023-07-13 2024-05-28 天翼爱音乐文化科技有限公司 Portrait segmentation method, system, equipment and storage medium based on feature fusion
CN117474932B (en) * 2023-12-27 2024-03-19 苏州镁伽科技有限公司 Object segmentation method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325954A (en) * 2018-09-18 2019-02-12 北京旷视科技有限公司 Image partition method, device and electronic equipment
CN109726644A (en) * 2018-12-14 2019-05-07 重庆邮电大学 A kind of nucleus dividing method based on generation confrontation network
US20190156154A1 (en) * 2017-11-21 2019-05-23 Nvidia Corporation Training a neural network to predict superpixels using segmentation-aware affinity loss
CN110660066A (en) * 2019-09-29 2020-01-07 Oppo广东移动通信有限公司 Network training method, image processing method, network, terminal device, and medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106846336B (en) * 2017-02-06 2022-07-15 腾讯科技(上海)有限公司 Method and device for extracting foreground image and replacing image background
CN110838124B (en) * 2017-09-12 2021-06-18 深圳科亚医疗科技有限公司 Method, system, and medium for segmenting images of objects having sparse distribution
CN108647588A (en) * 2018-04-24 2018-10-12 广州绿怡信息科技有限公司 Goods categories recognition methods, device, computer equipment and storage medium
CN109377445B (en) * 2018-10-12 2023-07-04 北京旷视科技有限公司 Model training method, method and device for replacing image background and electronic system
CN109685067B (en) * 2018-12-26 2022-05-03 江西理工大学 Image semantic segmentation method based on region and depth residual error network
CN110084234B (en) * 2019-03-27 2023-04-18 东南大学 Sonar image target identification method based on example segmentation
CN110188760B (en) * 2019-04-01 2021-10-22 上海卫莎网络科技有限公司 Image processing model training method, image processing method and electronic equipment
CN110176016B (en) * 2019-05-28 2021-04-30 招远市国有资产经营有限公司 Virtual fitting method based on human body contour segmentation and skeleton recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190156154A1 (en) * 2017-11-21 2019-05-23 Nvidia Corporation Training a neural network to predict superpixels using segmentation-aware affinity loss
CN109325954A (en) * 2018-09-18 2019-02-12 北京旷视科技有限公司 Image partition method, device and electronic equipment
CN109726644A (en) * 2018-12-14 2019-05-07 重庆邮电大学 A kind of nucleus dividing method based on generation confrontation network
CN110660066A (en) * 2019-09-29 2020-01-07 Oppo广东移动通信有限公司 Network training method, image processing method, network, terminal device, and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHEN XU; WILLIAMS BRYAN M.; VALLABHANENI SRINIVASA R.; CZANNER GABRIELA; WILLIAMS RACHEL; ZHENG YALIN: "Learning Active Contour Models for Medical Image Segmentation", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 15 June 2019 (2019-06-15), pages 11624 - 11632, XP033686570, DOI: 10.1109/CVPR.2019.01190 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210279883A1 (en) * 2020-03-05 2021-09-09 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
US11816842B2 (en) * 2020-03-05 2023-11-14 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
CN113177606A (en) * 2021-05-20 2021-07-27 上海商汤智能科技有限公司 Image processing method, device, equipment and storage medium
CN113177606B (en) * 2021-05-20 2023-11-28 上海商汤智能科技有限公司 Image processing method, device, equipment and storage medium
CN113378948A (en) * 2021-06-21 2021-09-10 梅卡曼德(北京)机器人科技有限公司 Image mask generation method and device, electronic equipment and storage medium
CN113724163B (en) * 2021-08-31 2024-06-07 平安科技(深圳)有限公司 Image correction method, device, equipment and medium based on neural network
CN113724163A (en) * 2021-08-31 2021-11-30 平安科技(深圳)有限公司 Image correction method, device, equipment and medium based on neural network
CN115223171A (en) * 2022-03-15 2022-10-21 腾讯科技(深圳)有限公司 Text recognition method, device, equipment and storage medium
CN115100404A (en) * 2022-05-17 2022-09-23 阿里巴巴(中国)有限公司 Method for processing image and image segmentation model and method for setting image background
CN116823864A (en) * 2023-08-25 2023-09-29 锋睿领创(珠海)科技有限公司 Data processing method, device, equipment and medium based on balance loss function
CN116823864B (en) * 2023-08-25 2024-01-05 锋睿领创(珠海)科技有限公司 Data processing method, device, equipment and medium based on balance loss function
CN117315263B (en) * 2023-11-28 2024-03-22 杭州申昊科技股份有限公司 Target contour device, training method, segmentation method, electronic equipment and storage medium
CN117315263A (en) * 2023-11-28 2023-12-29 杭州申昊科技股份有限公司 Target contour segmentation device, training method, segmentation method and electronic equipment

Also Published As

Publication number Publication date
CN110660066B (en) 2023-08-04
CN110660066A (en) 2020-01-07

Similar Documents

Publication Publication Date Title
WO2021057848A1 (en) Network training method, image processing method, network, terminal device and medium
CN108765278B (en) Image processing method, mobile terminal and computer readable storage medium
WO2020207190A1 (en) Three-dimensional information determination method, three-dimensional information determination device, and terminal apparatus
CN109345553B (en) Palm and key point detection method and device thereof, and terminal equipment
WO2021164269A1 (en) Attention mechanism-based disparity map acquisition method and apparatus
CN109166156B (en) Camera calibration image generation method, mobile terminal and storage medium
CN111489290B (en) Face image super-resolution reconstruction method and device and terminal equipment
WO2021098618A1 (en) Data classification method and apparatus, terminal device and readable storage medium
WO2022105608A1 (en) Rapid face density prediction and face detection method and apparatus, electronic device, and storage medium
WO2018228310A1 (en) Image processing method and apparatus, and terminal
CN108898082B (en) Picture processing method, picture processing device and terminal equipment
CN110853068B (en) Picture processing method and device, electronic equipment and readable storage medium
CN109657543B (en) People flow monitoring method and device and terminal equipment
EP4447008A1 (en) Facial recognition method and apparatus
CN110956131A (en) Single-target tracking method, device and system
WO2024041108A1 (en) Image correction model training method and apparatus, image correction method and apparatus, and computer device
CN112070682A (en) Method and device for compensating image brightness
CN110717405B (en) Face feature point positioning method, device, medium and electronic equipment
CN108932703B (en) Picture processing method, picture processing device and terminal equipment
US20110038509A1 (en) Determining main objects using range information
CN109165648B (en) Image processing method, image processing device and mobile terminal
CN113298122A (en) Target detection method and device and electronic equipment
CN110544221B (en) Training method and device, rain removing method, terminal device and storage medium
CN115731442A (en) Image processing method, image processing device, computer equipment and storage medium
CN111754411B (en) Image noise reduction method, image noise reduction device and terminal equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20867569

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20867569

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20867569

Country of ref document: EP

Kind code of ref document: A1