CN110599514A

CN110599514A - Image segmentation method and device, electronic equipment and storage medium

Info

Publication number: CN110599514A
Application number: CN201910901524.0A
Authority: CN
Inventors: 黄慧娟; 郭益林; 赵松涛; 宋丛礼; 郑文
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-09-23
Filing date: 2019-09-23
Publication date: 2019-12-20
Anticipated expiration: 2039-09-23
Also published as: CN110599514B

Abstract

The disclosure relates to an image segmentation method, an image segmentation device, electronic equipment and a storage medium, and belongs to the field of image processing. The method comprises the following steps: acquiring a sample image, and reference position information and reference edge information of a detection object in the sample image; inputting the sample image into the feature extraction module to obtain feature information of the sample image; determining position information of a detection object included in the sample image based on the feature information and the object position detection module, and determining edge information of the detection object included in the sample image based on the feature information and the object edge detection module; and training the image segmentation network to be trained based on the position information, the edge information, the reference position information and the reference edge information to obtain the trained image segmentation network. By adopting the method and the device, the accuracy of the image segmentation network in identifying the position of the object can be improved.

Description

Image segmentation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing, and in particular, to a method and an apparatus for image segmentation, an electronic device, and a storage medium.

Background

With the development of artificial intelligence, an image segmentation technique is a technique often used by people, and aims to detect a region belonging to a target object in an image, and distinguish a part belonging to the target object from a part not belonging to the target object in the image, wherein the target object can be a person, a vehicle, an animal and the like.

The current techniques for image segmentation are: and a large number of sample images are learned and trained by establishing an image segmentation network, so that the target object is detected. The image segmentation network comprises a feature extraction module and a position detection module, the feature extraction module extracts feature information of a picture and transmits the feature information to the position detection module, the position detection module extracts position information of a target object and compares the position information with reference position information of the target object to obtain a loss value, the adjustment values of parameters in the feature extraction module and the position detection module are determined based on the loss value, and then the parameters in the feature extraction module and the position detection module are adjusted. Through repeated and continuous learning training, the trained feature extraction module and the trained position detection module can output a binary image aiming at any image, wherein the pixel values of the part belonging to the target object are different from the pixel values of the part not belonging to the target object, so that the part belonging to the target object and the part not belonging to the target object in the image are distinguished.

In the course of implementing the present disclosure, the inventors found that the prior art has at least the following problems: when the image content is complex and the noise is high, the detected position information has a problem of high noise, for example, the position information has low accuracy because the region outside the object is regarded as the region where the object is located.

Disclosure of Invention

The present disclosure provides a method, an apparatus, an electronic device, and a storage medium for image segmentation, so as to at least solve the problem of low accuracy of detecting position information in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a training method for an image segmentation network, including:

acquiring a sample image, and reference position information and reference edge information of a detection object in the sample image;

inputting the sample image into the feature extraction module to obtain feature information of the sample image;

determining position information of a detection object included in the sample image based on the feature information and the object position detection module, and determining edge information of the detection object included in the sample image based on the feature information and the object edge detection module;

and training the image segmentation network to be trained based on the position information, the edge information, the reference position information and the reference edge information to obtain the trained image segmentation network.

Optionally, the step of training the image segmentation network to be trained based on the position information, the edge information, the reference position information, and the reference edge information includes:

determining a first loss value based on a difference of the position information and the reference position information, and determining a second loss value based on a difference of the edge information and the reference edge information;

and training the image segmentation network to be trained based on the first loss value and the second loss value.

Optionally, the step of training the image segmentation network to be trained based on the first loss value and the second loss value includes:

summing the first loss value and the second loss value to obtain a third loss value;

adjusting model parameters of the feature extraction module, the object position detection module, and the object edge detection module based on the third loss value.

Optionally, the image segmentation network further includes a Softmax processing module, and the step of determining, based on the feature information and the object position detection module, position information of a detection object included in the sample image, and determining, based on the feature information and the object edge detection module, edge information of the detection object of the sample image includes:

inputting the characteristic information into an object position detection module to obtain initial position information of the detection object included in the sample image, and processing the initial position information of the detection object included in the sample image based on a Softmax processing module to obtain position information of the detection object included in the sample image;

inputting the characteristic information into an object edge detection module to obtain initial edge information of the detection object included in the sample image, and processing the initial edge information of the detection object included in the sample image based on a Softmax processing module to obtain the edge information of the detection object included in the sample image.

Optionally, after the training of the image segmentation network to be trained is performed based on the position information, the edge information, the reference position information, and the reference edge information, and a trained image segmentation network is obtained, the method further includes:

and deleting the object edge detection module in the trained image segmentation network.

According to a second aspect of the embodiments of the present disclosure, there is provided an image segmentation method, including:

acquiring a target image to be segmented;

target position information of the detection object in the target image is determined based on the target image and the trained image segmentation network as described above.

Optionally, the determining target position information of the detection object in the target image based on the target image and the trained image segmentation network includes:

and determining target characteristic information based on the target image and the trained characteristic extraction module, and determining target position information of the detection object in the target image based on the target characteristic information and the trained object position detection module.

Optionally, the image segmentation network further includes a Softmax processing module, and the step of determining, based on the target feature information and the trained object position detection module, the target position information of the detected object in the target image includes:

inputting the target characteristic information into a trained object position detection module to obtain initial position information of a detection object included in the target image;

and processing the initial position information of the detection object included in the target image based on a Softmax processing module to obtain the position information of the detection object included in the target image.

According to a third aspect of the embodiments of the present disclosure, there is provided a training apparatus for an image segmentation network, comprising

An acquisition unit configured to acquire a sample image, reference position information and reference edge information of a detection object in the sample image;

the input unit is configured to input the sample image into the feature extraction module to obtain feature information of the sample image;

a determination unit configured to determine position information of a detection object included in the sample image based on the feature information and the object position detection module, and determine edge information of the detection object included in the sample image based on the feature information and the object edge detection module;

and the training unit is configured to train the image segmentation network to be trained on the basis of the position information, the edge information, the reference position information and the reference edge information to obtain the trained image segmentation network.

Optionally, the training unit is configured to:

Optionally, the image segmentation network further includes a Softmax processing module, and the determining unit is configured to:

Optionally, the apparatus further includes a deleting unit, where the deleting unit is configured to:

According to a fourth aspect of the embodiments of the present disclosure, there is provided an apparatus for image segmentation, including:

an acquisition unit configured to acquire a target image to be segmented;

a determination unit configured to determine target position information of a detection object in the target image based on the target image and the trained image segmentation network as described above.

Optionally, the determining unit is configured to:

Optionally, the image segmentation network further includes a Softmax processing module, and the determining unit is further configured to:

According to a fourth aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of training of an image segmentation network according to the first aspect or the method of image segmentation according to the second aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform the method of training an image segmentation network according to the first aspect or the method of image segmentation according to the second aspect.

According to a sixth aspect of embodiments of the present disclosure, there is provided a computer program product, which, when run on a server, causes the server to perform the method of training of an image segmentation network according to the first aspect or the method of image segmentation according to the second aspect.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

according to the image segmentation network training method and device, when the image segmentation network is trained, the object edge detection module is additionally arranged, the image segmentation network is trained from two angles of position information and edge information of the detection object, and because the edge and the position of the detection object are directly related, the accuracy of the image segmentation network in identifying the position of the object can be improved, the influence of noise in the image on the identification result can be reduced, the influence of a complex background image on the identification result can be reduced, the influence of a strong edge on the identification result can be reduced, and meanwhile, the operation amount of the image in segmentation can not be increased.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is an architecture diagram of an image segmentation network in a training method of the image segmentation network according to an exemplary embodiment.

FIG. 2 is a block diagram illustrating a method of training an image segmentation network, according to an example embodiment.

FIG. 3 is a block diagram illustrating a method of image segmentation in accordance with an exemplary embodiment.

Fig. 4a is an effect diagram of image segmentation.

FIG. 4b is an illustration of an effect of image segmentation, according to an exemplary embodiment.

FIG. 5 is a block diagram illustrating an apparatus for training of an image segmentation network, according to an example embodiment.

FIG. 6 is a block diagram illustrating an apparatus for image segmentation in accordance with an exemplary embodiment.

Fig. 7 is a schematic structural diagram of an electronic device shown in accordance with an example embodiment.

Detailed Description

The training method of the image segmentation network provided by the present disclosure may be implemented by a computer device, which may be a server or a terminal. The method can be realized by the server or the terminal independently, or can be realized by the server and the terminal together. The terminal can be a mobile phone, a tablet computer, intelligent wearable equipment, a desktop computer, a notebook computer and the like. The server may be a single server or a server group, and if the server is a single server, the server may be responsible for all processing in the following scheme, and if the server is a server group, different servers in the server group may be respectively responsible for different processing in the following scheme, and the specific processing allocation condition may be arbitrarily set by a technician according to actual needs, and is not described herein again. In this embodiment, the scheme is described by taking the server to perform object detection as an example, and other cases are similar to the above and will not be described again.

Fig. 1 is an architecture diagram of an image segmentation network in a training method of the image segmentation network according to an exemplary embodiment. Referring to fig. 1, the training method of the image segmentation network provided by the present disclosure is to perform learning training on the image segmentation network to be trained through a large number of sample images, so that the image segmentation network achieves the purpose of segmenting a target object from an image. The image segmentation network to be trained comprises a feature extraction module, a position detection module and an edge detection module, wherein the feature extraction module extracts feature information of a picture and transmits the feature information to the position detection module and the edge detection module, the position detection module and the edge detection module extract position information and edge information of a target object and compare the position information and the edge information with reference position information and reference edge information of the target object to obtain a loss value, the loss value is subjected to back propagation by using a gradient descent method, and parameters in the feature extraction module are adjusted. Through repeated and continuous learning and training, the purpose of detecting the accurate region of the target object in the picture is finally achieved.

According to the image segmentation method provided by the embodiment of the disclosure, the position of each object contained in any image in the image can be detected, wherein the object can be any one of a human body, a human face, an animal, an automobile or other objects. The scheme can be applied to various scenes relevant to image recognition, for example, in the process that a public security organization searches for criminal suspects through the bayonet device, the image of a human body or a vehicle in the image captured by the bayonet device can be accurately extracted by using the scheme, and the comparison and the investigation of relevant personnel are facilitated. The method can also be applied to the field of unmanned driving, and the accuracy of recognizing surrounding objects by the automobile is improved. The method can also be applied to various picture editing software, can realize automatic and accurate matting of the object to be edited, and improves the picture editing efficiency. In the embodiment of the present disclosure, a human body is taken as an example to perform a detailed description of the scheme, and other situations are similar and will not be described again.

Fig. 2 is a flowchart illustrating a training method of an image segmentation network according to an exemplary embodiment, wherein the image segmentation network to be trained includes a feature extraction module, an object position detection module, and an object edge detection module, and as shown in fig. 2, the method includes the following steps.

In step S21, the sample image, the reference position information and the reference edge information of the detection object in the sample image are acquired.

The reference position information of the detection object and the position information in the subsequent content may be in various forms, for example, the reference position information may be position information of four vertices of a minimum circumscribed rectangle of the detection object, or may be a position calibration image, the position calibration image is a binary image, a pixel value of a pixel point belonging to the detection object is a first value, such as 255, and a pixel value of a pixel point not belonging to the detection object is a second value, such as 0.

The reference edge information of the detection object and the edge information in the subsequent content may be in various forms, for example, the reference edge information may be coordinates of all points on an edge line, and for example, the reference edge information may be an edge calibration image, the edge calibration image is a binary image, a pixel value of a pixel point belonging to the edge of the detection object is a first value, such as 255, and a pixel value of a pixel point not belonging to the edge of the detection object is a second value, such as 0.

In the embodiment of the present disclosure, the reference position information and the position information are used as the position calibration image, and the reference edge information and the edge information are used as the edge calibration image, for example, to perform detailed description of the scheme, other situations are similar to them, and this embodiment is not repeated.

In the implementation, the training method of the image segmentation network to be trained is to train the image segmentation network to be trained through a large number of sample images, so that the image segmentation network can segment any detected object from a target image. The sample image includes an image containing the detection object and an image not containing the detection object. During training, for a sample image, reference position information and reference edge information of a detection object can be acquired, wherein the reference position information and the reference edge information can be acquired in a manual calibration mode.

For example, a human body is a detection target. First, a large number of sample images are prepared, and the sample images include images including a human body and images not including a human body. And then manually marking pixels belonging to the human body and pixels not belonging to the human body in the picture, wherein the pixel value of the pixels belonging to the human body can be set to be 255, and the pixel value of the pixels not belonging to the human body can be set to be 0, so that the position marking image of the human body is obtained. The obtained position calibration image of the human body is subjected to edge extraction through a canny operator (a multi-level edge detection algorithm), pixel points of the edge of the human body are determined, then the pixel values belonging to the edge of the human body are set to be 255, and the pixel values not belonging to the edge of the human body are set to be 0, so that the edge calibration image of the human body is obtained.

In step S22, the sample image is input to the feature extraction module, and feature information of the sample image is obtained.

In implementation, the sample image may be input to the feature extraction module, and the feature extraction module may perform processing calculation on data of the sample image and output feature information of the sample image.

In step S23, based on the feature information and the object position detection module, position information of the detection object included in the sample image is determined, and based on the feature information and the object edge detection module, edge information of the detection object included in the sample image is determined.

In implementation, the feature information output by the feature extraction module may be input into the object position detection module to obtain initial position information of the detection object in the sample image, where the initial position information is an initial position calibration image, and a technician may perform range division on pixel values in the initial position calibration image to determine pixels belonging to the detection target. For example, two numerical ranges may be set, and for each pixel position in the initial position calibration image, the pixel position belongs to the detection target if the pixel value of the pixel position belongs to the first numerical range, and the pixel position does not belong to the detection target if the numerical value of the pixel position belongs to the second numerical range. Similarly, the feature information extracted by the feature extraction module may be input into the object edge detection module to obtain initial edge information of the detected object in the sample image, where the initial edge information is an initial edge calibration image, and a technician may perform range division on pixel values in the initial edge calibration image to determine pixels belonging to the detected target edge. For example, two value ranges may be set, and for each pixel position in the initial edge calibration image, the pixel position belongs to the edge of the detection target if the value of the pixel position belongs to the first value range, and the pixel position does not belong to the detection target if the value of the pixel position belongs to the second value range.

Optionally, the image segmentation network further includes a Softmax processing module, and accordingly, the processing of step S23 may be as follows: inputting the characteristic information into an object position detection module to obtain initial position information of the detection object included in the sample image, and processing the initial position information of the detection object included in the sample image based on a Softmax processing module to obtain position information of the detection object included in the sample image; inputting the characteristic information into an object edge detection module to obtain initial edge information of the detection object included in the sample image, and processing the initial edge information of the detection object included in the sample image based on a Softmax processing module to obtain the edge information of the detection object included in the sample image.

In implementation, a Softmax processing module may be further disposed in the image segmentation network, and the Softmax processing module may process each pixel position in the initial position calibration image through a Softmax function, so that values of the pixel positions in the initial position information are uniformly set within a preset range, and further, position information of the detection object in the sample image is obtained. Similarly, each pixel position information in the initial edge calibration image can be processed by using a Softmax function in a Softmax processing module, and numerical values of the pixel positions in the initial position information are uniformly set in a preset range, so that position information of the edge of the detection object in the sample image is obtained.

For example, the feature information of the image extracted by the feature extraction module is input into the object position detection module to obtain an initial position calibration image of the human body in the sample image, and the initial position calibration image can be regarded as a matrix. There are different values in the matrix. After being processed by the Softmax function, the numerical values can be uniformly set in the range of 0-1, and then the position calibration image of the human body can be obtained. The maximum value of the position calibration image of the human body is close to 0 or close to 1, and the position calibration image of the human body can be approximately regarded as a binary image. The values in the position calibration image may also be multiplied by 255 for display. Similarly, the characteristic information is input into the object position detection module to obtain an initial edge calibration image of the human body in the sample image, namely a matrix of initial edge information is obtained, and after the matrix is processed by a Softmax function, numerical values can be uniformly set in a range of 0-1, so that the edge calibration image of the human body can be obtained. The maximum value of the edge calibration image of the human body is close to 0 or close to 1, and the edge calibration image of the human body can be approximately regarded as a binary image. The value in the edge calibration image may be multiplied by 255 to obtain the position information with the same data format as the above-mentioned edge calibration image of the human body, so as to facilitate the calculation processing.

Optionally, the Softmax processing module may be further configured in an object position detection module in the image segmentation network, that is, the object position detection module includes a Softmax processing module and an object position detection sub-module. Accordingly, the process of step S23 may be as follows: inputting the characteristic information into an object position detection submodule to obtain initial position information of a detection object included in the sample image, and processing the initial position information of the detection object included in the sample image based on a Softmax processing module to obtain position information of the detection object included in the sample image; inputting the characteristic information into an object edge detection submodule to obtain initial edge information of the detection object included in the sample image, and processing the initial edge information of the detection object included in the sample image based on a Softmax processing module to obtain the edge information of the detection object included in the sample image.

In implementation, the characteristic information of the sample image is input into the object position detection submodule, initial position information can be obtained, and then the initial position information can be processed by the Softmax processing module. The characteristic information of the sample image is input into the object edge detection submodule, so that initial edge information can be obtained, and then the initial edge information can be processed by using the Softmax processing module. The mode of processing the initial position information and the initial edge information by the Softmax processing module is the same as the mode of processing the initial position information and the initial edge information by the Softmax processing module, and is not described herein again.

In step S24, the image segmentation network to be trained is trained based on the position information, the edge information, the reference position information, and the reference edge information, to obtain a trained image segmentation network.

In practice, the position detection module may be trained according to the difference between the obtained position information and the reference position information. And training an edge detection module according to the obtained difference value between the edge information and the reference edge information. The training process can adjust parameter values in the object position detection module and the object edge detection module according to the difference value, so that the difference value is smaller and smaller.

Optionally, the image segmentation network to be trained is trained according to a difference value between the position information and the reference position information. The steps of training the image segmentation network to be trained are as follows: determining a first loss value based on the difference between the position information and the reference position information, determining a second loss value based on the difference between the edge information and the reference edge information, summing the first loss value and the second loss value to obtain a third loss value, and adjusting model parameters of the feature extraction module, the object position detection module and the object edge detection module based on the third loss value.

In implementation, after the position information and the edge information of the target object are obtained, the position information and the reference position information are calculated to obtain a first loss value, the edge information and the reference edge information are calculated to obtain a second loss value, and the first loss value and the second loss value are summed to obtain a third loss value. And (3) performing back propagation on the third loss value by using a gradient descent method, training the object feature extraction module, the object position detection module and the object edge detection module respectively, and adjusting parameters in the modules. The image segmentation network is repeatedly trained by inputting a large number of samples into the image segmentation network, and parameters in the feature extraction module, the object position detection module and the object edge detection module are continuously adjusted, so that the first loss value and the second loss value are smaller and smaller, and when the detection accuracy of the image segmentation network reaches a certain threshold value, such as 97%, the training of the image segmentation network can be finished. Namely, the image segmentation network after training is obtained, and the image segmentation network after training can segment any detected object from the target image.

For example, the obtained human body position information is compared with a position calibration image of the human body to obtain a cross entropy loss1 (i.e. a first loss value), the obtained human body edge information is compared with accurate edge information of the human body to obtain another cross entropy loss2 (i.e. a second loss value), and the loss1 and the loss2 are summed to obtain the loss (i.e. a third loss value). And substituting loss into the training function by using a gradient descent method, and adjusting parameters in the feature extraction module, the object position detection module and the object edge detection module according to the result of the training function. Then, other sample pictures can be continuously input into the deep learning network, so that a new loss value can be obtained, parameters in each module are further trained, the parameters in the feature extraction module, the object position detection module and the object edge detection module are continuously adjusted through repeated training, and when the detection accuracy of the image segmentation network reaches a certain threshold value, such as 97%, the training of the image segmentation network can be finished.

Optionally, in practical application of the image segmentation network, the object edge detection module may be deleted from the network, or the object edge detection module may be shielded. Correspondingly, after the training is completed, the object edge detection module can be deleted in the trained image segmentation network.

In an implementation, the third loss value is a sum of a first loss value and a second loss value, the first loss value is a difference value between the position information output by the object position detection module and the reference position information, and the second loss value is a difference value between the edge information output by the object edge detection module and the reference edge information. And training the image segmentation network through the third loss value, so that an object position detection module in the image segmentation network is influenced by the second loss value, the object position detection module simultaneously detects the edge information of the object when detecting the position information of the object, and the final position information is output by integrating the position information of the object and the edge information of the object. In application, the object edge detection module may be eliminated from the image segmentation network.

FIG. 3 is a flowchart illustrating a method of image segmentation, as shown in FIG. 3, including the following steps, according to an exemplary embodiment.

In step S31, a target image to be segmented is acquired.

In step S32, target position information of the detection object in the target image is determined based on the target image and the trained image segmentation network.

In practice, after the image segmentation network trained in step S24 can segment any detected object from the target image, the image segmentation network can be applied. In application, a target image is input into the image segmentation network, and the image network can output and highlight the detected object in the target image, so that the image segmentation effect is achieved.

Optionally, in practical application of the image segmentation network, the object edge detection module may be deleted from the network, or the object edge detection module may be shielded. Correspondingly, target feature information is determined based on the target image and the trained feature extraction module, and target position information of the detection object in the target image is determined based on the target feature information and the trained object position detection module.

In practice, the object edge detection module may be deleted from the network or masked in practical application of the image segmentation network. Inputting the feature information extracted by the feature extraction module into an object position detection module to obtain initial position information of a detection object included in a sample image, wherein the initial position information is an initial position calibration image, and according to a set numerical range, for each pixel position in the initial position calibration image, if the numerical value of the pixel position belongs to a first numerical range, the pixel position belongs to a detection target, and if the numerical value of the pixel position belongs to a second numerical range, the pixel position does not belong to the detection target.

Optionally, the image segmentation network further includes a Softmax processing module, and when determining the target position information of the detection object in the target image, the following processing manner may also be adopted: and inputting the target characteristic information into the trained object position detection module to obtain the initial position information of the detection object included in the target image. And processing the initial position information of the detection object included in the target image based on a Softmax processing module to obtain the position information of the detection object included in the target image.

In an implementation, the image segmentation network further includes a Softmax processing module, and the Softmax processing module may process initial position information of the detection object by using a Softmax function, and uniformly set values of pixel positions in the initial position information within a preset range, so as to obtain position information of the detection object included in the sample image.

For example, the image is input into a trained deep learning network, the feature extraction module extracts feature information of the image, and then the feature information of the image is input into the object position detection module, so that an initial position calibration image of a human body in the sample image is obtained. The obtained initial position calibration image is processed by a Softmax function, each pixel value in the initial position calibration image is set to be in a range of 0-1, wherein the pixel with the pixel value close to 1 is the pixel belonging to the human body, and therefore the position information of the human body in the image can be obtained. Fig. 4a is a segmentation result of a general image segmentation network on a human body, and fig. 4b is a segmentation result of an image segmentation network provided by the present disclosure on a human body.

Optionally, the Softmax processing module may be further configured in an object position detection module in the image segmentation network, that is, the object position detection module includes a Softmax processing module and an object position detection sub-module. Correspondingly, inputting the target characteristic information into the trained object position detection submodule to obtain initial position information of a detection object included in the target image; and processing the initial position information of the detection object included in the target image based on a Softmax processing module to obtain the position information of the detection object included in the target image.

In implementation, the characteristic information of the sample image is input into the object position detection submodule, initial position information can be obtained, and then the initial position information can be processed by the Softmax processing module. The characteristic information of the sample image is input into the object edge detection submodule, so that initial edge information can be obtained, and then the initial edge information can be processed by using the Softmax processing module. The mode of processing the initial position information by the Softmax processing module is the same as the mode of processing the initial position information by the Softmax processing module, and details are not repeated here.

FIG. 5 is a block diagram illustrating a training apparatus for an image segmentation network according to an example embodiment. Referring to fig. 5, the apparatus includes an acquisition unit 151, an input unit 152, a determination unit 153, and a training unit 154.

An obtaining unit 151 configured to obtain a sample image, reference position information and reference edge information of a detection object in the sample image, and may specifically implement the obtaining function in step S21, and other implicit steps;

an input unit 152, configured to input the sample image into the feature extraction module, to obtain feature information of the sample image, which may specifically implement the input function in step S22 described above, and other implicit steps;

the determining unit 153 is configured to determine the position information of the detection object included in the sample image based on the feature information and the object position detection module, and determine the edge information of the detection object included in the sample image based on the feature information and the object edge detection module, and may specifically implement the determining function in step S23, and other implicit steps;

the training unit 154 is configured to train the image segmentation network to be trained based on the position information, the edge information, the reference position information, and the reference edge information, so as to obtain a trained image segmentation network, which may specifically implement the training function in step S24, and other implicit steps.

Optionally, the training unit 154 is configured to:

Optionally, the image segmentation network further includes a Softmax processing module, and the determining unit 153 is configured to:

FIG. 6 illustrates an apparatus for image segmentation according to an exemplary embodiment. Referring to fig. 6, the apparatus includes an acquisition unit 161 and an extraction unit 162.

an obtaining unit 161, configured to obtain a target image to be segmented, which may specifically implement the obtaining function in step S31 above, and other implicit steps;

the determining unit 162 is configured to determine the target position information of the detected object in the target image based on the target image and the trained image segmentation network as described above, and may specifically implement the determining function in step S32 described above, and other implicit steps.

Optionally, the determining unit 162 is configured to:

Optionally, the image segmentation network further includes a Softmax processing module, and the determining unit 162 is further configured to:

It should be noted that: in the training apparatus for an image segmentation network provided in the above embodiment, when the image segmentation network is trained, or when the image segmentation apparatus provided in the above embodiment performs image segmentation, only the division of each functional unit is illustrated, and in practical applications, the function distribution may be completed by different functional units, that is, the internal structure of the apparatus may be divided into different functional units, as needed, to complete all or part of the functions described above. In addition, with regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated herein.

Fig. 7 is a schematic structural diagram of a server electronic device according to an embodiment of the present disclosure, where the electronic device 700 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 701 and one or more memories 702, where the memory 702 stores at least one instruction, and the at least one instruction is loaded and executed by the processor 701 to implement the method for performing object detection.

In an exemplary embodiment, a storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 701 of the electronic device 700 to perform the above-described method is also provided. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A training method of an image segmentation network is characterized in that the image segmentation network to be trained comprises a feature extraction module, an object position detection module and an object edge detection module, and the method comprises the following steps:

2. The method of claim 1, wherein the step of training the image segmentation network to be trained based on the position information, the edge information, the reference position information, and the reference edge information comprises:

3. The method of claim 2, wherein the step of training the image segmentation network to be trained based on the first loss value and the second loss value comprises:

4. The method according to any one of claims 1 to 3, wherein after the training of the image segmentation network to be trained based on the position information, the edge information, the reference position information, and the reference edge information to obtain a trained image segmentation network, the method further comprises:

5. A method of image segmentation, the method comprising:

acquiring a target image to be segmented;

determining target position information of a detection object in the target image based on the target image and the trained image segmentation network of any one of claims 1-4.

6. The method of claim 5, wherein determining target location information for a detected object in the target image based on the target image and the trained image segmentation network of any of claims 1-4 comprises:

7. An apparatus for training an image segmentation network, wherein the image segmentation network to be trained includes a feature extraction module, an object position detection module, and an object edge detection module, the apparatus comprising:

8. An apparatus for image segmentation, the apparatus comprising:

an acquisition unit configured to acquire a target image to be segmented;

a determination unit that determines target position information of a detection object in the target image based on the target image and the trained image segmentation network of claim 7.

9. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of training an image segmentation network according to any one of claims 1 to 4 or a method of image segmentation according to any one of claims 5 to 6.

10. A storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform a method of training an image segmentation network as claimed in any one of claims 1 to 4 or a method of image segmentation as claimed in any one of claims 5 to 6.