WO2022105125A1 - Image segmentation method and apparatus, computer device, and storage medium - Google Patents

Image segmentation method and apparatus, computer device, and storage medium Download PDF

Info

Publication number
WO2022105125A1
WO2022105125A1 PCT/CN2021/090817 CN2021090817W WO2022105125A1 WO 2022105125 A1 WO2022105125 A1 WO 2022105125A1 CN 2021090817 W CN2021090817 W CN 2021090817W WO 2022105125 A1 WO2022105125 A1 WO 2022105125A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
network
image
result
training
Prior art date
Application number
PCT/CN2021/090817
Other languages
French (fr)
Chinese (zh)
Inventor
汪淼
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022105125A1 publication Critical patent/WO2022105125A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to an image segmentation method, apparatus, computer equipment and storage medium.
  • Synthetic aperture radar as an imaging radar with high range resolution and azimuth resolution, has a wide range of applications in military and civilian fields. Detecting the target of interest in the SAR image and segmenting it from the background according to the contour of the target can lay the foundation for subsequent understanding, analysis and planning.
  • the common segmentation methods include the maximum inter-class variance method, the edge detection algorithm based on local hybrid filtering, and the bias correction fuzzy c-means algorithm.
  • a hot research field in recent years is the segmentation method based on deep learning, which learns image features through deep neural networks, and this highly abstract feature is more conducive to image segmentation.
  • the inventor realized that this method realizes the classification of pixels through an end-to-end deep neural network, but the disadvantage of this method is that the linear difference is used, so that the detailed structure information is lost during segmentation, resulting in blurred boundaries.
  • the pooling layer has the effect of expanding the receptive field, it causes the loss of position information, and the position information often needs to be preserved in the process of semantic segmentation processing. As a result, the problem of inaccurate information extraction during image segmentation is finally caused.
  • the purpose of the embodiments of the present application is to provide an image segmentation method, device, computer equipment and storage medium, so as to solve the technical problem of insufficient information extraction during image segmentation.
  • the embodiments of the present application provide an image segmentation method, which adopts the following technical solutions:
  • the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
  • Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
  • the embodiments of the present application also provide an image segmentation device, which adopts the following technical solutions:
  • a decomposition module used for acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
  • the processing module is configured to obtain a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the multi-dimensional
  • the image block is subjected to encoding processing to obtain an encoding result, and a decoder based on the first layer network performs decoding processing on the encoding result to obtain a binary segmentation result map of the target image;
  • the computing module is configured to perform multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation result graph of the target image.
  • an embodiment of the present application further provides a computer device, including a memory and a processor, and computer-readable instructions stored in the memory and executable on the processor, and the processor executes
  • the computer-readable instructions also implement the following steps:
  • the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
  • Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
  • an embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the processing The device also performs the following steps:
  • the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
  • Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
  • the above image segmentation method obtains a target image and performs two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block.
  • the decomposed multi-dimensional image block can improve the accuracy of image processing; then, obtain a preset atrous convolutional neural network, Among them, the atrous convolutional neural network includes a first-layer network and a second-layer network.
  • the encoder in the first-layer network encodes the multi-dimensional image block to obtain an encoding result, and the decoder based on the first-layer network encodes the encoding result.
  • the decoding process obtains the binary segmentation result map of the target image, and the multi-dimensional image block is processed by the preset atrous convolutional neural network, which can increase the receptive field within the controllable range of network parameters, so that each feature image contains
  • the multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation of the target image.
  • FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
  • FIG. 2 is a flowchart of an embodiment of an image segmentation method according to the present application.
  • FIG. 3 is a schematic structural diagram of an embodiment of an image segmentation apparatus according to the present application.
  • FIG. 4 is a schematic structural diagram of an embodiment of a computer device according to the present application.
  • image segmentation device 300 decomposition module 301 , processing module 302 , and calculation module 303 .
  • the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 .
  • the network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 .
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like.
  • Various communication client applications may be installed on the terminal devices 101 , 102 and 103 , such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social platform software, and the like.
  • the terminal devices 101, 102, and 103 can be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic Picture Experts Compression Standard Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, Moving Picture Experts Compression Standard Audio Layer 4) Players, Laptops and Desktops, etc.
  • MP3 players Moving Picture Experts Group Audio Layer III, dynamic Picture Experts Compression Standard Audio Layer 3
  • MP4 Moving Picture Experts Group Audio Layer IV, Moving Picture Experts Compression Standard Audio Layer 4
  • the server 105 may be a server that provides various services, such as a background server that provides support for the pages displayed on the terminal devices 101 , 102 , and 103 .
  • the image segmentation method provided by the embodiments of the present application is generally performed by a server/terminal device, and accordingly, an image segmentation apparatus is generally set in the server/terminal device.
  • terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • the described image segmentation method includes the following steps:
  • Step S201 acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
  • a target image is acquired, and the target image is an image including target segmentation information.
  • the image is subjected to two-layer wavelet decomposition.
  • the wavelet is usually a signal whose local feature is not 0 in a limited interval.
  • the first layer of wavelet decomposition is to decompose the image into low-frequency information and high-frequency information.
  • the high-frequency information is the part of the image with strong changes in intensity. , such as the image outline; low-frequency information is the part of the image where the intensity of the image changes gently, such as the large color block in the image.
  • the low-frequency information is decomposed into low-frequency information and high-frequency information, which is the second-level decomposition of the wavelet.
  • the target image can be decomposed by two-layer wavelet to obtain multi-dimensional image blocks.
  • Step S202 obtaining a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, based on the encoder in the first-layer network for the multi-dimensional image block.
  • Perform encoding processing to obtain an encoding result and perform decoding processing on the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
  • a preset atrous convolutional neural network is obtained, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the first-layer network includes an encoder and a
  • the decoder includes three first convolutional layers, three first atrous convolutional layers and two pooling layers, and encodes multi-dimensional image blocks according to the encoder;
  • the decoder includes two upsampling layers, two The second convolution layer and the two second holes are convolutional layers, and the encoder decodes the encoding result output by the encoder based on the decoder, and finally obtains a binary segmentation result graph;
  • the second layer network includes multiple convolution layers.
  • the binary segmentation result map corresponding to the target image can be obtained, and according to the second layer network, the multi-layer convolution calculation can be performed on the obtained binary segmentation result map to obtain the semantic segmentation map corresponding to the target image.
  • Step S203 performing multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation result graph of the target image.
  • the second layer network includes a third convolution layer, a third hole convolution layer and a fourth convolution layer.
  • the first convolution result of the first layer network is obtained, wherein, The first convolution result is obtained from the first sub-hole convolution result obtained by the first first hole convolution calculation in the encoder of the first-layer network, and is obtained by re-convolution calculation. Multiply the first convolution result and the binary segmentation result graph to obtain a multiplication result. Input the multiplication result to the third convolutional layer.
  • the output result of the previous layer is used as the input of the next layer. Calculated by This obtains the final semantic segmentation result graph, which is the final segmentation result graph of the target image.
  • the above-mentioned semantic segmentation result graph information may also be stored in a node of a blockchain.
  • the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • This embodiment realizes that more image information can be obtained during image segmentation, improves the accuracy of image signal description when extracting local feature information, and greatly increases the receptive field within the controllable range of network parameters.
  • the amount of information contained in each feature is further improved, which further makes the segmentation of image information more accurate, and the obtained image information is more complete.
  • the encoder includes a first convolution layer, a first dilated convolution layer, and a pooling layer, and the encoder in the first layer network encodes the multi-dimensional image block
  • the encoded result obtained by processing includes:
  • the encoding result corresponding to the multi-dimensional image block is obtained by performing down-fitting on the pooling result through a preset down-fitting layer.
  • the encoder in the first-layer network includes a first convolutional layer, a first atrous convolutional layer, and a pooling layer.
  • the multi-dimensional image block is obtained, the multi-dimensional image block is convolved and activated based on the first convolution layer to obtain the first sub-convolution result; then the first sub-convolution result is convoluted by the first hole convolution layer.
  • Product and activation, the first sub-hole convolution result is obtained, and finally the first sub-hole convolution result is processed by the pooling layer to obtain the sub-pooling result.
  • the first convolution layer, the first hole convolution layer and the pooling layer are all multi-dimensional processors, such as three-dimensional convolution (conv 3*3*3), three-dimensional hole convolution (3-dilated conv 3*3* 3) and three-dimensional pooling (max pool 2*2*1).
  • the results are directly calculated according to the first convolution layer and the first hole convolution layer respectively.
  • the convolution result is processed by a relu activation function, and finally the first sub-convolution result and the first sub-hole convolution result are obtained respectively.
  • the sub-pooling result is used as the input to encode the second first convolutional layer.
  • the output of the previous layer is used as the input of the latter layer, and the final pooling result is calculated.
  • the pooling result is obtained, the pooling result is de-fitted by a preset de-fitting layer to obtain the encoding result corresponding to the multi-dimensional image block.
  • the fitting drop layer includes the first convolution layer, the first hole convolution layer and the sub fitting drop layer (dropout 0.5).
  • the pooling result is used as the first convolution layer in the fitting drop layer.
  • the output result of the previous layer is used as the input of the next layer, and the encoding corresponding to the multi-dimensional image block is calculated. result.
  • the multi-dimensional image block is encoded by the encoder, which further improves the accuracy of the image processing, and the receptive field is increased through the hole convolution, thereby increasing the amount of information included in the output image.
  • the above-mentioned decoder includes an upsampling layer, a second convolutional layer, and a second atrous convolutional layer, and the above-mentioned decoder based on the first-layer network performs decoding processing on the encoding result
  • Obtaining the binary segmentation result graph of the target image includes:
  • the encoding result is calculated according to the upsampling layer, the second convolution layer and the second hole convolution layer to obtain a hole convolution result;
  • the hole convolution result is calculated by a preset activation function, and a binary segmentation result map of the target image is obtained.
  • the decoder includes an upsampling layer, a second convolutional layer, and a second atrous convolutional layer.
  • the upsampling layer in the decoder is used for computation to obtain the first upsampling result; Splicing to obtain the first splicing result; the first splicing result is used as the input of the second convolution layer, and the output result of the previous layer is used as the latter layer according to the order of the second convolution layer and the second hole convolution layer.
  • the input of , the second sub-hole convolution result is obtained by calculation.
  • the second sub-hole convolution result is processed through the second up-sampling layer to obtain the second up-sampling result, and the second up-sampling result is calculated with the first first hole convolution in the encoder.
  • the obtained results are spliced to obtain a second splicing result; the second splicing result is passed through the second second convolutional layer in the decoder, and is again in the order of the second convolutional layer and the second hole convolutional layer.
  • the output of the previous layer is used as the input of the latter layer to obtain the final atrous convolution result.
  • the convolution calculation result of the hole convolution result is calculated by a preset activation function (such as a sigmoid function), that is, a binary segmentation result map of the objective function is obtained.
  • a preset activation function such as a sigmoid function
  • the upsampling layer, the second convolutional layer and the second atrous convolutional layer are also multi-dimensional processors
  • the upsampling layer can be calculated by up-conv 2*2*1
  • the second convolutional layer and The second atrous convolutional layer is the same as the first convolutional layer and the first atrous convolutional layer.
  • the results obtained from the second convolutional layer and the second atroused convolutional layer are also directly obtained.
  • the convolution result is processed by a relu activation function, so as to obtain the final second sub-convolution result and the second sub-hole convolution result respectively.
  • the decoder processes the coding result to obtain a binary segmentation result map, which realizes efficient segmentation of pictures, and improves the amount of information included in the binary segmentation result map and the accuracy of picture segmentation.
  • performing multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation result graph of the target image includes:
  • Multi-layer convolution calculation is performed on the mask result based on the second-layer network to obtain a semantic segmentation result map of the target image.
  • the first convolution result is that when the first hole convolution result is obtained through the first hole convolution calculation of the encoder of the first layer network, the first hole convolution result is passed through once The result calculated by the activation function of convolution conv 1*1*9 and relu.
  • a mask constraint is performed on the binary segmentation map according to the first convolution result. Specifically, the mask constraint is to multiply the first convolution result and the obtained binary segmentation result map to obtain an image of the region of interest, and the image of interest is the mask result.
  • the mask result is calculated according to the output order of the third convolution layer, the third hole convolution layer and the fourth convolution layer to obtain a semantic segmentation result map of the target image.
  • the third convolution layer and the third hole convolution layer use the same convolution and activation calculation methods as the first convolution layer and the first hole convolution layer, and the fourth convolution layer adopts conv 1*1* 1 and the calculation method of the activation function of relu.
  • the information of the obtained semantic segmentation result map is more complete through mask constraints, and the accuracy of image segmentation is further improved.
  • the method before obtaining the preset atrous convolutional neural network, the method further includes:
  • the trained basic training network is tested according to the test image, and when the recognition success rate of the trained basic training network on the test image is greater than or equal to a preset success rate, the trained basic training network is determined.
  • the training network is the atrous convolutional neural network.
  • the basic training network needs to be trained to obtain the atrous convolutional neural network.
  • the basic training network is a model with the same structure as the atrous convolutional neural network but with different parameters. Pre-selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images; acquiring a basic training network, inputting the training images into the basic training network, according to The training image and the standard segmentation map corresponding to the training image are adjusted, and the parameters of the basic training network are adjusted to obtain the trained basic training network.
  • test the trained basic training network according to the test image and determine when the similarity between the recognition result of the trained basic training network for the test image and the standard segmentation map corresponding to the test image is greater than or equal to a preset threshold.
  • the trained basic training network successfully recognizes the test image; when the trained basic training network recognizes the test image with a success rate greater than or equal to a preset success rate, the trained basic training network is determined to be a preset hole Convolutional Neural Networks.
  • the basic training network is trained in advance, so that when the target image is obtained, the image segmentation can be quickly performed according to the trained network, which improves the efficiency and accuracy of image segmentation.
  • the above-mentioned training of the basic training network according to the training image, the obtained basic training network after training includes:
  • a standard segmentation image of the training image is acquired, and the basic training network is trained according to the training segmentation image and the standard segmentation image to obtain a trained basic training network.
  • two-layer wavelet decomposition is performed on each training image to obtain a corresponding training image block, and the training image block is input into the basic training network, and the output corresponding to the training image is obtained.
  • Train to segment images A standard segmented image of the training image is acquired, where the standard segmented image is a preset segmented image associated with the training image.
  • the loss function of the basic training network can be calculated according to the standard segmentation image and the training segmentation image. When the loss function converges, the basic training network is the trained basic training network.
  • the basic training network is trained by training image blocks, so that the trained network can accurately segment the image, avoid the error of image segmentation, and further improve the accuracy of image segmentation.
  • the basic training network is trained according to the training segmentation image and the standard segmentation image, and the trained basic training network includes:
  • the loss function of the basic training network is calculated according to the first pixel number and the second pixel number, and when the loss function converges, the basic training network is determined to be the trained basic training network.
  • the loss function of the basic training network can be calculated according to the first pixel number of the training segmented image and the second pixel number of the standard segmented image.
  • the specific calculation formula of the loss function is as follows:
  • L 1 represents the second pixel number of the standard segmentation image
  • L 2 represents the first pixel number of the training segment image.
  • the trained basic training network is constrained by the loss function, which reduces the training time and improves the efficiency of model training.
  • the aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM) or the like.
  • the present application provides an embodiment of an image segmentation apparatus.
  • the apparatus embodiment corresponds to the method embodiment shown in FIG. 2 .
  • the apparatus may specifically Used in various electronic devices.
  • the image segmentation apparatus 300 in this embodiment includes: a decomposition module 301 , a processing module 302 , and a calculation module 303 . in:
  • a decomposition module 301 is used to acquire a target image, and perform two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
  • a target image is acquired, and the target image is an image including target segmentation information.
  • the image is subjected to two-layer wavelet decomposition.
  • the wavelet is usually a signal whose local feature is not 0 in a limited interval.
  • the first layer of wavelet decomposition is to decompose the image into low-frequency information and high-frequency information.
  • the high-frequency information is the part of the image with strong changes in intensity. , such as the image outline; low-frequency information is the part of the image where the intensity of the image changes gently, such as the large color block in the image.
  • the low-frequency information is decomposed into low-frequency information and high-frequency information, which is the second-level decomposition of the wavelet.
  • the target image can be decomposed by two-layer wavelet to obtain multi-dimensional image blocks.
  • the processing module 302 is configured to obtain a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the encoder in the first-layer network determines the The multi-dimensional image block is subjected to encoding processing to obtain an encoding result, and a decoder based on the first layer network performs decoding processing on the encoding result to obtain a binary segmentation result map of the target image;
  • processing module 302 includes:
  • a first processing unit configured to sequentially pass the multi-dimensional image block through the first convolutional layer, the first atrous convolutional layer and the pooling layer to obtain a pooling result
  • a down-fitting unit configured to perform down-fitting on the pooling result through a preset down-fitting layer to obtain an encoding result corresponding to the multi-dimensional image block.
  • the second processing unit is configured to, when the encoding result is obtained, calculate the encoding result according to the upsampling layer, the second convolution layer and the second hole convolution layer to obtain the hole volume product result;
  • the third processing unit is configured to calculate the hole convolution result by using a preset activation function to obtain a binary segmentation result map of the target image.
  • a preset atrous convolutional neural network is obtained, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the first-layer network includes an encoder and a
  • the decoder includes three first convolutional layers, three first atrous convolutional layers and two pooling layers, and encodes multi-dimensional image blocks according to the encoder;
  • the decoder includes two upsampling layers, two The second convolution layer and the two second holes are convolutional layers, and the encoder decodes the encoding result output by the encoder based on the decoder, and finally obtains a binary segmentation result graph;
  • the second layer network includes multiple convolution layers.
  • the binary segmentation result map corresponding to the target image can be obtained, and according to the second layer network, the multi-layer convolution calculation can be performed on the obtained binary segmentation result map to obtain the semantic segmentation map corresponding to the target image.
  • the calculation module 303 is configured to perform multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
  • the computing module 303 includes:
  • a first constraining unit configured to obtain a first convolution result of the first layer network, and perform mask constraint on the binary segmentation result graph according to the first convolution result to obtain a mask result
  • the second constraint unit is configured to perform multi-layer convolution calculation on the mask result based on the second layer network to obtain a semantic segmentation result map of the target image.
  • the second layer network includes a third convolution layer, a third hole convolution layer and a fourth convolution layer.
  • the first convolution result of the first layer network is obtained, wherein, The first convolution result is obtained from the first sub-hole convolution result obtained by the first first hole convolution calculation in the encoder of the first-layer network, and is obtained by re-convolution calculation. Multiply the first convolution result and the binary segmentation result graph to obtain a multiplication result. Input the multiplication result to the third convolutional layer.
  • the output result of the previous layer is used as the input of the next layer. Calculated by This obtains the final semantic segmentation result graph, which is the final segmentation result graph of the target image.
  • the above-mentioned semantic segmentation result graph information may also be stored in a node of a blockchain.
  • the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • an acquisition module used for selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images;
  • a training module for acquiring a basic training network, and training the basic training network according to the training image to obtain a trained basic training network
  • the test module is configured to test the trained basic training network according to the test image, and determine the The basic training network after the training is the atrous convolutional neural network.
  • the training module includes:
  • a decomposition unit configured to decompose the training image into training image blocks, and input the training image blocks into the basic training network to obtain training segmentation images
  • a training unit configured to acquire a standard segmented image of the training image, and train the basic training network according to the training segmented image and the standard segmented image to obtain a trained basic training network.
  • the training unit further includes:
  • an acquisition subunit for acquiring the first pixel number of the training segmentation image and the second pixel number of the standard segmentation image
  • a confirmation subunit configured to calculate the loss function of the basic training network according to the number of the first pixels and the number of the second pixels, and when the loss function converges, determine that the basic training network is a trained network Basic training network.
  • the basic training network needs to be trained to obtain the atrous convolutional neural network.
  • the basic training network is a model with the same structure as the atrous convolutional neural network but with different parameters. Pre-selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images; acquiring a basic training network, inputting the training images into the basic training network, according to The training image and the standard segmentation map corresponding to the training image are adjusted, and the parameters of the basic training network are adjusted to obtain the trained basic training network.
  • test the trained basic training network according to the test image and determine when the similarity between the recognition result of the trained basic training network for the test image and the standard segmentation map corresponding to the test image is greater than or equal to a preset threshold.
  • the trained basic training network successfully recognizes the test image; when the trained basic training network recognizes the test image with a success rate greater than or equal to a preset success rate, the trained basic training network is determined to be a preset hole Convolutional Neural Networks.
  • the image segmentation device proposed in this embodiment realizes that more image information can be obtained during image segmentation, improves the accuracy of image signal description during the extraction of local feature information, and greatly improves the accuracy of image signal description within the controllable range of network parameters.
  • the receptive field is increased, and the amount of information contained in each feature is improved, which further makes the segmentation of image information more accurate, and the obtained image information is more complete.
  • FIG. 4 is a block diagram of a basic structure of a computer device according to this embodiment.
  • the computer device 6 includes a memory 61 , a processor 62 , and a network interface 63 that communicate with each other through a system bus. It should be pointed out that only the computer device 6 with components 61-63 is shown in the figure, but it should be understood that it is not required to implement all of the shown components, and more or less components may be implemented instead.
  • the computer device here is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, special-purpose Integrated circuit (Application Specific Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), digital processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Signal Processor
  • embedded equipment etc.
  • the computer equipment may be a desktop computer, a notebook computer, a palmtop computer, a cloud server and other computing equipment.
  • the computer device can perform human-computer interaction with the user through a keyboard, a mouse, a remote control, a touch pad or a voice control device.
  • the memory 61 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), Magnetic Memory, Magnetic Disk, Optical Disk, etc.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the memory 61 may be an internal storage unit of the computer device 6 , such as a hard disk or a memory of the computer device 6 .
  • the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc.
  • the memory 61 may also include both the internal storage unit of the computer device 6 and its external storage device.
  • the memory 61 is generally used to store the operating system and various application software installed on the computer device 6, such as computer-readable instructions for an image segmentation method.
  • the memory 61 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 62 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips. This processor 62 is typically used to control the overall operation of the computer device 6 . In this embodiment, the processor 62 is configured to execute computer-readable instructions stored in the memory 61 or process data, for example, computer-readable instructions for executing the image segmentation method.
  • CPU Central Processing Unit
  • controller a microcontroller
  • microprocessor microprocessor
  • This processor 62 is typically used to control the overall operation of the computer device 6 .
  • the processor 62 is configured to execute computer-readable instructions stored in the memory 61 or process data, for example, computer-readable instructions for executing the image segmentation method.
  • the network interface 63 may include a wireless network interface or a wired network interface, and the network interface 63 is generally used to establish a communication connection between the computer device 6 and other electronic devices.
  • the computer device proposed in this embodiment realizes that more image information can be obtained during image segmentation, improves the accuracy of image signal description during local feature information extraction, and greatly increases the accuracy of image signal description within the controllable range of network parameters.
  • the receptive field is improved, and the amount of information contained in each feature is increased, which further makes the segmentation of image information more accurate, and the obtained image information is more complete.
  • the present application also provides another embodiment, that is, to provide a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions can be executed by at least one processor to The at least one processor is caused to perform the steps of the image segmentation method as described above.
  • the computer-readable storage medium proposed in this embodiment realizes that more image information can be obtained during image segmentation, improves the accuracy of image signal description during local feature information extraction, and within the range of controllable network parameters, It greatly increases the receptive field, improves the amount of information contained in each feature, further makes the segmentation of image information more accurate, and the obtained image information is more complete.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

An image segmentation method and apparatus, a computer device, and a storage medium, relating to the field of artificial intelligence. The image segmentation method comprises: obtaining a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block (S201); obtaining a preset dilated convolutional neural network, the dilated convolutional neural network comprising a first-layer network and a second-layer network, performing encoding processing on the multi-dimensional image block on the basis of an encoder in the first-layer network to obtain an encoding result, and performing decoding processing on the encoding result on the basis of a decoder in the first-layer network to obtain a binary segmentation result graph of the target image (S202); and performing classification, identification, and multi-layer convolution calculation on the binary segmentation result graph on the basis of the second-layer network to obtain a semantic segmentation result graph of the target image (S203). The semantic segmentation result graph can be stored in a blockchain. The method achieves accurate segmentation for an image.

Description

图像分割方法、装置、计算机设备及存储介质Image segmentation method, device, computer equipment and storage medium
本申请要求于2020年11月17日提交中国专利局、申请号为202011288874.3,发明名称为“图像分割方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on November 17, 2020 with the application number 202011288874.3 and the invention title is "image segmentation method, device, computer equipment and storage medium", the entire contents of which are incorporated by reference in this application.
技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种图像分割方法、装置、计算机设备及存储介质。The present application relates to the technical field of artificial intelligence, and in particular, to an image segmentation method, apparatus, computer equipment and storage medium.
背景技术Background technique
合成孔径雷达(synthetic aperture radar,SAR)作为一种具有距离高分辨率和方位高分辨率能力的成像雷达,在军事领域和民用领域具有广泛的应用。将SAR图像中的感兴趣目标检测出来并且根据目标的轮廓将它与背景分割开来,能够为后续理解分析,制定计划奠定基础。Synthetic aperture radar (SAR), as an imaging radar with high range resolution and azimuth resolution, has a wide range of applications in military and civilian fields. Detecting the target of interest in the SAR image and segmenting it from the background according to the contour of the target can lay the foundation for subsequent understanding, analysis and planning.
目前常见的分割方法有最大类间方差法、基于局部混合滤波的边缘检测算法、偏差修正模糊c均值算法等。近年来比较热的一个研究领域是基于深度学习的分割方法,该算法通过深度神经网络来学习图像特征,这种高度抽象的特征更有利于图像分割。发明人意识到,该方法通过端到端的深度神经网络,实现了对像素点的分类,然而该方法的缺点是使用线性差值,使得分割时细节结构信息丢失,造成边界模糊。其中,池化层虽然有扩大感受野的作用,但是却造成位置信息的丢失,而在语义分割处理的过程中往往需要保留位置信息。由此,最终导致图像分割时信息提取不够准确的问题。At present, the common segmentation methods include the maximum inter-class variance method, the edge detection algorithm based on local hybrid filtering, and the bias correction fuzzy c-means algorithm. A hot research field in recent years is the segmentation method based on deep learning, which learns image features through deep neural networks, and this highly abstract feature is more conducive to image segmentation. The inventor realized that this method realizes the classification of pixels through an end-to-end deep neural network, but the disadvantage of this method is that the linear difference is used, so that the detailed structure information is lost during segmentation, resulting in blurred boundaries. Among them, although the pooling layer has the effect of expanding the receptive field, it causes the loss of position information, and the position information often needs to be preserved in the process of semantic segmentation processing. As a result, the problem of inaccurate information extraction during image segmentation is finally caused.
发明内容SUMMARY OF THE INVENTION
本申请实施例的目的在于提出一种图像分割方法、装置、计算机设备及存储介质,以解决图像分割时信息提取不够准确的技术问题。The purpose of the embodiments of the present application is to provide an image segmentation method, device, computer equipment and storage medium, so as to solve the technical problem of insufficient information extraction during image segmentation.
为了解决上述技术问题,本申请实施例提供一种图像分割方法,采用了如下所述的技术方案:In order to solve the above technical problems, the embodiments of the present application provide an image segmentation method, which adopts the following technical solutions:
获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;Acquire a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
为了解决上述技术问题,本申请实施例还提供一种图像分割装置,采用了如下所述的技术方案:In order to solve the above technical problems, the embodiments of the present application also provide an image segmentation device, which adopts the following technical solutions:
分解模块,用于获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;a decomposition module, used for acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
处理模块,用于获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;The processing module is configured to obtain a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the multi-dimensional The image block is subjected to encoding processing to obtain an encoding result, and a decoder based on the first layer network performs decoding processing on the encoding result to obtain a binary segmentation result map of the target image;
计算模块,用于基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。The computing module is configured to perform multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation result graph of the target image.
为了解决上述技术问题,本申请实施例还提供一种计算机设备,包括存储器和处理器, 以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时还实现如下步骤:In order to solve the above technical problem, an embodiment of the present application further provides a computer device, including a memory and a processor, and computer-readable instructions stored in the memory and executable on the processor, and the processor executes The computer-readable instructions also implement the following steps:
获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;Acquire a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
为了解决上述技术问题,本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时,使得所述处理器还执行如下步骤:In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the processing The device also performs the following steps:
获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;Acquire a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
上述图像分割方法,通过获取目标图像,并对目标图像进行二层小波分解,得到多维图像块,通过分解的多维图像块可以提高图像处理的精度;而后,获取预设的空洞卷积神经网络,其中,空洞卷积神经网络包括第一层网络和第二层网络,基于第一层网络中的编码器对多维图像块进行编码处理得到编码结果,基于第一层网络的解码器对编码结果进行解码处理得到目标图像的二值分割结果图,通过预设的空洞卷积神经网络对多维图像块进行处理,可以在网络参数可控的范围内,增加感受野,使得每个特征图像所包含的信息量越来越大,有助于图像全局信息的提取,避免图像信息的丢失;最后,基于第二层网络对二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图,由此,实现了在图像分割时能够获取到更多的图像信息,提高了局部特征信息提取时图像信号描述的准确度,并且在网络参数可控的范围内,极大地增加了感受野,提高了每个特征所包含的信息量,进一步使得图像信息分割更加精确,获取得到的图像信息更加完整。The above image segmentation method obtains a target image and performs two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block. The decomposed multi-dimensional image block can improve the accuracy of image processing; then, obtain a preset atrous convolutional neural network, Among them, the atrous convolutional neural network includes a first-layer network and a second-layer network. The encoder in the first-layer network encodes the multi-dimensional image block to obtain an encoding result, and the decoder based on the first-layer network encodes the encoding result. The decoding process obtains the binary segmentation result map of the target image, and the multi-dimensional image block is processed by the preset atrous convolutional neural network, which can increase the receptive field within the controllable range of network parameters, so that each feature image contains The increasing amount of information is helpful for the extraction of global information of the image and avoids the loss of image information; finally, the multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation of the target image. As a result, more image information can be obtained during image segmentation, the accuracy of image signal description during local feature information extraction is improved, and within the controllable range of network parameters, the experience is greatly increased. It increases the amount of information contained in each feature, further makes the segmentation of image information more accurate, and the obtained image information is more complete.
附图说明Description of drawings
为了更清楚地说明本申请中的方案,下面将对本申请实施例描述中所需要使用的附图作一个简单介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the solutions in the present application more clearly, the following will briefly introduce the accompanying drawings used in the description of the embodiments of the present application. For those of ordinary skill, other drawings can also be obtained from these drawings without any creative effort.
图1是本申请可以应用于其中的示例性系统架构图;FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
图2根据本申请的图像分割方法的一个实施例的流程图;2 is a flowchart of an embodiment of an image segmentation method according to the present application;
图3是根据本申请的图像分割装置的一个实施例的结构示意图;3 is a schematic structural diagram of an embodiment of an image segmentation apparatus according to the present application;
图4是根据本申请的计算机设备的一个实施例的结构示意图。FIG. 4 is a schematic structural diagram of an embodiment of a computer device according to the present application.
附图标记:图像分割装置300、分解模块301、处理模块302、以及计算模块303。Reference numerals: image segmentation device 300 , decomposition module 301 , processing module 302 , and calculation module 303 .
具体实施方式Detailed ways
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同;本文中在申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请;本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。本申请的说 明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field of this application; the terms used herein in the specification of the application are for the purpose of describing specific embodiments only It is not intended to limit the application; the terms "comprising" and "having" and any variations thereof in the description and claims of this application and the above description of the drawings are intended to cover non-exclusive inclusion. The terms "first", "second" and the like in the description and claims of the present application or the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific order.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor a separate or alternative embodiment that is mutually exclusive of other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.
为了使本技术领域的人员更好地理解本申请方案,下面将结合附图,对本申请实施例中的技术方案进行清楚、完整地描述。In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the accompanying drawings.
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1 , the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 . The network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 . The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如网页浏览器应用、购物类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications may be installed on the terminal devices 101 , 102 and 103 , such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social platform software, and the like.
终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。The terminal devices 101, 102, and 103 can be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic Picture Experts Compression Standard Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, Moving Picture Experts Compression Standard Audio Layer 4) Players, Laptops and Desktops, etc.
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103上显示的页面提供支持的后台服务器。The server 105 may be a server that provides various services, such as a background server that provides support for the pages displayed on the terminal devices 101 , 102 , and 103 .
需要说明的是,本申请实施例所提供的图像分割方法一般由服务器/终端设备执行,相应地,图像分割装置一般设置于服务器/终端设备中。It should be noted that the image segmentation method provided by the embodiments of the present application is generally performed by a server/terminal device, and accordingly, an image segmentation apparatus is generally set in the server/terminal device.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
继续参考图2,示出了根据本申请的图像分割的方法的一个实施例的流程图。所述的图像分割方法,包括以下步骤:Continuing to refer to FIG. 2 , a flowchart of one embodiment of the method for image segmentation according to the present application is shown. The described image segmentation method includes the following steps:
步骤S201,获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;Step S201, acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
在本实施例中,获取目标图像,目标图像为包括目标分割信息的图像。在得到目标图像时,对该图像进行二层小波分解。具体地,小波通常为局部特征在有限的区间内取值不为0的信号,小波分解的第一层分解为将图像分为低频信息和高频信息,高频信息为图像强度变化强烈的部分,如图像轮廓;低频信息为图像强度变化平缓的部分,如图像中大色块的地方。在第一层分解的基础上,将低频信息再分解为低频信息和高频信息,即为小波的第二层分解。通过MATLAB可以对目标图像进行二层小波分解,从而得到多维图像块。In this embodiment, a target image is acquired, and the target image is an image including target segmentation information. When the target image is obtained, the image is subjected to two-layer wavelet decomposition. Specifically, the wavelet is usually a signal whose local feature is not 0 in a limited interval. The first layer of wavelet decomposition is to decompose the image into low-frequency information and high-frequency information. The high-frequency information is the part of the image with strong changes in intensity. , such as the image outline; low-frequency information is the part of the image where the intensity of the image changes gently, such as the large color block in the image. On the basis of the first-level decomposition, the low-frequency information is decomposed into low-frequency information and high-frequency information, which is the second-level decomposition of the wavelet. Through MATLAB, the target image can be decomposed by two-layer wavelet to obtain multi-dimensional image blocks.
步骤S202,获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;Step S202, obtaining a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, based on the encoder in the first-layer network for the multi-dimensional image block. Perform encoding processing to obtain an encoding result, and perform decoding processing on the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
在本实施例中,在得到多维图像块时,获取预设的空洞卷积神经网络,其中,该空洞卷积神经网络包括第一层网络和第二层网络,第一层网络包括编码器和解码器,编码器包括三个第一卷积层、三个第一空洞卷积层和两个池化层,根据编码器对多维图像块进行编码;解码器包括两个上采样层、两个第二卷积层和两个第二空洞卷积成层,基于解码器对编码器输出的编码结果进行解码,最终得到二值分割结果图;第二层网络则包括多个卷积层。根据第一层网络可以得到目标图像对应的二值分割结果图,根据第二层网络则可以对得到二值分割结果图进行多层卷积计算,得到目标图像对应的语义分割图。In this embodiment, when a multi-dimensional image block is obtained, a preset atrous convolutional neural network is obtained, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the first-layer network includes an encoder and a The decoder includes three first convolutional layers, three first atrous convolutional layers and two pooling layers, and encodes multi-dimensional image blocks according to the encoder; the decoder includes two upsampling layers, two The second convolution layer and the two second holes are convolutional layers, and the encoder decodes the encoding result output by the encoder based on the decoder, and finally obtains a binary segmentation result graph; the second layer network includes multiple convolution layers. According to the first layer network, the binary segmentation result map corresponding to the target image can be obtained, and according to the second layer network, the multi-layer convolution calculation can be performed on the obtained binary segmentation result map to obtain the semantic segmentation map corresponding to the target image.
步骤S203,基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述 目标图像的语义分割结果图。Step S203, performing multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation result graph of the target image.
在本实施例中,在得到二值分割结果图时,根据第二层网络对该二值分割图进行多层卷积计算,得到目标图像的语义分割图。具体地,第二层网络包括第三卷积层、第三空洞卷积层和第四卷积层,在得到二值分割结果图时,获取第一层网络的第一卷积结果,其中,第一卷积结果由第一层网络的编码器中第一次的第一空洞卷积计算得到的第一子空洞卷积结果,经过再次卷积计算得到。将该第一卷积结果和该二值分割结果图进行相乘,得到相乘结果。输入该相乘结果至第三卷积层,按照第三卷积层、第三空洞卷积层和第四卷积层的顺序,将前一层的输出结果作为后一层的输入计算,由此得到最终的语义分割结果图,该语义分割结果图即为目标图像最终分割的结果图。In this embodiment, when the binary segmentation result map is obtained, multi-layer convolution calculation is performed on the binary segmentation map according to the second layer network to obtain the semantic segmentation map of the target image. Specifically, the second layer network includes a third convolution layer, a third hole convolution layer and a fourth convolution layer. When the binary segmentation result graph is obtained, the first convolution result of the first layer network is obtained, wherein, The first convolution result is obtained from the first sub-hole convolution result obtained by the first first hole convolution calculation in the encoder of the first-layer network, and is obtained by re-convolution calculation. Multiply the first convolution result and the binary segmentation result graph to obtain a multiplication result. Input the multiplication result to the third convolutional layer. According to the order of the third convolutional layer, the third convolutional convolutional layer and the fourth convolutional layer, the output result of the previous layer is used as the input of the next layer. Calculated by This obtains the final semantic segmentation result graph, which is the final segmentation result graph of the target image.
需要强调的是,为进一步保证上述语义分割结果图信息的私密和安全性,上述语义分割结果图信息还可以存储于一区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned semantic segmentation result graph information, the above-mentioned semantic segmentation result graph information may also be stored in a node of a blockchain.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本实施例实现了在图像分割时能够获取到更多的图像信息,提高了局部特征信息提取时图像信号描述的准确度,并且在网络参数可控的范围内,极大地增加了感受野,提高了每个特征所包含的信息量,进一步使得图像信息分割更加精确,获取得到的图像信息更加完整。This embodiment realizes that more image information can be obtained during image segmentation, improves the accuracy of image signal description when extracting local feature information, and greatly increases the receptive field within the controllable range of network parameters. The amount of information contained in each feature is further improved, which further makes the segmentation of image information more accurate, and the obtained image information is more complete.
在本申请的一些实施例中,上述编码器包括第一卷积层、第一空洞卷积层和池化层,上述基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果包括:In some embodiments of the present application, the encoder includes a first convolution layer, a first dilated convolution layer, and a pooling layer, and the encoder in the first layer network encodes the multi-dimensional image block The encoded result obtained by processing includes:
将所述多维图像块依次经过所述第一卷积层、所述第一空洞卷积层和所述池化层,得到池化结果;Passing the multi-dimensional image block through the first convolution layer, the first hole convolution layer and the pooling layer in sequence to obtain a pooling result;
通过预设降拟合层对所述池化结果进行降拟合得到所述多维图像块对应的编码结果。The encoding result corresponding to the multi-dimensional image block is obtained by performing down-fitting on the pooling result through a preset down-fitting layer.
在本实施例中,第一层网络中的编码器包括第一卷积层、第一空洞卷积层和池化层。在得到多维图像块时,基于第一卷积层对多维图像块进行卷积及激活,得到第一子卷积结果;而后通过第一空洞卷积层对该第一子卷积结果进行空洞卷积及激活,得到第一子空洞卷积结果,最后通过池化层对该第一子空洞卷积结果进行处理,得到子池化结果。其中,第一卷积层、第一空洞卷积层和池化层均为多维处理器,如三维卷积(conv 3*3*3)、三维空洞卷积(3-dilated conv 3*3*3)和三维池化(max pool 2*2*1)。在第一卷积层和第一空洞卷积层计算得到第一子卷积结果和第一子空洞卷积结果前,都会分别对根据第一卷积层和第一空洞卷积层直接计算得到的卷积结果,进行一个relu的激活函数处理,最终分别得到第一子卷积结果和第一子空洞卷积结果。在得到子池化结果时,将该子池化结果作为编码其中第二个第一卷积层的输入,按照第一卷积层、第一空洞卷积层和池化层的顺序,同样将前一层的输出结果作为后一层的输入,计算得到最终的池化结果。在得到该池化结果时,通过预设的降拟合层对该池化结果进行降拟合,得到多维图像块对应的编码结果。其中,降拟合层中包括第一卷积层、第一空洞卷积层和子降拟合层(dropout 0.5),在得到池化结果时,将该池化结果作为该降拟合层中第一卷积层的输入,按照第一卷积层、第一空洞卷积层和子降拟合层的顺序,将前一层的输出结果作为后一层的输入,计算得到多维图像块对应的编码结果。In this embodiment, the encoder in the first-layer network includes a first convolutional layer, a first atrous convolutional layer, and a pooling layer. When the multi-dimensional image block is obtained, the multi-dimensional image block is convolved and activated based on the first convolution layer to obtain the first sub-convolution result; then the first sub-convolution result is convoluted by the first hole convolution layer. Product and activation, the first sub-hole convolution result is obtained, and finally the first sub-hole convolution result is processed by the pooling layer to obtain the sub-pooling result. Among them, the first convolution layer, the first hole convolution layer and the pooling layer are all multi-dimensional processors, such as three-dimensional convolution (conv 3*3*3), three-dimensional hole convolution (3-dilated conv 3*3* 3) and three-dimensional pooling (max pool 2*2*1). Before the first sub-convolution result and the first sub-hole convolution result are calculated by the first convolution layer and the first hole convolution layer, the results are directly calculated according to the first convolution layer and the first hole convolution layer respectively. The convolution result is processed by a relu activation function, and finally the first sub-convolution result and the first sub-hole convolution result are obtained respectively. When the sub-pooling result is obtained, the sub-pooling result is used as the input to encode the second first convolutional layer. The output of the previous layer is used as the input of the latter layer, and the final pooling result is calculated. When the pooling result is obtained, the pooling result is de-fitted by a preset de-fitting layer to obtain the encoding result corresponding to the multi-dimensional image block. Among them, the fitting drop layer includes the first convolution layer, the first hole convolution layer and the sub fitting drop layer (dropout 0.5). When the pooling result is obtained, the pooling result is used as the first convolution layer in the fitting drop layer. For the input of a convolutional layer, in the order of the first convolutional layer, the first atrous convolutional layer and the sub-reduction fitting layer, the output result of the previous layer is used as the input of the next layer, and the encoding corresponding to the multi-dimensional image block is calculated. result.
本实施例通过编码器对多维图像块进行编码处理,进一步提高了图片处理的精度,并且通过空洞卷积增加了感受野,提高了输出图像包括的信息量。In this embodiment, the multi-dimensional image block is encoded by the encoder, which further improves the accuracy of the image processing, and the receptive field is increased through the hole convolution, thereby increasing the amount of information included in the output image.
在本申请的一些实施例中,上述解码器包括上采样层、第二卷积层和第二空洞卷积成层,上述基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图包括:In some embodiments of the present application, the above-mentioned decoder includes an upsampling layer, a second convolutional layer, and a second atrous convolutional layer, and the above-mentioned decoder based on the first-layer network performs decoding processing on the encoding result Obtaining the binary segmentation result graph of the target image includes:
在得到所述编码结果时,根据所述上采样层、所述第二卷积层和所述第二空洞卷积成层对所述编码结果进行计算,得到空洞卷积结果;When the encoding result is obtained, the encoding result is calculated according to the upsampling layer, the second convolution layer and the second hole convolution layer to obtain a hole convolution result;
通过预设激活函数对所述空洞卷积结果进行计算,得到所述目标图像的二值分割结果图。The hole convolution result is calculated by a preset activation function, and a binary segmentation result map of the target image is obtained.
在本实施例中,解码器包括上采样层、第二卷积层和第二空洞卷积层。在得到编码结果时,通过解码器中的上采样层进行计算,得到第一上采样结果;将该第一上采样结果与编码器中第二次的第一空洞卷积计算后得到的结果进行拼接,得到第一拼接结果;将该第一拼接结果作为第二卷积层的输入,按照第二卷积层和第二空洞卷积层的顺序,将前一层的输出结果作为后一层的输入,计算得到第二子空洞卷积结果。In this embodiment, the decoder includes an upsampling layer, a second convolutional layer, and a second atrous convolutional layer. When the encoding result is obtained, the upsampling layer in the decoder is used for computation to obtain the first upsampling result; Splicing to obtain the first splicing result; the first splicing result is used as the input of the second convolution layer, and the output result of the previous layer is used as the latter layer according to the order of the second convolution layer and the second hole convolution layer. The input of , the second sub-hole convolution result is obtained by calculation.
之后,通过第二个上采样层对该第二子空洞卷积结果进行处理,得到第二上采样结果,将该第二上采样结果与编码器中第一次的第一空洞卷积计算后得到的结果进行拼接,得到第二拼接结果;将该第二拼接结果通过解码器中的第二个第二卷积层,再次按照第二卷积层和第二空洞卷积层的顺序,将前一层的输出结果作为后一层的输入计算得到最终的空洞卷积结果。最后,在得到该空洞卷积结果时,在对该空洞卷积结果通过预设激活函数计算之前,还需对该空洞卷积结果再进行一次卷积运算(如conv 1*1*9),之后通过预设激活函数(如sigmoid函数)对该空洞卷积结果的卷积计算结果进行计算,即得到目标函数的二值分割结果图。After that, the second sub-hole convolution result is processed through the second up-sampling layer to obtain the second up-sampling result, and the second up-sampling result is calculated with the first first hole convolution in the encoder. The obtained results are spliced to obtain a second splicing result; the second splicing result is passed through the second second convolutional layer in the decoder, and is again in the order of the second convolutional layer and the second hole convolutional layer. The output of the previous layer is used as the input of the latter layer to obtain the final atrous convolution result. Finally, when the hole convolution result is obtained, before the hole convolution result is calculated by the preset activation function, it is necessary to perform a convolution operation on the hole convolution result (such as conv 1*1*9), Then, the convolution calculation result of the hole convolution result is calculated by a preset activation function (such as a sigmoid function), that is, a binary segmentation result map of the objective function is obtained.
特别地,本实施例中上采样层、第二卷积层和第二空洞卷积层亦均为多维处理器,上采样层可用up-conv 2*2*1计算,第二卷积层和第二空洞卷积层与第一卷积层、第一空洞卷积层所采用的卷积相同。在第二卷积层和第二空洞卷积层计算得到第二子卷积结果和第二子空洞卷积结果前,亦会分别对根据第二卷积层和第二空洞卷积层直接得到的卷积结果进行一个relu的激活函数的处理,从而分别得到最终的第二子卷积结果和第二子空洞卷积结果。In particular, in this embodiment, the upsampling layer, the second convolutional layer and the second atrous convolutional layer are also multi-dimensional processors, the upsampling layer can be calculated by up-conv 2*2*1, the second convolutional layer and The second atrous convolutional layer is the same as the first convolutional layer and the first atrous convolutional layer. Before the second convolutional layer and the second atrous convolutional layer are calculated to obtain the second subconvolutional result and the second atroused convolutional result, the results obtained from the second convolutional layer and the second atroused convolutional layer are also directly obtained. The convolution result is processed by a relu activation function, so as to obtain the final second sub-convolution result and the second sub-hole convolution result respectively.
本实施例通过解码器对编码结果进行处理得到二值分割结果图,实现了对图片的高效分割,并且提高了二值分割结果图所包括的信息量,以及图片分割的精确度。In this embodiment, the decoder processes the coding result to obtain a binary segmentation result map, which realizes efficient segmentation of pictures, and improves the amount of information included in the binary segmentation result map and the accuracy of picture segmentation.
在本申请的一些实施例中,上述基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图包括:In some embodiments of the present application, performing multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation result graph of the target image includes:
获取所述第一层网络的第一卷积结果,根据所述第一卷积结果对所述二值分割结果图进行掩膜约束,得到掩膜结果;Obtain the first convolution result of the first layer of network, and perform mask constraint on the binary segmentation result graph according to the first convolution result to obtain a mask result;
基于所述第二层网络对所述掩膜结果进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the mask result based on the second-layer network to obtain a semantic segmentation result map of the target image.
在本实施例中,第一卷积结果为在通过第一层网络的编码器第一次的第一空洞卷积计算得到第一空洞卷积结果时,将该第一空洞卷积结果通过一次卷积conv 1*1*9以及relu的激活函数计算得到的结果。根据该第一卷积结果对该二值分割图进行掩膜约束。具体地,掩膜约束为将第一卷积结果与得到的二值分割结果图进行相乘,得到感兴趣区图像,该感兴趣图像即为掩膜结果。在得到该掩膜结果时,将该掩膜结果按照第三卷积层、第三空洞卷积层和第四卷积层的输出顺序,计算得到目标图像的语义分割结果图。其中,第三卷积层和第三空洞卷积层与第一卷积层、第一空洞卷积层所采用的卷积和激活计算方式相同,第四卷积层则采用conv 1*1*1以及relu的激活函数计算方式。In this embodiment, the first convolution result is that when the first hole convolution result is obtained through the first hole convolution calculation of the encoder of the first layer network, the first hole convolution result is passed through once The result calculated by the activation function of convolution conv 1*1*9 and relu. A mask constraint is performed on the binary segmentation map according to the first convolution result. Specifically, the mask constraint is to multiply the first convolution result and the obtained binary segmentation result map to obtain an image of the region of interest, and the image of interest is the mask result. When the mask result is obtained, the mask result is calculated according to the output order of the third convolution layer, the third hole convolution layer and the fourth convolution layer to obtain a semantic segmentation result map of the target image. Among them, the third convolution layer and the third hole convolution layer use the same convolution and activation calculation methods as the first convolution layer and the first hole convolution layer, and the fourth convolution layer adopts conv 1*1* 1 and the calculation method of the activation function of relu.
本实施例通过掩膜约束,使得得到的语义分割结果图的信息更完全,进一步提高了图片分割的精确度。In this embodiment, the information of the obtained semantic segmentation result map is more complete through mask constraints, and the accuracy of image segmentation is further improved.
在本申请的一些实施例中,在上述获取预设的空洞卷积神经网络之前还包括:In some embodiments of the present application, before obtaining the preset atrous convolutional neural network, the method further includes:
选取预设图像库中预设个数的图像为训练图像,将所述预设图像库中剩余的图像作为测试图像;Selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images;
获取基础训练网络,根据所述训练图像对所述基础训练网络进行训练,得到训练后的基础训练网络;Obtaining a basic training network, training the basic training network according to the training image, and obtaining a trained basic training network;
根据所述测试图像对所述训练后的基础训练网络进行测试,在所述训练后的基础训练网络对所述测试图像的识别成功率大于等于预设成功率时,确定所述训练后的基础训练网络为所述空洞卷积神经网络。The trained basic training network is tested according to the test image, and when the recognition success rate of the trained basic training network on the test image is greater than or equal to a preset success rate, the trained basic training network is determined. The training network is the atrous convolutional neural network.
在本实施例中,在通过预设的空洞卷积神经网络对多维图像块进行处理之前,需要对基础训练网络进行训练,得到该空洞卷积神经网络。具体地,该基础训练网络为与空洞卷积神经网络结构相同,参数不同的模型。预先选取预设图像库中预设个数的图像为训练图像,将所述预设图像库中剩余的图像作为测试图像;获取基础训练网络,将该训练图像输入至该基础训练网络中,根据训练图像和该训练图像对应的标准分割图,对该基础训练网络的参数进行调整,得到训练后的基础训练网络。之后,根据测试图像对训练后的基础训练网络进行测试,在该训练后的基础训练网络对该测试图像的识别结果与该测试图像对应的标准分割图的相似度大于等于预设阈值时,确定该训练后的基础训练网络对该测试图像识别成功;在训练后的基础训练网络对测试图像的识别成功率大于等于预设成功率时,则确定该训练后的基础训练网络为预设的空洞卷积神经网络。In this embodiment, before the multi-dimensional image blocks are processed by the preset atrous convolutional neural network, the basic training network needs to be trained to obtain the atrous convolutional neural network. Specifically, the basic training network is a model with the same structure as the atrous convolutional neural network but with different parameters. Pre-selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images; acquiring a basic training network, inputting the training images into the basic training network, according to The training image and the standard segmentation map corresponding to the training image are adjusted, and the parameters of the basic training network are adjusted to obtain the trained basic training network. Then, test the trained basic training network according to the test image, and determine when the similarity between the recognition result of the trained basic training network for the test image and the standard segmentation map corresponding to the test image is greater than or equal to a preset threshold. The trained basic training network successfully recognizes the test image; when the trained basic training network recognizes the test image with a success rate greater than or equal to a preset success rate, the trained basic training network is determined to be a preset hole Convolutional Neural Networks.
本实施例通过预先对基础训练网络进行训练,使得在得到目标图像时,能够快速根据训练后的网络进行图像分割,提高了图像分割的效率及准确率。In this embodiment, the basic training network is trained in advance, so that when the target image is obtained, the image segmentation can be quickly performed according to the trained network, which improves the efficiency and accuracy of image segmentation.
在本申请的一些实施例中,上述根据所述训练图像对所述基础训练网络进行训练,得到训练后的基础训练网络包括:In some embodiments of the present application, the above-mentioned training of the basic training network according to the training image, the obtained basic training network after training includes:
分解所述训练图像为训练图像块,输入所述训练图像块至所述基础训练网络中得到训练分割图像;Decomposing the training image into training image blocks, inputting the training image blocks into the basic training network to obtain training segmentation images;
获取所述训练图像的标准分割图像,根据所述训练分割图像和所述标准分割图像对所述基础训练网络进行训练,得到训练后的基础训练网络。A standard segmentation image of the training image is acquired, and the basic training network is trained according to the training segmentation image and the standard segmentation image to obtain a trained basic training network.
在本实施例中,在得到训练图像时,对每一张训练图像均进行二层小波分解,得到对应的训练图像块,将该训练图像块输入至基础训练网络中,输出得到训练图像对应的训练分割图像。获取该训练图像的标准分割图像,该标准分割图像为预设的与训练图像关联的分割图像。根据该标准分割图像和训练分割图像可以对基础训练网络的损失函数进行计算,在损失函数收敛时,则该基础训练网络为训练后的基础训练网络。In this embodiment, when training images are obtained, two-layer wavelet decomposition is performed on each training image to obtain a corresponding training image block, and the training image block is input into the basic training network, and the output corresponding to the training image is obtained. Train to segment images. A standard segmented image of the training image is acquired, where the standard segmented image is a preset segmented image associated with the training image. The loss function of the basic training network can be calculated according to the standard segmentation image and the training segmentation image. When the loss function converges, the basic training network is the trained basic training network.
本实施例通过训练图像块对基础训练网络进行训练,使得训练后的网络能够准确地对图像进行分割,避免了图像分割的误差,进一步提高了图像分割的精确度。In this embodiment, the basic training network is trained by training image blocks, so that the trained network can accurately segment the image, avoid the error of image segmentation, and further improve the accuracy of image segmentation.
在本申请的一些实施例中,上述根据所述训练分割图像和所述标准分割图像对所述基础训练网络进行训练,得到训练后的基础训练网络包括:In some embodiments of the present application, the basic training network is trained according to the training segmentation image and the standard segmentation image, and the trained basic training network includes:
获取所述训练分割图像的第一像素个数,以及所述标准分割图像的第二像素个数;Obtain the first pixel number of the training segmented image, and the second pixel number of the standard segmented image;
根据所述第一像素个数和所述第二像素个数计算所述基础训练网络的损失函数,在所述损失函数收敛时,确定所述基础训练网络为训练后的基础训练网络。The loss function of the basic training network is calculated according to the first pixel number and the second pixel number, and when the loss function converges, the basic training network is determined to be the trained basic training network.
在本实施例中,根据训练分割图像的第一像素个数和标准分割图像的第二像素个数,可以计算得到基础训练网络的损失函数。该损失函数的具体计算公式如下所示:In this embodiment, the loss function of the basic training network can be calculated according to the first pixel number of the training segmented image and the second pixel number of the standard segmented image. The specific calculation formula of the loss function is as follows:
loss=1-2|L 1∩L 2|/(|L 1|+|L 2|) loss=1-2|L 1 ∩L 2 |/(|L 1 |+|L 2 |)
其中,L 1表示标准分割图像的第二像素个数,L 2表示训练分割图像的第一像素个数。在该损失函数收敛时,所得到的基础训练网络即为训练后的基础训练网络。 Wherein, L 1 represents the second pixel number of the standard segmentation image, and L 2 represents the first pixel number of the training segment image. When the loss function converges, the obtained basic training network is the trained basic training network.
本实施例通过损失函数对训练后的基础训练网络进行约束,减少了训练时长,提高了模型训练的效率。In this embodiment, the trained basic training network is constrained by the loss function, which reduces the training time and improves the efficiency of model training.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,该计算机可读指令可存储于一计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium. , when the computer-readable instructions are executed, the processes of the above-mentioned method embodiments may be included. Wherein, the aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM) or the like.
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowchart of the accompanying drawings are sequentially shown in the order indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order and may be performed in other orders. Moreover, at least a part of the steps in the flowchart of the accompanying drawings may include multiple sub-steps or multiple stages, and these sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and the execution sequence is also It does not have to be performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of sub-steps or stages of other steps.
进一步参考图3,作为对上述图2所示方法的实现,本申请提供了一种图像分割装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to FIG. 3 , as an implementation of the method shown in FIG. 2 above, the present application provides an embodiment of an image segmentation apparatus. The apparatus embodiment corresponds to the method embodiment shown in FIG. 2 . The apparatus may specifically Used in various electronic devices.
如图3所示,本实施例所述的图像分割装置300包括:分解模块301、处理模块302、以及计算模块303。其中:As shown in FIG. 3 , the image segmentation apparatus 300 in this embodiment includes: a decomposition module 301 , a processing module 302 , and a calculation module 303 . in:
分解模块301,用于获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;A decomposition module 301 is used to acquire a target image, and perform two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
在本实施例中,获取目标图像,目标图像为包括目标分割信息的图像。在得到目标图像时,对该图像进行二层小波分解。具体地,小波通常为局部特征在有限的区间内取值不为0的信号,小波分解的第一层分解为将图像分为低频信息和高频信息,高频信息为图像强度变化强烈的部分,如图像轮廓;低频信息为图像强度变化平缓的部分,如图像中大色块的地方。在第一层分解的基础上,将低频信息再分解为低频信息和高频信息,即为小波的第二层分解。通过MATLAB可以对目标图像进行二层小波分解,从而得到多维图像块。In this embodiment, a target image is acquired, and the target image is an image including target segmentation information. When the target image is obtained, the image is subjected to two-layer wavelet decomposition. Specifically, the wavelet is usually a signal whose local feature is not 0 in a limited interval. The first layer of wavelet decomposition is to decompose the image into low-frequency information and high-frequency information. The high-frequency information is the part of the image with strong changes in intensity. , such as the image outline; low-frequency information is the part of the image where the intensity of the image changes gently, such as the large color block in the image. On the basis of the first-level decomposition, the low-frequency information is decomposed into low-frequency information and high-frequency information, which is the second-level decomposition of the wavelet. Through MATLAB, the target image can be decomposed by two-layer wavelet to obtain multi-dimensional image blocks.
处理模块302,用于获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;The processing module 302 is configured to obtain a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the encoder in the first-layer network determines the The multi-dimensional image block is subjected to encoding processing to obtain an encoding result, and a decoder based on the first layer network performs decoding processing on the encoding result to obtain a binary segmentation result map of the target image;
其中,所述处理模块302包括:Wherein, the processing module 302 includes:
第一处理单元,用于将所述多维图像块依次经过所述第一卷积层、所述第一空洞卷积层和所述池化层,得到池化结果;a first processing unit, configured to sequentially pass the multi-dimensional image block through the first convolutional layer, the first atrous convolutional layer and the pooling layer to obtain a pooling result;
降拟合单元,用于通过预设降拟合层对所述池化结果进行降拟合得到所述多维图像块对应的编码结果。A down-fitting unit, configured to perform down-fitting on the pooling result through a preset down-fitting layer to obtain an encoding result corresponding to the multi-dimensional image block.
第二处理单元,用于在得到所述编码结果时,根据所述上采样层、所述第二卷积层和所述第二空洞卷积成层对所述编码结果进行计算,得到空洞卷积结果;The second processing unit is configured to, when the encoding result is obtained, calculate the encoding result according to the upsampling layer, the second convolution layer and the second hole convolution layer to obtain the hole volume product result;
第三处理单元,用于通过预设激活函数对所述空洞卷积结果进行计算,得到所述目标图像的二值分割结果图。The third processing unit is configured to calculate the hole convolution result by using a preset activation function to obtain a binary segmentation result map of the target image.
在本实施例中,在得到多维图像块时,获取预设的空洞卷积神经网络,其中,该空洞卷积神经网络包括第一层网络和第二层网络,第一层网络包括编码器和解码器,编码器包括三个第一卷积层、三个第一空洞卷积层和两个池化层,根据编码器对多维图像块进行编码;解码器包括两个上采样层、两个第二卷积层和两个第二空洞卷积成层,基于解码器对编码器输出的编码结果进行解码,最终得到二值分割结果图;第二层网络则包括多个卷积层。根据第一层网络可以得到目标图像对应的二值分割结果图,根据第二层网络则可以对得到二值分割结果图进行多层卷积计算,得到目标图像对应的语义分割图。In this embodiment, when a multi-dimensional image block is obtained, a preset atrous convolutional neural network is obtained, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the first-layer network includes an encoder and a The decoder includes three first convolutional layers, three first atrous convolutional layers and two pooling layers, and encodes multi-dimensional image blocks according to the encoder; the decoder includes two upsampling layers, two The second convolution layer and the two second holes are convolutional layers, and the encoder decodes the encoding result output by the encoder based on the decoder, and finally obtains a binary segmentation result graph; the second layer network includes multiple convolution layers. According to the first layer network, the binary segmentation result map corresponding to the target image can be obtained, and according to the second layer network, the multi-layer convolution calculation can be performed on the obtained binary segmentation result map to obtain the semantic segmentation map corresponding to the target image.
计算模块303,用于基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。The calculation module 303 is configured to perform multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
其中,所述计算模块303包括:Wherein, the computing module 303 includes:
第一约束单元,用于获取所述第一层网络的第一卷积结果,根据所述第一卷积结果对所述二值分割结果图进行掩膜约束,得到掩膜结果;a first constraining unit, configured to obtain a first convolution result of the first layer network, and perform mask constraint on the binary segmentation result graph according to the first convolution result to obtain a mask result;
第二约束单元,用于基于所述第二层网络对所述掩膜结果进行多层卷积计算,得到所述目标图像的语义分割结果图。The second constraint unit is configured to perform multi-layer convolution calculation on the mask result based on the second layer network to obtain a semantic segmentation result map of the target image.
在本实施例中,在得到二值分割结果图时,根据第二层网络对该二值分割图进行多层卷积计算,得到目标图像的语义分割图。具体地,第二层网络包括第三卷积层、第三空洞卷积层和第四卷积层,在得到二值分割结果图时,获取第一层网络的第一卷积结果,其中,第一卷积结果由第一层网络的编码器中第一次的第一空洞卷积计算得到的第一子空洞卷积结果,经过再次卷积计算得到。将该第一卷积结果和该二值分割结果图进行相乘,得到相乘结果。输入该相乘结果至第三卷积层,按照第三卷积层、第三空洞卷积层和第四卷积层的顺序,将前一层的输出结果作为后一层的输入计算,由此得到最终的语义分割结果图,该语义分割结果图即为目标图像最终分割的结果图。In this embodiment, when the binary segmentation result map is obtained, multi-layer convolution calculation is performed on the binary segmentation map according to the second layer network to obtain the semantic segmentation map of the target image. Specifically, the second layer network includes a third convolution layer, a third hole convolution layer and a fourth convolution layer. When the binary segmentation result graph is obtained, the first convolution result of the first layer network is obtained, wherein, The first convolution result is obtained from the first sub-hole convolution result obtained by the first first hole convolution calculation in the encoder of the first-layer network, and is obtained by re-convolution calculation. Multiply the first convolution result and the binary segmentation result graph to obtain a multiplication result. Input the multiplication result to the third convolutional layer. According to the order of the third convolutional layer, the third convolutional convolutional layer and the fourth convolutional layer, the output result of the previous layer is used as the input of the next layer. Calculated by This obtains the final semantic segmentation result graph, which is the final segmentation result graph of the target image.
需要强调的是,为进一步保证上述语义分割结果图信息的私密和安全性,上述语义分割结果图信息还可以存储于一区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned semantic segmentation result graph information, the above-mentioned semantic segmentation result graph information may also be stored in a node of a blockchain.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本实施例中提出的图像分割装置还包括:The image segmentation device proposed in this embodiment further includes:
获取模块,用于选取预设图像库中预设个数的图像为训练图像,将所述预设图像库中剩余的图像作为测试图像;an acquisition module, used for selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images;
训练模块,用于获取基础训练网络,根据所述训练图像对所述基础训练网络进行训练,得到训练后的基础训练网络;a training module for acquiring a basic training network, and training the basic training network according to the training image to obtain a trained basic training network;
测试模块,用于根据所述测试图像对所述训练后的基础训练网络进行测试,在所述训练后的基础训练网络对所述测试图像的识别成功率大于等于预设成功率时,确定所述训练后的基础训练网络为所述空洞卷积神经网络。The test module is configured to test the trained basic training network according to the test image, and determine the The basic training network after the training is the atrous convolutional neural network.
其中,所述训练模块包括:Wherein, the training module includes:
分解单元,用于分解所述训练图像为训练图像块,输入所述训练图像块至所述基础训练网络中得到训练分割图像;a decomposition unit, configured to decompose the training image into training image blocks, and input the training image blocks into the basic training network to obtain training segmentation images;
训练单元,用于获取所述训练图像的标准分割图像,根据所述训练分割图像和所述标准分割图像对所述基础训练网络进行训练,得到训练后的基础训练网络。A training unit, configured to acquire a standard segmented image of the training image, and train the basic training network according to the training segmented image and the standard segmented image to obtain a trained basic training network.
其中,所述训练单元还包括:Wherein, the training unit further includes:
获取子单元,用于获取所述训练分割图像的第一像素个数,以及所述标准分割图像的第二像素个数;an acquisition subunit for acquiring the first pixel number of the training segmentation image and the second pixel number of the standard segmentation image;
确认子单元,用于根据所述第一像素个数和所述第二像素个数计算所述基础训练网络的损失函数,在所述损失函数收敛时,确定所述基础训练网络为训练后的基础训练网络。A confirmation subunit, configured to calculate the loss function of the basic training network according to the number of the first pixels and the number of the second pixels, and when the loss function converges, determine that the basic training network is a trained network Basic training network.
在本实施例中,在通过预设的空洞卷积神经网络对多维图像块进行处理之前,需要对基础训练网络进行训练,得到该空洞卷积神经网络。具体地,该基础训练网络为与空洞卷积神经网络结构相同,参数不同的模型。预先选取预设图像库中预设个数的图像为训练图像,将所述预设图像库中剩余的图像作为测试图像;获取基础训练网络,将该训练图像输入至该基础训练网络中,根据训练图像和该训练图像对应的标准分割图,对该基础训练网络的参数进行调整,得到训练后的基础训练网络。之后,根据测试图像对训练后的基础训练网络进行测试,在该训练后的基础训练网络对该测试图像的识别结果与该测试图像对应的标准分割图的相似度大于等于预设阈值时,确定该训练后的基础训练网络对该测试图像识别成功;在训练后的基础训练网络对测试图像的识别成功率大于等于预设成功率时,则确定该训练后的基础训练网络为预设的空洞卷积神经网络。In this embodiment, before the multi-dimensional image blocks are processed by the preset atrous convolutional neural network, the basic training network needs to be trained to obtain the atrous convolutional neural network. Specifically, the basic training network is a model with the same structure as the atrous convolutional neural network but with different parameters. Pre-selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images; acquiring a basic training network, inputting the training images into the basic training network, according to The training image and the standard segmentation map corresponding to the training image are adjusted, and the parameters of the basic training network are adjusted to obtain the trained basic training network. Then, test the trained basic training network according to the test image, and determine when the similarity between the recognition result of the trained basic training network for the test image and the standard segmentation map corresponding to the test image is greater than or equal to a preset threshold. The trained basic training network successfully recognizes the test image; when the trained basic training network recognizes the test image with a success rate greater than or equal to a preset success rate, the trained basic training network is determined to be a preset hole Convolutional Neural Networks.
本实施例提出的图像分割装置,实现了在图像分割时能够获取到更多的图像信息,提高了局部特征信息提取时图像信号描述的准确度,并且在网络参数可控的范围内,极大地增加了感受野,提高了每个特征所包含的信息量,进一步使得图像信息分割更加精确,获 取得到的图像信息更加完整。The image segmentation device proposed in this embodiment realizes that more image information can be obtained during image segmentation, improves the accuracy of image signal description during the extraction of local feature information, and greatly improves the accuracy of image signal description within the controllable range of network parameters. The receptive field is increased, and the amount of information contained in each feature is improved, which further makes the segmentation of image information more accurate, and the obtained image information is more complete.
为解决上述技术问题,本申请实施例还提供计算机设备。具体请参阅图4,图4为本实施例计算机设备基本结构框图。To solve the above technical problems, the embodiments of the present application also provide computer equipment. For details, please refer to FIG. 4 , which is a block diagram of a basic structure of a computer device according to this embodiment.
所述计算机设备6包括通过系统总线相互通信连接存储器61、处理器62、网络接口63。需要指出的是,图中仅示出了具有组件61-63的计算机设备6,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。其中,本技术领域技术人员可以理解,这里的计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate Array,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。The computer device 6 includes a memory 61 , a processor 62 , and a network interface 63 that communicate with each other through a system bus. It should be pointed out that only the computer device 6 with components 61-63 is shown in the figure, but it should be understood that it is not required to implement all of the shown components, and more or less components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, special-purpose Integrated circuit (Application Specific Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), digital processor (Digital Signal Processor, DSP), embedded equipment, etc.
所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。The computer equipment may be a desktop computer, a notebook computer, a palmtop computer, a cloud server and other computing equipment. The computer device can perform human-computer interaction with the user through a keyboard, a mouse, a remote control, a touch pad or a voice control device.
所述存储器61至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。所述计算机可读存储介质可以是非易失性,也可以是易失性。在一些实施例中,所述存储器61可以是所述计算机设备6的内部存储单元,例如该计算机设备6的硬盘或内存。在另一些实施例中,所述存储器61也可以是所述计算机设备6的外部存储设备,例如该计算机设备6上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器61还可以既包括所述计算机设备6的内部存储单元也包括其外部存储设备。本实施例中,所述存储器61通常用于存储安装于所述计算机设备6的操作系统和各类应用软件,例如图像分割方法的计算机可读指令等。此外,所述存储器61还可以用于暂时地存储已经输出或者将要输出的各类数据。The memory 61 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), Magnetic Memory, Magnetic Disk, Optical Disk, etc. The computer-readable storage medium may be non-volatile or volatile. In some embodiments, the memory 61 may be an internal storage unit of the computer device 6 , such as a hard disk or a memory of the computer device 6 . In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc. Of course, the memory 61 may also include both the internal storage unit of the computer device 6 and its external storage device. In this embodiment, the memory 61 is generally used to store the operating system and various application software installed on the computer device 6, such as computer-readable instructions for an image segmentation method. In addition, the memory 61 can also be used to temporarily store various types of data that have been output or will be output.
所述处理器62在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器62通常用于控制所述计算机设备6的总体操作。本实施例中,所述处理器62用于运行所述存储器61中存储的计算机可读指令或者处理数据,例如运行所述图像分割方法的计算机可读指令。In some embodiments, the processor 62 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips. This processor 62 is typically used to control the overall operation of the computer device 6 . In this embodiment, the processor 62 is configured to execute computer-readable instructions stored in the memory 61 or process data, for example, computer-readable instructions for executing the image segmentation method.
所述网络接口63可包括无线网络接口或有线网络接口,该网络接口63通常用于在所述计算机设备6与其他电子设备之间建立通信连接。The network interface 63 may include a wireless network interface or a wired network interface, and the network interface 63 is generally used to establish a communication connection between the computer device 6 and other electronic devices.
本实施例提出的计算机设备,实现了在图像分割时能够获取到更多的图像信息,提高了局部特征信息提取时图像信号描述的准确度,并且在网络参数可控的范围内,极大地增加了感受野,提高了每个特征所包含的信息量,进一步使得图像信息分割更加精确,获取得到的图像信息更加完整。The computer device proposed in this embodiment realizes that more image information can be obtained during image segmentation, improves the accuracy of image signal description during local feature information extraction, and greatly increases the accuracy of image signal description within the controllable range of network parameters. The receptive field is improved, and the amount of information contained in each feature is increased, which further makes the segmentation of image information more accurate, and the obtained image information is more complete.
本申请还提供了另一种实施方式,即提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令可被至少一个处理器执行,以使所述至少一个处理器执行如上述的图像分割方法的步骤。The present application also provides another embodiment, that is, to provide a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions can be executed by at least one processor to The at least one processor is caused to perform the steps of the image segmentation method as described above.
本实施例提出的计算机可读存储介质,实现了在图像分割时能够获取到更多的图像信息,提高了局部特征信息提取时图像信号描述的准确度,并且在网络参数可控的范围内,极大地增加了感受野,提高了每个特征所包含的信息量,进一步使得图像信息分割更加精确,获取得到的图像信息更加完整。The computer-readable storage medium proposed in this embodiment realizes that more image information can be obtained during image segmentation, improves the accuracy of image signal description during local feature information extraction, and within the range of controllable network parameters, It greatly increases the receptive field, improves the amount of information contained in each feature, further makes the segmentation of image information more accurate, and the obtained image information is more complete.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如 ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course hardware can also be used, but in many cases the former is better implementation. Based on this understanding, the technical solutions of the present application can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products are stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.
显然,以上所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例,附图中给出了本申请的较佳实施例,但并不限制本申请的专利范围。本申请可以以许多不同的形式来实现,相反地,提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明,对于本领域的技术人员来而言,其依然可以对前述各具体实施方式所记载的技术方案进行修改,或者对其中部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构,直接或间接运用在其他相关的技术领域,均同理在本申请专利保护范围之内。Obviously, the above-described embodiments are only a part of the embodiments of the present application, rather than all of the embodiments. The accompanying drawings show the preferred embodiments of the present application, but do not limit the patent scope of the present application. This application may be embodied in many different forms, rather these embodiments are provided so that a thorough and complete understanding of the disclosure of this application is provided. Although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing specific embodiments, or perform equivalent replacements for some of the technical features. . Any equivalent structures made by using the contents of the description and drawings of this application, which are directly or indirectly used in other related technical fields, are all within the scope of protection of the patent of this application.

Claims (20)

  1. 一种图像分割方法,包括下述步骤:An image segmentation method, comprising the following steps:
    获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
    获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;Acquire a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
    基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
  2. 根据权利要求1所述的图像分割方法,其中,所述编码器包括第一卷积层、第一空洞卷积层和池化层,所述基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果的步骤具体包括:The image segmentation method according to claim 1, wherein the encoder comprises a first convolutional layer, a first atrous convolutional layer and a pooling layer, and the encoder in the first layer network performs a The steps of performing encoding processing on the multi-dimensional image block to obtain an encoding result specifically include:
    将所述多维图像块依次经过所述第一卷积层、所述第一空洞卷积层和所述池化层,得到池化结果;Passing the multi-dimensional image block through the first convolution layer, the first hole convolution layer and the pooling layer in sequence to obtain a pooling result;
    通过预设降拟合层对所述池化结果进行降拟合得到所述多维图像块对应的编码结果。The encoding result corresponding to the multi-dimensional image block is obtained by performing down-fitting on the pooling result through a preset down-fitting layer.
  3. 根据权利要求1所述的图像分割方法,其中,所述解码器包括上采样层、第二卷积层和第二空洞卷积成层,所述基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图的步骤具体包括:The image segmentation method according to claim 1, wherein the decoder comprises an upsampling layer, a second convolutional layer and a second atrous convolutional layer, and the decoder based on the first layer network has a The step of performing decoding processing on the encoding result to obtain the binary segmentation result map of the target image specifically includes:
    在得到所述编码结果时,根据所述上采样层、所述第二卷积层和所述第二空洞卷积成层对所述编码结果进行计算,得到空洞卷积结果;When the encoding result is obtained, the encoding result is calculated according to the upsampling layer, the second convolution layer and the second hole convolution layer to obtain a hole convolution result;
    通过预设激活函数对所述空洞卷积结果进行计算,得到所述目标图像的二值分割结果图。The hole convolution result is calculated by a preset activation function, and a binary segmentation result map of the target image is obtained.
  4. 根据权利要求1所述的图像分割方法,其中,所述基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图的步骤具体包括:The image segmentation method according to claim 1, wherein the step of performing multi-layer convolution calculation on the binary segmentation result graph based on the second layer network to obtain the semantic segmentation result graph of the target image is specifically include:
    获取所述第一层网络的第一卷积结果,根据所述第一卷积结果对所述二值分割结果图进行掩膜约束,得到掩膜结果;Obtain the first convolution result of the first layer of network, and perform mask constraint on the binary segmentation result graph according to the first convolution result to obtain a mask result;
    基于所述第二层网络对所述掩膜结果进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the mask result based on the second-layer network to obtain a semantic segmentation result map of the target image.
  5. 根据权利要求1所述的图像分割方法,其中,在所述获取预设的空洞卷积神经网络的步骤之前还包括:The image segmentation method according to claim 1, wherein before the step of acquiring the preset atrous convolutional neural network, it further comprises:
    选取预设图像库中预设个数的图像为训练图像,将所述预设图像库中剩余的图像作为测试图像;Selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images;
    获取基础训练网络,根据所述训练图像对所述基础训练网络进行训练,得到训练后的基础训练网络;Obtaining a basic training network, training the basic training network according to the training image, and obtaining a trained basic training network;
    根据所述测试图像对所述训练后的基础训练网络进行测试,在所述训练后的基础训练网络对所述测试图像的识别成功率大于等于预设成功率时,确定所述训练后的基础训练网络为所述空洞卷积神经网络。The trained basic training network is tested according to the test image, and when the recognition success rate of the trained basic training network on the test image is greater than or equal to a preset success rate, the trained basic training network is determined. The training network is the atrous convolutional neural network.
  6. 根据权利要求5所述的图像分割方法,其中,所述根据所述训练图像对所述基础训练网络进行训练,得到训练后的基础训练网络的步骤具体包括:The image segmentation method according to claim 5, wherein the step of training the basic training network according to the training image to obtain the trained basic training network specifically comprises:
    分解所述训练图像为训练图像块,输入所述训练图像块至所述基础训练网络中得到训练分割图像;Decomposing the training image into training image blocks, inputting the training image blocks into the basic training network to obtain training segmentation images;
    获取所述训练图像的标准分割图像,根据所述训练分割图像和所述标准分割图像对所述基础训练网络进行训练,得到训练后的基础训练网络。A standard segmentation image of the training image is acquired, and the basic training network is trained according to the training segmentation image and the standard segmentation image to obtain a trained basic training network.
  7. 根据权利要求6所述的图像分割方法,其中,所述根据所述训练分割图像和所述标准分割图像对所述基础训练网络进行训练,得到训练后的基础训练网络的步骤具体包括:The image segmentation method according to claim 6, wherein the step of training the basic training network according to the training segmentation image and the standard segmentation image, and obtaining the trained basic training network specifically comprises:
    获取所述训练分割图像的第一像素个数,以及所述标准分割图像的第二像素个数;Obtain the first pixel number of the training segmented image, and the second pixel number of the standard segmented image;
    根据所述第一像素个数和所述第二像素个数计算所述基础训练网络的损失函数,在所述损失函数收敛时,确定所述基础训练网络为训练后的基础训练网络。The loss function of the basic training network is calculated according to the first pixel number and the second pixel number, and when the loss function converges, the basic training network is determined to be the trained basic training network.
  8. 一种图像分割装置,包括:An image segmentation device, comprising:
    分解模块,用于获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;a decomposition module, used for acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
    处理模块,用于获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;The processing module is configured to obtain a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and the multi-dimensional The image block is subjected to encoding processing to obtain an encoding result, and a decoder based on the first layer network performs decoding processing on the encoding result to obtain a binary segmentation result map of the target image;
    计算模块,用于基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。The computing module is configured to perform multi-layer convolution calculation on the binary segmentation result graph based on the second-layer network to obtain the semantic segmentation result graph of the target image.
  9. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述处理器执行所述计算机可读指令时还实现如下步骤:A computer device includes a memory and a processor, wherein computer-readable instructions are stored in the memory, and the processor also implements the following steps when executing the computer-readable instructions:
    获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
    获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;Acquire a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
    基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
  10. 根据权利要求9所述的计算机设备,其中,所述编码器包括第一卷积层、第一空洞卷积层和池化层,所述基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果的步骤具体包括:9. The computer device of claim 9, wherein the encoder includes a first convolutional layer, a first dilated convolutional layer, and a pooling layer, the encoder in the first layer network based on the The steps of encoding the multi-dimensional image block to obtain the encoding result specifically include:
    将所述多维图像块依次经过所述第一卷积层、所述第一空洞卷积层和所述池化层,得到池化结果;Passing the multi-dimensional image block through the first convolution layer, the first hole convolution layer and the pooling layer in sequence to obtain a pooling result;
    通过预设降拟合层对所述池化结果进行降拟合得到所述多维图像块对应的编码结果。The encoding result corresponding to the multi-dimensional image block is obtained by performing down-fitting on the pooling result through a preset down-fitting layer.
  11. 根据权利要求9所述的计算机设备,其中,所述解码器包括上采样层、第二卷积层和第二空洞卷积成层,所述基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图的步骤具体包括:9. The computer device of claim 9, wherein the decoder comprises an upsampling layer, a second convolutional layer, and a second atrous convolutional layer, the decoder based on the first layer network The step of decoding the encoding result to obtain the binary segmentation result map of the target image specifically includes:
    在得到所述编码结果时,根据所述上采样层、所述第二卷积层和所述第二空洞卷积成层对所述编码结果进行计算,得到空洞卷积结果;When the encoding result is obtained, the encoding result is calculated according to the upsampling layer, the second convolution layer and the second hole convolution layer to obtain a hole convolution result;
    通过预设激活函数对所述空洞卷积结果进行计算,得到所述目标图像的二值分割结果图。The hole convolution result is calculated by a preset activation function, and a binary segmentation result map of the target image is obtained.
  12. 根据权利要求9所述的计算机设备,其中,所述基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图的步骤具体包括:The computer device according to claim 9, wherein the step of performing multi-layer convolution calculation on the binary segmentation result graph based on the second layer network to obtain the semantic segmentation result graph of the target image specifically comprises the following steps: :
    获取所述第一层网络的第一卷积结果,根据所述第一卷积结果对所述二值分割结果图进行掩膜约束,得到掩膜结果;Obtain the first convolution result of the first layer of network, and perform mask constraint on the binary segmentation result graph according to the first convolution result to obtain a mask result;
    基于所述第二层网络对所述掩膜结果进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the mask result based on the second-layer network to obtain a semantic segmentation result map of the target image.
  13. 根据权利要求9所述的计算机设备,其中,在所述获取预设的空洞卷积神经网络的步骤之前还包括:The computer device according to claim 9, wherein before the step of acquiring the preset atrous convolutional neural network, it further comprises:
    选取预设图像库中预设个数的图像为训练图像,将所述预设图像库中剩余的图像作为测试图像;Selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images;
    获取基础训练网络,根据所述训练图像对所述基础训练网络进行训练,得到训练后的基础训练网络;Obtaining a basic training network, training the basic training network according to the training image, and obtaining a trained basic training network;
    根据所述测试图像对所述训练后的基础训练网络进行测试,在所述训练后的基础训练 网络对所述测试图像的识别成功率大于等于预设成功率时,确定所述训练后的基础训练网络为所述空洞卷积神经网络。The trained basic training network is tested according to the test image, and when the recognition success rate of the trained basic training network on the test image is greater than or equal to a preset success rate, the trained basic training network is determined. The training network is the atrous convolutional neural network.
  14. 根据权利要求13所述的计算机设备,其中,所述根据所述训练图像对所述基础训练网络进行训练,得到训练后的基础训练网络的步骤具体包括:The computer device according to claim 13, wherein the step of training the basic training network according to the training image to obtain the trained basic training network specifically comprises:
    分解所述训练图像为训练图像块,输入所述训练图像块至所述基础训练网络中得到训练分割图像;Decomposing the training image into training image blocks, inputting the training image blocks into the basic training network to obtain training segmentation images;
    获取所述训练图像的标准分割图像,根据所述训练分割图像和所述标准分割图像对所述基础训练网络进行训练,得到训练后的基础训练网络。A standard segmentation image of the training image is acquired, and the basic training network is trained according to the training segmentation image and the standard segmentation image to obtain a trained basic training network.
  15. 根据权利要求14所述的计算机设备,其中,所述根据所述训练分割图像和所述标准分割图像对所述基础训练网络进行训练,得到训练后的基础训练网络的步骤具体包括:The computer device according to claim 14, wherein the step of training the basic training network according to the training segmentation image and the standard segmentation image, and obtaining the trained basic training network specifically comprises:
    获取所述训练分割图像的第一像素个数,以及所述标准分割图像的第二像素个数;Obtain the first pixel number of the training segmented image, and the second pixel number of the standard segmented image;
    根据所述第一像素个数和所述第二像素个数计算所述基础训练网络的损失函数,在所述损失函数收敛时,确定所述基础训练网络为训练后的基础训练网络。The loss function of the basic training network is calculated according to the first pixel number and the second pixel number, and when the loss function converges, the basic training network is determined to be the trained basic training network.
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时,使得所述处理器还执行如下步骤:A computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the processor further performs the following steps:
    获取目标图像,并对所述目标图像进行二层小波分解,得到多维图像块;acquiring a target image, and performing two-layer wavelet decomposition on the target image to obtain a multi-dimensional image block;
    获取预设的空洞卷积神经网络,其中,所述空洞卷积神经网络包括第一层网络和第二层网络,基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果,基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图;Acquire a preset atrous convolutional neural network, wherein the atrous convolutional neural network includes a first-layer network and a second-layer network, and encodes the multi-dimensional image block based on an encoder in the first-layer network Obtaining an encoding result, and decoding the encoding result based on the decoder of the first-layer network to obtain a binary segmentation result map of the target image;
    基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the binary segmentation result graph based on the second-layer network to obtain a semantic segmentation result graph of the target image.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述编码器包括第一卷积层、第一空洞卷积层和池化层,所述基于所述第一层网络中的编码器对所述多维图像块进行编码处理得到编码结果的步骤具体包括:17. The computer-readable storage medium of claim 16, wherein the encoder comprises a first convolutional layer, a first atrous convolutional layer, and a pooling layer, the encoder based on the first layer network The step of encoding the multi-dimensional image block to obtain an encoding result specifically includes:
    将所述多维图像块依次经过所述第一卷积层、所述第一空洞卷积层和所述池化层,得到池化结果;Passing the multi-dimensional image block through the first convolution layer, the first hole convolution layer and the pooling layer in sequence to obtain a pooling result;
    通过预设降拟合层对所述池化结果进行降拟合得到所述多维图像块对应的编码结果。The encoding result corresponding to the multi-dimensional image block is obtained by performing down-fitting on the pooling result through a preset down-fitting layer.
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述解码器包括上采样层、第二卷积层和第二空洞卷积成层,所述基于所述第一层网络的解码器对所述编码结果进行解码处理得到所述目标图像的二值分割结果图的步骤具体包括:17. The computer-readable storage medium of claim 16, wherein the decoder comprises an upsampling layer, a second convolutional layer, and a second atrous convolutional layer, the decoder based on the first layer network The step of decoding the encoding result to obtain the binary segmentation result map of the target image specifically includes:
    在得到所述编码结果时,根据所述上采样层、所述第二卷积层和所述第二空洞卷积成层对所述编码结果进行计算,得到空洞卷积结果;When the encoding result is obtained, the encoding result is calculated according to the upsampling layer, the second convolution layer and the second hole convolution layer to obtain a hole convolution result;
    通过预设激活函数对所述空洞卷积结果进行计算,得到所述目标图像的二值分割结果图。The hole convolution result is calculated by a preset activation function, and a binary segmentation result map of the target image is obtained.
  19. 根据权利要求16所述的计算机可读存储介质,其中,所述基于所述第二层网络对所述二值分割结果图进行多层卷积计算,得到所述目标图像的语义分割结果图的步骤具体包括:The computer-readable storage medium according to claim 16, wherein the multi-layer convolution calculation is performed on the binary segmentation result graph based on the second layer network to obtain the semantic segmentation result graph of the target image. The steps include:
    获取所述第一层网络的第一卷积结果,根据所述第一卷积结果对所述二值分割结果图进行掩膜约束,得到掩膜结果;Obtain the first convolution result of the first layer of network, and perform mask constraint on the binary segmentation result graph according to the first convolution result to obtain a mask result;
    基于所述第二层网络对所述掩膜结果进行多层卷积计算,得到所述目标图像的语义分割结果图。Multi-layer convolution calculation is performed on the mask result based on the second-layer network to obtain a semantic segmentation result map of the target image.
  20. 根据权利要求16所述的计算机可读存储介质,其中,在所述获取预设的空洞卷积神经网络的步骤之前还包括:The computer-readable storage medium according to claim 16, wherein before the step of acquiring the preset atrous convolutional neural network, it further comprises:
    选取预设图像库中预设个数的图像为训练图像,将所述预设图像库中剩余的图像作为测试图像;Selecting a preset number of images in the preset image library as training images, and using the remaining images in the preset image library as test images;
    获取基础训练网络,根据所述训练图像对所述基础训练网络进行训练,得到训练后的基础训练网络;Obtaining a basic training network, training the basic training network according to the training image, and obtaining a trained basic training network;
    根据所述测试图像对所述训练后的基础训练网络进行测试,在所述训练后的基础训练网络对所述测试图像的识别成功率大于等于预设成功率时,确定所述训练后的基础训练网络为所述空洞卷积神经网络。The trained basic training network is tested according to the test image, and when the recognition success rate of the trained basic training network on the test image is greater than or equal to a preset success rate, the trained basic training network is determined. The training network is the atrous convolutional neural network.
PCT/CN2021/090817 2020-11-17 2021-04-29 Image segmentation method and apparatus, computer device, and storage medium WO2022105125A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011288874.3A CN112396613B (en) 2020-11-17 2020-11-17 Image segmentation method, device, computer equipment and storage medium
CN202011288874.3 2020-11-17

Publications (1)

Publication Number Publication Date
WO2022105125A1 true WO2022105125A1 (en) 2022-05-27

Family

ID=74606047

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/090817 WO2022105125A1 (en) 2020-11-17 2021-04-29 Image segmentation method and apparatus, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN112396613B (en)
WO (1) WO2022105125A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082502A (en) * 2022-06-30 2022-09-20 温州医科大学 Image segmentation method based on distance-guided deep learning strategy
CN115205300A (en) * 2022-09-19 2022-10-18 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
CN115471765A (en) * 2022-11-02 2022-12-13 广东工业大学 Semantic segmentation method, device and equipment for aerial image and storage medium
CN115546236A (en) * 2022-11-24 2022-12-30 阿里巴巴(中国)有限公司 Image segmentation method and device based on wavelet transformation
CN115641434A (en) * 2022-12-26 2023-01-24 浙江天铂云科光电股份有限公司 Power equipment positioning method, system, terminal and storage medium
CN116824308A (en) * 2023-08-30 2023-09-29 腾讯科技(深圳)有限公司 Image segmentation model training method and related method, device, medium and equipment
CN117007606A (en) * 2023-08-17 2023-11-07 泓浒(苏州)半导体科技有限公司 Wafer grain defect detection method and system based on grain division network
CN117474925A (en) * 2023-12-28 2024-01-30 山东润通齿轮集团有限公司 Gear pitting detection method and system based on machine vision

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396613B (en) * 2020-11-17 2024-05-10 平安科技(深圳)有限公司 Image segmentation method, device, computer equipment and storage medium
CN113112518B (en) * 2021-04-19 2024-03-26 深圳思谋信息科技有限公司 Feature extractor generation method and device based on spliced image and computer equipment
CN113191367B (en) * 2021-05-25 2022-07-29 华东师范大学 Semantic segmentation method based on dense scale dynamic network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109859158A (en) * 2018-11-27 2019-06-07 邦鼓思电子科技(上海)有限公司 A kind of detection system, method and the machinery equipment on the working region boundary of view-based access control model
CN110197709A (en) * 2019-05-29 2019-09-03 广州瑞多思医疗科技有限公司 A kind of 3-dimensional dose prediction technique based on deep learning Yu priori plan
WO2019196633A1 (en) * 2018-04-10 2019-10-17 腾讯科技(深圳)有限公司 Training method for image semantic segmentation model and server
CN110415260A (en) * 2019-08-01 2019-11-05 西安科技大学 Smog image segmentation and recognition methods based on dictionary and BP neural network
CN112396613A (en) * 2020-11-17 2021-02-23 平安科技(深圳)有限公司 Image segmentation method and device, computer equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108537292B (en) * 2018-04-10 2020-07-31 上海白泽网络科技有限公司 Semantic segmentation network training method, image semantic segmentation method and device
CN108986124A (en) * 2018-06-20 2018-12-11 天津大学 In conjunction with Analysis On Multi-scale Features convolutional neural networks retinal vascular images dividing method
CN111091576B (en) * 2020-03-19 2020-07-28 腾讯科技(深圳)有限公司 Image segmentation method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019196633A1 (en) * 2018-04-10 2019-10-17 腾讯科技(深圳)有限公司 Training method for image semantic segmentation model and server
CN109859158A (en) * 2018-11-27 2019-06-07 邦鼓思电子科技(上海)有限公司 A kind of detection system, method and the machinery equipment on the working region boundary of view-based access control model
CN110197709A (en) * 2019-05-29 2019-09-03 广州瑞多思医疗科技有限公司 A kind of 3-dimensional dose prediction technique based on deep learning Yu priori plan
CN110415260A (en) * 2019-08-01 2019-11-05 西安科技大学 Smog image segmentation and recognition methods based on dictionary and BP neural network
CN112396613A (en) * 2020-11-17 2021-02-23 平安科技(深圳)有限公司 Image segmentation method and device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG RONG , ZHAO KUNQI , GU KAI: "Road Image Semantic Segmentation Based on Convolutional Neural Network", COMPUTER & DIGITAL ENGINEERING, vol. 48, no. 7, 20 July 2020 (2020-07-20), pages 1172 - 1775+1803, XP055932177, ISSN: 1672-9722, DOI: 10.3969/j.issn.1672-9722.2020.07.043 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082502A (en) * 2022-06-30 2022-09-20 温州医科大学 Image segmentation method based on distance-guided deep learning strategy
CN115082502B (en) * 2022-06-30 2024-05-10 温州医科大学 Image segmentation method based on distance guidance deep learning strategy
CN115205300A (en) * 2022-09-19 2022-10-18 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
CN115205300B (en) * 2022-09-19 2022-12-09 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
CN115471765A (en) * 2022-11-02 2022-12-13 广东工业大学 Semantic segmentation method, device and equipment for aerial image and storage medium
CN115546236A (en) * 2022-11-24 2022-12-30 阿里巴巴(中国)有限公司 Image segmentation method and device based on wavelet transformation
CN115641434A (en) * 2022-12-26 2023-01-24 浙江天铂云科光电股份有限公司 Power equipment positioning method, system, terminal and storage medium
CN117007606B (en) * 2023-08-17 2024-03-08 泓浒(苏州)半导体科技有限公司 Wafer grain defect detection method and system based on grain division network
CN117007606A (en) * 2023-08-17 2023-11-07 泓浒(苏州)半导体科技有限公司 Wafer grain defect detection method and system based on grain division network
CN116824308B (en) * 2023-08-30 2024-03-22 腾讯科技(深圳)有限公司 Image segmentation model training method and related method, device, medium and equipment
CN116824308A (en) * 2023-08-30 2023-09-29 腾讯科技(深圳)有限公司 Image segmentation model training method and related method, device, medium and equipment
CN117474925A (en) * 2023-12-28 2024-01-30 山东润通齿轮集团有限公司 Gear pitting detection method and system based on machine vision
CN117474925B (en) * 2023-12-28 2024-03-15 山东润通齿轮集团有限公司 Gear pitting detection method and system based on machine vision

Also Published As

Publication number Publication date
CN112396613B (en) 2024-05-10
CN112396613A (en) 2021-02-23

Similar Documents

Publication Publication Date Title
WO2022105125A1 (en) Image segmentation method and apparatus, computer device, and storage medium
US11200424B2 (en) Space-time memory network for locating target object in video content
CN108509915B (en) Method and device for generating face recognition model
US11373390B2 (en) Generating scene graphs from digital images using external knowledge and image reconstruction
US10891465B2 (en) Methods and apparatuses for searching for target person, devices, and media
US11775574B2 (en) Method and apparatus for visual question answering, computer device and medium
CN114066902A (en) Medical image segmentation method, system and device based on convolution and transformer fusion
WO2022001623A1 (en) Image processing method and apparatus based on artificial intelligence, and device and storage medium
EP3859560A2 (en) Method and apparatus for visual question answering, computer device and medium
CN113343982B (en) Entity relation extraction method, device and equipment for multi-modal feature fusion
WO2023273628A1 (en) Video loop recognition method and apparatus, computer device, and storage medium
WO2023035531A1 (en) Super-resolution reconstruction method for text image and related device thereof
WO2023159746A1 (en) Image matting method and apparatus based on image segmentation, computer device, and medium
CN113379627A (en) Training method of image enhancement model and method for enhancing image
CN114445904A (en) Iris segmentation method, apparatus, medium, and device based on full convolution neural network
CN111104941B (en) Image direction correction method and device and electronic equipment
TWI803243B (en) Method for expanding images, computer device and storage medium
CN116796287A (en) Pre-training method, device, equipment and storage medium for graphic understanding model
WO2023173536A1 (en) Chemical formula identification method and apparatus, computer device, and storage medium
US20240161382A1 (en) Texture completion
CN115546554A (en) Sensitive image identification method, device, equipment and computer readable storage medium
CN113610856A (en) Method and device for training image segmentation model and image segmentation
CN114117037A (en) Intention recognition method, device, equipment and storage medium
CN115147434A (en) Image processing method, device, terminal equipment and computer readable storage medium
CN112071331A (en) Voice file repairing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21893281

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21893281

Country of ref document: EP

Kind code of ref document: A1