WO2021143207A1 - 图像处理方法、装置、计算处理设备及介质 - Google Patents

图像处理方法、装置、计算处理设备及介质 Download PDF

Info

Publication number
WO2021143207A1
WO2021143207A1 PCT/CN2020/118866 CN2020118866W WO2021143207A1 WO 2021143207 A1 WO2021143207 A1 WO 2021143207A1 CN 2020118866 W CN2020118866 W CN 2020118866W WO 2021143207 A1 WO2021143207 A1 WO 2021143207A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
feature map
processing
image
output
Prior art date
Application number
PCT/CN2020/118866
Other languages
English (en)
French (fr)
Inventor
李彦玮
宋林
黎泽明
Original Assignee
北京迈格威科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京迈格威科技有限公司 filed Critical 北京迈格威科技有限公司
Publication of WO2021143207A1 publication Critical patent/WO2021143207A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of image processing technology. Specifically, this application relates to an image processing method, device, computing processing device, and medium.
  • the existing network structures for image processing all use pre-defined static networks to predict input images, and are mainly divided into two types: manually designed networks and network structure searches.
  • the hand-designed network generally uses a method of fusion of multiple hierarchical feature maps to enrich the language details of the feature maps to establish the contextual relationship between the feature maps.
  • the network structure search mainly uses methods based on reinforcement learning or gradient update to fit a fixed network structure in a data set.
  • the size distribution of the image to be processed is often very different.
  • the network structure of the image processing in the technology because they are all fixed structures, it is impossible to accurately establish the context relationship between the feature maps for such images with a large difference in size distribution, and thus cannot obtain accurate processing results.
  • the purpose of this application is to solve at least one of the above-mentioned technical defects.
  • an image processing method which includes:
  • the image processing model includes multiple levels of feature processing sub-networks, each level includes feature processing nodes of different depths, for each feature processing node of each level except the last level, based on the feature processing node
  • the included gate control network determines the output feature map of the feature processing node to form a dynamic image processing model, and obtains the processing result of the image to be processed based on the output feature map output by each feature processing node in the last level of the image processing model .
  • determining the output feature map of the feature processing node based on the gated network included in the feature processing node includes:
  • the output feature map of the feature processing node is determined.
  • the use probability of feature maps of various sizes includes the use probability of up-sampling processing for the initial feature map, the use probability of performing resolution-invariant processing, and the use of resolution down-sampling processing. At least two of the probabilities.
  • determining the output feature map of the feature processing node based on the initial output feature map and the determined use probability of the feature map corresponding to each size includes:
  • feature extraction of corresponding sizes is performed on the initial output feature map to determine the output feature map of the feature processing node.
  • the input feature map based on the feature processing node is not executed, and the initial output feature map is determined step.
  • each feature processing node further includes a convolutional layer and a residual layer that are sequentially cascaded, and the convolutional layer and the residual layer that are sequentially cascaded are used for the input features of the feature processing node.
  • Figure determine the initial output feature map.
  • the gated network includes a neural network and an activation function layer, and the input feature map of the feature processing node is input to the gated network to determine the output feature map corresponding to each size of the feature processing node Probability of use, including:
  • the initial usage probability of the output feature map of each size is activated based on the activation function, and the usage probability of the output feature map of each size corresponding to the feature processing node is obtained.
  • an image processing device including:
  • the image acquisition module is used to acquire the image to be processed
  • the image processing result determination module is used to input the image to be processed into the image processing model, and obtain the image processing result of the image to be processed based on the output of the image processing model;
  • the image processing model includes multiple levels of feature processing sub-networks, each level includes feature processing nodes of different depths, for each feature processing node of each level except the last level, based on the feature processing node
  • the included gate control network determines the output feature map of the feature processing node to form a dynamic image processing model, and obtains the processing result of the image to be processed based on the output feature map output by each feature processing node in the last level of the image processing model .
  • the image processing model determines the output feature of the feature processing node based on the gated network included in the feature processing node When drawing, it is specifically used for:
  • the output feature map of the feature processing node is determined.
  • the use probability of feature maps of each size includes the use probability of up-sampling processing for the initial feature map, the use probability of performing resolution-invariant processing, and the use of resolution down-sampling processing. At least two of the probabilities.
  • the image processing model is specifically used to determine the output feature map of the feature processing node based on the initial output feature map and the determined use probability of the feature map corresponding to each size:
  • feature extraction of corresponding sizes is performed on the initial output feature map to determine the output feature map of the feature processing node.
  • the input feature map based on the feature processing node is not executed, and the initial output feature map is determined step.
  • each feature processing node further includes a convolutional layer and a residual layer that are sequentially cascaded, and the convolutional layer and the residual layer that are sequentially cascaded are used for input features based on the feature processing node Figure, determine the initial output feature map.
  • the gated network includes a neural network and an activation function layer, and the image processing model inputs the input feature map of the feature processing node to the gated network to determine that the feature processing node corresponds to each size
  • the image processing model inputs the input feature map of the feature processing node to the gated network to determine that the feature processing node corresponds to each size
  • the initial usage probability of the output feature map of each size is activated based on the activation function, and the usage probability of the output feature map of each size corresponding to the feature processing node is obtained.
  • an embodiment of the present application provides a computing processing device, including:
  • a memory in which computer-readable codes are stored
  • One or more processors when the computer-readable code is executed by the one or more processors, the computing processing device executes the image processing method according to any one of the first aspect.
  • an embodiment of the present application provides a computer program, including computer-readable code, which when the computer-readable code runs on a computing processing device, causes the computing processing device to execute any one of the The image processing method described in the item.
  • an embodiment of the present application provides a computer-readable storage medium on which the computer program as described in the fourth aspect is stored.
  • the image processing model includes multiple levels of feature processing sub-networks, and each level of feature processing sub-networks also includes feature processing nodes of different depths, that is, the image processing model Contains a large number of network structures, so that in the process of dynamic selection according to the input image to be processed, multiple known network structures can be adaptively selected, which can be applied to images with different scales. Further, since each feature processing node of each level except the last level includes a gated network for controlling the output feature map, at this time, unimportant feature processing nodes can be adaptively closed, so that it can be simulated Combining different network structures and controlling the actual operating calculations can ensure that the applicable network structure is determined and the calculations can be reduced.
  • FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of this application.
  • FIG. 2 is a schematic diagram of a part of the structure of an image processing model provided by an embodiment of the application;
  • FIG. 3 is a schematic diagram of a part of the structure of a deep feature extraction network provided by an embodiment of this application;
  • FIG. 4 is a schematic structural diagram of an image processing device provided by an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of a computing processing device provided by an embodiment of this application.
  • Fig. 6 schematically shows a storage unit for holding or carrying program codes for implementing the method according to the present invention.
  • the embodiment of the present application provides an image processing method. As shown in FIG. 1, the method includes:
  • Step S101 Obtain an image to be processed.
  • the image to be processed refers to an image that needs image processing.
  • the specific type of the image to be processed is not limited in this embodiment. For example, it may be a photo taken by a photographing device or a frame image in a video.
  • Step S102 input the image to be processed into the image processing model, and obtain the image processing result of the image to be processed based on the output of the image processing model;
  • the image processing model includes multiple levels of feature processing sub-networks, each level includes feature processing nodes of different depths, for each feature processing node of each level except the last level, based on the feature processing node
  • the included gate control network determines the output feature map of the feature processing node to form a dynamic image processing model, and obtains the processing result of the image to be processed based on the output feature map output by each feature processing node in the last level of the image processing model .
  • the image processing model may be an image semantic segmentation model
  • the image processing result may be the semantic segmentation result of the image to be processed.
  • the image processing model can include a deep feature extraction network, and the deep feature extraction network includes different levels (each level is a feature processing sub-network), and each level includes different depths of feature processing node.
  • each feature processing node of each level except the last level in the image processing model may include a gated network, and the gated network can control the output feature map of the feature processing node according to the input feature map.
  • the gated network can control the on and off of the feature processing node.
  • the image processing model is a dynamically adjustable model.
  • the image processing model also includes an initial feature extraction network.
  • the initial feature extraction network can extract the image features of the image to be processed into a high-dimensional feature space to obtain the image to be processed The initial feature map of the image.
  • the specific network structure of the initial feature extraction network can be pre-configured, which is not limited in the embodiment of the present application.
  • the initial feature extraction network can include a multi-scale feature extraction module and a multi-scale feature fusion module; correspondingly, when the image to be processed is input to the image processing model, the multi-scale feature extraction module in the initial feature extraction network can first extract the Process the feature maps of multiple scales of the image, and then the multi-scale feature fusion module fuses the feature maps of multiple scales to obtain the initial feature map of the image to be processed.
  • the image features of the image to be processed can be extracted into the high-dimensional feature space through the initial feature extraction network, it can be ensured that the subsequent processing of the image to be processed can be more stable.
  • each feature processing node corresponds to an input feature map size. If the current feature processing node is the first level node, the input of this node is the initial feature map, and the size of the initial feature map is equal to The size of the input feature map of this node; if the current feature processing node is a node of a level other than the first level, the input of this node is the size of the output of each feature processing node of the previous level equal to the input of this node The output feature map of the feature map size; if the current feature processing node is the last level node, the output feature map of this node is the output feature map after fusion of its various input feature maps.
  • the initial feature map of the image to be processed is extracted through the initial feature extraction network
  • the initial feature map can be further extracted based on the depth feature extraction network in the image processing model to obtain the depth feature.
  • the input feature map of the feature processing node at each level in the deep feature extraction network corresponds to one size, and the input feature map size corresponding to different feature processing nodes may be the same or different.
  • the input feature maps or output feature maps of feature processing nodes at different levels are different. If the current feature processing node is the first level feature processing node, the input feature map of the feature processing node is the initial feature map, and the input initial feature map is equal to the input feature map size corresponding to the feature processing node; if the current feature processing If a node is a feature processing node at a level other than the first level, the input of the feature processing node is the output of each feature processing node of the previous level equal to the size of the input feature map corresponding to the feature processing node Feature map; in addition, if the current feature processing node is the feature processing node of the last level, since the feature processing node of the last level does not include a gated network, the output feature map of the feature processing node at this time is the input feature map The output feature map after fusion.
  • the current feature processing node belongs to the third-level feature processing sub-network
  • the corresponding input feature map size is 1/8 of the image to be processed.
  • the previous level is the second level of feature processing Sub-network
  • the second-level feature processing sub-network includes feature processing node 1 and feature processing node 2.
  • the size of the output feature map of feature processing node 1 is respectively the size of the image to be processed and 1/8 of the image to be processed
  • the size of the output feature map of feature processing node 2 is 1/4 of the image to be processed, 1/8 of the image to be processed, and 1/16 of the image to be processed.
  • the input feature map of the current feature processing node is a feature processing node.
  • the input of the feature processing node of the first level is the initial feature map
  • the output of each feature processing node of each level except the last level is at least two types Feature maps of different sizes.
  • the processing result of the image to be processed is obtained, including:
  • the semantic segmentation result of the image to be processed is obtained and output.
  • the image processing module may also include a processing result output module, which is used to obtain the processing result of the image to be processed based on the output feature map output by each feature processing node of the last level. In other words, after obtaining the output feature map output by each feature processing node of the last level, the processing result of the image to be processed can be determined.
  • the processing result output module includes a feature fusion module and a semantic segmentation result output module that are sequentially cascaded.
  • the feature fusion module included at this time can fuse the output feature maps of each feature processing node at the last level to obtain a fusion feature map with a size equal to the size of the image to be processed, and then through the semantic segmentation result output module based on the fusion feature map, Obtain the semantic segmentation result of the image to be processed, and output the obtained semantic segmentation result.
  • the specific implementation manner of fusing the output feature maps of the feature processing nodes of the last level is not limited in the embodiment of this application.
  • the output feature maps of various sizes are fused and processed for up-sampling resolution until a fused feature map with a size equal to the size of the image to be processed is obtained.
  • the size of the output feature map of each feature processing node of the last level is 1/8 of the size of the image to be processed, 1/4 of the size of the image to be processed, and 1/2 of the size of the image to be processed.
  • the output feature map whose size is 1/8 of the size of the image to be processed can be up-sampled and processed to obtain a feature map whose size is 1/4 of the size of the image to be processed, and the obtained size is the size of the image to be processed
  • the 1/4 feature map and the output feature map whose size is 1/4 of the original image to be processed are fused to obtain the first fused feature map, and then the first fused feature map is up-sampling resolution processing to obtain
  • the feature map whose size is 1/2 of the size of the image to be processed, and the resulting feature map whose size is 1/2 of the size of the image to be processed and the output feature map whose original size is 1/2 of the size of the image to be processed are merged, Obtain the second fused feature map, and then perform up-s
  • the image processing model includes multiple levels of feature processing sub-networks, and each level of feature processing sub-networks also includes feature processing nodes of different depths, that is, the image processing model Contains a large number of network structures, so that in the process of dynamic selection according to the input image to be processed, multiple known network structures can be adaptively selected, which can be applied to images with different scales. Further, since each feature processing node of each level except the last level includes a gated network for controlling the output feature map, at this time, unimportant feature processing nodes can be adaptively closed, so that it can be simulated Combining different network structures and controlling the actual operating calculations can ensure that the applicable network structure is determined and the calculations can be reduced.
  • the output feature map of the feature processing node is determined based on the gated network included in the feature processing node, including:
  • the output feature map of the feature processing node is determined.
  • each feature processing node of each level except the last level is referred to as a target feature processing node.
  • the feature extraction module included in the target feature processing node The initial output feature map can be determined according to the input feature map, and the included feature selection module (ie, gated network) can determine the target feature processing node corresponding to the output feature map of each size according to the input feature map. Then the included feature output module can perform feature extraction on the initial output feature map based on the usage probability of the output feature map corresponding to each size, and then obtain the output feature corresponding to each size of the target feature processing node picture.
  • the usage probability refers to the usage probability of the feature map of the corresponding size.
  • the usage probability is greater, the usage probability of the feature map of the corresponding size is higher, and vice versa, the usage probability is lower.
  • the use probability of feature maps of each size includes the use probability of upsampling processing for the initial feature map, the use probability of resolution invariant processing, and the use probability of resolution downsampling processing. At least two of them.
  • the initial output feature map can be up-sampling processing, resolution invariant processing or resolution down-sampling processing, etc.
  • the gated network is used for Determine the use probability of up-sampling processing, resolution-invariant processing or resolution down-sampling processing of the initial output feature map.
  • determining the output feature map of the feature processing node includes:
  • feature extraction of corresponding sizes is performed on the initial output feature map to determine the output feature map of the feature processing node.
  • the usage probability can be filtered out if the usage probability is less than the set threshold, that is, the feature extraction method corresponding to the usage probability is less than the set threshold is not executed. That is to say, in the embodiment of the present application, the method for extracting the feature of the initial output feature map can be determined according to the usage probability of the feature map of each size.
  • the specific value of the threshold can be preset, which is not limited in the embodiment of the present application.
  • the setting threshold may be set to 0, that is, if the usage probability is 0, the feature extraction method corresponding to the usage profile will not be executed at this time.
  • the target feature processing node determines through a gated network that the usage probability corresponding to the up-sampling process is 0.5, the usage probability corresponding to the resolution-invariant process is 0.6, and the usage probability corresponding to the resolution down-sampling process Is 0, and the set threshold is 0; correspondingly, since the use probability corresponding to the up-sampling processing is 0.5 and the use probability corresponding to the resolution-invariant processing is 0.6 greater than the set threshold, the target feature processing node may not Perform resolution up-sampling processing on the initial output feature map and resolution invariant processing on the initial feature map, and only perform resolution down-sampling processing on the initial output feature map.
  • the input feature map based on the feature processing node is not executed, and the initial output feature is determined Diagram of the steps.
  • the target feature processing node may not be executed.
  • the use probability of the output feature map of each target feature processing node can be determined through the gated network in each target feature processing node, and then the target with a large amount of calculation but less contribution to the final result can be dynamically determined
  • the feature processing node is deleted, so that when there is a constraint on the amount of calculation, the network structure can be dynamically selected to achieve the purpose of reducing the amount of calculation.
  • each feature processing node further includes a convolutional layer and a residual layer that are sequentially cascaded, and the convolutional layer and the residual layer that are sequentially cascaded are used to process the input features of the node based on the feature.
  • Figure determine the initial output feature map.
  • each feature processing node in the image processing model it also includes a convolution (SepConv) layer and a residual (Residual) layer that are sequentially cascaded, based on the sequentially cascaded convolution layer and residual
  • the layer can determine the initial output feature map based on the input feature map. It should be noted that, for the feature processing node in the last level, the initial output feature map determined based on the convolutional layer and the residual layer is the final output feature map.
  • the input feature maps can be first fused to obtain the fused feature map, and then the fused feature map Input to the successively cascaded convolutional layer and residual layer; you can also directly input each input feature map to the successively cascaded convolutional layer and residual layer, and the successively cascaded convolutional layer and residual layer first compare each The input feature map is fused to obtain the fusion feature map, and then the initial output feature map is determined based on the obtained fusion feature map.
  • the gated network includes a neural network and an activation function layer, and the input feature map of the feature processing node is input to the gated network to determine the output feature map corresponding to each size of the feature processing node Probability of use, including:
  • the initial usage probability of the output feature map of each size is activated, and the usage probability of the output feature map of each size corresponding to the feature processing node is obtained.
  • the gated network can be a lightweight gated network, which can include a convolutional neural network and an activation function layer.
  • the convolutional neural network can map the input feature map to the hidden space and output corresponding to each size
  • the activation value of the feature map is output, and then the activation function layer activates each activation value to obtain the usage probability of the output feature map corresponding to each size.
  • each use probability is limited to [0,1]
  • the activation function layer can be max((0,tanh(x)), where x is The number of activation values.
  • the training image processing model for each use probability of the gated network output in each target feature processing node, it can be multiplied by the feature value in the corresponding output feature map to achieve the The feature processing node and the gated network included in it perform end-to-end training together.
  • an embodiment of the present application provides a schematic structural diagram of an image processing model, and combines the structural schematic diagram.
  • the numbers under Scale in Figure 2 ie 1, 1/4, ... 1/64 represent the size of the feature map (ie different depths), for example, when it is 1, the feature map size is the image to be processed The size, when it is 1/4, the feature map size is 1/4 of the image to be processed, etc.
  • the image semantic segmentation model may include an initial feature extraction network, a deep feature extraction network, and a processing result output module.
  • the deep feature extraction network includes L+1 levels of feature processing sub-networks (1 in the figure is a level), and each feature processing sub-network includes various feature processing nodes (as shown in the deep feature extraction network in the figure). The included dots are shown), and each feature processing node included therein corresponds to an input feature map size.
  • the corresponding input feature map sizes are the initial feature map whose size is 1/4 of the image to be processed, and the initial feature map whose size is 1/8 of the image to be processed. picture.
  • the image to be processed can be input to the initial feature extraction network through the Input shown in the figure.
  • the multi-scale feature extraction module included in the initial feature extraction network (that is, the image STEM in) combines the feature maps of multiple scales of the image to be processed, and then the included multi-scale feature fusion module can fuse the obtained feature maps of multiple scales to obtain an initial feature map.
  • the initial feature map is input to the feature processing node at the first level in the deep feature extraction network, and the feature processing node at the first level performs resolution-invariant sampling processing on the initial feature map (as shown in Figure 2 Direction arrow) and resolution down-sampling processing (as shown by the arrow pointing to the lower right in Figure 3) to obtain output feature maps of different sizes, and then input each output feature map separately according to the size of the output feature map To the corresponding feature processing node in the second level, and so on, to the feature processing node in the last level.
  • resolution-invariant sampling processing on the initial feature map (as shown in Figure 2 Direction arrow) and resolution down-sampling processing (as shown by the arrow pointing to the lower right in Figure 3) to obtain output feature maps of different sizes, and then input each output feature map separately according to the size of the output feature map To the corresponding feature processing node in the second level, and so on, to the feature processing node in the last level.
  • one of the feature processing nodes is taken as an example to describe the processing process of each feature processing node of each level except the last level.
  • the content structure of the feature processing node is shown in Figure 3, specifically:
  • the size of the output feature map output by the feature processing node is equal to the input feature map size of the node (as shown in area C in the figure), and then the input feature of the feature processing node can be processed
  • the map is fused to obtain the fused feature map (specifically shown in A in the figure); further, the convolutional layer and the residual layer are sequentially cascaded (specifically shown in SepConv and Identity in the cell part of the figure)
  • the initial output feature map is determined based on the fused feature map, and the gated network (Gate in Figure 2) determines the usage probability corresponding to the resolution upsampling process based on the fused feature map, and corresponds to the resolution unchanged
  • the use probability of processing and the use probability corresponding to the resolution upsampling process further, assuming that the three use probabilities obtained are all greater than the set threshold, the initial output feature map can be separately processed for resolution upsampling (as shown in the figure) 3), resolution invariant processing (show
  • the processing result output module fuses the output feature maps of each feature processing node at the last level based on the resolution upsampling processing method (Upsample in Figure 2) to obtain a fused feature map with a size equal to the size of the image to be processed , And obtain the semantic segmentation result of the image to be processed based on the fusion feature map and output it (Output in Figure 2).
  • the image processing model includes a path selection space of feature processing nodes of multiple scales, so that the designed path selection includes most of the existing static network structures and can efficiently extract features of multiple scales.
  • the image processing model includes feature processing nodes of multiple scales, which are mainly used to aggregate multi-scale features and perform subsequent propagation path selection.
  • a gated network can be used to control each feature processing node on and off.
  • the loss function is used to constrain the control network to dynamically increase the amount of calculation but contribute to the final result.
  • the smaller feature processing node is deleted, that is, it can dynamically decide whether to use this node for feature aggregation according to the input image, so as to achieve the purpose of dynamically selecting the network structure when there are computational constraints.
  • an embodiment of the present application provides an image processing device.
  • the image processing device 60 may include: an image acquisition module 601 and an image processing result determination module 602, wherein:
  • the image acquisition module 601 is used to acquire an image to be processed
  • the image processing result determining module 602 is configured to input the image to be processed into the image processing model, and obtain the image processing result of the image to be processed based on the output of the image processing model;
  • the image processing model includes multiple levels of feature processing sub-networks, each level includes feature processing nodes of different depths, for each feature processing node of each level except the last level, based on the feature processing node
  • the included gate control network determines the output feature map of the feature processing node to form a dynamic image processing model, and obtains the processing result of the image to be processed based on the output feature map output by each feature processing node in the last level of the image processing model .
  • the image processing model determines the output feature map of the feature processing node based on the gated network included in the feature processing node When, specifically used for:
  • the output feature map of the feature processing node is determined.
  • the use probability of feature maps of each size includes the use probability of upsampling processing for the initial feature map, the use probability of resolution invariant processing, and the use probability of resolution downsampling processing. At least two of them.
  • the image processing model is specifically used to determine the output feature map of the feature processing node based on the initial output feature map and the determined use probability of the feature map corresponding to each size:
  • feature extraction of corresponding sizes is performed on the initial output feature map to determine the output feature map of the feature processing node.
  • the step of determining the initial output feature map based on the input feature map of the feature processing node is not executed .
  • each feature processing node further includes a convolutional layer and a residual layer that are sequentially cascaded, and the convolutional layer and the residual layer that are sequentially cascaded are used for the input feature map based on the feature processing node , Determine the initial output feature map.
  • the gated network includes a neural network and an activation function layer, and the image processing model inputs the input feature map of the feature processing node to the gated network to determine the output of the feature processing node corresponding to each size
  • the probability of feature map it is specifically used for:
  • the initial usage probability of the output feature map of each size is activated based on the activation function, and the usage probability of the output feature map of each size corresponding to the feature processing node is obtained.
  • the device embodiments described above are merely illustrative.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network units.
  • Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement it without creative work.
  • the various component embodiments of the present application may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the computing processing device according to the embodiments of the present application.
  • This application can also be implemented as a device or device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein.
  • Such a program for implementing the present application may be stored on a computer-readable medium, or may have the form of one or more signals.
  • Such a signal can be downloaded from an Internet website, or provided on a carrier signal, or provided in any other form.
  • an embodiment of the present application provides a computing processing device.
  • the computing processing device 2000 shown in FIG. 5 includes a processor 2001 and a memory 2003.
  • the processor 2001 and the memory 2003 are connected, such as by a bus 2002.
  • the computing processing device 2000 may further include a transceiver 2004. It should be noted that in actual applications, the transceiver 2004 is not limited to one, and the structure of the computing processing device 2000 does not constitute a limitation to the embodiment of the present application.
  • the processor 2001 is applied in the embodiments of the present application, and is used to implement the functions of the modules shown in FIG. 4.
  • the processor 2001 may be a CPU, a general-purpose processor, DSP, ASIC, FPGA, or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute various exemplary logical blocks, modules, and circuits described in conjunction with the disclosure of this application.
  • the processor 2001 may also be a combination that implements computing functions, for example, including one or more microprocessor combinations, DSP and microprocessor combinations, and so on.
  • the bus 2002 may include a path for transferring information between the above-mentioned components.
  • the bus 2002 may be a PCI bus, an EISA bus, or the like.
  • the bus 2002 can be divided into an address bus, a data bus, a control bus, and so on. For ease of presentation, only one thick line is used to represent in FIG. 5, but it does not mean that there is only one bus or one type of bus.
  • the memory 2003 can be ROM or other types of static storage devices that can store static information and instructions, RAM or other types of dynamic storage devices that can store information and instructions, or it can be EEPROM, CD-ROM or other optical disk storage, or optical disk storage. (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this.
  • the memory 2003 is used to store application program codes for executing the solutions of the present application, and is controlled by the processor 2001 to execute.
  • the memory 2003 has a storage space 2005 for executing program codes of any method steps in the above-mentioned method.
  • the storage space 2005 for program codes may include various program codes 2006 respectively used to implement various steps in the above method.
  • These program codes can be read from or written into one or more computer program products.
  • These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards, or floppy disks.
  • Such a computer program product is usually a portable or fixed storage unit as described with reference to FIG. 6.
  • the storage unit may have storage segments, storage spaces, etc. arranged similarly to the storage 2003 in the computing processing device of FIG. 5.
  • the program code can be compressed in an appropriate form, for example.
  • the storage unit includes computer-readable codes 2006', that is, codes that can be read by, for example, a processor such as 2001. These codes, when run by a computing processing device, cause the computing processing device to execute the method described above. The various steps.
  • the embodiment of the present application provides a computer-readable storage medium, which is used to store computer instructions.
  • the computer instructions When the computer instructions are executed on the computer, the computer can execute the image processing method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种图像处理方法、装置、计算处理设备及介质,包括:获取待处理图像(S101);将待处理图像输入到图像处理模型中,基于待处理模型的输出得到待处理图像的图像处理结果(S102);其中,图像处理模型包括多个层级的特征处理子网络,每个层级包括不同深度的特征处理节点,对于除最后一个层级的之外的各层级的每个特征处理节点,基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图,以形成动态的图像处理模型,并基于图像处理模型中最后一个层级的各特征处理节点所输出的输出特征图,得到待处理图像的处理结果。上述方法可以自适应地选择已知网络结构,适用于不同尺度分布的图像,对实际运行计算量进行控制,降低了计算量。

Description

图像处理方法、装置、计算处理设备及介质
本申请要求在2020年1月16日提交中国专利局、申请号为202010058004.0、发明名称为“图像处理方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,具体而言,本申请涉及一种图像处理方法、装置、计算处理设备及介质。
背景技术
现有针对图像处理的网络结构均为使用预先定义好的静态网络对输入图片进行预测,主要分为手工设计的网络和网络结构搜索两种。其中,手工设计的网络一般使用多个层次特征图融合的方法来丰富特征图的语的细节信,以建立特征图之间的上下文关系。而网络结构搜索主要使用基于强化学习或梯度更新的方法,在一个数据集去拟合一个固定的网络结构。
但是,在实际应用中,待处理图像的尺寸分布往往有很大的差别,例如在一张图片中既有占图像很小比例的前景物体,也有占图像大部分比例的背景区域,若使用现有技术中的图像处理的网络结构,由于其均为固定的结构,对于这种尺寸分布相差很大的图像将无法准确地建立特征图之间的上下文关系,进而无法得到准确地处理结果。
发明内容
本申请的目的旨在至少能解决上述的技术缺陷之一。
第一方面,本申请实施例提供了一种图像处理方法,该方法包括:
获取待处理图像;
将待处理图像输入到图像处理模型中,基于图像处理模型的输出得到待处理图像的图像处理结果;
其中,图像处理模型包括多个层级的特征处理子网络,每个层级包括不同深度的特征处理节点,对于除最后一个层级的之外的各层级的每个特征处 理节点,基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图,以形成动态的图像处理模型,并基于图像处理模型中最后一个层级的各特征处理节点所输出的输出特征图,得到待处理图像的处理结果。
第一方面可选的实施例中,对于除最后一个层级的之外的各层级的每个特征处理节点,基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图,包括:
基于特征处理节点的输入特征图,确定初始输出特征图;
将特征处理节点的输入特征图输入至门控网络,以确定特征处理节点对应于各尺寸的输出特征图的使用概率;
基于初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定特征处理节点的输出特征图。
第一方面可选的实施例中,各尺寸的特征图的使用概率包括针对初始特征图进行上采样处理的使用概率、进行分辨率不变处理的使用概率、以及进行分辨率下采样处理的使用概率中的至少两种。
第一方面可选的实施例中,基于初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定特征处理节点的输出特征图,包括:
对于使用概率大于设定阈值的每一个使用概率,对初始输出特征图分别进行相对应的尺寸的特征提取,以确定特征处理节点的输出特征图。
第一方面可选的实施例中,若特征处理节点对应于各尺寸的输出特征图的使用概率均不大于设定阈值,则不执行基于特征处理节点的输入特征图,确定初始输出特征图的步骤。
第一方面可选的实施例中,每个特征处理节点中还包括依次级联的卷积层和残差层,依次级联的卷积层和残差层用于基于特征处理节点的输入特征图,确定初始输出特征图。
第一方面可选的实施例中,门控网络中包括神经网络和激活函数层,将特征处理节点的输入特征图输入至门控网络,以确定特征处理节点对应于各尺寸的输出特征图的使用概率,包括:
基于门控网络中包括的神经网络,确定特征处理节点对应于各尺寸的输出特征图的初始使用概率;
基于激活函数对各尺寸的输出特征图的初始使用概率进行激活,得到特征处理节点对应于各尺寸的输出特征图的使用概率。
第二方面,本申请实施例提供了一种图像处理装置,包括:
图像获取模块,用于获取待处理图像;
图像处理结果确定模块,用于将待处理图像输入到图像处理模型中,基于图像处理模型的输出得到待处理图像的图像处理结果;
其中,图像处理模型包括多个层级的特征处理子网络,每个层级包括不同深度的特征处理节点,对于除最后一个层级的之外的各层级的每个特征处理节点,基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图,以形成动态的图像处理模型,并基于图像处理模型中最后一个层级的各特征处理节点所输出的输出特征图,得到待处理图像的处理结果。
第二方面可选的实施例中,对于除最后一个层级的之外的各层级的每个特征处理节点,图像处理模型在基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图时,具体用于:
基于特征处理节点的输入特征图,确定初始输出特征图;
将特征处理节点的输入特征图输入至门控网络,以确定特征处理节点对应于各尺寸的输出特征图的使用概率;
基于初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定特征处理节点的输出特征图。
第二方面可选的实施例中,各尺寸的特征图的使用概率包括针对初始特征图进行上采样处理的使用概率、进行分辨率不变处理的使用概率、以及进行分辨率下采样处理的使用概率中的至少两种。
第二方面可选的实施例中,图像处理模型在基于初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定特征处理节点的输出特征图时,具体用于:
对于使用概率大于设定阈值的每一个使用概率,对初始输出特征图分别进行相对应的尺寸的特征提取,以确定特征处理节点的输出特征图。
第二方面可选的实施例中,若特征处理节点对应于各尺寸的输出特征图的使用概率均不大于设定阈值,则不执行基于特征处理节点的输入特征图,确定初始输出特征图的步骤。
第二方面可选的实施例中,每个特征处理节点中还包括依次级联的卷积层和残差层,依次级联的卷积层和残差层用于基于特征处理节点的输入特征图,确定初始输出特征图。
第二方面可选的实施例中,门控网络中包括神经网络和激活函数层,图像处理模型在将特征处理节点的输入特征图输入至门控网络,以确定特征处 理节点对应于各尺寸的输出特征图的使用概率时,具体用于:
基于门控网络中包括的神经网络,确定特征处理节点对应于各尺寸的输出特征图的初始使用概率;
基于激活函数对各尺寸的输出特征图的初始使用概率进行激活,得到特征处理节点对应于各尺寸的输出特征图的使用概率。
第三方面,本申请实施例提供了一种计算处理设备,包括:
存储器,其中存储有计算机可读代码;
一个或多个处理器,当所述计算机可读代码被所述一个或多个处理器执行时,所述计算处理设备执行如第一方面中任一项所述的图像处理方法。
第四方面,本申请实施例提供了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在计算处理设备上运行时,导致所述计算处理设备执行如第一方面中任一项所述的图像处理方法。
第五方面,本申请实施例提供了一种计算机可读存储介质,其上存储有如第四方面所述的计算机程序。
本申请实施例提供的技术方案带来的有益效果是:
在本申请实施例中,由于图像处理模型中包括多个层级的特征处理子网络,并且每个层级的特征处理子网络中还包括不同深度的特征处理节点,也就是说,该图像处理模型中包含大量的网络结构,从而在根据输入的待处理图像进行动态选择的过程中,可以自适应地选择多个已知网络结构,进而可以适用于不同尺度分布的图像。进一步的,由于除最后一个层级的之外的各层级的每个特征处理节点包括了用于控制输出特征图的门控网络,此时可以自适应地关闭不重要的特征处理节点,从而可以拟合不同的网络结构并对实际运行计算量进行控制,即可以保证确定出适用的网络结构,又可以降低了计算量。
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对本申请实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳 动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种图像处理方法的流程示意图;
图2为本申请实施例提供的一种图像处理模型的部分结构示意图;
图3为本申请实施例提供的一种深度特征提取网络的部分结构示意图;
图4为本申请实施例提供的一种图像处理装置的结构示意图;
图5为本申请实施例提供的一种计算处理设备的结构示意图;
图6示意性地示出了用于保持或者携带实现根据本发明的方法的程序代码的存储单元。
具体实施例
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本申请,而不能解释为对本发明的限制。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将结合附图,对本申请的实施例进行描述。
本申请实施例提供了一种图像处理方法,如图1所示,该方法包括:
步骤S101,获取待处理图像。
其中,待处理图像指的是需要进行图像处理的图像,该待处理图像的具体类型本申请实施例不限定,如可以为通过拍摄装置拍摄的照片,也可以是视频中的帧图像等。
步骤S102,将待处理图像输入到图像处理模型中,基于图像处理模型的输出得到待处理图像的图像处理结果;
其中,图像处理模型包括多个层级的特征处理子网络,每个层级包括不同深度的特征处理节点,对于除最后一个层级的之外的各层级的每个特征处理节点,基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图,以形成动态的图像处理模型,并基于图像处理模型中最后一个层级的各特征处理节点所输出的输出特征图,得到待处理图像的处理结果。
本申请可选的实施例中,图像处理模型可以为图像语义分割模型,图像处理结果可以为该待处理图像的语义分割结果。
在实际应用中,该图像处理模型可以包括深度特征提取网络,而深度特征提取网络中包括不同的层级(每个层级为一个特征处理子网络),并且每个层级中包括了不同深度的特征处理节点。此外,该图像处理模型中除最后一个层级的之外的各层级的每个特征处理节点可以包括门控网络,该门控网络可以根据输入特征图控制该特征处理节点的输出特征图。其中,当特征处理节点不存在输出特征图时,说明该特征处理节点处于关闭状态,也就是说,门控网络可以控制特征处理节点的通断。相应的,当图像处理模型中的特征处理节点包括门控网络时,由于门控网络可以控制特征处理节点的通断,因此该图像处理模型为可动态调整的模型。
在实际应用中,图像处理模型中还包括初始特征提取网络,当待处理图像输入到图像处理模型时,初始特征提取网络可以将该待处理图像的图片特征提取到高维特征空间,得到待处理图像的初始特征图。其中,初始特征提取网络的具体网络结构可以预先配置,本申请实施例不限定。例如,初始特征提取网络中可以包括多尺度特征提取模块和多尺度特征融合模块;相应的,当待处理图像输入到图像处理模型后,初始特征提取网络中的多尺度特征提取模块可以先提取待处理图像的多个尺度的特征图,然后多尺度特征融合模块对多个尺度的特征图进行融合,得到待处理图像的初始特征图。
在本申请实施例中,由于可以先通过初始特征提取网络将待处理图像的图片特征提取到高维特征空间,此时可以保证后续在对待处理图像进行处理的过程中可以更加稳定。
本申请可选的实施例中,每个特征处理节点对应一种输入特征图尺寸,若当前特征处理节点为第一个层级节点,该节点的输入为初始特征图,且初始特征图中尺寸等于该节点的输入特征图尺寸;若当前特征处理节点为除第一个层级之外的其它层级的节点,该节点的输入为其上一层级的各特征处理节点所输出的尺寸等于该节点的输入特征图尺寸的输出特征图;若当前特征处理节点为最后一个层级的节点,则该节点的输出特征图为将其各输入特征图融合后的输出特征图。
在实际应用中,在通过初始特征提取网络提取待处理图像的初始特征图后,可以基于图像处理模型中的深度特征提取网络对初始特征图进行进一步的特征提取进而得深度特征。
其中,深度特征提取网络中每个层级的特征处理节点的输入特征图对应于一种尺寸,不同的特征处理节点对应的输入特征图尺寸可能相同,也可能是不同的。
在实际应用中,不同层级的特征处理节点的输入特征图或输出特征图是存在不同的。如若当前特征处理节点为第一个层级的特征处理节点,该特征处理节点的输入特征图为初始特征图,且输入的初始特征图等于该特征处理节点对应的输入特征图尺寸;若当前特征处理节点是除第一个层级之外的其它层级的特征处理节点,则该特征处理节点的输入为其上一层级的各特征处理节点所输出的等于该特征处理节点对应的输入特征图尺寸的输出特征图;此外,若当前特征处理节点为最后一个层级的特征处理节点,由于最后一个层级的特征处理节点不包含门控网络,此时该特征处理节点的输出特征图为将其各输入特征图融合后的输出特征图。
在一示例中,假设当前特征处理节点属于第三个层级的特征处理子网络,对应的输入特征图尺寸为待处理图像的1/8,此时上一层级即为第二个层级的特征处理子网络,且第二个层级的特征处理子网络中包括特征处理节点1和特征处理节点2,特征处理节点1的输出特征图的尺寸分别为待处理图像大小和待处理图像的1/8,特征处理节点2的输出特征图的尺寸分别为 待处理图像的1/4、待处理图像的1/8和待处理图像的1/16,此时当前特征处理节点的输入特征图为特征处理节点1输出的待处理图像的1/8大小的输出特征图和特征处理节点2输出的待处理图像的1/8大小的输出特征图。相应的,若第三个层级的特征处理子网络为图像处理模型中的最后一个层级,此时当前特征处理节点则可以对特征处理节点1和特征处理节点2输出的待处理图像的1/8大小的输出特征图进行特征融合,得到输出特征图。
也就是说,在深度特征提取网络中,只有第一个层级的特征处理节点输入为初始特征图,而除最后一个层级的之外的各层级的每个特征处理节点的输出均为至少两种不同尺寸的特征图。
在本公开实施例中,基于最后一个层级的各特征处理节点所输出的输出特征图,得到待处理图像的处理结果,包括:
将最后一个层级的各特征处理节点的输出特征图进行融合,得到尺寸等于待处理图像的尺寸的融合特征图;
基于融合特征图,得到待处理图像的语义分割结果并输出。
在实际应用中,图像处理模块中还可以包括处理结果输出模块,其用于基于最后一个层级的各特征处理节点所输出的输出特征图,得到待处理图像的处理结果。也就是说,在得到最后一个层级的各特征处理节点所输出的输出特征图后,即可确定出待处理图像的处理结果。
其中,处理结果输出模块中包括依次级联的特征融合模块和语义分割结果输出模块。此时所包括的特征融合模块可以将最后一个层级的各特征处理节点的输出特征图进行融合,得到尺寸等于待处理图像的尺寸的融合特征图,然后通过语义分割结果输出模块基于融合特征图,得到待处理图像的语义分割结果,并将得到的语义分割结果输出。
其中,将最后一个层级的各特征处理节点的输出特征图进行融合的具体实现方式本申请实施例不限定。例如,将各尺寸的输出特征图进行融合并进行上采样分辨率处理,直至得到尺寸等于待处理图像的尺寸的融合特征图。
在一示例中,假设,最后一个层级的各特征处理节点的输出特征图的尺寸分别为待处理图像尺寸的1/8,待处理图像尺寸的1/4、待处理图像尺寸的1/2,此时可以对尺寸为待处理图像尺寸的1/8的输出特征图进行上采样分辨率处理,得到尺寸为待处理图像尺寸的1/4的特征图,并将得到的尺寸为待 处理图像尺寸的1/4的特征图和尺寸为原待处理图像尺寸的1/4的输出特征图进行融合,得到第一融合的特征图,然后对第一融合的特征图进行上采样分辨率处理,得到尺寸为待处理图像尺寸的1/2的特征图,并将得到的尺寸为待处理图像尺寸的1/2的特征图和原尺寸为待处理图像尺寸的1/2的输出特征图进行融合,得到第二融合的特征图,然后对第二融合的特征图进行上采样分辨率处理,得到尺寸为待处理图像尺寸的融合特征图。
在本申请实施例中,由于图像处理模型中包括多个层级的特征处理子网络,并且每个层级的特征处理子网络中还包括不同深度的特征处理节点,也就是说,该图像处理模型中包含大量的网络结构,从而在根据输入的待处理图像进行动态选择的过程中,可以自适应地选择多个已知网络结构,进而可以适用于不同尺度分布的图像。进一步的,由于除最后一个层级的之外的各层级的每个特征处理节点包括了用于控制输出特征图的门控网络,此时可以自适应地关闭不重要的特征处理节点,从而可以拟合不同的网络结构并对实际运行计算量进行控制,即可以保证确定出适用的网络结构,又可以降低了计算量。
在本申请实施例中,对于除最后一个层级的之外的各层级的每个特征处理节点,基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图,包括:
基于特征处理节点的输入特征图,确定初始输出特征图;
将特征处理节点的输入特征图输入至门控网络,以确定特征处理节点对应于各尺寸的输出特征图的使用概率;
基于初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定特征处理节点的输出特征图。
其中,为了描述方便,下文中将除最后一个层级的之外的各层级的每个特征处理节点称之为目标特征处理节点。在实际应该用中,对于每个目标特征处理节点,在将等于该目标特征处理节点的输入特征图尺寸的输出特征图输入至目标特征处理节点时,该目标特征处理节点所包括的特征提取模块可以根据所输入的输入特征图,确定初始输出特征图,并且所包括的特征选择模块(即门控网络)可以根据所输入的输入特征图确定该目标特征处理节点对应于各尺寸的输出特征图的使用概率,然后所包括的特征输出模块可以基 于对应于的每一种尺寸的输出特征图的使用概率对初始输出特征图进行特征提取,进而得到该目标特征处理节点对应于各尺寸的输出特征图。
其中,使用概率指的是对应尺寸的特征图的使用可能性,当使用概率越大时,其对应尺寸的特征图的使用可能性越高,反之,则使用可能性越低。
本申请可选的实施例中,各尺寸的特征图的使用概率包括针对初始特征图进行上采样处理的使用概率、进行分辨率不变处理的使用概率、以及进行分辨率下采样处理的使用概率中的至少两种。
也就是说,在目标特征处理节点根据输入特征图确定初始输出特征图后,可以对初始输出特征图进行上采样处理,分辨率不变处理或分辨率下采样处理等,而门控网络用于确定对初始输出特征图的进行上采样处理,分辨率不变处理或分辨率下采样处理的使用概率。
在本申请实施例中,基于初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定特征处理节点的输出特征图,包括:
对于使用概率大于设定阈值的每一个使用概率,对初始输出特征图分别进行相对应的尺寸的特征提取,以确定特征处理节点的输出特征图。
在实际应用中,目标特征处理节点在确定输出特征图时,可以将使用概率小于设定阈值使用概率过滤掉,即不执行使用概率小于设定阈值所对应的特征提取方式。也就是说,本申请实施例中,可以根据各尺寸的特征图的使用概率,确定对初始输出特征图进行特征提取的方式。其中,阈值的具体数值可以预先设置,本申请实施例不限定。在本申请可选的实施例中,设定阈值可以设置0,也就是说,若使用概率为0,此时将不执行该使用概况对应的特征提取方式。
在一示例中,假设目标特征处理节点通过门控网络确定对应于上采样处理的使用概率为0.5、对应于分辨率不变处理的使用概率为0.6、以及对应于分辨率下采样处理的使用概率为0,且设定阈值为0;相应的,由于对应于上采样处理的使用概率为0.5和对应于分辨率不变处理的使用概率为0.6大于设定阈值,此时目标特征处理节点可以不执行对初始输出特征图进行分辨率上采样处理和对初始特征图进行分辨率不变处理,仅对初始输出特征图进行分辨率下采样处理。
可以理解的是,在本申请实施例中,若特征处理节点对应于各尺寸的输 出特征图的使用概率均不大于设定阈值,则不执行基于特征处理节点的输入特征图,确定初始输出特征图的步骤。
在实际应用中,若特征处理节点对应于各尺寸的输出特征图的使用概率均不大于设定阈值,此时不需要对初始输出特征图进行特征提取,而为了减少计算量,目标特征处理节点可以不执行基于特征处理节点的输入特征图,确定初始输出特征图的步骤(即将目标特征处理节点处于关闭的状态)。
在本申请实施例中,可以通过各目标特征处理节点中的门控网络确定各目标特征处理节点的输出特征图的使用概率,进而可以动态地将计算量大但对最终结果贡献较小的目标特征处理节点删除,从而在有计算量约束时,可以动态选择网络结构,达到减小计算量的目的。
在本申请可选的实施例中,每个特征处理节点中还包括依次级联的卷积层和残差层,依次级联的卷积层和残差层用于基于特征处理节点的输入特征图,确定初始输出特征图。
在实际应用中,对于图像处理模型中的每个特征处理节点,其还包括依次级联的卷积(SepConv)层和残差(Residual)层,基于该依次级联的卷积层和残差层可以根据输入特征图确定初始输出特征图。其中,需要说明的是,对于最后一个层级中的特征处理节点,基于该卷积层和残差层确定的初始输出特征图即为最后的输出特征图。
此外,在实际应用中,在将等于该特征处理节点的输入特征图尺寸的输出特征图输入至特征处理节点前,可以先对各输入特征图进行融合,得到融合特征图,然后将融合特征图输入至依次级联的卷积层和残差层;也可以直接将各输入特征图输入至依次级联的卷积层和残差层,依次级联的卷积层和残差层先对各输入特征图输进行融合,得到融合特征图,然后基于得到的融合特征图确定初始输出特征图。
在本申请可选的实施例中,门控网络中包括神经网络和激活函数层,将特征处理节点的输入特征图输入至门控网络,以确定特征处理节点对应于各尺寸的输出特征图的使用概率,包括:
基于门控网络中包括的神经网络,确定特征处理节点对应于各尺寸的输出特征图的初始使用概率;
基于激活函数对各尺寸的输出特征图的初始使用概率进行激活,得到特 征处理节点对应于各尺寸的输出特征图的使用概率。
在实际应用中,门控网络可以为轻量级门控网络,其可以包括卷积神经网络和激活函数层,该卷积神经网络可以将输入特征图映射到隐空间并输出对应于各尺寸的输出特征图的激活值,然后激活函数层对各激活值进行激活,得到对应于各尺寸的输出特征图的使用概率。另外,由于使用概率是经过激活函数层确定的,此时各使用概率被限制在[0,1]之间,而激活函数层可以为max((0,tanh(x)),其中,x为激活值个数。
此外,在实际应用中,在训练图像处理模型中,对于每个目标特征处理节点中门控网络输出的各使用概率,可以将其与对应的输出特征图中的特征值相乘,以达到将特征处理节点和其所包括的门控网络一起进行端到端的训练。
如图2所示,本申请实施例提供一种图像处理模型的结构示意图,并结合该结构示意图。其中,图2中的Scale(比例)下方的数字(即1、1/4、…1/64)表示特征图尺寸(即不同的深度),如当为1时即特征图尺寸为待处理图像大小,当为1/4时即特征图尺寸为待处理图像的1/4等。
在本示例中,假设图像处理模型为图像语义分割模型,该图像语义分割模型可以包括初始特征提取网络、深度特征提取网络、以及处理结果输出模块。其中,深度特征提取网络中包括L+1个层级的特征处理子网络(图中1列为一个层级),每个特征处理子网络中包括各特征处理节点(如图中的深度特征提取网络中所包括的圆点所示),其所包括的各特征处理节点对应一种输入特征图尺寸。例如,对于第一个层级的两个特征处理节点,其对应的输入特征图尺寸分别为尺寸为待处理图像的1/4的初始特征图、以及尺寸为待处理图像的1/8的初始特征图。
在本示例中,获取到待处理图像后,可以通过图中所示的Input(输入)将待处理图像输入至初始特征提取网络,初始特征提取网络中所包括的多尺度特征提取模块(即图中的STEM)将待处理图像的多个尺度的特征图,然后所包括的多尺度特征融合模块可以对得到的多个尺度的特征图进行融合,得到初始特征图。相应的,将初始特征图输入至深度特征提取网络中的第一个层级的特征处理节点,第一个层级的特征处理节点对初始特征图分别进行分辨率不变采样处理(如图2中水平方向的箭头所示)和分辨率下采样处理 (如图3中指向右下方的箭头所示),得到不同的尺寸的输出特征图,然后根据输出特征图的尺寸,将各输出特征图分别输入至对应的第二个层级中的特征处理节点,并以此类推,直至到最后一个层级的特征处理节点。
为了更好的理解。在本示例中以其中一个特征处理节点为例,对除最后一个层级的之外的各层级的每个特征处理节点的处理过程进行说明。其中,该特征处理节点的内容结构如图3所示,具体的:
该特征处理节点的上一层级存在3个特征处理节点所输出的输出特征图的尺寸等于该节点的输入特征图尺寸(如图中C区域所示),然后可以对该特征处理节点的输入特征图进行融合,得到融合后的特征图(具体如图中的A所示);进一步,依次级联的卷积层和残差层(具体如图中的cell部分中的SepConv和Identity所示)基于该融合后的特征图确定初始输出特征图,门控网络(如图2中的Gate)基于该融合后的特征图,确定对应于分辨率上采样处理的使用概率、对应于分辨率不变处理的使用概率以及对应于分辨率上采样处理的使用概率;进一步的,假设得到的3个使用概率均大于设定阈值,此时可以对初始输出特征图分别进行分辨率上采样处理(如图3中指向右上方的箭头所示)、分辨率不变处理(如图3中水平方向的箭头所示)、以及分辨率下采样处理(如图3中指向右下方的箭头所示),得到3种不同尺寸的输出特征图(如图中Rounting区域所示)。其中,图3中Rounting区域为图3中b区域的详细结构示意图。
可以理解是的,在实际应用中,若门控网络确定的各尺寸的输出特征图的使用概率均不大于设定阈值,此时便可以不再执行图中cell部分,也就是说,此时可以不再执行该特征处理节点,进而可以减少了计算量。
进一步的,处理结果输出模块将最后一个层级的各特征处理节点的输出特征图基于分辨率上采样处理方式(如图2中的Upsample)进行融合,得到尺寸等于待处理图像的尺寸的融合特征图,并基于该融合特征图得到待处理图像的语义分割结果并输出(如图2中的Output)。
本申请实施例中,图像处理模型中包括多个尺度的特征处理节点的路径选择空间,可以使所设计的路径选择包含现有大多数的静态网络结构能够高效地提取多个尺度的特征。
进一步的,图像处理模型中包括多个尺度的特征处理节点主要用来聚合 多尺度特征并进行后续传播的路径选择.并且可以使用门控网络来对各个特征处理节点进行通断控制。进而在实际应用中,可以根据实际应用场景(如在终端设备或在服务器)的计算量需求,在训练图像处理模型时,通过损失函数进行约束控制网络动态地将计算量大但对最终结果贡献较小的特征处理节点删除,即可以根据输入图像动态地决定是否使用该节点进行特征聚合,从而达到在有计算量约束时动态选择网络结构的目的。
如图4所示,本申请实施例提供了一种图像处理装置,如图4所示,该图像处理装置60可以包括:图像获取模块601以及图像处理结果确定模块602,其中,
图像获取模块601,用于获取待处理图像;
图像处理结果确定模块602,用于将待处理图像输入到图像处理模型中,基于图像处理模型的输出得到待处理图像的图像处理结果;
其中,图像处理模型包括多个层级的特征处理子网络,每个层级包括不同深度的特征处理节点,对于除最后一个层级的之外的各层级的每个特征处理节点,基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图,以形成动态的图像处理模型,并基于图像处理模型中最后一个层级的各特征处理节点所输出的输出特征图,得到待处理图像的处理结果。
本申请可选的实施例中,对于除最后一个层级的之外的各层级的每个特征处理节点,图像处理模型在基于该特征处理节点中包括的门控网络确定特征处理节点的输出特征图时,具体用于:
基于特征处理节点的输入特征图,确定初始输出特征图;
将特征处理节点的输入特征图输入至门控网络,以确定特征处理节点对应于各尺寸的输出特征图的使用概率;
基于初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定特征处理节点的输出特征图。
本申请可选的实施例中,各尺寸的特征图的使用概率包括针对初始特征图进行上采样处理的使用概率、进行分辨率不变处理的使用概率、以及进行分辨率下采样处理的使用概率中的至少两种。
本申请可选的实施例中,图像处理模型在基于初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定特征处理节点的输出特征 图时,具体用于:
对于使用概率大于设定阈值的每一个使用概率,对初始输出特征图分别进行相对应的尺寸的特征提取,以确定特征处理节点的输出特征图。
本申请可选的实施例中,若特征处理节点对应于各尺寸的输出特征图的使用概率均不大于设定阈值,则不执行基于特征处理节点的输入特征图,确定初始输出特征图的步骤。
本申请可选的实施例中,每个特征处理节点中还包括依次级联的卷积层和残差层,依次级联的卷积层和残差层用于基于特征处理节点的输入特征图,确定初始输出特征图。
本申请可选的实施例中,门控网络中包括神经网络和激活函数层,图像处理模型在将特征处理节点的输入特征图输入至门控网络,以确定特征处理节点对应于各尺寸的输出特征图的使用概率时,具体用于:
基于门控网络中包括的神经网络,确定特征处理节点对应于各尺寸的输出特征图的初始使用概率;
基于激活函数对各尺寸的输出特征图的初始使用概率进行激活,得到特征处理节点对应于各尺寸的输出特征图的使用概率。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
本申请的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本申请实施例的计算处理设备中的一些或者全部部件的一些或者全部功能。本申请还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本申请的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
例如,本申请实施例提供了一种计算处理设备,如图5所示,图5所示的计算处理设备2000包括:处理器2001和存储器2003。其中,处理器2001和存储器2003相连,如通过总线2002相连。可选地,计算处理设备2000还可以包括收发器2004。需要说明的是,实际应用中收发器2004不限于一个,该计算处理设备2000的结构并不构成对本申请实施例的限定。
其中,处理器2001应用于本申请实施例中,用于实现图4所示的各模块的功能。
处理器2001可以是CPU,通用处理器,DSP,ASIC,FPGA或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器2001也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等。
总线2002可包括一通路,在上述组件之间传送信息。总线2002可以是PCI总线或EISA总线等。总线2002可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
存储器2003可以是ROM或可存储静态信息和指令的其他类型的静态存储设备,RAM或者可存储信息和指令的其他类型的动态存储设备,也可以是EEPROM、CD-ROM或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。
存储器2003用于存储执行本申请方案的应用程序代码,并由处理器2001来控制执行。存储器2003具有用于执行上述方法中的任何方法步骤的程序代码的存储空间2005。例如,用于程序代码的存储空间2005可以包括分别用于实现上面的方法中的各种步骤的各个程序代码2006。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为如参考图6所述的便携式或者固定存储单元。该存储单元可以具有与图5的计算 处理设备中的存储器2003类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括计算机可读代码2006’,即可以由例如诸如2001之类的处理器读取的代码,这些代码当由计算处理设备运行时,导致该计算处理设备执行上面所描述的方法中的各个步骤。
本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质上用于存储计算机指令,当计算机指令在计算机上运行时,使得计算机可以执行实现图像处理方法。
本申请中的一种计算机可读存储介质所涉及的名词及实现原理具体可以参照本申请实施例中的一种图像处理方法,在此不再赘述。
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
以上所述仅是本发明的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (11)

  1. 一种图像处理方法,其特征在于,包括:
    获取待处理图像;
    将所述待处理图像输入到图像处理模型中,基于所述图像处理模型的输出得到所述待处理图像的图像处理结果;
    其中,所述图像处理模型包括多个层级的特征处理子网络,每个层级包括不同深度的特征处理节点,对于除最后一个层级的之外的各层级的每个特征处理节点,基于该特征处理节点中包括的门控网络确定所述特征处理节点的输出特征图,以形成动态的图像处理模型,并基于所述图像处理模型中最后一个层级的各特征处理节点所输出的输出特征图,得到所述待处理图像的处理结果。
  2. 根据权利要求1所述的方法,其特征在于,对于除最后一个层级的之外的各层级的每个特征处理节点,所述基于该特征处理节点中包括的门控网络确定所述特征处理节点的输出特征图,包括:
    基于所述特征处理节点的输入特征图,确定初始输出特征图;
    将所述特征处理节点的输入特征图输入至所述门控网络,以确定所述特征处理节点对应于各尺寸的输出特征图的使用概率;
    基于所述初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定所述特征处理节点的输出特征图。
  3. 根据权利要求2所述的方法,其特征在于,所述各尺寸的特征图的使用概率包括针对所述初始特征图进行上采样处理的使用概率、进行分辨率不变处理的使用概率、以及进行分辨率下采样处理的使用概率中的至少两种。
  4. 根据权利要求2所述的方法,其特征在于,所述基于所述初始输出特征图、以及所确定出的对应于各尺寸的特征图的使用概率,确定所述特征处理节点的输出特征图,包括:
    对于使用概率大于设定阈值的每一个使用概率,对所述初始输出特征图分别进行相对应的尺寸的特征提取,以确定所述特征处理节点的输出特征图。
  5. 根据权利要求4所述的方法,其特征在于,若所述特征处理节点对应于各尺寸的输出特征图的使用概率均不大于设定阈值,则不执 行所述基于所述特征处理节点的输入特征图,确定初始输出特征图的步骤。
  6. 根据权利要求2所述的方法,其特征在于,每个特征处理节点中还包括依次级联的卷积层和残差层,所述依次级联的卷积层和残差层用于基于特征处理节点的输入特征图,确定初始输出特征图。
  7. 根据权利要求2所述的方法,其特征在于,所述门控网络中包括神经网络和激活函数层,所述将所述特征处理节点的输入特征图输入至所述门控网络,以确定所述特征处理节点对应于各尺寸的输出特征图的使用概率,包括:
    基于所述门控网络中包括的神经网络,确定所述特征处理节点对应于各尺寸的输出特征图的初始使用概率;
    基于所述激活函数对各尺寸的输出特征图的初始使用概率进行激活,得到所述特征处理节点对应于各尺寸的输出特征图的使用概率。
  8. 一种图像处理装置,其特征在于,包括:
    图像获取模块,用于获取待处理图像;
    图像处理结果确定模块,用于将所述图像处理图像输入到图像处理模型中,基于所述待处理模型的输出得到所述待处理图像的图像处理结果;
    其中,所述图像处理模型包括多个层级的特征处理子网络,每个层级包括不同深度的特征处理节点,对于除最后一个层级的之外的各层级的每个特征处理节点,基于该特征处理节点中包括的门控网络确定所述特征处理节点的输出特征图,以形成动态的图像处理模型,并基于所述图像处理模型中最后一个层级的各特征处理节点所输出的输出特征图,得到所述待处理图像的处理结果。
  9. 一种计算处理设备,其特征在于,包括:
    存储器,其中存储有计算机可读代码;
    一个或多个处理器,当所述计算机可读代码被所述一个或多个处理器执行时,所述计算处理设备执行如权利要求1-7中任一项所述的图像处理方法。
  10. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在计算处理设备上运行时,导致所述计算处理设备执行根据权利要求1-7中任一项所述的图像处理方法。
  11. 一种计算机可读介质,其中存储了如权利要求10所述的计算机程序。
PCT/CN2020/118866 2020-01-16 2020-09-29 图像处理方法、装置、计算处理设备及介质 WO2021143207A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010058004.0A CN111275054B (zh) 2020-01-16 2020-01-16 图像处理方法、装置、电子设备及存储介质
CN202010058004.0 2020-01-16

Publications (1)

Publication Number Publication Date
WO2021143207A1 true WO2021143207A1 (zh) 2021-07-22

Family

ID=71003058

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118866 WO2021143207A1 (zh) 2020-01-16 2020-09-29 图像处理方法、装置、计算处理设备及介质

Country Status (2)

Country Link
CN (1) CN111275054B (zh)
WO (1) WO2021143207A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116051848A (zh) * 2023-02-10 2023-05-02 阿里巴巴(中国)有限公司 图像特征提取方法、网络模型、装置及设备

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275054B (zh) * 2020-01-16 2023-10-31 北京迈格威科技有限公司 图像处理方法、装置、电子设备及存储介质
CN112329835A (zh) * 2020-10-30 2021-02-05 天河超级计算淮海分中心 图像处理方法、电子设备和存储介质
CN114612374A (zh) * 2020-12-09 2022-06-10 中国科学院深圳先进技术研究院 基于特征金字塔的图像检测模型的训练方法、介质和设备
CN113361567B (zh) * 2021-05-17 2023-10-31 上海壁仞智能科技有限公司 图像处理方法、装置、电子设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229650A (zh) * 2017-11-15 2018-06-29 北京市商汤科技开发有限公司 卷积处理方法、装置及电子设备
CN108776807A (zh) * 2018-05-18 2018-11-09 复旦大学 一种基于可跳层双支神经网络的图像粗细粒度分类方法
US10241520B2 (en) * 2016-12-22 2019-03-26 TCL Research America Inc. System and method for vision-based flight self-stabilization by deep gated recurrent Q-networks
CN109934153A (zh) * 2019-03-07 2019-06-25 张新长 基于门控深度残差优化网络的建筑物提取方法
CN111275054A (zh) * 2020-01-16 2020-06-12 北京迈格威科技有限公司 图像处理方法、装置、电子设备及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101903437B1 (ko) * 2017-06-30 2018-10-04 동국대학교 산학협력단 딥 레지듀얼 러닝 기반 눈 개폐 분류 장치 및 방법
CN108228700B (zh) * 2017-09-30 2021-01-26 北京市商汤科技开发有限公司 图像描述模型的训练方法、装置、电子设备及存储介质
WO2019203921A1 (en) * 2018-04-17 2019-10-24 Hrl Laboratories, Llc System for real-time object detection and recognition using both image and size features
CN109271992A (zh) * 2018-09-26 2019-01-25 上海联影智能医疗科技有限公司 一种医学图像处理方法、系统、装置和计算机可读存储介质
CN109710800B (zh) * 2018-11-08 2021-05-25 北京奇艺世纪科技有限公司 模型生成方法、视频分类方法、装置、终端及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10241520B2 (en) * 2016-12-22 2019-03-26 TCL Research America Inc. System and method for vision-based flight self-stabilization by deep gated recurrent Q-networks
CN108229650A (zh) * 2017-11-15 2018-06-29 北京市商汤科技开发有限公司 卷积处理方法、装置及电子设备
CN108776807A (zh) * 2018-05-18 2018-11-09 复旦大学 一种基于可跳层双支神经网络的图像粗细粒度分类方法
CN109934153A (zh) * 2019-03-07 2019-06-25 张新长 基于门控深度残差优化网络的建筑物提取方法
CN111275054A (zh) * 2020-01-16 2020-06-12 北京迈格威科技有限公司 图像处理方法、装置、电子设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116051848A (zh) * 2023-02-10 2023-05-02 阿里巴巴(中国)有限公司 图像特征提取方法、网络模型、装置及设备
CN116051848B (zh) * 2023-02-10 2024-01-09 阿里巴巴(中国)有限公司 图像特征提取方法、网络模型、装置及设备

Also Published As

Publication number Publication date
CN111275054B (zh) 2023-10-31
CN111275054A (zh) 2020-06-12

Similar Documents

Publication Publication Date Title
WO2021143207A1 (zh) 图像处理方法、装置、计算处理设备及介质
JP7059318B2 (ja) 地域的特徴を有する分類器学習のための学習データ生成方法およびそのシステム
KR20210124111A (ko) 모델을 훈련하기 위한 방법, 장치, 기기, 매체 및 프로그램 제품
WO2022105608A1 (zh) 一种快速人脸密度预测和人脸检测方法、装置、电子设备及存储介质
WO2023138188A1 (zh) 特征融合模型训练及样本检索方法、装置和计算机设备
CN113570030B (zh) 数据处理方法、装置、设备以及存储介质
Li et al. FRD-CNN: Object detection based on small-scale convolutional neural networks and feature reuse
WO2021218037A1 (zh) 目标检测方法、装置、计算机设备和存储介质
CN114511576A (zh) 尺度自适应特征增强深度神经网络的图像分割方法与系统
US10217224B2 (en) Method and system for sharing-oriented personalized route planning via a customizable multimedia approach
CN111274981A (zh) 目标检测网络构建方法及装置、目标检测方法
CN113887615A (zh) 图像处理方法、装置、设备和介质
CN114861842B (zh) 少样本目标检测方法、装置和电子设备
CN110633717A (zh) 一种目标检测模型的训练方法和装置
CN110633716A (zh) 一种目标对象的检测方法和装置
CN115170815A (zh) 视觉任务处理及模型训练的方法、装置、介质
CN114998592A (zh) 用于实例分割的方法、装置、设备和存储介质
CN113313162A (zh) 一种多尺度特征融合目标检测的方法及系统
CN116503744B (zh) 高度等级引导的单视角遥感影像建筑高度估计方法和装置
Zha et al. ASFNet: Adaptive multiscale segmentation fusion network for real‐time semantic segmentation
CN111475736A (zh) 社区挖掘的方法、装置和服务器
US20230196093A1 (en) Neural network processing
CN113139463B (zh) 用于训练模型的方法、装置、设备、介质和程序产品
CN111914920A (zh) 一种基于稀疏编码的相似性图像检索方法及系统
CN113343979B (zh) 用于训练模型的方法、装置、设备、介质和程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20913465

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20913465

Country of ref document: EP

Kind code of ref document: A1