WO2019020075A1 - 图像处理方法、装置、存储介质、计算机程序和电子设备 - Google Patents

图像处理方法、装置、存储介质、计算机程序和电子设备 Download PDF

Info

Publication number
WO2019020075A1
WO2019020075A1 PCT/CN2018/097227 CN2018097227W WO2019020075A1 WO 2019020075 A1 WO2019020075 A1 WO 2019020075A1 CN 2018097227 W CN2018097227 W CN 2018097227W WO 2019020075 A1 WO2019020075 A1 WO 2019020075A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature map
feature
network
layer
branch
Prior art date
Application number
PCT/CN2018/097227
Other languages
English (en)
French (fr)
Inventor
杨巍
欧阳万里
李爽
李鸿升
王晓刚
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Publication of WO2019020075A1 publication Critical patent/WO2019020075A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the embodiments of the present invention relate to the field of computer vision technologies, and in particular, to an image processing method, apparatus, storage medium, computer program, and electronic device.
  • the estimation of human body posture is mainly to locate the position of various parts of the human body in a given image or video. It is an important research topic in the field of computer vision. It is mainly used in motion recognition, behavior recognition, clothing analysis, task comparison, human-computer interaction. etc.
  • An embodiment of the present application provides an image processing scheme.
  • the method further includes performing key point detection on the target object in the image to be detected according to the first feature map.
  • performing key point detection on the target object according to the first feature map includes: acquiring at least each of the target objects according to the first feature map a score map of a key point; determining a position of a corresponding key point of the target object according to a score of a pixel point included in the score map of the at least one key point.
  • the neural network includes at least one feature pyramid sub-network, where the feature pyramid sub-network includes a first branch network and at least one of the first branch network a second branch network; the other feature map includes a second feature map and/or a third feature map; and the feature map is extracted by the neural network based on at least two different scales to obtain at least two other feature maps, include:
  • the feature map is extracted from the feature map by other scales to obtain the third feature map.
  • the first branch network includes a second convolution layer, a third convolution layer, and a fourth convolution layer; Feature extraction of the feature map by the original scale of the feature map, and obtaining the second feature map, including:
  • Decreasing a dimension of the feature map based on the second convolution layer performing convolution processing on the feature graph after the reduced dimension based on an original scale of the feature map by using the third convolution layer;
  • the four-volume layer enhances the dimension of the convolution processed feature map to obtain the second feature map.
  • At least one of the second branch networks includes a fifth convolution layer, a downsampling layer, a sixth convolution layer, an upsampling layer, and a seventh convolution layer; Extracting the feature map by using at least one of the second branch networks to obtain the feature map based on other scales different from the original scale, and obtaining the third feature map, including:
  • the second branch network has multiple; at least two of the second branch networks have different set downsampling ratios, and/or at least two of the foregoing The second branch network has the same set down sampling ratio.
  • the second branch network has multiple; the sixth convolution layer of at least two of the second branch networks share network parameters.
  • the second branch network includes a fifth convolution layer, an expanded convolution layer, and a seventh convolution layer; the utilizing at least one of the second branch networks respectively Extracting the feature map based on other scales different from the original scale, and obtaining the third feature map, including:
  • Decreasing a dimension of the feature map based on the fifth convolution layer ; performing an expansion convolution process on the feature map after reducing the dimension based on the expanded convolution layer, and elevating the expanded volume based on the seventh convolution layer
  • the third feature map is obtained by the dimension of the accumulated feature map.
  • the second branch network has multiple; the fifth convolution layer and/or the seventh convolution of at least two of the second branch networks. Layers share network parameters.
  • the feature pyramid sub-network further includes a first output merge layer; the first output merge layer pair at least two of the sharing the seventh convolution layer The respective outputs of the second branch network before the seventh convolutional layer are merged, and the combined result is output to the shared seventh convolutional layer.
  • the neural network includes at least two sequentially connected feature pyramid sub-networks; and the second characteristic pyramid sub-network outputs the first feature map of the first feature pyramid sub-network. Inputting, extracting a first feature map of the second feature pyramid sub-network based on different scales, where an input end of the second feature pyramid sub-network is connected to an output end of the first feature pyramid sub-network.
  • the neural network is an hourglass HOURGLASS neural network
  • the at least one hourglass module included in the hourglass HOURGLASS neural network includes at least one of the characteristic pyramid sub-networks.
  • the initialization network parameter of the at least one network layer of the neural network is obtained from a network parameter distribution determined according to the mean and variance of the initialization network parameter, and The average value of the initialization network parameters is zero.
  • the output adjustment module when there is a situation in the neural network including at least two identity mapping additions, setting an output adjustment module in at least one identity mapping branch that needs to be added, The first feature map of the output of the identity map branch is adjusted by the output adjustment module.
  • an image processing apparatus comprising: an acquisition module, configured to acquire a feature map of an image to be detected; and an extraction module, configured to use the neural network to perform the method based on at least two different scales
  • the feature map performs feature extraction to obtain at least two other feature maps;
  • the merging module is configured to merge the feature map and the at least two other feature maps to obtain a first feature map of the image to be detected.
  • the device further includes: a detecting module, configured to perform key point detection on the target object in the image to be detected according to the first feature map.
  • the detecting module includes: a scoring unit, configured to respectively acquire a score map of at least one key point of the target object according to the first feature map; And determining a position of a corresponding key point of the target object according to a score of the pixel points included in the score map of the at least one key point.
  • the neural network includes at least one feature pyramid sub-network, the feature pyramid sub-network includes a first branch network and at least one respectively connected in parallel with the first branch network a second branch network; the other feature map includes a second feature map and/or a third feature map; the extracting module is configured to use the first branch network to map the feature map based on an original scale of the feature map Feature extraction is performed to obtain the second feature map; and the feature map is extracted by using at least one of the second branch networks based on other scales different from the original scale to obtain the third feature map.
  • the first branch network includes a second convolution layer, a third convolution layer, and a fourth convolution layer; and the extraction module is configured to be based on the a second convolution layer reduces a dimension of the feature map; using the third convolution layer to perform convolution processing on the reduced dimension feature map based on an original scale of the feature map; using the fourth convolution layer The dimension of the convolution processed feature map is raised to obtain the second feature map.
  • At least one of the second branch networks includes a fifth convolution layer, a downsampling layer, a sixth convolution layer, an upsampling layer, and a seventh convolution layer; Extracting, by using at least one of the second branch networks, performing feature extraction on the feature map based on other scales different from the original scale, and when obtaining the third feature map, based on the fifth convolution layer Lowering the dimension of the feature map; and downsampling the feature map after the dimension reduction according to the set downsampling ratio, wherein the scale of the downsampled feature map is smaller than the original scale of the feature map And performing convolution processing on the downsampled feature map based on the sixth convolution layer; and performing upsampling on the convolved feature map based on the upsampling layer according to a set upsampling ratio, wherein The scale of the upsampled feature map is equal to the original scale of the feature map; and the third feature map is obtained by lifting the dimension of the
  • the second branch network has multiple; at least two of the second branch networks have different set downsampling ratios, and/or at least two of the foregoing The second branch network has the same set down sampling ratio.
  • the second branch network has multiple; the sixth convolution layer of at least two of the second branch networks share network parameters.
  • the second branch network includes a fifth convolution layer, an expanded convolution layer, and a seventh convolution layer; and the extraction module utilizes at least one of the second branches
  • the network performs feature extraction on the feature map based on other scales different from the original scale, and when the third feature map is obtained, is used to reduce a dimension of the feature map based on the fifth convolution layer;
  • the expanded convolution layer performs an expansion convolution process on the feature map after reducing the dimension; and the third feature map is obtained by lifting the dimension of the feature map after the expansion convolution based on the seventh convolution layer.
  • the second branch network has multiple; the fifth convolution layer and/or the seventh convolution of at least two of the second branch networks. Layers share network parameters.
  • the feature pyramid sub-network further includes a first output merge layer; the first output merge layer is configured to share at least two of the seventh convolution layer The respective outputs of the second branch network before the seventh convolutional layer are merged, and the merged result is output to the shared seventh convolutional layer.
  • the neural network includes at least two sequentially connected feature pyramid sub-networks; and the second characteristic pyramid sub-network outputs the first feature map of the first feature pyramid sub-network. Inputting, extracting a first feature map of the second feature pyramid sub-network based on different scales, where an input end of the second feature pyramid sub-network is connected to an output end of the first feature pyramid sub-network.
  • the neural network is an hourglass HOURGLASS neural network
  • the at least one hourglass module included in the hourglass HOURGLASS neural network includes at least one of the characteristic pyramid sub-networks.
  • the initialization network parameter of the at least one network layer of the neural network is obtained from a network parameter distribution determined according to the mean and variance of the initialization network parameter, and The average value of the initialization network parameters is zero.
  • the output adjustment module is configured to adjust a first feature map of the output of the identity mapping branch.
  • a computer readable storage medium having stored thereon computer program instructions, wherein the program instructions are executed by a processor to implement the steps of any of the foregoing image processing methods.
  • an electronic device comprising: a processor and the image processing apparatus according to any one of the preceding claims; wherein when the processor runs the image processing apparatus, The modules in the image processing device are operated.
  • an electronic device comprising: a processor, a memory, a communication component, and a communication bus, the processor, the memory, and the communication component completing each other through the communication bus
  • the communication is for storing at least one executable instruction that causes the processor to perform an operation corresponding to the image processing method of any of the foregoing.
  • a computer program comprising: at least one executable instruction for performing an operation corresponding to any one of the foregoing image processing methods when the at least one executable instruction is processed by a processor .
  • the feature map is extracted by the neural network based on a plurality of different scales to obtain a plurality of other feature maps, and the feature map and the plurality of other features are obtained.
  • the feature maps are merged to obtain the first feature map of the image to be detected, and the neural network is used to learn and extract features of different scales, which improves the accuracy and robustness of feature extraction of the neural network.
  • FIG. 1 is a schematic flow chart of an embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 2 is a schematic flow chart of another embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a feature pyramid sub-network according to another embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 4 is another schematic structural diagram of a feature pyramid sub-network according to another embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 5 is still another schematic structural diagram of a feature pyramid sub-network according to another embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a neural network for image processing according to another embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a HOURGLASS network according to another embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 8 is a score diagram of an output of another embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an identity mapping addition according to another embodiment of an image processing method according to an embodiment of the present application.
  • FIG. 10 is a structural block diagram of an embodiment of an image processing apparatus according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an embodiment of an electronic device according to an embodiment of the present application.
  • Embodiments of the present application can be applied to computer systems/servers that can operate with numerous other general purpose or special purpose computing system environments or configurations.
  • Examples of well-known computing systems, environments, and/or configurations suitable for use with computer systems/servers include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop devices, based on Microprocessor systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the above, and the like.
  • the computer system/server can be described in the general context of computer system executable instructions (such as program modules) being executed by a computer system.
  • program modules may include routines, programs, target programs, components, logic, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • the computer system/server can be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices that are linked through a communication network.
  • program modules may be located on a local or remote computing system storage medium including storage devices.
  • FIG. 1 a flow chart of an embodiment of an image processing method according to an embodiment of the present application is shown.
  • Step S102 Acquire a feature map of the image to be detected.
  • an arbitrary image analysis processing method may be used to perform feature extraction processing on the image to be detected to obtain a feature map of the image to be detected.
  • the feature extraction operation is performed on the image to be detected by, for example, a convolutional neural network, and a feature map including feature information of the image to be detected is acquired.
  • the image to be detected may be an independent still image, or may be any frame image in the video sequence.
  • the acquired feature map may be a global feature map of the image to be detected or a non-global feature map, which is not limited in this embodiment.
  • a global feature map of the image to be detected or a local feature map including the target object may be respectively acquired.
  • the step S102 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the acquisition module 1002 being executed by the processor.
  • Step S104 Perform feature extraction on the feature map based on at least two different scales by using a neural network to obtain at least two other feature maps.
  • At least two other feature maps are feature maps of the neural network to be detected, and feature maps obtained by performing further feature extraction operations based on at least two different scales, each scale corresponding to one other feature map.
  • the scale by which the neural network performs the feature extraction operation can define the scale of the feature extracted by the feature extraction operation.
  • the neural network extracts features based on different scales of the detected images, and learns and extracts features of different scales through the neural network, so that the features of the image to be detected can be stably and accurately extracted.
  • the embodiments of the present application can effectively cope with the problem that the feature scale transmission of the image to be detected is changed, such as occlusion, perspective, etc., thereby improving the robustness of feature extraction.
  • the feature extraction is based on different scales, which may be different physical size of the image, or different sizes of the effective part of the image (for example, although the physical size of the image is the same, the pixel value of a part of the pixel of the image) It has been processed by means of, but not limited to, zeroing, etc., except that the portion of the other pixels of the processed pixels corresponds to the effective portion, the size of the effective portion is small relative to the physical size of the image, and the like, but is not limited thereto.
  • the at least two different scales may include an original scale of the image to be detected and at least one scale different from the original scale, or include at least two different scales different from the original scale.
  • step S104 may be performed by the processor invoking a corresponding instruction stored in the memory, or may be performed by the extraction module 1004 being executed by the processor.
  • Step S106 Combine the feature map and the at least two other feature maps to obtain a first feature map of the image to be detected.
  • the feature map and each of the other feature maps are combined to obtain a first feature map such that the first feature map includes extracted features of different scales.
  • the merging operation may include an adding operation or a series operation.
  • the merged first feature map can be used for subsequent image processing of the image to be detected, such as key point detection, object detection, object recognition, image segmentation, object clustering, etc., which can improve the effect of subsequent image processing.
  • the step S106 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a merge module 1006 executed by the processor.
  • the feature map is extracted by the neural network based on a plurality of different scales to obtain a plurality of other feature maps, and the feature map and the plurality of other features are obtained.
  • the feature maps are merged to obtain the first feature map of the image to be detected, and the neural network is used to learn and extract features of different scales, which improves the accuracy and robustness of feature extraction of the neural network.
  • any of the image processing methods provided by the embodiments of the present application may be performed by any suitable device having data processing capabilities, including but not limited to: a terminal device, a server, and the like.
  • any image processing method provided by the embodiment of the present application may be executed by a processor, such as the processor, by executing a corresponding instruction stored in the memory to execute any one of the image processing methods mentioned in the embodiments of the present application. This will not be repeated below.
  • FIG. 2 a flow chart of another embodiment of an image processing method according to an embodiment of the present application is shown.
  • Step S202 Acquire a feature map of the image to be detected.
  • the feature extraction operation is performed on the detected image by the neural network to acquire the feature map.
  • the neural network includes a convolution layer (Convution, Conv) for performing feature extraction, performing preliminary detection and feature extraction operations on the image to be detected of the input neural network, and acquiring a feature map including an initial image to be detected.
  • Conv convolution layer
  • the step S202 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the acquisition module 1002 being executed by the processor.
  • Step S204 Perform feature extraction on the feature map based on at least two different scales by using a neural network to obtain at least two other feature maps.
  • step S204 may be performed by the processor invoking a corresponding instruction stored in the memory or by the extraction module 1004 being executed by the processor.
  • the neural network includes at least one feature pyramid sub-network for performing feature extraction on the feature map based on at least two different scales to obtain at least two other feature maps.
  • the feature pyramid includes a first branch network and at least one second branch network respectively in parallel with the first branch network.
  • the first branch network performs further feature extraction on the feature map of the input feature pyramid based on the original scale of the image to be detected to obtain a second feature map; and at least one second branch network performs the feature map on the basis of other scales different from the original scale. Further feature extraction, obtaining a third feature map. That is, at least two other feature maps include a second feature map and a third feature map.
  • the first branch network includes a second convolution layer (Convolutio 2, Conv 2), a third convolution layer (Conv 3), and a fourth convolution layer (Conv 4).
  • the at least one second branch network includes a fifth convolutional layer (Conv5), a downsampling layer, a sixth convolutional layer (Conv 6), an upsampling layer, and a seventh convolutional layer (Conv 7).
  • the first branch network is f 0 and the at least two second branch networks are f 1 to f c , respectively, where f 0 retains the original scale of the input features.
  • the feature maps of the input feature pyramid sub-network are input to f 0 to f c , respectively .
  • the second convolutional layer of f 0 and the fifth convolutional layer of f 1 to f c may each employ a convolutional network having a convolution kernel size of 1 ⁇ 1 for reducing the dimension of the input feature map.
  • the downsampling layers of f 1 to f c respectively downsample the feature maps of the reduced thickness of the fifth convolutional layer according to the set downsampling ratios Ratio 1 to Ratio c, respectively, to obtain feature maps of different resolutions.
  • the scale of the feature map after downsampling is smaller than the original scale of the feature map.
  • a third convolutional layer of f 0 and a sixth convolutional layer of f 1 to f c may each adopt a convolutional network having a convolution kernel size of 3 ⁇ 3 for respectively reducing the output of the second convolutional layer.
  • the feature map and the downsampled feature map of the corresponding downsampled layer output are convoluted to learn and extract features of different scales.
  • the upsampling layers of f 1 to f c upsample the convolved feature map of the sixth convolutional layer based on different upsampling ratios, wherein the scale of the upsampled feature map is equal to the feature map Original scale.
  • the fourth convolutional layer of f 0 enhances the dimension of the convolution processed feature map of the third convolutional layer output to obtain a second feature map.
  • the seventh convolution layer of f 1 to f c raises the dimensions of the upsampled feature map corresponding to the output of the upsampling layer, and obtains the third feature map, respectively.
  • the at least two second branch network to f c f 1, different set downsampling ratio of at least two of the second branch of the network, and / or at least two branch network of the second set to the same down sampling ratio. That is, the downsampling ratios adopted by at least two second branch networks may be different, may be partially the same, or may be all the same.
  • the feature pyramid sub-network in conjunction with the first branch network based on the original scale, the feature pyramid sub-network can extract different features based on at least two different scales.
  • f 0 since f 0 retains the original scale of the input feature, there is no need to change the resolution of the feature. Therefore, f 0 does not use the downsampling layer and the upsampling layer. In practical applications, f 0 can also adopt the downsampling ratio and the upsampling ratio. Is 1 downsampled layer and upsampled layer.
  • the sixth convolutional layer of the at least two second branch networks share parameters.
  • the sixth convolutional layer of at least two second branch networks shares a convolution kernel, that is, at least two convolutional cores of the sixth convolutional layer have the same parameters to reduce by adopting an internal parameter sharing mechanism.
  • the structural form of the feature pyramid sub-network shown in FIG. 4 may be further adopted, and the at least one second branch network includes a fifth convolution layer, an expanded convolution layer, and a seventh convolution layer;
  • the fifth volume reduces the dimension of the feature map;
  • the expanded convolution layer performs the expansion convolution process on the feature map after the dimension is reduced;
  • the seventh convolution layer enhances the dimension of the feature map after the expanded convolution to obtain the third feature map . That is, the downsampled layer, the sixth convolutional layer, and the upsampled layer of the at least one second branch network are replaced by a dilated convolution (shown as dstride 1 to dstride c in the figure) to simplify the feature pyramid subnetwork.
  • dilated convolution shown as dstride 1 to dstride c in the figure
  • the expansion convolution process can also implement downsampling, for example, by setting the pixel value of a part of the pixels of the feature map to 0, and maintaining the image having the effective pixel value in the case where the physical size of the image is consistent. It is smaller, and the effect of downsampling is also achieved.
  • the at least two second branch networks share a fifth convolutional layer and/or a seventh convolutional layer, optionally the fifth convolutional layer and/or the seventh convolutional layer share network parameters.
  • the at least two second branch networks may also have respective fifth convolutional layers and/or seventh convolutional layers, and the network parameters of the fifth convolutional layer and/or the seventh convolutional layer are different.
  • the structural form of the feature pyramid sub-network shown in FIG. 5 may be adopted, and at least two second branch networks share the same fifth convolution layer.
  • the fifth convolutional layer is a 1 ⁇ 1 convolution network, and after the features of the input feature pyramid sub-network are subjected to dimensionality reduction processing, the output is output to at least two second branch networks sharing the fifth convolutional layer. Sampling layer.
  • the characteristic pyramid subnetwork of the structure has a small number of parameters and a low computational complexity.
  • the feature pyramid sub-network further includes a first output merge layer, and the first output merge layer merges the respective outputs of the at least two second branch networks sharing the seventh convolution layer before the seventh convolution layer, and The combined result is output to the shared seventh volume layer.
  • the first output merge layer is connected between the shared seventh convolutional upsampling layer and the seventh convolutional layer, and is used for combining the feature maps outputted by the upsampling layer of the at least two second branch networks, and
  • the merged feature map is output to the seventh volume layer.
  • the merging process may include an adding operation or a series operation.
  • Indicates the output addition operation in the figure Can also be replaced by Indicates the output concatenation operation (Concatenation).
  • the adding operation can be represented as a point-to-point addition of a plurality of tensors
  • the series operation can be represented as a series connection of a plurality of tensors in one dimension. If the c second branch networks f 1 to f c output c 256 ⁇ 64 ⁇ 64 feature maps, after the addition operation, the 256 ⁇ 64 ⁇ 64 feature maps will become (256 ⁇ after serial operation). c) ⁇ 64 ⁇ 64 feature map.
  • the seventh convolution layer is further configured to linearly transform the features output by the at least two second branch networks to add the features of the original scale output by the first branch network. If the merging process performed by the first output merging layer is a series operation, the seventh convolutional layer is further configured to perform mapping transformation processing on the feature map outputted by the first output merging layer to transform the feature map mapping into the feature map before the series connection. size. For example, the above (256 ⁇ c) ⁇ 64 ⁇ 64 feature map map is transformed into a 256 ⁇ 64 ⁇ 64 feature map.
  • Step S206 Combine the feature map and the at least two other feature maps to obtain a first feature map of the image to be detected.
  • the step S206 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a merge module 1006 executed by the processor.
  • the feature pyramid sub-network further includes a second output merge layer, and the outputs of the first branch network and the at least two second branch networks are both connected to the second output merge layer, where the output of the second branch network includes The output of the shared seventh volume layer and the output of the upsampling layer of the at least two second branch networks of the seventh convolution layer are not shared.
  • the second output merge layer is configured to combine the feature map, the second feature map output by the first branch network, and the third feature map output by the at least two second branch networks to obtain the first feature map.
  • the merge processing is an addition operation.
  • the neural network includes at least two feature pyramid sub-networks; at least two feature pyramid sub-networks, the first feature map outputted by the previous feature pyramid sub-network connected to the current feature pyramid sub-network is input, and according to The input first feature map extracts a first feature map of the current feature pyramid sub-network based on different scales.
  • the second feature pyramid sub-network takes as input the first feature map output by the first feature pyramid sub-network, and extracts a first feature map of the second feature pyramid sub-network based on different scales, and inputs of the second feature pyramid sub-network The end is connected to the output of the first characteristic pyramid subnetwork.
  • the input of the first feature gold sub-network is the feature map acquired in step S202, and the first feature map is obtained by performing steps S204 to S206; the input of the non-first feature gold sub-network is the first output of the previous feature pyramid sub-network.
  • the sub-neural network includes a plurality of feature pyramid sub-networks
  • the output of the previous feature pyramid sub-network may be an input of the adjacent subsequent feature pyramid sub-network.
  • x (l) and W (l) represent the input (characteristic map) and parameters of the first feature pyramid subnetwork
  • the output of the feature pyramid subnetwork, that is, the input of the next feature pyramid subnetwork may Expressed as:
  • x (l+1) x (l) +p(x (l) +W (l) ) (1)
  • p(x (l) - W (l) ) is a feature extraction operation performed by a feature pyramid sub-network and can be further expressed as:
  • the neural network can extract features of different scales by using the feature pyramid sub-network as the basic component module and the feature pyramid learning mechanism.
  • the neural network may adopt the HOURGLASS network structure shown in FIG. 6 as an optional basic network structure, but is not limited thereto.
  • the neural network structure includes a plurality of HOURGLASS structures connected end-to-end to form a HOURGLASS network structure, and the HOURGLASS structure includes at least one feature pyramid sub-network.
  • the output of the previous HOURGLASS structure is the input of the adjacent latter HOURGLASS structure.
  • the feature pyramid sub-network of the present embodiment may be a Pyramids Residual Module (PRM) for forming a HOURGLASS network structure, because the HOURGLASS network uses a Residual Unit as a basic component module.
  • PRM Pyramids Residual Module
  • the number of HOURGLASS structures and feature pyramid sub-networks can be appropriately set according to actual needs.
  • the HOURGLASS structure may be composed of a plurality of feature pyramid sub-networks to learn and extract features of different scales using the feature pyramid sub-network, and output the first feature map.
  • the feature pyramid sub-network may adopt the structure of any of the feature pyramid sub-networks shown in FIG. 3 to FIG. 5 above.
  • the neural network shown in FIG. 7 further includes a first convolution layer (Conv1), which can be used to perform the foregoing step S202 to acquire a feature map; and a pooling layer (Pooling, Pool), which can continuously reduce the resolution of the feature map.
  • Conv1 first convolution layer
  • Pool pooling layer
  • the global feature is interpolated and combined with the position of the corresponding resolution in the feature map, that is, the feature map of the image to be detected is obtained by globally pooling the feature image.
  • the acquired feature map can be input into the feature pyramid sub-network, so that the feature pyramid sub-network performs deeper learning and extraction on the feature map, and then extracts the first feature map based on different scales.
  • a feature pyramid sub-network or a convolution layer may be disposed between the pooling layer and the feature pyramid sub-network for adjusting attributes such as resolution of the feature map.
  • Step S208 Perform key point detection on the target object in the image to be detected according to the first feature image.
  • step S208 may be performed by the processor invoking a corresponding instruction stored in the memory or by the detection module 1008 being executed by the processor.
  • the score map of the at least one key point of the target object is respectively acquired according to the first feature map; and the position of the corresponding key point of the target object is determined according to the score of the pixel points included in the score map of the at least one key point.
  • the first feature map of the image to be detected acquired by the feature pyramid sub-network is used to detect the feature of the image to be detected based on different scales, and the features of different scales can be detected stably and accurately.
  • Key point detection effectively improves the accuracy of key point detection.
  • the position with a higher score in the score map represents the detected position of the key point.
  • the output score map corresponds to at least one key point of the target object in the image to be detected.
  • the target object in the image to be detected is a person, including 16 key points, such as a hand, a knee, and the like.
  • the position detection of the 16 key points can be completed by determining the position of the corresponding key points in the 16 score maps with higher scores (for example, one or more of the highest scores).
  • the image processing method of the embodiment of the present application can be used for, but not limited to, human body pose estimation, video comprehension analysis, behavior recognition and human-computer interaction, image segmentation, object clustering, and the like.
  • the image to be detected is input into the neural network, and the feature pyramid sub-network is used to extract features based on different scales, and key points are detected on the target object according to the extracted features, thereby based on at least one key detected.
  • the position of the point is used to estimate the posture of the human body.
  • the positions (for example, coordinates) of the key points corresponding to the 16 score maps shown in FIG. 8 are acquired, and the human body posture can be accurately estimated based on the positions of the 16 key points. Since the image processing method of the present embodiment utilizes the feature pyramid learning mechanism to extract features, target objects of different scales can be detected, thereby ensuring the robustness of the human body pose estimation.
  • the image processing method of the embodiment may be used, and the feature pyramid learning mechanism is used to stably extract the feature map of the video frame image, thereby accurately performing key point positioning of the target object, which is helpful.
  • the feature pyramid learning mechanism is used to stably extract the feature map of the video frame image, thereby accurately performing key point positioning of the target object, which is helpful.
  • the initialization network parameter of the at least one network layer of the neural network of the embodiment is obtained from a network parameter distribution determined according to the mean and variance of the network parameter.
  • the network parameter distribution may be a set Gaussian distribution or a uniform distribution, and the mean and variance of the network parameter distribution are determined by the number of inputs and outputs with the parameter layer, and the initial network parameters may be randomly sampled from the network parameter distribution.
  • the parameter initialization method can train a neural network with a multi-branch network structure, and the training method is applicable not only to a single-branch network but also to a feature pyramid residual module training with a multi-branch network, so that the training of the neural network is performed. The process is more stable.
  • the mean value of the network parameters is initialized to 0 to ensure that the variance of the input and output of each layer of the neural network is substantially the same.
  • the initial network parameters can be sampled from a Gaussian distribution or a uniform distribution with a mean of 0 and a variance of ⁇ as an initialization network parameter of the forward propagation process.
  • the mean value of the network parameters is initialized to 0, so that the mean value of the gradient of the network parameters is 0, thereby ensuring that the variance of the input and output gradients of each layer of the neural network is substantially the same.
  • the initial network parameters can be sampled from a Gaussian distribution or a uniform distribution with a mean of 0, and the variance of the gradient is ⁇ ', as an initialization network parameter of the backward propagation process.
  • an output adjustment module is set in at least one identity mapping branch that needs to be added, and the constant is adjusted by the output adjustment module.
  • BN-ReLU-Conv batch normalization-Rectified Linear Units-Convolution
  • the neural network mentioned in the foregoing embodiments corresponding to FIG. 3 to FIG. 5 there are also cases in which a plurality of identity mapping branches are added, and at least one of the identity mapping branches (eg, f 0 , f) may be present. 1 ... or f c ) increase the setting of the BN-ReLU-Conv layer, thereby adjusting the output of the branch, avoiding the problem that multiple equal mapping branches are added and the corresponding variance is superimposed.
  • feature extraction of the feature image of the image to be detected is performed based on the feature pyramid sub-network of the neural network, and a plurality of other feature maps and feature maps are obtained, and the The first feature map of the image is detected, and the feature pyramid network is used to learn and extract features of different scales, which ensures the accuracy and robustness of the feature extraction of the neural network.
  • the key is obtained according to the acquired first feature map. Point detection effectively improves the accuracy of key point detection.
  • the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
  • the foregoing steps include the steps of the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • FIG. 10 a block diagram showing the structure of an embodiment of an image processing apparatus according to an embodiment of the present application is shown.
  • the image processing apparatus of the embodiment includes: an obtaining module 1002, configured to acquire a feature map of an image to be detected; and an extracting module 1004, configured to perform feature extraction on the feature image based on at least two different scales by using a neural network, to obtain at least two Other feature maps; a merging module 1006, configured to merge the feature map and the at least two other feature maps to obtain a first feature map of the image to be detected.
  • the apparatus of this embodiment further includes: a detecting module 1008, configured to perform key point detection on the target object in the image to be detected according to the first feature image.
  • a detecting module 1008 configured to perform key point detection on the target object in the image to be detected according to the first feature image.
  • the detecting module 1008 includes: a scoring unit (not shown), configured to respectively acquire a score map of at least one key point of the target object according to the first feature map; determining unit (not shown in the figure) And a method for determining a position of a corresponding key point of the target object according to a score of the pixel points included in the score map of the at least one key point.
  • the neural network comprises at least one feature pyramid subnetwork comprising a first branch network and at least one second branch network respectively connected in parallel with the first branch network;
  • the other feature maps comprise a second feature map and/or Third feature map;
  • the extracting module 1004 is configured to use the first branch network to perform feature extraction on the feature map based on the original scale of the feature map to obtain a second feature map; and use at least one second branch network to separately calculate other scales different from the original scale Feature extraction is performed on the feature map to obtain a third feature map.
  • the first branch network includes a second convolution layer, a third convolution layer, and a fourth convolution layer;
  • An extraction module 1004 configured to reduce a dimension of the feature map based on the second convolution layer; and use a third convolution layer to perform convolution processing on the feature map after the dimension reduction based on the original scale of the feature map; The layer is used to enhance the dimension of the convolution processed feature map to obtain a second feature map.
  • the at least one second branch network includes a fifth convolution layer, a downsampling layer, a sixth convolution layer, an upsampling layer, and a seventh convolution layer; and an extracting module 1004, configured to use the fifth convolution layer
  • the dimension of the feature map is reduced; the downsampling layer is used for downsampling the feature map after the dimension is reduced according to the set downsampling ratio, wherein the scale of the downsampled feature map is smaller than the original scale of the feature map;
  • the convolution layer is used for convolution processing of the downsampled feature map; the upsampling layer is used for upsampling the convolved feature map according to the set upsampling ratio, wherein the upsampled feature map is performed
  • the scale of the feature is equal to the original scale of the feature map; and the third feature map is obtained based on the seventh convolution layer for improving the dimension of the upsampled feature map.
  • the sixth convolutional layer of at least two second branch networks share network parameters.
  • the second branch network includes a fifth convolution layer, an expanded convolution layer, and a seventh convolution layer;
  • An extraction module 1004 configured to reduce a dimension of the feature map based on the fifth convolution layer; an expansion convolution process for the feature map after the reduced dimension based on the expanded convolution layer; and an elevation convolution process based on the seventh convolution layer The dimension of the feature map after the convolution is expanded to obtain a third feature map.
  • the fifth convolutional layer and/or the seventh convolutional layer of at least two second branch networks share network parameters.
  • the fifth convolutional layer and/or the seventh convolutional layer of the at least two second branch networks may each have different network parameters.
  • the feature pyramid sub-network further includes a first output merge layer; the first output merge layer is configured to merge the respective outputs of the at least two second branch networks sharing the seventh convolution layer before the seventh convolution layer And output the combined result to the shared seventh volume.
  • the neural network includes at least two feature pyramid sub-networks; the feature pyramid sub-network is configured to input the first feature map outputted by the previous feature pyramid sub-network connected to the current feature pyramid sub-network, and according to the input The first feature map extracts a first feature map of the current feature pyramid sub-network based on different scales.
  • the neural network includes at least two sequentially connected feature pyramid sub-networks
  • the second feature pyramid sub-network takes the first feature map outputted by the first feature pyramid sub-network as an input, and extracts a first feature map of the second feature pyramid sub-network based on different scales, and the input end of the second feature pyramid sub-network and the first The outputs of the feature pyramid subnetwork are connected.
  • the neural network is an hourglass HOURGLASS neural network
  • the at least one hourglass module included in the hourglass HOURGLASS neural network includes at least one characteristic pyramid subnetwork.
  • the initialization network parameters of the at least one network layer of the neural network are obtained from a network parameter distribution determined according to the mean and variance of the initialization network parameters, and the mean value of the initialization network parameters is zero.
  • an output adjustment module is set in the at least one identity mapping branch that needs to be added, and the output adjustment module is configured to adjust the output of the identity mapping branch.
  • the first feature map is set in the at least one identity mapping branch that needs to be added, and the output adjustment module is configured to adjust the output of the identity mapping branch.
  • the image processing apparatus of the present embodiment is used to implement the corresponding image processing method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.
  • the embodiment further provides a computer readable storage medium having stored thereon computer program instructions, wherein the program instructions are executed by the processor to implement the steps of any of the image processing methods provided by the embodiments of the present application.
  • the embodiment further provides a computer program, comprising: at least one executable instruction, when the at least one executable instruction is executed by the processor, is used to implement the steps of any one of the image processing methods provided by the embodiments of the present application.
  • the embodiment further provides an electronic device, comprising: a processor and an image processing device provided by the embodiment of the present application; when the processor runs the image processing device, the module in the image processing device according to any one of the above items is run.
  • the embodiment of the present application provides an electronic device, such as a mobile terminal, a personal computer (PC), a tablet computer, a server, and the like.
  • the electronic device 1100 includes one or more processors and communication components.
  • the one or more processors for example: one or more central processing units (CPUs) 1101, and/or one or more image processors (GPUs) 1113, etc., may be stored in a read-only memory ( Various suitable actions and processes are performed by executable instructions in ROM) 1102 or executable instructions loaded into random access memory (RAM) 1103 from storage portion 1108.
  • the communication component includes a communication component 1112 and/or a communication interface 1109.
  • the communication component 1112 may include, but is not limited to, a network card, which may include, but is not limited to, an IB (Infiniband) network card, the communication interface 1109 includes a communication interface of a network interface card such as a LAN card, a modem, etc., and the communication interface 1109 is via an Internet interface, for example
  • the network performs communication processing.
  • the processor can communicate with the read only memory 1102 and/or the random access memory 1103 to execute executable instructions, connect to the communication component 1112 via the communication bus 1104, and communicate with other target devices via the communication component 1112, thereby completing the embodiments of the present application.
  • An operation corresponding to any one of the image processing methods provided, for example, acquiring a feature map of the image to be detected; performing feature extraction on the feature map based on at least two different scales by using a neural network to obtain at least two other feature maps; The feature map and each of the other feature maps obtain a first feature map of the image to be detected.
  • RAM 1103 various programs and data required for the operation of the device can be stored.
  • the CPU 1101 or the GPU 1113, the ROM 1102, and the RAM 1103 are connected to each other through a communication bus 1104.
  • ROM 1102 is an optional module.
  • the RAM 1103 stores executable instructions, or writes executable instructions to the ROM 1102 at runtime, the executable instructions causing the processor to perform operations corresponding to the above-described communication methods.
  • An input/output (I/O) interface 1105 is also coupled to the communication bus 1104.
  • the communication component 1112 can be integrated or can be configured to have multiple sub-modules (e.g., multiple IB network cards) and be on a communication bus link.
  • the following components are connected to the I/O interface 1105: an input portion 1106 including a keyboard, a mouse, etc.; an output portion 1107 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a speaker; a storage portion 1108 including a hard disk or the like And a communication interface 1109 including a network interface card such as a LAN card, modem, or the like.
  • Driver 1110 is also connected to I/O interface 1105 as needed.
  • a removable medium 1111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive 1110 as needed so that a computer program read therefrom is installed into the storage portion 1108 as needed.
  • FIG. 11 is only an optional implementation manner.
  • the number and type of components in the foregoing FIG. 11 may be selected, deleted, added, or replaced according to actual needs; Different function component settings may also be implemented by separate settings or integrated settings.
  • the GPU 1113 and the CPU 1101 may be separately configured or the GPU 1113 may be integrated on the CPU 1101, and the communication components may be separately configured or integrated on the CPU 1101 or the GPU 1113. and many more. These alternative embodiments are all within the scope of the present application.
  • the above method according to an embodiment of the present application may be implemented in hardware, firmware, or implemented as software or computer code that may be stored in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or implemented by
  • the network downloads computer code originally stored in a remote recording medium or non-transitory machine readable medium and stored in a local recording medium so that the methods described herein can be stored using a general purpose computer, a dedicated processor or programmable
  • Such software processing on a recording medium of dedicated hardware such as an ASIC or an FPGA.
  • a computer, processor, microprocessor controller or programmable hardware includes storage components (eg, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is The processing methods described herein are implemented when the processor or hardware is accessed and executed. Moreover, when a general purpose computer accesses code for implementing the processing shown herein, the execution of the code converts the general purpose computer into a special purpose computer for performing the processing shown herein.
  • the methods and apparatus of the present application may be implemented in a number of ways.
  • the methods and apparatus of the present application can be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-described sequence of steps for the method is for illustrative purposes only, and the steps of the method of the present application are not limited to the order specifically described above unless otherwise specifically stated.
  • the present application can also be implemented as a program recorded in a recording medium, the programs including machine readable instructions for implementing the method according to the present application.
  • the present application also covers a recording medium storing a program for executing the method according to the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本申请实施例提供了一种图像处理方法、装置、存储介质、计算机程序和电子设备,其中,所述图像处理方法包括:获取待检测图像的特征图;通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图;合并所述特征图和至少二个所述其他特征图,得到所述待检测图像的第一特征图。采用本申请实施例的技术方案,可以利用神经网络学习和提取不同尺度的特征,提高特征提取的准确性和鲁棒性。

Description

图像处理方法、装置、存储介质、计算机程序和电子设备
本申请要求在2017年7月28日提交中国专利局、申请号为CN201710632941.0、发明名称为“图像处理方法、装置、存储介质、计算机程序和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机视觉技术领域,尤其涉及一种图像处理方法、装置、存储介质、计算机程序和电子设备。
背景技术
人体姿态估计主要是对给定图像或视频中人体身体各部分的位置进行定位,是计算机视觉领域的一个重要的研究课题,主要应用在动作识别、行为识别、服装解析、任务对比、人机交互等方面。
发明内容
本申请实施例提供了一种图像处理方案。
根据本申请实施例的第一方面,提供了一种图像处理方法,包括:获取待检测图像的特征图;通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图;合并所述特征图和至少二个所述其他特征图,得到所述待检测图像的第一特征图。
可选地,结合本申请提供的任一实施例,所述方法还包括:根据所述第一特征图对所述待检测图像中的目标对象进行关键点检测。
可选地,结合本申请提供的任一实施例,所述根据所述第一特征图对所述目标对象进行关键点检测,包括:根据所述第一特征图分别获取所述目标对象的至少一关键点的得分图;根据所述至少一关键点的得分图中所包括的像素点的分数,确定所述目标对象的相应关键点的位置。
可选地,结合本申请提供的任一实施例,所述神经网络包括至少一个特征金字塔子网络,所述特征金字塔子网络包括第一分支网络以及与所述第一分支网络并联的至少一个第二分支网络;所述其他特征图包括第二特征图和/或第三特征图;所述通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图,包括:
利用所述第一分支网络基于所述特征图的原始尺度对所述特征图进行特征提取,获得所述第二特征图;利用至少一个所述第二分支网络分别基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图。
可选地,结合本申请提供的任一实施例,所述第一分支网络包括第二卷积层、第三卷积层和第四卷积层;所述利用所述第一分支网络基于所述特征图的原始尺度对所述特征图进行特征提取,获得所述第二特征图,包括:
基于所述第二卷积层降低所述特征图的维度;利用所述第三卷积层基于所述特征图的原始尺度对所述降低维度后的特征图进行卷积处理;利用所述第四卷积层提升所述经过卷积处理的特征图的维度,获得所述第二特征图。
可选地,结合本申请提供的任一实施例,至少一所述第二分支网络包括第五卷积层、降采样层、第 六卷积层、上采样层和第七卷积层;所述利用至少一个所述第二分支网络分别基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图,包括:
基于所述第五卷积层降低所述特征图的维度;基于所述降采样层根据设定降采样比例对降低维度后的特征图进行降采样,其中,经过降采样后的特征图的尺度小于所述特征图的原始尺度;基于所述第六卷积层对所述经过降采样的特征图进行卷积处理;基于所述上采样层根据设定上采样比例,对经过卷积的特征图进行上采样,其中,经过上采样后的特征图的尺度等于所述特征图的原始尺度;基于所述第七卷积层提升经过上采样后的特征图的维度,获得所述第三特征图。
可选地,结合本申请提供的任一实施例,所述第二分支网络有多个;至少二个所述第二分支网络的设定降采样比例不同,和/或,至少二个所述第二分支网络的设定降采样比例相同。
可选地,结合本申请提供的任一实施例,所述第二分支网络有多个;至少二个所述第二分支网络的所述第六卷积层共享网络参数。
可选地,结合本申请提供的任一实施例,所述第二分支网络包括第五卷积层、膨胀卷积层和第七卷积层;所述利用至少一个所述第二分支网络分别基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图,包括:
基于所述第五卷积层降低所述特征图的维度;基于所述膨胀卷积层对降低维度后的所述特征图进行膨胀卷积处理,基于所述第七卷积层提升经过膨胀卷积后的特征图的维度,获得所述第三特征图。
可选地,结合本申请提供的任一实施例,所述第二分支网络有多个;至少二个所述第二分支网络的所述第五卷积层和/或所述第七卷积层共享网络参数。
可选地,结合本申请提供的任一实施例,所述特征金字塔子网络还包括第一输出合并层;所述第一输出合并层对共享所述第七卷积层的至少二个所述第二分支网络在所述第七卷积层之前的各自输出进行合并、并将合并结果输出至共享的所述第七卷积层。
可选地,结合本申请提供的任一实施例,所述神经网络包括至少两个顺序连接的特征金字塔子网络;第二特征金字塔子网络以第一特征金字塔子网络输出的第一特征图为输入,基于不同尺度提取所述第二特征金字塔子网络的第一特征图,所述第二特征金字塔子网络的输入端与所述第一特征金字塔子网络的输出端相连接。
可选地,结合本申请提供的任一实施例,所述神经网络为沙漏HOURGLASS神经网络,所述沙漏HOURGLASS神经网络包括的至少一沙漏模块包括至少一所述特征金字塔子网络。
可选地,结合本申请提供的任一实施例,所述神经网络的至少一网络层的初始化网络参数,从根据所述初始化网络参数的均值和方差确定的网络参数分布中获取,且所述初始化网络参数的均值为零。
可选地,结合本申请提供的任一实施例,当所述神经网络中存在包括至少二个恒等映射相加的情形,在需要相加的至少一恒等映射分支中设置输出调整模块,通过输出调整模块调整该恒等映射分支输出的第一特征图。
根据本申请实施例的第二方面,提供了一种图像处理装置,包括:获取模块,用于获取待检测图像的特征图;提取模块,用于通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图;合并模块,用于合并所述特征图和至少二个所述其他特征图,得到所述待检测图像的第一特征图。
可选地,结合本申请提供的任一实施例,所述装置还包括:检测模块,用于根据所述第一特征图对所述待检测图像中的目标对象进行关键点检测。
可选地,结合本申请提供的任一实施例,所述检测模块包括:得分单元,用于根据所述第一特征图 分别获取所述目标对象的至少一关键点的得分图;确定单元,用于根据所述至少一个关键点的得分图中所包括的像素点的分数,确定所述目标对象的相应关键点的位置。
可选地,结合本申请提供的任一实施例,所述神经网络包括至少一个特征金字塔子网络,所述特征金字塔子网络包括第一分支网络以及分别与所述第一分支网络并联的至少一个第二分支网络;所述其他特征图包括第二特征图和/或第三特征图;所述提取模块,用于利用所述第一分支网络基于所述特征图的原始尺度对所述特征图进行特征提取,获得所述第二特征图;利用至少一个所述第二分支网络基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图。
可选地,结合本申请提供的任一实施例,所述第一分支网络包括第二卷积层、第三卷积层和第四卷积层;所述提取模块,用于基于所述第二卷积层降低所述特征图的维度;利用所述第三卷积层基于所述特征图的原始尺度对所述降低维度后的特征图进行卷积处理;利用所述第四卷积层提升所述经过卷积处理的特征图的维度,获得所述第二特征图。
可选地,结合本申请提供的任一实施例,至少一所述第二分支网络包括第五卷积层、降采样层、第六卷积层、上采样层和第七卷积层;所述提取模块利用至少一个所述第二分支网络基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图时,用于基于所述第五卷积层降低所述特征图的维度;基于所述降采样层根据设定降采样比例对降低维度后的特征图进行降采样,其中,经过降采样后的特征图的尺度小于所述特征图的原始尺度;基于所述第六卷积层对所述经过降采样的特征图进行卷积处理;基于所述上采样层根据设定上采样比例,对经过卷积的特征图进行上采样,其中,经过上采样后的特征图的尺度等于所述特征图的原始尺度;基于所述第七卷积层提升经过上采样后的特征图的维度,获得所述第三特征图。
可选地,结合本申请提供的任一实施例,所述第二分支网络有多个;至少二个所述第二分支网络的设定降采样比例不同,和/或,至少二个所述第二分支网络的设定降采样比例相同。
可选地,结合本申请提供的任一实施例,所述第二分支网络有多个;至少二个所述第二分支网络的所述第六卷积层共享网络参数。
可选地,结合本申请提供的任一实施例,所述第二分支网络包括第五卷积层、膨胀卷积层和第七卷积层;所述提取模块利用至少一个所述第二分支网络基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图时,用于基于所述第五卷积层降低所述特征图的维度;基于所述膨胀卷积层对降低维度后的所述特征图进行膨胀卷积处理;基于所述第七卷积层提升经过膨胀卷积后的特征图的维度,获得所述第三特征图。
可选地,结合本申请提供的任一实施例,所述第二分支网络有多个;至少二个所述第二分支网络的所述第五卷积层和/或所述第七卷积层共享网络参数。
可选地,结合本申请提供的任一实施例,所述特征金字塔子网络还包括第一输出合并层;所述第一输出合并层用于对共享所述第七卷积层的至少二个所述第二分支网络在所述第七卷积层之前的各自输出进行合并、并将合并结果输出至共享的所述第七卷积层。
可选地,结合本申请提供的任一实施例,所述神经网络包括至少两个顺序连接的特征金字塔子网络;第二特征金字塔子网络以第一特征金字塔子网络输出的第一特征图为输入,基于不同尺度提取所述第二特征金字塔子网络的第一特征图,所述第二特征金字塔子网络的输入端与所述第一特征金字塔子网络的输出端相连接。
可选地,结合本申请提供的任一实施例,所述神经网络为沙漏HOURGLASS神经网络,所述沙漏HOURGLASS神经网络包括的至少一沙漏模块包括至少一所述特征金字塔子网络。
可选地,结合本申请提供的任一实施例,所述神经网络的至少一网络层的初始化网络参数,从根据所述初始化网络参数的均值和方差确定的网络参数分布中获取,且所述初始化网络参数的均值为零。
可选地,结合本申请提供的任一实施例,当所述神经网络中存在包括至少二个恒等映射相加的情形,在需要相加的至少一恒等映射分支中设置输出调整模块,所述输出调整模块用于调整该恒等映射分支输出的第一特征图。
根据本申请实施例的第三方面,提供了一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述程序指令被处理器执行时实现前述任一项图像处理方法的步骤。
根据本申请实施例的第四方面,提供了一种电子设备,包括:处理器和如上任一项所述的图像处理装置;在处理器运行所述图像处理装置时,如上任一项所述的图像处理装置中的模块被运行。
根据本申请实施例的第四方面,提供了一种电子设备,包括:处理器、存储器、通信元件和通信总线,所述处理器、所述存储器和所述通信元件通过所述通信总线完成相互间的通信;所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行前述任一项的图像处理方法对应的操作。
根据本申请实施例的第五方面,提供了一种计算机程序,包括:至少一可执行指令,所述至少一可执行指令被处理器处理时用于实现前述任一项图像处理方法对应的操作。
根据本申请实施例的图像处理方案,在获取待检测图像的特征图之后,通过神经网络基于多种不同尺度对特征图进行特征提取来获得多个其他特征图,并将特征图与多个其他特征图合并来得到待检测图像的第一特征图,利用神经网络学习和提取不同尺度的特征,提高了神经网络进行特征提取的准确性和鲁棒性。
下面通过附图和实施例,对本申请的技术方案做进一步的详细描述。
附图说明
构成说明书的一部分的附图描述了本申请的实施例,并且连同描述一起用于解释本申请的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本申请,其中:
图1是根据本申请实施例图像处理方法一个实施例的流程示意图。
图2是根据本申请实施例图像处理方法另一个实施例的流程示意图。
图3是根据本申请实施例图像处理方法另一个实施例的特征金字塔子网络的一种结构示意图。
图4是根据本申请实施例图像处理方法另一个实施例的特征金字塔子网络的另一种结构示意图。
图5是根据本申请实施例图像处理方法另一个实施例的特征金字塔子网络的又一种结构示意图。
图6是根据本申请实施例图像处理方法另一个实施例的一种用于图像处理的神经网络的结构示意图。
图7是根据本申请实施例图像处理方法另一个实施例的一种HOURGLASS网络的结构示意图。
图8是根据本申请实施例图像处理方法另一个实施例输出的得分图。
图9是根据本申请实施例图像处理方法另一个实施例的一种恒等映射相加的结构示意图。
图10是根据本申请实施例图像处理装置一个实施例的结构框图。
图11是根据本申请实施例电子设备一个实施例的结构示意图。
具体实施方式
现在将参照附图来详细描述本申请的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本申请的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制 的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本申请及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本申请实施例可以应用于计算机系统/服务器,其可与众多其它通用或专用计算系统环境或配置一起操作。适于与计算机系统/服务器一起使用的众所周知的计算系统、环境和/或配置的例子包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。
计算机系统/服务器可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
参照图1,示出了根据本申请实施例图像处理方法一个实施例的流程示意图。
本实施例的图像处理方法包括以下步骤:
步骤S102:获取待检测图像的特征图。
本实施例中,可以采用任意的图像分析处理方法来对待检测图像进行特征提取处理,以获取待检测图像的特征图。可选地,通过例如卷积神经网络对待检测图像进行特征提取操作,获取包括待检测图像的特征信息的特征图(Feature Map)。其中,待检测图像可以是独立的静态图像,也可以是视频序列中的任意一帧图像。
在这里说明,获取的特征图可以为待检测图像的全局特征图,也可以是非全局的特征图,本实施例对此不作限定。例如,在实际应用中,根据获取的特征图用于进行图像处理、或物体识别等不同的应用场景,可以分别获取待检测图像的全局特征图、或包括目标物体的局部特征图。
在一个可选示例中,该步骤S102可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的获取模块1002执行。
步骤S104:通过神经网络基于至少二种不同尺度对特征图进行特征提取,获得至少二个其他特征图。
其中,至少二个其他特征图为神经网络对待检测图像的特征图,基于至少二种不同尺度分别进行进一步的特征提取操作获得的特征图,每一种尺度对应于一个其他特征图。
神经网络进行特征提取操作所基于的尺度,能够限定特征提取操作所提取的特征的尺度。本申请实施例中,神经网络基于不同尺度对待检测图像进行特征提取,通过神经网络学习和提取不同尺度的特征,可以稳定准确地提取到待检测图像的特征。本申请实施例能够有效应对出 现例如遮挡、透视等问题造成待检测图像的特征尺度发送变化的问题,从而提高特征提取的鲁棒性。
在实际应用中,特征提取所基于的尺度不同,可以是图像的物理大小尺寸不同,或者图像的有效部分的尺寸不同(例如,虽然图像的物理大小尺寸相同,但该图像的部分像素的像素值已经采用但不限于置零等方式处理,除了这些处理后的像素的其他像素组成的部分相当于有效部分,有效部分的尺寸相对图像的物理尺寸较小)等,但不限于此。
可选地,至少二种不同尺度可以包括待检测图像的原始尺度与不同于原始尺度的至少一种尺度,或者,包括不同于原始尺度的至少二种不同尺度。
在一个可选示例中,该步骤S104可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的提取模块1004执行。
步骤S106:合并特征图和至少二个其他特征图,得到待检测图像的第一特征图。
将特征图和各其他特征图进行合并得到第一特征图,使得第一特征图包括提取到的不同尺度的特征。可选地,合并操作可以包括相加操作或者串联操作。合并得到的第一特征图可用于对待检测图像进行后续的图像处理,例如关键点检测、物体检测、物体识别、图像分割、物体聚类等,能够提高后续的图像处理的效果。
在一个可选示例中,该步骤S106可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的合并模块1006执行。
根据本申请实施例的图像处理方法,在获取待检测图像的特征图之后,通过神经网络基于多种不同尺度对特征图进行特征提取来获得多个其他特征图,并将特征图与多个其他特征图合并来得到待检测图像的第一特征图,利用神经网络学习和提取不同尺度的特征,提高了神经网络进行特征提取的准确性和鲁棒性。
本申请实施例提供的任一种图像处理方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。或者,本申请实施例提供的任一种图像处理方法可以由处理器执行,如处理器通过调用存储器存储的相应指令来执行本申请实施例提及的任一种图像处理方法。下文不再赘述。
参照图2,示出了根据本申请实施例图像处理方法另一个实施例的流程示意图。
本实施例的图像处理方法包括以下步骤:
步骤S202:获取待检测图像的特征图。
本实施例中,通过神经网络对待检测图像进行特征提取操作来获取特征图。例如,神经网络包括用于进行特征提取的卷积层(Convolution,Conv),对输入神经网络的待检测图像进行初步检测和特征提取操作,获取包括待检测图像初始的特征图。
在一个可选示例中,该步骤S202可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的获取模块1002执行。
步骤S204:通过神经网络基于至少二种不同尺度对特征图进行特征提取,获得至少二个其他特征图。
在一个可选示例中,该步骤S204可以由处理器调用存储器存储的相应指令执行,也可以由被处理 器运行的提取模块1004执行。
可选地,神经网络包括至少一个特征金字塔子网络,用于基于至少二种不同尺度对特征图进行特征提取,获得至少二个其他特征图。特征金字塔包括第一分支网络以及分别与第一分支网络并联的至少一个第二分支网络。第一分支网络基于待检测图像的原始尺度,对输入特征金字塔的特征图进行进一步的特征提取,获得第二特征图;至少一个第二分支网络基于不同于该原始尺度的其他尺度对特征图进行进一步的特征提取,获取第三特征图。也即,至少二个其他特征图包括第二特征图和第三特征图。
一种可选的实施方式中,参照图3,第一分支网络包括第二卷积层(Convolutio 2,Conv 2)、第三卷积层(Conv 3)和第四卷积层(Conv 4)。至少一第二分支网络包括第五卷积层(Conv5)、降采样层、第六卷积层(Conv 6)、上采样层和第七卷积层(Conv 7)。
第一分支网络为f 0,至少二个第二分支网络分别为f 1至f c,其中,f 0保留输入特征的原始尺度。输入特征金字塔子网络的特征图分别输入到f 0至f c。f 0的第二卷积层以及f 1至f c的第五卷积层均可以采用卷积核大小为1×1的卷积网络,用于降低输入特征图的维度。f 1至f c的降采样层分别根据设定的降采样比例Ratio 1至Ratio c,分别对第五卷积层输出的降低维度后的特征图进行降采样,得到不同分辨率的特征图。其中,经过降采样后的特征图的尺度小于特征图的原始尺度。f 0的第三卷积层以及f 1至f c的第六卷积层均可以采用卷积核大小为3×3的卷积网络,用于分别对第二卷积层输出的降低维度后的特征图,以及相应的降采样层输出的经过降采样的特征图进行卷积,学习和提取不同尺度的特征。f 1至f c的上采样层分别基于不同的上采样比例,对第六卷积层输出的经过卷积的特征图进行上采样,其中,经过上采样后的特征图的尺度等于特征图的原始尺度。f 0的第四卷积层提升第三卷积层输出的经过卷积处理的特征图的维度,获得第二特征图。f 1至f c的第七卷积层提升对应上采样层输出的经过上采样的特征图的维度,分别获得第三特征图。
其中,至少二个第二分支网络f 1至f c中,至少二个第二分支网络的设定降采样比例不同,和/或,至少二个第二分支网络的设定降采样比例相同。也即,至少二个第二分支网络采用的降采样比例可以均不相同,可以部分相同,也可以全都相同。对于这三种情况,与基于原始尺度的第一分支网络相配合,特征金字塔子网络能够基于至少二种不同尺度提取不同的特征。
此外,由于f 0保留输入特征的原始尺度,无需改变特征的分辨率,因此,f 0没有采用降采样层和上采样层,在实际应用中,f 0还可以采用降采样比例和上采样比例为1降采样层和上采样层。
可选地,至少二个第二分支网络的第六卷积层共享参数。例如,至少二个第二分支网络的第六卷积层共享卷积核,也即,至少二个第六卷积层的卷积核具有相同的参数,以通过采用内部参数共享机制,来降低参数数量,同时还能够基于通过数据和任务学习得到的参数获得较高的准确率。
另一种可选的实施方式中,还可以采用图4示出的特征金字塔子网络的结构形式,至少一第二分支网络包括第五卷积层、膨胀卷积层和第七卷积层;第五卷积层降低特征图的维度;膨胀卷积层对降低维度后的特征图进行膨胀卷积处理;第七卷积层提升经过膨胀卷积后的特征图的维度,获得第三特征图。也即,将至少一第二分支网络的降采样层、第六卷积层和上采样层由膨胀卷积层(dilated convolution,图中表示为dstride 1至dstride c)代替,简化特征金字塔子网络内部的网络结构,并可以增加输入特征的分辨率,利用膨胀卷积层来完成不同分辨率特征的采样操作,不同尺度特征的提取操作,以及同样分辨率特征的采样操作等,从而获取不同尺度的特征。其中,膨胀卷积处理也可以实现降采样例如,采用将特征图的一部分像素的像素值置0的方式,在保持图像的物理尺寸大小一致的情况下,将特征图中具有有效像素值的部分变小,同样也实现了降采样的效果。
可选地,至少二个第二分支网络共享第五卷积层和/或第七卷积层,可选地,第五卷积层和/或第七卷积层共享网络参数。
可选地,至少二个第二分支网络还可以具有各自的第五卷积层和/或第七卷积层,第五卷积层和/或第七卷积层的网络参数不同。
例如,为了简化特征金字塔子网络的结构,可以采用图5示出的特征金字塔子网络的结构形式,将至少二个第二分支网络共享同一个第五卷积层。例如,第五卷积层为1×1的卷积网络,在将输入特征金字塔子网络的特征进行降维处理后,输出至共享该第五卷积层的至少二个第二分支网络的降采样层。该结构的特征金字塔子网络的参数数量较少,计算复杂度较低。
可选地,特征金字塔子网络还包括第一输出合并层,第一输出合并层对共享第七卷积层的至少二个第二分支网络在第七卷积层之前的各自输出进行合并、并将合并结果输出至共享的第七卷积层。
例如,第一输出合并层连接在共享第七卷积层上采样层与第七卷积层之间,用于对至少二个第二分支网络的上采样层输出的特征图进行合并处理,并将合并后的特征图输出至第七卷积层。这里,合并处理可以包括相加操作或者串联操作。例如,图中示出的
Figure PCTCN2018097227-appb-000001
表示输出相加操作,图中的
Figure PCTCN2018097227-appb-000002
还可以替换为
Figure PCTCN2018097227-appb-000003
表示输出串联操作(Concatenation)。其中,相加操作可以表示为多个张量的点对点相加,串联操作可以表示为多个张量在一个维度上的串联。若c个第二分支网络f 1至f c输出c个256×64×64的特征图,经相加操作后还是256×64×64的特征图,经串联操作后则会变成(256×c)×64×64的特征图。
此外,第七卷积层还用于将至少二个第二分支网络输出的特征进行线性变换,以便与第一分支网络输出的原始尺度的特征相加。如果第一输出合并层进行的合并处理为串联操作,第七 卷积层还用于对第一输出合并层输出的特征图进行映射变换处理,以将特征图映射变换为串联前的特征图的大小。例如,将上述(256×c)×64×64的特征图映射变换为256×64×64的特征图。
步骤S206:合并特征图和至少二个其他特征图,得到待检测图像的第一特征图。
在一个可选示例中,该步骤S206可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的合并模块1006执行。
可选地,特征金字塔子网络还包括第二输出合并层,第一分支网络和至少二个第二分支网络的输出端均连接至第二输出合并层,这里,第二分支网络的输出端包括共享的第七卷积层的输出端,以及未共享第七卷积层的至少二个第二分支网络的上采样层的输出端。第二输出合并层用于将特征图,第一分支网络输出的第二特征图,以及至少二个第二分支网络输出的第三特征图进行合并处理,获取第一特征图。这里,合并处理为相加操作。
本实施例中,神经网络包括至少二个特征金字塔子网络;至少二个特征金字塔子网络,以与当前特征金字塔子网络连接的前一特征金字塔子网络输出的第一特征图为输入,并根据输入的第一特征图,基于不同尺度提取当前特征金字塔子网络的第一特征图。
可选地,第二特征金字塔子网络以第一特征金字塔子网络输出的第一特征图为输入,基于不同尺度提取第二特征金字塔子网络的第一特征图,第二特征金字塔子网络的输入端与第一特征金字塔子网络的输出端相连接。
其中,首个特征金字子网络的输入是步骤S202获取的特征图,执行步骤S204至步骤S206获取第一特征图;非首个特征金字子网络的输入为前一特征金字塔子网络输出的第一特征图,并执行步骤S204至步骤S206,基于至少二种不同尺度对输入的第一特征图进行特征提取,将获取的其他特征图与输入的第一特征图进行合并,得到当前特征金字塔子网络的第一特征图。
本实施例中,子神经网络包括多个特征金字塔子网络,前一个特征金字塔子网络的输出,可以为相邻的后一特征金字塔子网络的输入。例如,若x (l)和W (l)表示第l个特征金字塔子网络的输入(特征图)和参数,则该特征金字塔子网络的输出,也即,下一个特征金字塔子网络的输入可以表示为:
x (l+1)=x (l)+p(x (l)+W (l))  (1)
其中,p(x (l)-W (l))为一个特征金字塔子网络所执行的特征提取操作,并可以进一步表示为:
Figure PCTCN2018097227-appb-000004
其中,c为第二分支网络的个数,
Figure PCTCN2018097227-appb-000005
表示至少二个第二分支网络f c所执行的特征提取操作,
Figure PCTCN2018097227-appb-000006
表示第一分支网络f 0所执行的特征提取操作,
Figure PCTCN2018097227-appb-000007
表示第七卷积层所执行的处理。
在实际应用中,神经网络可通过以特征金字塔子网络为基本组成模块,利用特征金字塔学习机制,来提取不同尺度的特征。
一种可选的实施方式中,神经网络可采用图6中示出的沙漏(HOURGLASS)网络结构作为一种可选的基本网络结构,但不限于此。神经网络结构包括的多个HOURGLASS结构端对端连接,形成HOURGLASS网络结构,HOURGLASS结构包括至少一个特征金字塔子网络。前一HOURGLASS结构的输出为相邻的后一HOURGLASS结构的输入,通过这种网络结构,使得自底向上、自顶向下地分析和学习贯穿模型始终,从而使得神经网络提取的特征更加有效且准确,保证获取的第一特征图的准确性。其中,由于HOURGLASS网络采用残差模块(Residual Unit)作为基本组成模块,因此,本实施例的特征金字塔子网络可以为用于形成HOURGLASS网络结构的特征金字塔残差模块(Pyramids Residual Module,PRM)。这里,HOURGLASS结构以及特征金字塔子网络的数量可以根据实际需要适当设定。
图7示出的HOURGLASS网络结构中,HOURGLASS结构可以由多个特征金字塔子网络组成,以利用特征金字塔子网络来学习和提取不同尺度的特征,并输出第一特征图。其中,特征金字塔子网络可以采用上述图3至图5示出的任一种特征金字塔子网络的结构。其中,图7示出的神经网络还包括第一卷积层(Conv1),可用于执行前述步骤S202获取特征图;以及池化层(Pooling,Pool),可用不断减小特征图的分辨率,以得到全局特征,然后将全局特征插值放大,和特征图中对应分辨率的位置结合,也即,通过对特征图进行全局池化,获取待检测图像的特征图。获取的特征图可以输入特征金字塔子网络,使得特征金字塔子网络对特征图进行更深层次的学习和提取,进而基于不同尺度提取第一特征图。可选地,在池化层和特征金字塔子网络之间还可以设置特征金字塔子网络或卷积层,用于调整特征图的分辨率等属性。
步骤S208:根据第一特征图对待检测图像中的目标对象进行关键点检测。
在一个可选示例中,该步骤S208可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的检测模块1008执行。
可选地,根据第一特征图分别获取目标对象的至少一关键点的得分图;根据至少一关键点的得分图中所包括的像素点的分数,确定目标对象的相应关键点的位置。通过特征金字塔子网络获取的待检测图像的第一特征图,基于不同尺度来检测提取待检测图像的特征,可以稳定准确地检测到不同尺度的特征,在此基础上,根据第一特征图进行关键点检测,有效地提高了关键点检测的准确性。
一种可选的实施方式中,针对某一个关键点,得分图中分数较高的位置,代表检测到的该关键点位置。如图8所示,与输入神经网络的待检测图像相对应,输出的得分图对应待检测图像中目标对象的至少一个关键点。其中,待检测图像中目标对象为人,包括16个关键点,例如手、膝盖等。将16个得分图中得分较高(例如:得分最高的一个或多个)的位置,确定对应关键点的位置,即可完成对16个关键点的定位检测。
在实际应用场景中,本申请实施例的图像处理方法可用于但不限于进行人体姿态估计、视频理解分析、行为识别和人机交互、图像分割、物体聚类等。
例如,在进行人体姿态估计时,将待检测图像输入神经网络,利用特征金字塔子网络基于不同尺度进行特征提取,并根据提取的特征对目标对象进行关键点检测,从而依据检测到的至少一个关键点的位置进行人体姿态估计。例如,获取图8中示出的16个得分图对应的关键点的位置(例如,坐标),根据16个关键点的位置可以准确地估计出人体姿态。由于本实施例的图像处理方法利用特征金字塔学习机制来提取特征,可以检测不同尺度的目标对象,从而保证人体姿态估计的鲁棒性。
再例如,对于包含目标对象的视频序列,可以采用本实施例的图像处理方法,利用特征金字塔学习机制来稳定提取视频帧图像的特征图,进而准确地进行目标对象的关键点定位,有助于实现视频理解分析。
可选地,本实施例的神经网络的至少一网络层的初始化网络参数,从根据网络参数的均值和方差确定的网络参数分布中获取。其中,网络参数分布可以为一个设定的高斯分布或者均匀分布,该网络参数分布的均值和方差由带参数层的输入和输出个数决定,初始化网络参数可以从该网络参数分布中随机采样获得。该参数初始化方法可对具有多分支网络结构的神经网络进行训练,该训练方法不仅适用基于单分支网络提出的,还可适用具有多分支网络的特征金字塔残模块训练的问题,使得神经网络的训练过程更加稳定。
例如,在网络参数初始化过程中,对于神经网络前向传播过程,将网络参数的均值初始化为0,以保证神经网络每一层的输入和输出的方差基本一致。在获取网络参数的方差σ之后,就可以从一个均值为0,方差为σ的高斯分布或均匀分布中对初始化网络参数进行采样,作为前向传播过程的初始化网络参数。对于神经网络后向传播过程,将网络参数的均值初始化为0,使得网络参数的梯度的均值为0,从而保证神经网络每一层的输入和输出梯度的方差基本一致。在获取网络参数的梯度的方差σ′之后,就可以从一个均值为0,梯度的方差为σ′的高斯分布或均匀分布中对初始化网络参数进行采样,作为后向传播过程的初始化网络参数。
可选地,若神经网络中存在包括至少二个恒等映射(Identity Mapping)相加的情形,则在需要相加的至少一恒等映射分支中设置输出调整模块,通过输出调整模块调整该恒等映射分支输出的第一特征图。
例如,在神经网络中如果存在至少二个恒等映射相加的情形(不妨以图9示出的两个为例进行说明),则在某一个恒等映射分支中设置批量规范化-激活函数-卷积(BN-ReLU-Conv,batch normalization-Rectified Linear Units-Convolution)模块,以调整该恒等映射分支输出的方差的范围等参数,如此处理后在两个恒等映射的输出相加时,可避免这两个恒等映射分支产生输出响应的方差成倍增加的问题,有利于保持神经网络学习过程的稳定性。以图9示出的两个恒等映射相加的情形为例进行说明,可在两个恒等映射分支中的任一个中设置输出调整模块。
又例如,在上述图3至图5对应的实施例中提及的神经网络,也均存在多个恒等映射分支相加的情形,可在其中至少一个恒等映射分支(如f 0、f 1……或f c)增加设置BN-ReLU-Conv层,由此调整该分支的输出,避免多个恒等映射分支相加出出现相应方差叠加等问题。
根据本申请实施例的图像处理方法,通过神经网络的特征金字塔子网络,基于多种不同尺度对待检测图像的特征图进行特征提取,并将获得多个其他特征图与特征图合并,来得到待检测图像的第一特征图,利用特征金字塔网络学习和提取不同尺度的特征,保证了神经网络进行特征提取的准确性和鲁棒性;在此基础上,根据获取的第一特征图来进行关键点检测,有效地提高了关键点检测的准确性。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
参照图10,示出了根据本申请实施例图像处理装置一个实施例的结构框图。
本实施例的图像处理装置,包括:获取模块1002,用于获取待检测图像的特征图;提取模块1004,用于通过神经网络基于至少二种不同尺度对特征图进行特征提取,获得至少二个其他特征图;合并模块1006,用于合并特征图和至少二个所述其他特征图,得到待检测图像的第一特征图。
可选地,本实施例装置还包括:检测模块1008,用于根据第一特征图对待检测图像中的目标对象进行关键点检测。
可选地,所述检测模块1008包括:得分单元(图中未示出),用于根据第一特征图分别获取所述目标对象的至少一关键点的得分图;确定单元(图中未示出),用于根据至少一关键点的得分图中所包括的像素点的分数,确定目标对象的相应关键点的位置。
可选地,神经网络包括至少一个特征金字塔子网络,特征金字塔子网络包括第一分支网络以及分别与第一分支网络并联的至少一个第二分支网络;其他特征图包括第二特征图和/或第三特征图;
提取模块1004,用于利用第一分支网络用于基于特征图的原始尺度对特征图进行特征提取,获得第二特征图;利用至少一个第二分支网络用于分别基于不同于原始尺度的其他尺度对特征图进行特征提取,获得第三特征图。
可选地,第一分支网络包括第二卷积层、第三卷积层和第四卷积层;
提取模块1004,用于基于第二卷积层用于降低特征图的维度;利用第三卷积层用于基于特征图的原始尺度对降低维度后的特征图进行卷积处理;利用第四卷积层用于提升经过卷积处理的特征图的维度,获得第二特征图。
可选地,至少一第二分支网络包括第五卷积层、降采样层、第六卷积层、上采样层和第七卷积层;提取模块1004,用于基于第五卷积层用于降低特征图的维度;降采样层用于根据设定降采样比例对降低维度后的特征图进行降采样,其中,经过降采样后的特征图的尺度小于特征图的原始尺度;基于第六卷积层用于对经过降采样的特征图进行卷积处理;基于上采样层用于根据设定上采样比例,对经过卷积的特征图进行上采样,其中,经过上采样后的特征图的尺度等于特征图的原始尺度;基于第七卷积层用于提升经过上采样后的特征图的维度,获得所述 第三特征图。
可选地,第二分支网络有多个;至少二个第二分支网络的设定降采样比例不同,和/或,至少二个第二分支网络的设定降采样比例相同。
可选地,第二分支网络有多个;至少二个第二分支网络的第六卷积层共享网络参数。
可选地,第二分支网络包括第五卷积层、膨胀卷积层和第七卷积层;
提取模块1004,用于基于第五卷积层用于降低特征图的维度;基于膨胀卷积层用于对降低维度后的特征图进行膨胀卷积处理;基于第七卷积层用于提升经过膨胀卷积后的特征图的维度,获得第三特征图。
可选地,第二分支网络有多个;至少二个第二分支网络的第五卷积层和/或第七卷积层共享网络参数。
可选地,至少二个第二分支网络的第五卷积层和/或第七卷积层还可以各自具有不同的网络参数。
可选地,特征金字塔子网络还包括第一输出合并层;第一输出合并层用于对共享第七卷积层的至少二个第二分支网络在第七卷积层之前的各自输出进行合并、并将合并结果输出至共享的第七卷积层。
可选地,神经网络包括至少两个特征金字塔子网络;特征金字塔子网络,用于以与当前特征金字塔子网络连接的前一特征金字塔子网络输出的第一特征图为输入,并根据输入的第一特征图,基于不同尺度提取当前特征金字塔子网络的第一特征图。
可选地,神经网络包括至少两个顺序连接的特征金字塔子网络;
第二特征金字塔子网络以第一特征金字塔子网络输出的第一特征图为输入,基于不同尺度提取第二特征金字塔子网络的第一特征图,第二特征金字塔子网络的输入端与第一特征金字塔子网络的输出端相连接。
可选地,神经网络为沙漏HOURGLASS神经网络,沙漏HOURGLASS神经网络包括的至少一沙漏模块包括至少一特征金字塔子网络。
可选地,神经网络的至少一网络层的初始化网络参数,从根据初始化网络参数的均值和方差确定的网络参数分布中获取,且初始化网络参数的均值为零。
可选地,当神经网络中存在包括至少二个恒等映射相加的情形,在需要相加的至少一恒等映射分支中设置输出调整模块,输出调整模块用于调整该恒等映射分支输出的第一特征图。
本实施例的图像处理装置用于实现前述方法实施例中相应的图像处理方法,并具有相应的方法实施例的有益效果,在此不再赘述。
本实施例还提供一种计算机可读存储介质,其上存储有计算机程序指令,其中,该程序指令被处理器执行时实现本申请实施例提供的任一种图像处理方法的步骤。
本实施例还提供一种计算机程序,包括:至少一可执行指令,所述至少一可执行指令被处理器执行时用于实现本申请实施例提供的任一种图像处理方法的步骤。
本实施例还提供一种电子设备,包括:处理器和本申请实施例提供的图像处理装置;在处理器 运行所述图像处理装置时,上述任一项所述的图像处理装置中的模块被运行。
本申请实施例提供了一种电子设备,例如可以是移动终端、个人计算机(PC)、平板电脑、服务器等。下面参考图11,其示出了适于用来实现本申请实施例的终端设备或服务器的电子设备1100的结构示意图:如图11所示,电子设备1100包括一个或多个处理器、通信元件等,所述一个或多个处理器例如:一个或多个中央处理单元(CPU)1101,和/或一个或多个图像处理器(GPU)1113等,处理器可以根据存储在只读存储器(ROM)1102中的可执行指令或者从存储部分1108加载到随机访问存储器(RAM)1103中的可执行指令而执行各种适当的动作和处理。通信元件包括通信组件1112和/或通信接口1109。其中,通信组件1112可包括但不限于网卡,所述网卡可包括但不限于IB(Infiniband)网卡,通信接口1109包括诸如LAN卡、调制解调器等的网络接口卡的通信接口,通信接口1109经由诸如因特网的网络执行通信处理。
处理器可与只读存储器1102和/或随机访问存储器1103中通信以执行可执行指令,通过通信总线1104与通信组件1112相连、并经通信组件1112与其他目标设备通信,从而完成本申请实施例提供的任一项图像处理方法对应的操作,例如,获取待检测图像的特征图;通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图;合并所述特征图和各所述其他特征图,得到所述待检测图像的第一特征图。
此外,在RAM 1103中,还可存储有装置操作所需的各种程序和数据。CPU1101或GPU1113、ROM1102以及RAM1103通过通信总线1104彼此相连。在有RAM1103的情况下,ROM1102为可选模块。RAM1103存储可执行指令,或在运行时向ROM1102中写入可执行指令,可执行指令使处理器执行上述通信方法对应的操作。输入/输出(I/O)接口1105也连接至通信总线1104。通信组件1112可以集成设置,也可以设置为具有多个子模块(例如多个IB网卡),并在通信总线链接上。
以下部件连接至I/O接口1105:包括键盘、鼠标等的输入部分1106;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分1107;包括硬盘等的存储部分1108;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信接口1109。驱动器1110也根据需要连接至I/O接口1105。可拆卸介质1111,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器1110上,以便于从其上读出的计算机程序根据需要被安装入存储部分1108。
需要说明的,如图11所示的架构仅为一种可选实现方式,在具体实践过程中,可根据实际需要对上述图11的部件数量和类型进行选择、删减、增加或替换;在不同功能部件设置上,也可采用分离设置或集成设置等实现方式,例如GPU1113和CPU1101可分离设置或者可将GPU1113集成在CPU1101上,通信元件可分离设置,也可集成设置在CPU1101或GPU1113上,等等。这些可替换的实施方式均落入本申请的保护范围。
特别地,根据本申请实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本申请实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机 程序,计算机程序包含用于执行流程图所示的方法的程序代码,程序代码可包括对应执行本申请实施例提供的方法步骤对应的指令,例如,获取待检测图像的特征图;通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图;合并所述特征图和各所述其他特征图,得到所述待检测图像的第一特征图。在这样的实施例中,该计算机程序可以通过通信元件从网络上被下载和安装,和/或从可拆卸介质1111被安装。在该计算机程序被处理器执行时,执行本申请实施例的方法中限定的上述功能。
需要指出,根据实施的需要,可将本申请实施例中描述的各个部件/步骤拆分为更多部件/步骤,也可将两个或多个部件/步骤或者部件/步骤的部分操作组合成新的部件/步骤,以实现本申请实施例的目的。
上述根据本申请实施例的方法可在硬件、固件中实现,或者被实现为可存储在记录介质(诸如CD ROM、RAM、软盘、硬盘或磁光盘)中的软件或计算机代码,或者被实现通过网络下载的原始存储在远程记录介质或非暂时机器可读介质中并将被存储在本地记录介质中的计算机代码,从而在此描述的方法可被存储在使用通用计算机、专用处理器或者可编程或专用硬件(诸如ASIC或FPGA)的记录介质上的这样的软件处理。可以理解,计算机、处理器、微处理器控制器或可编程硬件包括可存储或接收软件或计算机代码的存储组件(例如,RAM、ROM、闪存等),当所述软件或计算机代码被计算机、处理器或硬件访问且执行时,实现在此描述的处理方法。此外,当通用计算机访问用于实现在此示出的处理的代码时,代码的执行将通用计算机转换为用于执行在此示出的处理的专用计算机。
本说明书中各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似的部分相互参见即可。对于系统实施例而言,由于其与方法实施例基本对应,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
可能以许多方式来实现本申请的方法和装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本申请的方法和装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本申请的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本申请实施为记录在记录介质中的程序,这些程序包括用于实现根据本申请的方法的机器可读指令。因而,本申请还覆盖存储用于执行根据本申请的方法的程序的记录介质。
本申请的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本申请限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显然的。选择和描述实施例是为了更好说明本申请的原理和实际应用,并且使本领域的普通技术人员能够理解本申请从而设计适于特定用途的带有各种修改的各种实施例。

Claims (34)

  1. 一种图像处理方法,包括:
    获取待检测图像的特征图;
    通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图;
    合并所述特征图和至少二个所述其他特征图,得到所述待检测图像的第一特征图。
  2. 根据权利要求1所述的方法,其特征在于,还包括:
    根据所述第一特征图对所述待检测图像中的目标对象进行关键点检测。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述第一特征图对所述目标对象进行关键点检测,包括:
    根据所述第一特征图分别获取所述目标对象的至少一关键点的得分图;
    根据所述至少一个关键点的得分图中所包括的像素点的分数,确定所述目标对象的相应关键点的位置。
  4. 根据权利要求1至3中任一所述的方法,其特征在于,所述神经网络包括至少一个特征金字塔子网络,所述特征金字塔子网络包括第一分支网络以及与所述第一分支网络并联的至少一个第二分支网络;所述其他特征图包括第二特征图和/或第三特征图;
    所述通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图,包括:
    利用所述第一分支网络基于所述特征图的原始尺度对所述特征图进行特征提取,获得所述第二特征图;
    利用至少一个所述第二分支网络基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图。
  5. 根据权利要求4所述的方法,其特征在于,所述第一分支网络包括第二卷积层、第三卷积层和第四卷积层;
    所述利用所述第一分支网络基于所述特征图的原始尺度对所述特征图进行特征提取,获得所述第二特征图,包括:
    基于所述第二卷积层降低所述特征图的维度;
    利用所述第三卷积层基于所述特征图的原始尺度对所述降低维度后的特征图进行卷积处理;
    利用所述第四卷积层提升所述经过卷积处理的特征图的维度,获得所述第二特征图。
  6. 根据权利要求4或5所述的方法,其特征在于,所述第二分支网络包括第五卷积层、降采样层、第六卷积层、上采样层和第七卷积层;
    所述利用至少一个所述第二分支网络分别基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图,包括:
    基于所述第五卷积层降低所述特征图的维度;
    基于所述降采样层根据设定降采样比例对降低维度后的特征图进行降采样,其中,经过降采样后的 特征图的尺度小于所述特征图的原始尺度;
    基于所述第六卷积层对所述经过降采样的特征图进行卷积处理;
    基于所述上采样层根据设定上采样比例,对经过卷积的特征图进行上采样,其中,经过上采样后的特征图的尺度等于所述特征图的原始尺度;
    基于所述第七卷积层提升经过上采样后的特征图的维度,获得所述第三特征图。
  7. 根据权利要求6所述的方法,其特征在于,所述第二分支网络有多个;
    至少二个所述第二分支网络的设定降采样比例不同,和/或,至少二个所述第二分支网络的设定降采样比例相同。
  8. 根据权利要求6或7所述的方法,其特征在于,所述第二分支网络有多个;
    至少二个所述第二分支网络的所述第六卷积层共享网络参数。
  9. 根据权利要求4或5所述的方法,其特征在于,所述第二分支网络包括第五卷积层、膨胀卷积层和第七卷积层;
    所述利用至少一个所述第二分支网络分别基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图,包括:
    基于所述第五卷积层降低所述特征图的维度;
    基于所述膨胀卷积层对降低维度后的所述特征图进行膨胀卷积处理,
    基于所述第七卷积层提升经过膨胀卷积后的特征图的维度,获得所述第三特征图。
  10. 根据权利要求6至9中任一所述的方法,其特征在于,所述第二分支网络有多个;
    至少二个所述第二分支网络的所述第五卷积层和/或所述第七卷积层共享网络参数。
  11. 根据权利要求10所述的方法,其特征在于,所述特征金字塔子网络还包括第一输出合并层;
    所述第一输出合并层对共享所述第七卷积层的至少二个所述第二分支网络在所述第七卷积层之前的各自输出进行合并、并将合并结果输出至共享的所述第七卷积层。
  12. 根据权利要求1至11中任一项所述的方法,其特征在于,所述神经网络包括至少两个顺序连接的特征金字塔子网络;
    第二特征金字塔子网络以第一特征金字塔子网络输出的第一特征图为输入,基于不同尺度提取所述第二特征金字塔子网络的第一特征图,所述第二特征金字塔子网络的输入端与所述第一特征金字塔子网络的输出端相连接。
  13. 根据权利要求12所述的方法,其特征在于,所述神经网络为沙漏HOURGLASS神经网络,所述沙漏HOURGLASS神经网络包括的至少一沙漏模块包括至少一所述特征金字塔子网络。
  14. 根据权利要求1至13中任一项所述的方法,其特征在于,所述神经网络的至少一网络层的初始化网络参数,从根据所述初始化网络参数的均值和方差确定的网络参数分布中获取,且所述初始化网络参数的均值为零。
  15. 根据权利要求1至14中任一项所述的方法,其特征在于,当所述神经网络中存在包括至少二个恒等映射相加的情形,在需要相加的至少一恒等映射分支中设置输出调整模块,通过输出调整模块调整该恒等映射分支输出的第一特征图。
  16. 一种图像处理装置,其特征在于,包括:
    获取模块,用于获取待检测图像的特征图;
    提取模块,用于通过神经网络基于至少二种不同尺度对所述特征图进行特征提取,获得至少二个其他特征图;
    合并模块,用于合并所述特征图和至少二个所述其他特征图,得到所述待检测图像的第一特征图。
  17. 根据权利要求16所述的装置,其特征在于,所述装置还包括:
    检测模块,用于根据所述第一特征图对所述待检测图像中的目标对象进行关键点检测。
  18. 根据权利要求17所述的装置,其特征在于,所述检测模块包括:
    得分单元,用于根据所述第一特征图分别获取所述目标对象的至少一关键点的得分图;
    确定单元,用于根据所述至少一个关键点的得分图中所包括的像素点的分数,确定所述目标对象的相应关键点的位置。
  19. 根据权利要求16至18中任一所述的装置,其特征在于,所述神经网络包括至少一个特征金字塔子网络,所述特征金字塔子网络包括第一分支网络以及分别与所述第一分支网络并联的至少一个第二分支网络;所述其他特征图包括第二特征图和/或第三特征图;
    所述提取模块,用于利用所述第一分支网络基于所述特征图的原始尺度对所述特征图进行特征提取,获得所述第二特征图;
    利用至少一个所述第二分支网络基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图。
  20. 根据权利要求19所述的装置,其特征在于,所述第一分支网络包括第二卷积层、第三卷积层和第四卷积层;
    所述提取模块,用于基于所述第二卷积层降低所述特征图的维度;
    利用所述第三卷积层基于所述特征图的原始尺度对所述降低维度后的特征图进行卷积处理;
    利用所述第四卷积层提升所述经过卷积处理的特征图的维度,获得所述第二特征图。
  21. 根据权利要求19或20所述的装置,其特征在于,所述第二分支网络包括第五卷积层、降采样层、第六卷积层、上采样层和第七卷积层;
    所述提取模块利用至少一个所述第二分支网络基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图时,用于基于所述第五卷积层降低所述特征图的维度;基于所述降采样层根据设定降采样比例对降低维度后的特征图进行降采样,其中,经过降采样后的特征图的尺度小于所述特征图的原始尺度;基于所述第六卷积层对所述经过降采样的特征图进行卷积处理;基于所述上采样层根据设定上采样比例,对经过卷积的特征图进行上采样,其中,经过上采样后的特征图的尺度等于所述特征图的原始尺度;基于所述第七卷积层提升经过上采样后的特征图的维度,获得所述第三特征图。
  22. 根据权利要求21所述的装置,其特征在于,所述第二分支网络有多个;
    至少二个所述第二分支网络的设定降采样比例不同,和/或,至少二个所述第二分支网络的设定降采样比例相同。
  23. 根据权利要求21或22所述的装置,其特征在于,所述第二分支网络有多个;
    至少二个所述第二分支网络的所述第六卷积层共享网络参数。
  24. 根据权利要求19或20所述的装置,其特征在于,所述第二分支网络包括第五卷积层、膨胀卷积层和第七卷积层;
    所述提取模块利用至少一个所述第二分支网络基于不同于所述原始尺度的其他尺度对所述特征图进行特征提取,获得所述第三特征图时,用于基于所述第五卷积层降低所述特征图的维度;基于所述膨胀卷积层对降低维度后的所述特征图进行膨胀卷积处理,基于所述第七卷积层提升经过膨胀卷积后的特征图的维度,获得所述第三特征图。
  25. 根据权利要求21至24中任一所述的装置,其特征在于,所述第二分支网络有多个;
    至少二个所述第二分支网络的所述第五卷积层和/或所述第七卷积层共享网络参数。
  26. 根据权利要求25所述的装置,其特征在于,所述特征金字塔子网络还包括第一输出合并层;
    所述第一输出合并层用于对共享所述第七卷积层的至少二个所述第二分支网络在所述第七卷积层之前的各自输出进行合并、并将合并结果输出至共享的所述第七卷积层。
  27. 根据权利要求16至26中任一项所述的装置,其特征在于,所述神经网络包括至少两个顺序连接的特征金字塔子网络;
    第二特征金字塔子网络以第一特征金字塔子网络输出的第一特征图为输入,基于不同尺度提取所述第二特征金字塔子网络的第一特征图,所述第二特征金字塔子网络的输入端与所述第一特征金字塔子网络的输出端相连接。
  28. 根据权利要求27所述的装置,其特征在于,所述神经网络为沙漏HOURGLASS神经网络,所述沙漏HOURGLASS神经网络包括的至少一沙漏模块包括至少一所述特征金字塔子网络。
  29. 根据权利要求16至18中任一项所述的装置,其特征在于,所述神经网络的至少一网络层的初始化网络参数,从根据所述初始化网络参数的均值和方差确定的网络参数分布中获取,且所述初始化网络参数的均值为零。
  30. 根据权利要求16至29中任一项所述的装置,其特征在于,当所述神经网络中存在包括至少二个恒等映射相加的情形,在需要相加的至少一恒等映射分支中设置输出调整模块,所述输出调整模块用于调整该恒等映射分支输出的第一特征图。
  31. 一种计算机可读存储介质,其特征在于,其上存储有计算机程序指令,其中,所述计算机程序指令被处理器执行时实现权利要求1至15中任一项所述的图像处理方法。
  32. 一种电子设备,其特征在于,包括:
    处理器和权利要求16-30任一项所述的图像处理装置;在所述处理器运行所述图像处理装置时,权利要求16-30任一项所述的图像处理装置中的模块被运行。
  33. 一种电子设备,其特征在于,包括:处理器、存储器、通信元件和通信总线,所述处理器、所述存储器和所述通信元件通过所述通信总线完成相互间的通信;
    所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求1至15中任一项所述的图像处理方法对应的操作。
  34. 一种计算机程序,其特征在于,包括:至少一可执行指令,所述至少一可执行指令被处理器执行时用于执行实现如权利要求1至15中任一项所述的图像处理方法对应的操作。
PCT/CN2018/097227 2017-07-28 2018-07-26 图像处理方法、装置、存储介质、计算机程序和电子设备 WO2019020075A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710632941.0 2017-07-28
CN201710632941.0A CN108229497B (zh) 2017-07-28 2017-07-28 图像处理方法、装置、存储介质、计算机程序和电子设备

Publications (1)

Publication Number Publication Date
WO2019020075A1 true WO2019020075A1 (zh) 2019-01-31

Family

ID=62655195

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/097227 WO2019020075A1 (zh) 2017-07-28 2018-07-26 图像处理方法、装置、存储介质、计算机程序和电子设备

Country Status (2)

Country Link
CN (1) CN108229497B (zh)
WO (1) WO2019020075A1 (zh)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110472732A (zh) * 2019-08-19 2019-11-19 杭州凝眸智能科技有限公司 优化特征提取方法及其神经网络结构
CN110807757A (zh) * 2019-08-14 2020-02-18 腾讯科技(深圳)有限公司 基于人工智能的图像质量评估方法、装置及计算机设备
CN111047630A (zh) * 2019-11-13 2020-04-21 芯启源(上海)半导体科技有限公司 神经网络和基于神经网络的目标检测及深度预测方法
CN111190952A (zh) * 2019-12-23 2020-05-22 中电海康集团有限公司 一种基于图像金字塔提取城市画像多尺度特征并持久化的方法
CN111414990A (zh) * 2020-02-20 2020-07-14 北京迈格威科技有限公司 卷积神经网络处理方法、装置、电子设备及存储介质
CN111476740A (zh) * 2020-04-28 2020-07-31 北京大米未来科技有限公司 图像处理方法、装置、存储介质和电子设备
CN111523377A (zh) * 2020-03-10 2020-08-11 浙江工业大学 一种多任务的人体姿态估计和行为识别的方法
CN111739097A (zh) * 2020-06-30 2020-10-02 上海商汤智能科技有限公司 测距方法及装置、电子设备及存储介质
CN111783934A (zh) * 2020-05-15 2020-10-16 北京迈格威科技有限公司 卷积神经网络构建方法、装置、设备及介质
CN111860557A (zh) * 2019-04-30 2020-10-30 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备及计算机存储介质
CN111914997A (zh) * 2020-06-30 2020-11-10 华为技术有限公司 训练神经网络的方法、图像处理方法及装置
CN111932530A (zh) * 2020-09-18 2020-11-13 北京百度网讯科技有限公司 三维对象检测方法、装置、设备和可读存储介质
CN112116060A (zh) * 2019-06-21 2020-12-22 杭州海康威视数字技术股份有限公司 一种网络配置实现方法及装置
CN112149558A (zh) * 2020-09-22 2020-12-29 驭势科技(南京)有限公司 一种用于关键点检测的图像处理方法、网络和电子设备
CN112184687A (zh) * 2020-10-10 2021-01-05 南京信息工程大学 基于胶囊特征金字塔的道路裂缝检测方法和存储介质
CN112528900A (zh) * 2020-12-17 2021-03-19 南开大学 基于极致下采样的图像显著性物体检测方法及系统
CN112633156A (zh) * 2020-12-22 2021-04-09 浙江大华技术股份有限公司 车辆检测方法、图像处理装置以及计算机可读存储介质
CN112836804A (zh) * 2021-02-08 2021-05-25 北京迈格威科技有限公司 图像处理方法、装置、电子设备及存储介质
CN112883981A (zh) * 2019-11-29 2021-06-01 阿里巴巴集团控股有限公司 一种图像处理方法、设备及存储介质
CN113076914A (zh) * 2021-04-16 2021-07-06 咪咕文化科技有限公司 一种图像处理方法、装置、电子设备和存储介质
CN113344862A (zh) * 2021-05-20 2021-09-03 北京百度网讯科技有限公司 缺陷检测方法、装置、电子设备及存储介质
CN113591573A (zh) * 2021-06-28 2021-11-02 北京百度网讯科技有限公司 多任务学习深度网络模型的训练及目标检测方法、装置
TWI749423B (zh) * 2019-07-18 2021-12-11 大陸商北京市商湯科技開發有限公司 圖像處理方法及裝置、電子設備和電腦可讀儲存介質
CN113837104A (zh) * 2021-09-26 2021-12-24 大连智慧渔业科技有限公司 基于卷积神经网络的水下鱼类目标检测方法、装置及存储介质
US20210407041A1 (en) * 2019-05-30 2021-12-30 Boe Technology Group Co., Ltd. Image processing method and device, training method of neural network, and storage medium
CN116091486A (zh) * 2023-03-01 2023-05-09 合肥联宝信息技术有限公司 表面缺陷检测方法、装置、电子设备及存储介质
CN112633156B (zh) * 2020-12-22 2024-05-31 浙江大华技术股份有限公司 车辆检测方法、图像处理装置以及计算机可读存储介质

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229497B (zh) * 2017-07-28 2021-01-05 北京市商汤科技开发有限公司 图像处理方法、装置、存储介质、计算机程序和电子设备
CN108921225B (zh) * 2018-07-10 2022-06-24 深圳市商汤科技有限公司 一种图像处理方法及装置、计算机设备和存储介质
CN109325972B (zh) * 2018-07-25 2020-10-27 深圳市商汤科技有限公司 激光雷达稀疏深度图的处理方法、装置、设备及介质
CN109344840B (zh) * 2018-08-07 2022-04-01 深圳市商汤科技有限公司 图像处理方法和装置、电子设备、存储介质、程序产品
CN109117888A (zh) * 2018-08-20 2019-01-01 北京旷视科技有限公司 目标对象识别方法及其神经网络生成方法以及装置
CN110163197B (zh) * 2018-08-24 2023-03-10 腾讯科技(深圳)有限公司 目标检测方法、装置、计算机可读存储介质及计算机设备
CN109360633B (zh) * 2018-09-04 2022-08-30 北京市商汤科技开发有限公司 医疗影像处理方法及装置、处理设备及存储介质
CN110956190A (zh) * 2018-09-27 2020-04-03 深圳云天励飞技术有限公司 图像识别方法及装置、计算机装置和计算机可读存储介质
CN109359676A (zh) 2018-10-08 2019-02-19 百度在线网络技术(北京)有限公司 用于生成车辆损伤信息的方法和装置
CN109410218B (zh) 2018-10-08 2020-08-11 百度在线网络技术(北京)有限公司 用于生成车辆损伤信息的方法和装置
CN109447088A (zh) * 2018-10-16 2019-03-08 杭州依图医疗技术有限公司 一种乳腺影像识别的方法及装置
CN111091593B (zh) * 2018-10-24 2024-03-22 深圳云天励飞技术有限公司 图像处理方法、装置、电子设备及存储介质
CN109670397B (zh) 2018-11-07 2020-10-30 北京达佳互联信息技术有限公司 人体骨骼关键点的检测方法、装置、电子设备及存储介质
CN111191486B (zh) * 2018-11-14 2023-09-05 杭州海康威视数字技术股份有限公司 一种溺水行为识别方法、监控相机及监控系统
CN113591754B (zh) * 2018-11-16 2022-08-02 北京市商汤科技开发有限公司 关键点检测方法及装置、电子设备和存储介质
CN109670516B (zh) * 2018-12-19 2023-05-09 广东工业大学 一种图像特征提取方法、装置、设备及可读存储介质
CN109784194B (zh) * 2018-12-20 2021-11-23 北京图森智途科技有限公司 目标检测网络构建方法和训练方法、目标检测方法
CN109784350A (zh) * 2018-12-29 2019-05-21 天津大学 结合空洞卷积与级联金字塔网络的服饰关键点定位方法
US11048935B2 (en) * 2019-01-28 2021-06-29 Adobe Inc. Generating shift-invariant neural network outputs
CN109815770B (zh) * 2019-01-31 2022-09-27 北京旷视科技有限公司 二维码检测方法、装置及系统
CN109871890A (zh) * 2019-01-31 2019-06-11 北京字节跳动网络技术有限公司 图像处理方法和装置
CN109902738B (zh) * 2019-02-25 2021-07-20 深圳市商汤科技有限公司 网络模块和分配方法及装置、电子设备和存储介质
CN110390394B (zh) * 2019-07-19 2021-11-05 深圳市商汤科技有限公司 批归一化数据的处理方法及装置、电子设备和存储介质
CN110503063B (zh) * 2019-08-28 2021-12-17 东北大学秦皇岛分校 基于沙漏卷积自动编码神经网络的跌倒检测方法
CN110619604B (zh) * 2019-09-17 2022-11-22 中国气象局公共气象服务中心(国家预警信息发布中心) 三维降尺度方法、装置、电子设备及可读存储介质
CN112784629A (zh) * 2019-11-06 2021-05-11 株式会社理光 图像处理方法、装置和计算机可读存储介质
CN111291660B (zh) * 2020-01-21 2022-08-12 天津大学 一种基于空洞卷积的anchor-free交通标志识别方法
CN111582206B (zh) * 2020-05-13 2023-08-22 抖音视界有限公司 用于生成生物体姿态关键点信息的方法和装置
CN111556337B (zh) * 2020-05-15 2021-09-21 腾讯科技(深圳)有限公司 一种媒体内容植入方法、模型训练方法以及相关装置
CN112084849A (zh) * 2020-07-31 2020-12-15 华为技术有限公司 图像识别方法和装置
CN112232361B (zh) * 2020-10-13 2021-09-21 国网电子商务有限公司 图像处理的方法及装置、电子设备及计算机可读存储介质
CN113420641A (zh) * 2021-06-21 2021-09-21 梅卡曼德(北京)机器人科技有限公司 图像数据处理方法、装置、电子设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160140424A1 (en) * 2014-11-13 2016-05-19 Nec Laboratories America, Inc. Object-centric Fine-grained Image Classification
CN106611169A (zh) * 2016-12-31 2017-05-03 中国科学技术大学 一种基于深度学习的危险驾驶行为实时检测方法
CN106650913A (zh) * 2016-12-31 2017-05-10 中国科学技术大学 一种基于深度卷积神经网络的车流密度估计方法
CN108229497A (zh) * 2017-07-28 2018-06-29 北京市商汤科技开发有限公司 图像处理方法、装置、存储介质、计算机程序和电子设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105981051B (zh) * 2014-10-10 2019-02-19 北京旷视科技有限公司 用于图像解析的分层互连多尺度卷积网络
US9760807B2 (en) * 2016-01-08 2017-09-12 Siemens Healthcare Gmbh Deep image-to-image network learning for medical image analysis
CN105956626A (zh) * 2016-05-12 2016-09-21 成都新舟锐视科技有限公司 基于深度学习的对车牌位置不敏感的车牌识别方法
CN106529447B (zh) * 2016-11-03 2020-01-21 河北工业大学 一种小样本人脸识别方法
CN106650786A (zh) * 2016-11-14 2017-05-10 沈阳工业大学 基于多列卷积神经网络模糊评判的图像识别方法
CN106651877B (zh) * 2016-12-20 2020-06-02 北京旷视科技有限公司 实例分割方法及装置
CN106909905B (zh) * 2017-03-02 2020-02-14 中科视拓(北京)科技有限公司 一种基于深度学习的多模态人脸识别方法
CN106951867B (zh) * 2017-03-22 2019-08-23 成都擎天树科技有限公司 基于卷积神经网络的人脸识别方法、装置、系统及设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160140424A1 (en) * 2014-11-13 2016-05-19 Nec Laboratories America, Inc. Object-centric Fine-grained Image Classification
CN106611169A (zh) * 2016-12-31 2017-05-03 中国科学技术大学 一种基于深度学习的危险驾驶行为实时检测方法
CN106650913A (zh) * 2016-12-31 2017-05-10 中国科学技术大学 一种基于深度卷积神经网络的车流密度估计方法
CN108229497A (zh) * 2017-07-28 2018-06-29 北京市商汤科技开发有限公司 图像处理方法、装置、存储介质、计算机程序和电子设备

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860557A (zh) * 2019-04-30 2020-10-30 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备及计算机存储介质
CN111860557B (zh) * 2019-04-30 2024-05-24 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备及计算机存储介质
US20210407041A1 (en) * 2019-05-30 2021-12-30 Boe Technology Group Co., Ltd. Image processing method and device, training method of neural network, and storage medium
US11908102B2 (en) * 2019-05-30 2024-02-20 Boe Technology Group Co., Ltd. Image processing method and device, training method of neural network, and storage medium
CN112116060A (zh) * 2019-06-21 2020-12-22 杭州海康威视数字技术股份有限公司 一种网络配置实现方法及装置
CN112116060B (zh) * 2019-06-21 2023-07-25 杭州海康威视数字技术股份有限公司 一种网络配置实现方法及装置
TWI749423B (zh) * 2019-07-18 2021-12-11 大陸商北京市商湯科技開發有限公司 圖像處理方法及裝置、電子設備和電腦可讀儲存介質
CN110807757A (zh) * 2019-08-14 2020-02-18 腾讯科技(深圳)有限公司 基于人工智能的图像质量评估方法、装置及计算机设备
CN110807757B (zh) * 2019-08-14 2023-07-25 腾讯科技(深圳)有限公司 基于人工智能的图像质量评估方法、装置及计算机设备
CN110472732A (zh) * 2019-08-19 2019-11-19 杭州凝眸智能科技有限公司 优化特征提取方法及其神经网络结构
CN111047630A (zh) * 2019-11-13 2020-04-21 芯启源(上海)半导体科技有限公司 神经网络和基于神经网络的目标检测及深度预测方法
CN111047630B (zh) * 2019-11-13 2023-06-13 芯启源(上海)半导体科技有限公司 神经网络和基于神经网络的目标检测及深度预测方法
CN112883981A (zh) * 2019-11-29 2021-06-01 阿里巴巴集团控股有限公司 一种图像处理方法、设备及存储介质
CN111190952B (zh) * 2019-12-23 2023-10-03 中电海康集团有限公司 一种基于图像金字塔提取城市画像多尺度特征并持久化的方法
CN111190952A (zh) * 2019-12-23 2020-05-22 中电海康集团有限公司 一种基于图像金字塔提取城市画像多尺度特征并持久化的方法
CN111414990B (zh) * 2020-02-20 2024-03-19 北京迈格威科技有限公司 卷积神经网络处理方法、装置、电子设备及存储介质
CN111414990A (zh) * 2020-02-20 2020-07-14 北京迈格威科技有限公司 卷积神经网络处理方法、装置、电子设备及存储介质
CN111523377A (zh) * 2020-03-10 2020-08-11 浙江工业大学 一种多任务的人体姿态估计和行为识别的方法
CN111476740B (zh) * 2020-04-28 2023-10-31 北京大米未来科技有限公司 图像处理方法、装置、存储介质和电子设备
CN111476740A (zh) * 2020-04-28 2020-07-31 北京大米未来科技有限公司 图像处理方法、装置、存储介质和电子设备
CN111783934A (zh) * 2020-05-15 2020-10-16 北京迈格威科技有限公司 卷积神经网络构建方法、装置、设备及介质
CN111914997A (zh) * 2020-06-30 2020-11-10 华为技术有限公司 训练神经网络的方法、图像处理方法及装置
CN111914997B (zh) * 2020-06-30 2024-04-02 华为技术有限公司 训练神经网络的方法、图像处理方法及装置
CN111739097A (zh) * 2020-06-30 2020-10-02 上海商汤智能科技有限公司 测距方法及装置、电子设备及存储介质
CN111932530A (zh) * 2020-09-18 2020-11-13 北京百度网讯科技有限公司 三维对象检测方法、装置、设备和可读存储介质
CN111932530B (zh) * 2020-09-18 2024-02-23 北京百度网讯科技有限公司 三维对象检测方法、装置、设备和可读存储介质
CN112149558A (zh) * 2020-09-22 2020-12-29 驭势科技(南京)有限公司 一种用于关键点检测的图像处理方法、网络和电子设备
CN112184687B (zh) * 2020-10-10 2023-09-26 南京信息工程大学 基于胶囊特征金字塔的道路裂缝检测方法和存储介质
CN112184687A (zh) * 2020-10-10 2021-01-05 南京信息工程大学 基于胶囊特征金字塔的道路裂缝检测方法和存储介质
CN112528900A (zh) * 2020-12-17 2021-03-19 南开大学 基于极致下采样的图像显著性物体检测方法及系统
CN112528900B (zh) * 2020-12-17 2022-09-16 南开大学 基于极致下采样的图像显著性物体检测方法及系统
CN112633156B (zh) * 2020-12-22 2024-05-31 浙江大华技术股份有限公司 车辆检测方法、图像处理装置以及计算机可读存储介质
CN112633156A (zh) * 2020-12-22 2021-04-09 浙江大华技术股份有限公司 车辆检测方法、图像处理装置以及计算机可读存储介质
CN112836804B (zh) * 2021-02-08 2024-05-10 北京迈格威科技有限公司 图像处理方法、装置、电子设备及存储介质
CN112836804A (zh) * 2021-02-08 2021-05-25 北京迈格威科技有限公司 图像处理方法、装置、电子设备及存储介质
CN113076914A (zh) * 2021-04-16 2021-07-06 咪咕文化科技有限公司 一种图像处理方法、装置、电子设备和存储介质
CN113076914B (zh) * 2021-04-16 2024-04-12 咪咕文化科技有限公司 一种图像处理方法、装置、电子设备和存储介质
CN113344862A (zh) * 2021-05-20 2021-09-03 北京百度网讯科技有限公司 缺陷检测方法、装置、电子设备及存储介质
CN113344862B (zh) * 2021-05-20 2024-04-12 北京百度网讯科技有限公司 缺陷检测方法、装置、电子设备及存储介质
CN113591573A (zh) * 2021-06-28 2021-11-02 北京百度网讯科技有限公司 多任务学习深度网络模型的训练及目标检测方法、装置
CN113837104B (zh) * 2021-09-26 2024-03-15 大连智慧渔业科技有限公司 基于卷积神经网络的水下鱼类目标检测方法、装置及存储介质
CN113837104A (zh) * 2021-09-26 2021-12-24 大连智慧渔业科技有限公司 基于卷积神经网络的水下鱼类目标检测方法、装置及存储介质
CN116091486B (zh) * 2023-03-01 2024-02-06 合肥联宝信息技术有限公司 表面缺陷检测方法、装置、电子设备及存储介质
CN116091486A (zh) * 2023-03-01 2023-05-09 合肥联宝信息技术有限公司 表面缺陷检测方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN108229497A (zh) 2018-06-29
CN108229497B (zh) 2021-01-05

Similar Documents

Publication Publication Date Title
WO2019020075A1 (zh) 图像处理方法、装置、存储介质、计算机程序和电子设备
US11321593B2 (en) Method and apparatus for detecting object, method and apparatus for training neural network, and electronic device
TWI766175B (zh) 單目圖像深度估計方法、設備及儲存介質
US20200151849A1 (en) Visual style transfer of images
JP7373554B2 (ja) クロスドメイン画像変換
WO2019011249A1 (zh) 一种图像中物体姿态的确定方法、装置、设备及存储介质
WO2018099405A1 (zh) 人脸分辨率重建方法、重建系统和可读介质
US10846870B2 (en) Joint training technique for depth map generation
WO2018166438A1 (zh) 图像处理方法、装置及电子设备
CN108154222B (zh) 深度神经网络训练方法和系统、电子设备
JP2020524861A (ja) セマンティックセグメンテーションモデルの訓練方法および装置、電子機器、ならびに記憶媒体
CN110555795A (zh) 高解析度风格迁移
CN113343982B (zh) 多模态特征融合的实体关系提取方法、装置和设备
CN109118456B (zh) 图像处理方法和装置
WO2023159757A1 (zh) 视差图生成方法和装置、电子设备及存储介质
US11604963B2 (en) Feedback adversarial learning
Bilgazyev et al. Improved face recognition using super-resolution
WO2022143366A1 (zh) 图像处理方法、装置、电子设备、介质及计算机程序产品
JP2020536332A (ja) キーフレームスケジューリング方法及び装置、電子機器、プログラム並びに媒体
CN117099136A (zh) 用于对象检测的动态头
US20220301128A1 (en) Method and device for deep guided filter processing
CN115393423A (zh) 目标检测方法和装置
CN113112398A (zh) 图像处理方法和装置
CN111369425A (zh) 图像处理方法、装置、电子设备和计算机可读介质
CN115147902B (zh) 人脸活体检测模型的训练方法、装置及计算机程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18837480

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 29.06.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18837480

Country of ref document: EP

Kind code of ref document: A1