WO2021169160A1 - 图像归一化处理方法及装置、存储介质 - Google Patents

图像归一化处理方法及装置、存储介质 Download PDF

Info

Publication number
WO2021169160A1
WO2021169160A1 PCT/CN2020/103575 CN2020103575W WO2021169160A1 WO 2021169160 A1 WO2021169160 A1 WO 2021169160A1 CN 2020103575 W CN2020103575 W CN 2020103575W WO 2021169160 A1 WO2021169160 A1 WO 2021169160A1
Authority
WO
WIPO (PCT)
Prior art keywords
normalization
feature map
feature
vector
normalized
Prior art date
Application number
PCT/CN2020/103575
Other languages
English (en)
French (fr)
Inventor
张瑞茂
彭章琳
吴凌云
罗平
Original Assignee
深圳市商汤科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市商汤科技有限公司 filed Critical 深圳市商汤科技有限公司
Publication of WO2021169160A1 publication Critical patent/WO2021169160A1/zh
Priority to US17/893,797 priority Critical patent/US20220415007A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Definitions

  • the present disclosure relates to the field of deep learning, and in particular, to an image normalization processing method, device, and storage medium.
  • Normalization technology usually calculates statistics in different dimensions of the input tensor, so that different normalization methods are suitable for different visual tasks.
  • the present disclosure provides an image normalization processing method, device, and storage medium.
  • an image normalization processing method includes: using K normalization factors to perform normalization processing on a feature map, respectively, to obtain an image normalization method corresponding to the K normalization factors.
  • the candidate normalization feature map corresponding to each of the normalization factors where K is an integer greater than 1; determine the first weight value of each normalization factor in the K normalization factors; according to the K normalization factors.
  • the candidate normalized feature map corresponding to each of the factors and the first weight value are determined to determine the target normalized feature map corresponding to the feature map.
  • an image normalization processing device the device includes: a normalization processing module, configured to use K normalization factors to perform normalization processing on the feature map respectively , Obtain candidate normalization feature maps corresponding to each of the K normalization factors, where K is an integer greater than 1; the first determining module is used to determine the normalization of each of the K normalization factors The first weight value of the normalization factor; a second determination module, configured to determine the corresponding candidate normalization feature map corresponding to the K normalization factors and the first weight value Target normalized feature map.
  • a computer-readable storage medium stores a computer program, and when a processor invokes the computer program, the processor is configured to execute the first The image normalization processing method described in the aspect.
  • an electronic device including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call Executable instructions implement the image normalization processing method described in the first aspect.
  • a computer program product in which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, the image described in the first aspect is realized Normalized processing method.
  • Fig. 1 is a flowchart of an image normalization processing method according to an exemplary embodiment of the present disclosure
  • Fig. 2 is a flowchart of step 120 according to an exemplary embodiment of the present disclosure
  • Fig. 3 is a flowchart of step 121 according to an exemplary embodiment of the present disclosure.
  • Fig. 4 is a flowchart showing step 122 according to an exemplary embodiment of the present disclosure
  • Fig. 5 is a flowchart of step 123 according to an exemplary embodiment of the present disclosure.
  • Fig. 6 is a flowchart of step 130 according to an exemplary embodiment of the present disclosure.
  • Fig. 7 is a block diagram showing an image normalization processing architecture according to an exemplary embodiment of the present disclosure.
  • Fig. 8 is a block diagram showing an image normalization processing device according to an exemplary embodiment of the present disclosure.
  • Fig. 9 is a schematic diagram showing the hardware structure of an electronic device according to an exemplary embodiment of the present disclosure.
  • first, second, third, etc. may be used in this disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other.
  • first information may also be referred to as second information, and similarly, the second information may also be referred to as first information.
  • word “if” as used herein can be interpreted as "when” or “when” or "in response to determination”.
  • the switchable normalization (SN) method can be selected for each convolutional layer, and different normalization operators can be adaptively combined linearly, so that each layer in the deep neural network can optimize its own independent
  • the normalization method is suitable for various visual tasks.
  • SN can learn different normalization parameters for different network structures and different data sets, it cannot dynamically adjust the normalization parameters according to changes in sample characteristics. The flexibility of normalization is limited, and a better deep neural network cannot be obtained.
  • the embodiments of the present disclosure provide an image normalization processing method, which can be applied to different network models and visual tasks, and adaptively determine the first weight values of different normalization factors according to feature maps, which improves the normalization.
  • image content can be identified to output corresponding results, which can be specifically, but not limited to, techniques such as image recognition, target detection, and target segmentation. Recognizing the content of an image usually involves first extracting image features in the image, and then outputting the recognition result based on the extracted features. For example, when performing face recognition, the facial features in the image can be extracted, and the attributes of the face can be recognized based on the extracted facial features. It is understandable that the image normalization method provided by the embodiments of the present disclosure can be applied to the field of image processing.
  • Fig. 1 shows an image normalization processing method according to an exemplary embodiment.
  • the method includes the following steps 110-130:
  • step 110 different normalization factors are used to perform normalization processing on the feature maps respectively, and candidate normalized feature maps corresponding to each normalization factor are obtained.
  • K normalization factors are used to perform normalization processing on the feature maps, respectively, to obtain candidate normalized feature maps corresponding to the K normalization factors. Where K is an integer greater than 1.
  • the feature map corresponding to the image to be processed may be acquired first, where the image to be processed may be any image that needs to be normalized.
  • the image to be processed may be any image that needs to be normalized.
  • feature maps corresponding to the image to be processed can be obtained.
  • the number of feature maps can be N, where N is a positive integer.
  • the image features may include color features, texture features, shape features, etc. in the image.
  • Color feature is a kind of global feature, which describes the surface color attribute of the object corresponding to the image
  • the texture feature is also a kind of global feature, which describes the surface texture attribute of the object corresponding to the image.
  • shape features There are two types of representation methods for shape features, one is It is the contour feature, and the other is the regional feature.
  • the contour feature of the image is mainly for the outer boundary of the object, and the regional feature of the image is related to the shape of the image area.
  • the image features of the image to be processed can be extracted through a pre-trained neural network.
  • the neural network may include, but is not limited to, VGG Net (Visual Geometry Group Network), GoogleNet (Google Network, Google Network), etc. It is also possible to use other methods to extract the image features of the image to be processed, which is not specifically limited here.
  • different normalization factors refer to different normalization processing methods, including but not limited to batch normalization (BN) method, layer normalization (LN) method, examples Normalization (InstanceNormalization, IN) method, Group Normalization (GroupNormalization, GN) method.
  • the statistics ⁇ corresponding to each normalization factor are determined respectively, where the statistics ⁇ may include variance and/or mean.
  • the statistic ⁇ here corresponds to the normalization factor, that is, each normalization factor corresponds to one or a group of statistics ⁇ .
  • the feature maps are respectively normalized to obtain candidate normalized feature maps corresponding to each normalization factor.
  • N the number of feature maps
  • the total number of normalization factors is K
  • N sets of candidate normalized feature maps can be obtained, and each set of candidate normalized feature maps includes K candidate normalized features picture.
  • step 120 the first weight value of each normalization factor is determined.
  • the first weight value of each normalization factor corresponding to the feature map can be adaptively determined according to the feature map.
  • the first weight value of the normalization factor is used to indicate that after the normalization factor is used to normalize the feature map, the obtained candidate normalized feature map accounts for the K candidate normalized feature maps proportion.
  • K normalization factors may be used to determine the K first eigenvectors corresponding to the feature map. According to the correlation between the K first eigenvectors, the first eigenvectors of each normalization factor can be obtained. A weight value.
  • step 130 the target normalized feature map corresponding to the feature map is determined according to the candidate normalized feature map corresponding to each normalized factor and the first weight value of each normalized factor.
  • the candidate normalized feature map is multiplied by the first weight value of the normalization factor corresponding to the candidate normalized feature map to obtain The first normalized feature map corresponding to the candidate normalized feature map; combined with the second weight value of the normalization factor corresponding to the candidate normalized feature map to size the first normalized feature map To obtain a second normalized feature map corresponding to the candidate normalized feature map; combine the target offset value of the normalization factor corresponding to the candidate normalized feature map to this second normalized The feature map is moved to obtain a third normalized feature map corresponding to the candidate normalized feature map. Finally, the third normalized feature maps are added together to obtain the target normalized feature map corresponding to the feature map.
  • the second weight value is used to adjust the size of the first normalized feature map, and by reducing or enlarging the first normalized feature map, the zoomed second normalized feature map conforms to the target normalized feature The size requirements corresponding to the figure.
  • the second weight value can be determined during the training process of the neural network according to the size of the sample image and the size of the normalized feature map that the neural network finally needs to output. Once the neural network training is completed, the second weight value is for the same normalization The factor remains the same.
  • the target offset value is used to move the second normalized feature map, so that the positions of the third normalized feature map obtained after the movement overlap up and down, so as to facilitate the subsequent addition of the third normalized feature map.
  • the target offset value can also be determined during the training process of the neural network according to the size of the sample image and the size of the normalized feature map that the neural network needs to output. Once the neural network training is completed, the target offset value is for the same normalization The chemical factor remains unchanged.
  • the number of target normalized feature maps is the same as the number of feature maps.
  • the number of target normalized feature maps finally obtained is also N.
  • different normalization factors may be used to perform normalization processing on the feature maps respectively, so as to obtain candidate normalized feature maps corresponding to each normalization factor.
  • the candidate normalized feature map corresponding to each normalization factor and the first weight value of each normalization factor the target normalized feature map corresponding to the feature map is determined. Therefore, the purpose of adaptively determining the first weight values of different normalization factors according to the feature map is realized, and the flexibility of the normalization algorithm is improved.
  • the following formula (1) may be used to determine the first weight value of each normalization factor:
  • X n represents the nth feature map
  • k represents any integer from 1 to K
  • K represents the total number of normalization factors
  • ⁇ k represents based on the k-th normalization factor
  • the statistics obtained by the calculation of the normalization factor include the mean value ⁇ k and/or the variance ⁇ k
  • F(.) represents the function used to calculate the first weight value of the k-th normalization factor
  • represents the learnable parameter.
  • the processing method of each feature map is the same.
  • n in Formula 1 can be omitted, and the feature map can be represented by only one of the feature maps X. That is, in the following embodiments of the present disclosure, the first weight value of each normalization factor corresponding to the feature map X needs to be determined.
  • step 120 may include 121-123:
  • step 121 for each normalization factor, a first feature vector corresponding to the normalization factor is determined.
  • the feature map can be down-sampled to obtain the second feature vector x corresponding to each normalization factor.
  • the normalization factor is used to determine the statistic ⁇ corresponding to the normalization factor, and the second eigenvector x corresponding to the normalization factor is normalized according to the statistic ⁇ to obtain the normalization
  • the third eigenvector corresponding to the factor The number of the third feature vector is K.
  • the first feature vector z is obtained, where the number of the first feature vector is also K.
  • a correlation matrix is determined according to the correlation between the first eigenvectors corresponding to each normalization factor.
  • the correlation between multiple first feature vectors can be described according to the product of each first feature vector z and the transposed vector z T corresponding to each first feature vector z, Thus, the correlation matrix v is determined.
  • step 123 the first weight value of each normalization factor is determined according to the correlation matrix.
  • the correlation matrix v can be converted into a candidate vector through the first fully connected network, the tanh (hyperbolic tangent) change, and the second fully connected network in turn, and then the candidate vector can be normalized After transformation, the target vector ⁇ is obtained. According to the target vector ⁇ , the first weight value of each normalization factor is obtained.
  • the first feature vector corresponding to each normalization factor can be determined first, and then the correlation between each first feature vector can be determined, and then the value of each normalization factor can be determined.
  • the first weight value is easy to implement and highly usable.
  • step 121 may include 1211-1213:
  • step 1211 the feature map is down-sampled to obtain a second feature vector corresponding to the feature map.
  • the feature map may be down-sampled by means of average pooling or maximum pooling, so as to obtain K second feature vectors corresponding to the feature map.
  • X n is used to represent the n-th feature map, and the processing method of each feature map is the same.
  • n is omitted, and the feature map can be represented by X only.
  • K second feature vectors x corresponding to the feature map can be obtained.
  • x is C-dimensional
  • C is the number of channels in the feature map.
  • step 1212 for each normalization factor, using the normalization factor, normalize the second feature vector corresponding to the normalization factor to obtain a third feature vector.
  • the statistic ⁇ corresponding to the normalization factor may be calculated based on each normalization factor, where ⁇ includes the mean value and/or variance. In the embodiment of the present disclosure, ⁇ may include both the variance and the mean value.
  • step 1213 dimensionality reduction processing is performed on the third feature vector to obtain the first feature vector corresponding to the normalization factor.
  • a convolution method when performing dimensionality reduction processing, can be used.
  • a grouped convolution method can be used to combine the number of channels C corresponding to the feature map with the preset hyperparameters.
  • the quotient of r is used as the number of groups. For example, if the number of channels corresponding to the feature map X is C and the preset hyperparameter is r, the number of groups is C/r. In this way, it can be ensured that the parameter amount in the entire dimensionality reduction process is constant to C, and K first feature vectors z are obtained, and the first feature vectors z are C/r dimensional.
  • the corresponding K second feature vectors are obtained.
  • the K second eigenvectors are respectively normalized to obtain K third eigenvectors, and then the K third eigenvectors are subjected to dimensionality reduction processing to obtain K first eigenvectors.
  • Feature vector It is convenient to determine the first weight value of different normalization factors later, and the usability is high.
  • step 122 may include 1221-1222:
  • step 1221 the transpose vector corresponding to each first feature vector is determined.
  • the corresponding transposed vector z T may be determined for each first feature vector z.
  • step 1222 for each first feature vector, the first feature vector and each transposed vector are multiplied to obtain the correlation matrix.
  • any first eigenvector z is multiplied by any transposed vector z T , and finally the correlation matrix v can be obtained.
  • v is K ⁇ K dimension.
  • the first transpose vector corresponding to the first feature vector 1[a 1 , a 2 , a 3 ] is determined, and the first feature vector 2[ The second transpose vector corresponding to b 1 , b 2 , b 3 ] is determined, and the third transposed vector corresponding to the first feature vector 3[c 1 , c 2 , c 3 ] is determined, and the first feature vector 4[d 1 is determined , D 2 , d 3 ] corresponding to the fourth transposed vector, determine the fifth transposed vector corresponding to the first feature vector 5[e 1 , e 2 , e 3 ]; the first feature vector 1 and the first transposed vector , The second transposed vector, the third transposed vector
  • the product of the first eigenvector and each transposed vector is used to describe the correlation between multiple first eigenvectors, so as to obtain a correlation matrix, which is convenient for subsequent determination of different The first weight value of the normalization factor, with high availability.
  • step 123 may include 1231-1233:
  • step 1231 the correlation matrix is converted into candidate vectors through the first fully connected network, the hyperbolic tangent transformation and the second fully connected network in sequence.
  • the dimension of the correlation matrix v is K ⁇ K
  • the correlation matrix v can be input into the first fully connected network first, where the fully connected network refers to a neural network composed of fully connected layers. Each node of each layer in the network is connected to each node of the adjacent network layer. Then, the dimension of the correlation matrix v is converted from K ⁇ K to ⁇ K through tanh (hyperbolic tangent) change, where ⁇ is a preset hyperparameter, and any positive integer value, such as 50, can be selected.
  • the dimension can be converted from ⁇ K to K through the second fully connected network to obtain a K-dimensional candidate vector.
  • step 1232 normalization is performed on the values in the candidate vector to obtain a normalized target vector.
  • a normalization function such as a softmax function, can be used to normalize the values in the K-dimensional candidate vector to ensure In this way, the normalized K-dimensional target vector ⁇ is obtained.
  • ⁇ k and ⁇ k can be used interchangeably.
  • step 1233 the first weight value of each normalization factor is determined according to the target vector.
  • the dimensions of the correlation matrix can be converted into candidate vectors through the first fully connected network, the hyperbolic tangent transformation, and the second fully connected network in sequence, and then the values in the candidate vectors are normalized After the normalization process, the normalized target vector is obtained, and then according to the target vector, the first weight value of different normalization factors can be determined, and the usability is high.
  • the foregoing step 130 may include 131-134:
  • step 131 for each normalization factor, the candidate normalization feature map corresponding to the normalization factor is multiplied by the first weight value of the normalization factor to obtain the normalization factor.
  • the first normalized feature map corresponding to the factor is multiplied by the first weight value of the normalization factor to obtain the normalization factor.
  • each normalization factor performs the normalization process on the feature map separately to obtain the candidate normalized feature map corresponding to the normalization factor, and the candidate normalized feature map is compared with the corresponding The first weight value of the normalization factor of is multiplied to obtain the first normalized feature map.
  • step 132 for each normalization factor, according to the second weight value corresponding to the normalization factor, the size of the first normalization feature map corresponding to the normalization factor is adjusted to obtain the The second normalized feature map corresponding to the normalized factor.
  • the second weight value remains unchanged for the same normalization factor after the neural network training is completed.
  • the size of the corresponding first normalized feature map can be adjusted by multiplying the second weight value corresponding to the normalization factor by the corresponding first normalized feature map to obtain the second normalized feature map. ⁇ feature map.
  • the size of the second normalized feature map meets the size required by the final target normalized feature map.
  • step 133 for each of the normalization factors, according to the target offset value corresponding to the normalization factor, move the second normalization feature map corresponding to the normalization factor to obtain the The third normalized feature map corresponding to the factorization factor.
  • the target offset value remains unchanged for the same normalization factor after the neural network training is completed.
  • the corresponding second normalized feature map can be moved by adding the target offset value corresponding to the normalization factor to the corresponding second normalized feature map, so as to obtain the third normalized feature picture.
  • the positions of the third normalized feature map corresponding to each normalization factor overlap up and down.
  • step 134 after adding the K third normalized feature maps, a target normalized feature map corresponding to the feature map is obtained.
  • each third normalized feature map is overlapped up and down, and the pixel values at the same position in each third normalized feature map are added, and finally the corresponding to the feature map X can be obtained.
  • step 103 can be expressed by the following formula (2):
  • ⁇ k represents the first weight value of the k-th normalization factor.
  • ⁇ k represents the mean value of the statistic ⁇ k corresponding to the k-th normalization factor.
  • ⁇ k represents the variance in the statistic ⁇ k corresponding to the k-th normalization factor.
  • is a preset value to prevent the denominator in formula 2 from being zero when the variance is zero.
  • ⁇ k represents the second weight value corresponding to the k-th normalization factor, which is equivalent to a scale parameter and is used to scale the first normalized feature map.
  • ⁇ k represents the target offset value corresponding to the k-th normalization factor, which is equivalent to the offset parameter and is used to move the second normalized feature map.
  • the mean ⁇ k and the variance ⁇ k adopt the same weight value. If the image to be processed is a sample image in the training process, the over-fitting phenomenon caused by using different weight values for the mean and variance can be avoided.
  • the candidate normalized feature maps are linearly combined by the weight values corresponding to different normalization factors, instead of using different normalization factors to perform the linear combination on each candidate normalized feature map. Linear combination makes the normalization algorithm more flexible and more usable.
  • a second weight value and a target offset value are introduced for each normalization factor.
  • the second weight value and the target offset value can be obtained in the normalization layer training process of the neural network, and remain unchanged for the same normalization factor after the training is completed.
  • the candidate normalization feature map corresponding to the normalization factor is multiplied by the first weight value of the normalization factor to obtain the normalization factor
  • FIG. 7 a framework diagram of an image normalization process is provided.
  • normalization factor k is calculated by the normalization factor k corresponding statistics ⁇ k, statistics [Omega] k include the mean [mu] k and variance ⁇ k, based on statistics ⁇ 1, ⁇ 2,. .. ⁇ k ,... ⁇ K , normalize the feature map X respectively to obtain K candidate normalized feature maps.
  • the feature map X is down-sampled through the average pooling or maximum pooling method to obtain K second feature vectors x corresponding to the feature map X.
  • the statistics ⁇ 1 , ⁇ 2 ,... ⁇ k ,... ⁇ K normalize the second eigenvector x to obtain K third eigenvectors
  • K third feature vectors After the dimensionality reduction processing is performed, K first feature vectors z corresponding to the feature map X are obtained.
  • the transposed vector z T corresponding to each first feature vector z can be determined. Any first eigenvector z multiplied by any transposed vector z T can be used to describe the correlation between multiple first eigenvectors, and finally a correlation matrix v is obtained. Among them, v is K ⁇ K dimension.
  • the normalized target vector ⁇ [ ⁇ 1 , ⁇ 2 ,..., ⁇ k ] T is obtained , and the value of each dimension of the target vector ⁇ is used as the first weight value of the corresponding normalization factor.
  • the first weight values of different normalization factors are adaptively determined, which improves the flexibility of the normalization algorithm.
  • K first normalized feature maps are obtained.
  • the K first normalized feature maps are respectively multiplied by the second weight value ⁇ k to obtain K second normalized feature maps.
  • K second normalized feature maps are added to the target offset value ⁇ k to obtain K third normalized feature maps.
  • the K third normalized feature maps are added to obtain the target normalized feature map corresponding to the feature map X Among them, ⁇ k and ⁇ k are not shown in FIG. 7.
  • the first weight values of different normalization factors can be determined, which expands the scope of image normalization methods that can be used for analysis, making it possible to analyze data content of different granularities within the same framework, which promotes The cutting-edge development of deep learning normalization technology.
  • image normalization processing method by designing the above-mentioned image normalization processing method, the entire network can be optimized and stabilized while reducing over-fitting.
  • This normalization layer may replace any normalization layer in the network structure. Compared with other normalization methods, it has the advantages of easy implementation and optimization, plug and play.
  • the image normalization method can be used to train a neural network, and the neural network obtained after training can be used as a sub-network to replace the one used to perform various tasks. Normalization layer in neural network.
  • various tasks include but are not limited to semantic understanding, speech recognition, computer vision tasks and so on.
  • the above process can be used to adaptively determine the first weight value corresponding to each normalization factor according to the sample images for different tasks, which solves the problem that the normalization factor cannot be dynamically adjusted when the sample sets are different.
  • the weight value of, the problem that the normalization algorithm is not flexible.
  • the normalization layer in the neural network corresponding to the task can be directly replaced to achieve the purpose of plug and play. If there is a neural network corresponding to other tasks, it can be directly replaced on the new neural network by fine-tuning the network parameters, so that the performance of other tasks can be improved.
  • the present disclosure also provides an embodiment of the device.
  • FIG. 8 is a block diagram of an image normalization processing device according to an exemplary embodiment of the present disclosure.
  • the device includes: a normalization processing module 210, configured to use K normalization factors, respectively Perform normalization processing on the feature map to obtain candidate normalized feature maps corresponding to each of the K normalization factors, where K is an integer greater than 1.
  • the first determining module 220 is used for each of the normalization factors.
  • a first weight value of a normalization factor is used for each of the normalization factors.
  • a second determination module 230 configured to select the candidate normalization feature map corresponding to each of the K normalization factors and the first weight of each normalization factor Value to determine the target normalized feature map corresponding to the feature map.
  • the first determining module includes: a first determining submodule, configured to determine, for each normalization factor, a first feature vector corresponding to the normalization factor; and a second determining submodule , Used to determine the correlation matrix according to the correlation between the K first eigenvectors; the third determining sub-module, used to determine the first normalization factor according to the correlation matrix A weight value.
  • the first determining submodule includes: a down-sampling unit, configured to down-sample the feature map to obtain K second feature vectors corresponding to the feature map; first normalization A processing unit for the normalization factor to perform normalization processing on the second feature vector corresponding to the normalization factor among the K second feature vectors to obtain a third feature vector; dimensionality reduction processing unit , For performing dimensionality reduction processing on the third feature vector to obtain the first feature vector.
  • the second determining sub-module includes: a first determining unit, configured to determine the transpose vector corresponding to each first feature vector; and a second determining unit, configured to combine each first feature vector with Each of the transposed vectors is multiplied by two to obtain the correlation matrix.
  • the third determining sub-module includes: a conversion unit configured to sequentially convert the correlation matrix into a candidate vector through a first fully connected network, a hyperbolic tangent transformation, and a second fully connected network
  • the second normalization processing unit is used to normalize the value in the candidate vector to obtain the normalized target vector
  • the third determination unit is used to determine the target vector according to the target vector The first weight value of each of the normalization factors, wherein the target vector includes K elements.
  • the third determining unit includes: using the k-th element in the target vector as the first weight value of the k-th normalization factor, where k is any value from 1 to K. An integer.
  • the second determination module includes: a fourth determination sub-module, configured to, for each normalization factor, compare the candidate normalization feature map corresponding to the normalization factor to the normalization The first weight value of the factor is multiplied to obtain the first normalized feature map corresponding to the normalization factor; the fifth determining sub-module is used for each normalization factor, according to the normalization factor Corresponding to the second weight value, adjust the size of the first normalized feature map corresponding to the normalization factor to obtain the second normalized feature map corresponding to the normalization factor; the sixth determining sub-module uses For each normalization factor, according to the target offset value corresponding to the normalization factor, move the second normalization feature map corresponding to the normalization factor to obtain the third normalization factor corresponding to the normalization factor. Normalized feature map; the seventh determining sub-module is used to add the K third normalized feature maps to obtain the target normalized feature map corresponding to the feature map.
  • the relevant part can refer to the part of the description of the method embodiment.
  • the device embodiments described above are merely illustrative, where the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place. , Or it can be distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the present disclosure. Those of ordinary skill in the art can understand and implement it without creative work.
  • the embodiment of the present disclosure also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the processor invokes the computer program, the processor is used to execute the The image normalization processing method.
  • the computer-readable storage medium includes a non-transitory computer-readable storage medium.
  • the embodiments of the present disclosure provide a computer program product, which includes computer-readable code.
  • the processor in the device executes to implement the method provided in any of the above embodiments.
  • the instruction of the image normalization processing method is not limited to:
  • the embodiments of the present disclosure also provide another computer program product for storing computer-readable instructions.
  • the computer executes the operations of the image normalization processing method provided by any of the above-mentioned embodiments. .
  • the computer program product can be specifically implemented by hardware, software, or a combination thereof.
  • the computer program product may be embodied as a computer storage medium.
  • the computer program product may be embodied as a software product, such as a software development kit (SDK) and so on.
  • SDK software development kit
  • An embodiment of the present disclosure also provides an electronic device, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to call the executable instructions stored in the memory to implement any one of the foregoing The image normalization processing method described in the embodiment.
  • FIG. 9 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the disclosure.
  • the electronic device 310 includes a processor 311, and may also include an input device 312, an output device 313, and a memory 314.
  • the input device 312, the output device 313, the memory 314, and the processor 311 are connected to each other through a bus.
  • the memory 314 includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or portable Read-only memory (compact disc read-only memory, CD-ROM), which is used for related instructions and data.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • the input device 312 is used to input data and/or signals
  • the output device 313 is used to output data and/or signals.
  • the output device 313 and the input device 312 may be independent devices or a whole device.
  • the processor 311 may include one or more processors, such as one or more central processing units (CPU).
  • processors such as one or more central processing units (CPU).
  • CPU central processing units
  • the CPU may be a single-core CPU, or It can be a multi-core CPU.
  • the memory 314 is used to store program codes and data of the network device.
  • the processor 311 is configured to call the program code and data in the memory 314 to execute the steps in the foregoing method embodiment. For details, please refer to the description in the method embodiment, which will not be repeated here.
  • FIG. 9 only shows a simplified design of an image normalization processing device.
  • the image normalization processing device may also include other necessary components, including but not limited to any number of input/output devices, processors, controllers, memories, etc., and all of them can implement the embodiments of the present disclosure.
  • the image normalization processing devices are all within the protection scope of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供了一种图像归一化处理方法,其中,该方法包括:采用K个归一化因子,分别对特征图进行归一化处理,获得与所述K个归一化因子各自对应的备选归一化特征图,其中K是大于1的整数;确定所述K个归一化因子中各个归一化因子的第一权重值;根据所述K个归一化因子各自对应的备选归一化特征图和所述第一权重值,确定与所述特征图对应的目标归一化特征图。

Description

图像归一化处理方法及装置、存储介质
相关申请的交叉引用
本专利申请要求于2020年2月27日提交的、申请号为202010123511.8、发明名称为“图像归一化处理方法及装置、存储介质”的中国专利申请的优先权,该申请的全文以引用的方式并入本文中。
技术领域
本公开涉及深度学习领域,尤其涉及一种图像归一化处理方法及装置、存储介质。
背景技术
在自然语言处理、语音识别、计算机视觉等任务中,各种归一化(Normalization)技术成为深度学习所必不可少的模块。归一化技术通常在输入张量的不同维度进行统计量的计算,从而让不同的归一化方法适用于不同的视觉任务。
发明内容
本公开提供了一种图像归一化处理方法及装置、存储介质。
根据本公开实施例的第一方面,提供一种图像归一化处理方法,所述方法包括:采用K个归一化因子,分别对特征图进行归一化处理,获得与所述K个归一化因子各自对应的备选归一化特征图,其中K是大于1的整数;确定所述K个归一化因子中各个归一化因子的第一权重值;根据所述K个归一化因子各自对应的备选归一化特征图和所述第一权重值,确定与所述特征图对应的目标归一化特征图。
根据本公开实施例的第二方面,提供一种图像归一化处理装置,所述装置包括:归一化处理模块,用于采用K个归一化因子,分别对特征图进行归一化处理,获得与所述K个归一化因子各自对应的备选归一化特征图,其中K是大于1的整数;第一确定模块,用于确定所述K个归一化因子中各个归一化因子的第一权重值;第二确定模块,用于根据所述K个归一化因子各自对应的备选归一化特征图和所述第一权重值,确定与所述特征图对应的目标归一化特征图。
根据本公开实施例的第三方面,提供一种计算机可读存储介质,所述计算机可读存 储介质存储有计算机程序,当处理器调用所述计算机程序时,所述处理器用于执行上述第一方面所述的图像归一化处理方法。
根据本公开实施例的第四方面,提供一种电子设备,包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器中存储的可执行指令,实现第一方面所述的图像归一化处理方法。
根据本公开实施例的第五方面,提供一种计算机程序产品,所述计算机程序产品中存储有计算机可读指令,当所述计算机可读指令被处理器执行时实现第一方面所述的图像归一化处理方法。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
图1是本公开根据一示例性实施例示出的一种图像归一化处理方法流程图;
图2是本公开根据一示例性实施例示出的步骤120的流程图;
图3是本公开根据一示例性实施例示出的步骤121的流程图;
图4是本公开根据一示例性实施例示出的步骤122的流程图;
图5是本公开根据一示例性实施例示出的步骤123的流程图;
图6是本公开根据一示例性实施例示出的步骤130的流程图;
图7是本公开根据一示例性实施例示出的一种图像归一化处理架构框图;
图8是本公开根据一示例性实施例示出的一种图像归一化处理装置框图;
图9是本公开根据一示例性实施例示出的一种电子设备的硬件结构示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所运行的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本公开可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
可选择归一化(Switchable Normalization,SN)方法可以面向每个卷积层,自适应地将不同的归一化算子进行线性组合,使得深度神经网络中每一层都能优化出各自独立的归一化方法,适用于各种视觉任务。然而,SN虽然可以对不同的网络结构、不同的数据集学习不同的归一化参数,但是并不能根据样本特征的变化,动态地调整归一化参数。限制了归一化的灵活性,无法获得更优的深度神经网络。
本公开实施例提供了一种图像归一化处理方法,可以适用于不同的网络模型和视觉任务,根据特征图自适应性地确定不同的归一化因子的第一权重值,提高了归一化算法的灵活性。在图像处理领域,可以对图像内容进行识别从而输出对应的结果,具体可以但不限于表现为图像识别、目标检测、目标分割等技术。识别图像内容通常可以是先提取图像中的图像特征,再根据提取的特征输出识别结果。例如,在进行人脸识别的时候,可以提取图像中的人脸特征,根据提取的人脸特征识别人脸的属性。可理解的是,本公开实施例提供的图像归一化方法可以应用于图像处理领域。
例如图1所示,图1是根据一示例性实施例示出的一种图像归一化处理方法,该方法包括以下步骤110-130:
在步骤110中,采用不同的归一化因子,分别对特征图进行归一化处理,获得与各个归一化因子对应的备选归一化特征图。在一些实施例中,采用K个归一化因子,分别对特征图进行归一化处理,获得与K个归一化因子各自对应的备选归一化特征图。其中K是大于1的整数。
在本公开实施例中,可以先获取待处理图像对应的特征图,其中,待处理图像可以是任意一张需要进行归一化处理的图像。通过对待处理图像提取不同维度的图像特征, 可以得到该待处理图像对应的特征图,特征图的数目可以为N,N为正整数。
其中,图像特征可以包括图像中的颜色特征、纹理特征、形状特征等。颜色特征是一种全局特征,描述了图像所对应的对象的表面颜色属性,纹理特征也是一种全局特征,它描述了图像所对应对象的表面纹理属性,形状特征有两类表示方法,一类是轮廓特征,另一类是区域特征,图像的轮廓特征主要针对对象的外边界,而图像的区域特征则关系到图像区域的形状。
在本公开实施例中,可以通过预先训练好的神经网络,来提取待处理图像的图像特征。该神经网络可以包括但不限于VGG Net(Visual Geometry Group Network,视觉几何群网络)、GoogleNet(Google Network,谷歌网络)等。还可以是通过其他方法来提取待处理图像的图像特征,在此不做具体限定。
在本公开实施例中,不同的归一化因子是指不同的归一化处理方法,包括但不限于批归一化(BatchNormalization,BN)方法,层归一化(LayerNormalization,LN)方法,实例归一化(InstanceNormalization,IN)方法,组归一化(GroupNormalization,GN)方法。
在采用所述不同的归一化因子,分别对特征图进行归一化处理之前,先分别确定各个归一化因子对应的统计量Ω,其中,统计量Ω可以包括方差和/或均值。这里的统计量Ω是与归一化因子相对应的,即每个归一化因子对应一个或一组统计量Ω。
进一步地,采用不同的统计量Ω,分别对特征图进行归一化处理,得到与各个归一化因子对应的备选归一化特征图。
例如,特征图的数目为N张,归一化因子的总数目为K,则可以得到N组备选归一化特征图,每组备选归一化特征图中包括K张备选归一化特征图。
在步骤120中,确定各个归一化因子的第一权重值。
在本公开实施例中,可以根据特征图,自适应地确定与特征图对应的每个归一化因子的第一权重值。
其中,归一化因子的第一权重值用于表示采用该归一化因子对特征图进行归一化处理后,得到的备选归一化特征图占K个备选归一化特征图的比重。在本公开实施例中,可以采用K个归一化因子,确定特征图对应的K个第一特征向量,根据这K个第一特征向量之间的相关性,得到各个归一化因子的第一权重值。
在步骤130中,根据各个归一化因子对应的备选归一化特征图和各个归一化因子的所述第一权重值,确定与所述特征图对应的目标归一化特征图。
在本公开实施例中,针对各个备选归一化特征图,将该备选归一化特征图和该备选归一化特征图对应的归一化因子的第一权重值相乘,得到与该备选归一化特征图对应的第一归一化特征图;结合该备选归一化特征图对应的归一化因子的第二权重值对该第一归一化特征图进行尺寸的调整,得到与该备选归一化特征图对应的第二归一化特征图;结合该备选归一化特征图对应的归一化因子的目标偏移值对该第二归一化特征图进行移动,得到与该备选归一化特征图对应的第三归一化特征图。最终将各个第三归一化特征图相加,就可以得到特征图对应的目标归一化特征图。
其中,第二权重值用于调整第一归一化特征图的尺寸,通过对第一归一化特征图进行缩小或放大,使得缩放后的第二归一化特征图符合目标归一化特征图所对应的尺寸需求。第二权重值可以在神经网络的训练过程中,根据样本图像的尺寸、神经网络最终需要输出的归一化特征图的尺寸来确定,一旦神经网络训练完成,第二权重值针对同一归一化因子保持不变。
目标偏移值用来移动第二归一化特征图,使得移动后得到的第三归一化特征图的位置上下重叠,便于后续对第三归一化特征图进行相加。目标偏移值同样可以在神经网络的训练过程中,根据样本图像的尺寸、神经网络最终需要输出的归一化特征图的尺寸来确定,一旦神经网络训练完成,目标偏移值针对同一归一化因子保持不变。
另外,在本公开实施例中,目标归一化特征图的数目与特征图的数目相同。
例如,特征图的数目为N,最终得到的目标归一化特征图的数目也为N。
上述实施例中,可以采用不同的归一化因子,分别对特征图进行归一化处理,从而得到与各个归一化因子对应的备选归一化特征图。根据与各个归一化因子对应的备选归一化特征图和各个归一化因子的第一权重值,确定与特征图对应的目标归一化特征图。从而实现了根据特征图,自适应性地确定不同的归一化因子的第一权重值的目的,提高了归一化算法的灵活性。
在一些实施例中,可以用以下公式(1)确定各个归一化因子的第一权重值:
Figure PCTCN2020103575-appb-000001
其中,X n表示第n张特征图,
Figure PCTCN2020103575-appb-000002
表示第n张特征图对应的第k个归一化因子的第一 权重值,k表示1至K中的任一整数,K表示归一化因子的总数目,Ω k表示基于第k个归一化因子计算得到的的统计量,包括均值μ k和/或方差σ k,F(.)表示用于计算第k个归一化因子的第一权重值的函数,θ表示可学习参数。
在一些实施例中,特征图的数目为多个时,每张特征图的处理方式一致,为了便于描述,可以忽略公式1中的n,特征图可以仅用其中一张特征图X来表示,即在本公开下面的实施例中,需要确定与特征图X对应的各个归一化因子的第一权重值。
例如图2所示,步骤120可以包括121-123:
在步骤121中,针对各个归一化因子,确定与该归一化因子对应的第一特征向量。
在本公开实施例中,可以对特征图进行下采样,获得与各个归一化因子对应的第二特征向量x。采用该归一化因子,确定该归一化因子对应的统计量Ω,根据该统计量Ω对与该归一化因子对应的第二特征向量x进行归一化处理,获得与该归一化因子对应的第三特征向量
Figure PCTCN2020103575-appb-000003
其中第三特征向量的数目为K。对第三特征向量
Figure PCTCN2020103575-appb-000004
进行降维处理后,获得第一特征向量z,其中,第一特征向量的数目也为K。
在步骤122中,根据与各个归一化因子对应的第一特征向量之间的相关性,确定相关性矩阵。
在本公开实施例中,可以根据每个第一特征向量z和每个第一特征向量z对应的转置向量z T之间的乘积,来描述多个第一特征向量之间的相关性,从而确定相关性矩阵v。
在步骤123中,根据所述相关性矩阵,确定各个归一化因子的所述第一权重值。
在本公开实施例中,可以将相关性矩阵v依次通过第一全连接网络、tanh(双曲正切)变化和第二全连接网络,转换为备选向量,再对该备选向量进行归一化之后得到目标向量λ。根据目标向量λ,得到各个归一化因子的所述第一权重值。
上述实施例中,可以根据各个归一化因子,先确定与各个归一化因子对应的第一特征向量,再确定各个第一特征向量之间的相关性,进而确定出各个归一化因子的第一权重值,实现简便,可用性高。
在一些实施例中,例如图3所示,步骤121可以包括1211-1213:
在步骤1211中,对所述特征图进行下采样,获得与所述特征图对应的第二特征向量。
在本公开实施例中,可以通过平均池化或最大池化的方法对特征图进行下采样,从 而得到与特征图对应的K个第二特征向量。在本公开中,通过X n表示第n张特征图,每张特征图的处理方式一致,为了便于描述,忽略n,特征图可以仅用X来表示。在进行下采样之后,可以得到与特征图对应的K个第二特征向量x。其中,x是C维的,C是特征图的通道数目。
在步骤1212中,针对各个归一化因子,采用该归一化因子,对与该归一化因子对应的第二特征向量进行归一化处理,获得第三特征向量。
在本公开实施例中,可以基于各个归一化因子,计算与该归一化因子对应的统计量Ω,其中,Ω包括均值和/或方差。在本公开实施例中,Ω可以同时包括方差和均值。
根据统计量Ω,分别对第二特征向量x进行归一化处理,得到K个第三特征向量
Figure PCTCN2020103575-appb-000005
其中,
Figure PCTCN2020103575-appb-000006
也是C维的。
在步骤1213中,对所述第三特征向量进行降维处理,获得与该归一化因子对应的第一特征向量。
在本公开实施例中,在进行降维处理时,可以采用卷积方式,为了减少降维处理的计算开销,可以采用分组卷积的方式,将特征图对应的通道数目C与预设超参数r的商作为所述分组数目,例如,特征图X对应的通道数目为C,预设超参数为r,则分组数目为C/r。这样可以确保整个降维处理的过程中的参数量恒定为C,获得K个第一特征向量z,第一特征向量z是C/r维的。
上述实施例中,对特征图进行下采样后,获得对应的K个第二特征向量。采用K个归一化因子,分别对这K个第二特征向量进行归一化处理,获得K个第三特征向量,再对这K个第三特征向量进行降维处理,获得K个第一特征向量。便于后续确定不同的归一化因子的第一权重值,可用性高。
在一些实施例中,例如图4所示,步骤122可以包括1221-1222:
在步骤1221中,确定每个第一特征向量对应的转置向量。
在本公开实施例中,可以为每个第一特征向量z确定对应的转置向量z T
在步骤1222中,针对每个第一特征向量,将该第一特征向量和各个转置向量相乘,获得所述相关性矩阵。
在本公开实施例中,任意一个第一特征向量z与任意一个转置向量z T相乘,最终可 以得到相关性矩阵v。其中,v是K×K维的。在一些实施例中,以K=5、C/r=3为例,确定第一特征向量1[a 1,a 2,a 3]对应的第一转置向量,确定第一特征向量2[b 1,b 2,b 3]对应的第二转置向量,确定第一特征向量3[c 1,c 2,c 3]对应的第三转置向量,确定第一特征向量4[d 1,d 2,d 3]对应的第四转置向量,确定第一特征向量5[e 1,e 2,e 3]对应的第五转置向量;第一特征向量1与第一转置向量、第二转置向量、第三转置向量、第四转置向量和第五转置向量分别相乘,得到相关性矩阵的第一行中的元素;第一特征向量2与第一转置向量、第二转置向量、第三转置向量、第四转置向量和第五转置向量分别相乘,得到相关性矩阵的第二行中的元素;第一特征向量3与第一转置向量、第二转置向量、第三转置向量、第四转置向量和第五转置向量分别相乘,得到相关性矩阵的第三行中的元素;第一特征向量4与第一转置向量、第二转置向量、第三转置向量、第四转置向量和第五转置向量分别相乘,得到相关性矩阵的第四行中的元素;第一特征向量5与第一转置向量、第二转置向量、第三转置向量、第四转置向量和第五转置向量分别相乘,得到相关性矩阵的第五行中的元素。这样,得到了K×K维的相关性矩阵。
上述实施例中,针对各个第一特征向量,利用该第一特征向量和各个转置向量的乘积来描述多个第一特征向量之间的相关性,从而得到相关性矩阵,便于后续确定不同的归一化因子的第一权重值,可用性高。
在一些实施例中,例如图5所示,步骤123可以包括1231-1233:
在步骤1231中,依次通过第一全连接网络、双曲正切变换和第二全连接网络,将所述相关性矩阵转换为备选向量。
在本公开实施例中,相关性矩阵v的维度为K×K,可以先将相关性矩阵v输入第一全连接网络,其中,全连接网络是指由全连接层组成的神经网络,该神经网络中每一层的每一个结点都与相邻网络层的每一个结点相连。再通过tanh(双曲正切)变化将相关性矩阵v的维度从K×K转换为πK,其中π是预设超参数,可以选取任意正整数值,例如50。
进一步地,可以再通过第二全连接网络将维度由πK转换为K,获得K维的备选向量。
在步骤1232中,对所述备选向量中的值进行归一化处理,获得归一化处理后的目标向量。
在本公开实施例中,可以通过归一化函数,例如softmax函数,将K维的备选向量 中的值进行归一化处理,确保
Figure PCTCN2020103575-appb-000007
从而获得归一化处理后的K维的目标向量λ。在本公开实施例中,在确定一张特征图对应的目标归一化特征图时,λ k与λ k可互换使用。
在步骤1233中,根据所述目标向量,确定各个归一化因子的所述第一权重值。
本公开实施例中,目标向量λ=[λ 1,λ 2,...,λ k] T是K维的,可以将目标向量中第k个维度的值作为与第k个归一化因子的所述第一权重值。
上述实施例中,可以依次通过第一全连接网络、双曲正切变换和第二全连接网络,将所述相关性矩阵的维度转换为备选向量,然后对备选向量中的值进行归一化处理,获得归一化处理后的目标向量,再根据目标向量,就可以确定不同的归一化因子的第一权重值,可用性高。
在一些实施例中,例如图6所示,上述步骤130可以包括131-134:
在步骤131中,针对各个归一化因子,将与该归一化因子对应的备选归一化特征图与该归一化因子的所述第一权重值相乘,获得与该归一化因子对应的第一归一化特征图。
在本公开实施例中,每个归一化因子分别对特征图进行归一化处理,得到与该归一化因子对应的备选归一化特征图,将备选归一化特征图与对应的归一化因子的第一权重值相乘,得到第一归一化特征图。
在步骤132中,针对各个所述归一化因子,根据与该归一化因子对应的第二权重值,调整与该归一化因子对应的第一归一化特征图的尺寸,获得与该归一化因子对应的第二归一化特征图。
在本公开实施例中,第二权重值在神经网络训练完成后,针对同一归一化因子是保持不变的。可以通过与该归一化因子对应的第二权重值与对应的第一归一化特征图相乘,来对该对应的第一归一化特征图进行尺寸的调整,从而获得第二归一化特征图。第二归一化特征图的尺寸符合最终的目标归一化特征图所需要的尺寸。
在步骤133中,针对各个所述归一化因子,根据与该归一化因子对应的目标偏移值,移动与该归一化因子对应的第二归一化特征图,获得与该归一化因子对应的第三归一化特征图。
在本公开实施例中,目标偏移值在神经网络训练完成后,针对同一归一化因子是保持不变的。可以通过与该归一化因子对应的目标偏移值与对应的第二归一化特征图相加,来对该对应的第二归一化特征图进行移动,从而获得第三归一化特征图。各个归一化因 子对应的第三归一化特征图的位置上下重叠。
在步骤134中,将所述K个第三归一化特征图相加后,获得与所述特征图对应的目标归一化特征图。
在本公开实施例中,各个第三归一化特征图的位置是上下重叠的,对各个第三归一化特征图中同一位置的像素值进行相加,最终可以得到与特征图X对应的目标归一化特征图
Figure PCTCN2020103575-appb-000008
在本公开实施例中,步骤103可以通过以下公式(2)表示:
Figure PCTCN2020103575-appb-000009
其中,
Figure PCTCN2020103575-appb-000010
表示特征图X对应的目标归一化特征图。λ k表示第k个归一化因子的第一权重值。μ k表示第k个归一化因子对应的统计量Ω k中的均值。σ k表示第k个归一化因子对应的统计量Ω k中的方差。ε是为了避免方差为零时,公式2中的分母取值也为零的一个预设值。γ k表示第k个归一化因子对应的第二权重值,相当于比例参数,用于缩放第一归一化特征图。β k表示第k个归一化因子对应的目标偏移值,相当于偏移参数,用于移动第二归一化特征图。通过γ k和β k可以得到最终符合尺寸需求的目标归一化特征图
Figure PCTCN2020103575-appb-000011
通过公式(2)可以看出,均值μ k和方差σ k采用相同的权重值。如果待处理图像是训练过程中的样本图像,那么可以避免均值和方差采用不同的权重值导致的过拟合现象。在本公开实施例中,通过不同的归一化因子对应的权重值将各个备选归一化特征图进行线性组合,而不是采用不同的归一化因子对各个备选归一化特征图进行线性组合,使得归一化算法更加灵活,可用性更高。
另外,在本公开实施例中,为了获得更加优化的目标归一化特征图,针对每个归一化因子引入第二权重值和目标偏移值。其中,第二权重值和目标偏移值可以在神经网络的归一化层训练过程中得到,训练完成后针对同一归一化因子保持不变。
上述实施例中,针对各个归一化因子,将该归一化因子对应的备选归一化特征图与该归一化因子的所述第一权重值相乘,获得与该归一化因子对应的第一归一化特征图;通过与该归一化因子对应的第二权重值和目标偏移值,对与该归一化因子对应的第一归 一化特征图进行尺寸的调整和移动;将尺寸调整和移动后的第三归一化特征图相加,获得与所述特征图对应的目标归一化特征图。从而根据不同的归一化因子,灵活地确定出特征图对应的目标归一化特征图,在实际应用时可以替换各种神经网络中的任意归一化层,易于实现与优化。
在一些实施例中,例如图7所示,提供了一种图像归一化处理过程的框架图。
针对特征图X,可以采用归一化因子k计算出该归一化因子k对应的统计量Ω k,统计量Ω k包括均值μ k和方差σ k,基于统计量Ω 1,Ω 2,...Ω k,...Ω K,分别对特征图X进行归一化处理,可以得到K个备选归一化特征图。
另外,通过平均池化或最大池化的方法对特征图X进行下采样,获得与特征图X对应的K个第二特征向量x。根据统计量Ω 1,Ω 2,...Ω k,...Ω K分别对第二特征向量x进行归一化处理,获得K个第三特征向量
Figure PCTCN2020103575-appb-000012
通过分组卷积,对K个第三特征向量
Figure PCTCN2020103575-appb-000013
进行降维处理后,获得与特征图X对应的K个第一特征向量z。
可以确定每个第一特征向量z对应的转置向量z T。任意一个第一特征向量z与任意一个转置向量z T相乘,可以用来描述多个第一特征向量之间的相关性,最终得到相关性矩阵v。其中,v是K×K维的。
将相关性矩阵v输入第一全连接网络,再通过tanh变化将相关性矩阵v的维度从K×K转换为πK,其中π是预设超参数,可以选取任意正整数值,例如50。进一步地,可以再通过第二全连接网络将维度由πK转换为K,得到备选向量。
采用归一化函数,例如softmax函数对备选向量进行归一化处理,让
Figure PCTCN2020103575-appb-000014
得到归一化之后的目标向量λ=[λ 1,λ 2,...,λ k] T,让目标向量λ每个维度的值作为对应的归一化因子的第一权重值。这样,根据特征图,自适应性地确定不同的归一化因子的第一权重值,提高了归一化算法的灵活性。
将K个备选归一化特征图分别与对应的归一化因子的第一权重值λ k相乘后,获得K个第一归一化特征图。K个第一归一化特征图分别与第二权重值γ k相乘,获得K个第二归一化特征图。K个第二归一化特征图再分别与目标偏移值β k相加,获得K个第三归一化特征图。最终将这K个第三归一化特征图相加,获得与特征图X对应的目标归一化特征图
Figure PCTCN2020103575-appb-000015
其中,图7中未示出γ k和β k
上述实施例中,可以确定不同归一化因子的第一权重值,拓展了图像归一化方法可用于分析的范畴,使得在同一个框架内对不同粒度的数据内容进行分析成为可能,推动了深度学习归一化技术的前沿发展。另外,通过设计上述图像归一化处理方法,使得整个网络在优化稳定的同时能够减少过拟合现象。该归一化层可能替换网络结构中任意的归一化层。相比于其他归一化方法具有易于实现与优化、即插即用等优势。
在一些实施例中,当待处理图像为样本图像时,该图像归一化方法可以用于对神经网络进行训练,训练后得到的神经网络可以作为一个子网络,替换用于执行各种任务的神经网络中的归一化层。其中,各种任务包括但不限于语义理解、语音识别、计算机视觉任务等。
在训练过程中,采用上述过程可以根据针对不同任务的样本图像,自适应地确定各个归一化因子对应的第一权重值,解决了在样本集合不同的情况下,无法动态调整归一化因子的权重值,所带来的归一化算法不灵活的问题。
在本公开实施例中,如果针对某个任务的样本图像,完成神经网络的训练之后,可以直接替换该任务对应的神经网络中的归一化层,实现即插即用的目的。如果有其他任务对应的神经网络,可以通过微调网络参数的方式直接替换到新的神经网络上,从而可以提升其他任务的性能。
与前述方法实施例相对应,本公开还提供了装置的实施例。
如图8所示,图8是本公开根据一示例性实施例示出的一种图像归一化处理装置框图,装置包括:归一化处理模块210,用于采用K个归一化因子,分别对特征图进行归一化处理,获得与所述K个归一化因子各自对应的备选归一化特征图,其中K是大于1的整数;第一确定模块220,用于各个所述归一化因子的第一权重值;第二确定模块230,用于根据所述K个归一化因子各自对应的备选归一化特征图和各个所述归一化因子的所述第一权重值,确定与所述特征图对应的目标归一化特征图。
在一些实施例中,所述第一确定模块包括:第一确定子模块,用于针对各个所述归一化因子,确定与该归一化因子对应的第一特征向量;第二确定子模块,用于根据K个所述第一特征向量之间的相关性,确定相关性矩阵;第三确定子模块,用于根据所述相关性矩阵,确定各个所述归一化因子的所述第一权重值。
在一些实施例中,所述第一确定子模块包括:下采样单元,用于对所述特征图进行下采样,获得与所述特征图对应的K个第二特征向量;第一归一化处理单元,用于 该归一化因子,对所述K个第二特征向量中与该归一化因子对应的第二特征特征向量进行归一化处理,获得第三特征向量;降维处理单元,用于对所述第三特征向量进行降维处理,获得所述第一特征向量。
在一些实施例中,所述第二确定子模块包括:第一确定单元,用于确定每个第一特征向量对应的转置向量;第二确定单元,用于将每个第一特征向量和每个所述转置向量两两相乘,获得所述相关性矩阵。
在一些实施例中,所述第三确定子模块包括:转换单元,用于依次通过第一全连接网络、双曲正切变换和第二全连接网络,将所述相关性矩阵转换为备选向量;第二归一化处理单元,用于对所述备选向量中的值进行归一化处理,获得归一化处理后的目标向量;第三确定单元,用于根据所述目标向量,确定各个所述归一化因子的所述第一权重值,其中所述目标向量包括K个元素。
在一些实施例中,所述第三确定单元包括:将所述目标向量中第k个元素,作为第k个归一化因子的所述第一权重值,其中k为1至K中的任一整数。
在一些实施例中,所述第二确定模块包括:第四确定子模块,用于针对各个归一化因子,将与该归一化因子对应的备选归一化特征图与该归一化因子的所述第一权重值相乘,获得与该归一化因子对应的第一归一化特征图;第五确定子模块,用于针对各个归一化因子,根据与该归一化因子对应的第二权重值,调整与该归一化因子对应的第一归一化特征图的尺寸,获得与该归一化因子对应的第二归一化特征图;第六确定子模块,用于针对各个归一化因子,根据与该归一化因子对应的目标偏移值,移动与该归一化因子对应的第二归一化特征图,获得与该归一化因子对应的第三归一化特征图;第七确定子模块,用于将K个第三归一化特征图相加后,获得与所述特征图对应的目标归一化特征图。
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本公开方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
本公开实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存 储有计算机程序,当处理器调用所述计算机程序时,所述处理器用于执行上述任一实施例所述的图像归一化处理方法。所述计算机可读存储介质包括非暂态计算机可读存储介质。
在一些实施例中,本公开实施例提供了一种计算机程序产品,包括计算机可读代码,当计算机可读代码在设备上运行时,设备中的处理器执行用于实现如上任一实施例提供的图像归一化处理方法的指令。
在一些实施例中,本公开实施例还提供了另一种计算机程序产品,用于存储计算机可读指令,指令被执行时使得计算机执行上述任一实施例提供的图像归一化处理方法的操作。
该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一些实施例中,所述计算机程序产品可以体现为计算机存储介质,在一些实施例中,计算机程序产品可体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
本公开实施例还提供了一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,处理器被配置为调用所述存储器中存储的可执行指令,实现上述任一实施例所述的图像归一化处理方法。
图9为本公开实施例提供的一种电子设备的硬件结构示意图。该电子设备310包括处理器311,还可以包括输入装置312、输出装置313和存储器314。该输入装置312、输出装置313、存储器314和处理器311之间通过总线相互连接。
存储器314包括但不限于随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)、或便携式只读存储器(compact disc read-only memory,CD-ROM),该存储器用于相关指令及数据。
输入装置312用于输入数据和/或信号,以及输出装置313用于输出数据和/或信号。输出装置313和输入装置312可以是独立的器件,也可以是一个整体的器件。
处理器311可以包括是一个或多个处理器,例如包括一个或多个中央处理器(central processing unit,CPU),在处理器311是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。
存储器314用于存储网络设备的程序代码和数据。
处理器311用于调用该存储器314中的程序代码和数据,执行上述方法实施例中的步骤。具体可参见方法实施例中的描述,在此不再赘述。
可以理解的是,图9仅仅示出了一种图像归一化处理装置的简化设计。在实际应用中,图像归一化处理装置还可以分别包含必要的其他元件,包含但不限于任意数量的输入/输出装置、处理器、控制器、存储器等,而所有可以实现本公开实施例的图像归一化处理装置都在本公开的保护范围之内。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或者惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
以上所述仅为本公开的一些实施例而已,并不用以限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开保护的范围之内。

Claims (11)

  1. 一种图像归一化处理方法,包括:
    采用K个归一化因子,分别对特征图进行归一化处理,获得与所述K个归一化因子各自对应的备选归一化特征图,其中K是大于1的整数;
    确定所述K个归一化因子中各个归一化因子的第一权重值;
    根据所述K个归一化因子各自对应的备选归一化特征图和所述第一权重值,确定与所述特征图对应的目标归一化特征图。
  2. 根据权利要求1所述的方法,其特征在于,确定所述K个归一化因子中各个归一化因子的第一权重值,包括:
    针对所述K个归一化因子中各个归一化因子的第一权重值,确定与该归一化因子对应的第一特征向量;
    根据K个所述第一特征向量之间的相关性,确定相关性矩阵;
    根据所述相关性矩阵,确定所述K个归一化因子中各个归一化因子的所述第一权重值。
  3. 根据权利要求2所述的方法,其特征在于,针对所述K个归一化因子中各个归一化因子,确定与该归一化因子对应的第一特征向量,包括:
    对所述特征图进行下采样,获得与所述特征图对应的K个第二特征向量;
    采用该归一化因子,对所述K个第二特征向量中与该归一化因子对应的第二特征特征向量进行归一化处理,获得第三特征向量;
    对所述第三特征向量进行降维处理,获得所述第一特征向量。
  4. 根据权利要求2或3所述的方法,其特征在于,根据K个所述第一特征向量之间的相关性,确定所述相关性矩阵,包括:
    确定每个所述第一特征向量对应的转置向量;
    针对每个所述第一特征向量,将该第一特征向量和各个所述转置向量相乘,获得所述相关性矩阵。
  5. 根据权利要求2-4任一项所述的方法,其特征在于,根据所述相关性矩阵,确定所述K个归一化因子中各个归一化因子的所述第一权重值,包括:
    依次通过第一全连接网络、双曲正切变换和第二全连接网络,将所述相关性矩阵转换为备选向量;
    对所述备选向量中的值进行归一化处理,获得归一化处理后的目标向量;
    根据所述目标向量,确定所述K个归一化因子中各个归一化因子的所述第一权重值, 其中所述目标向量包括K个元素。
  6. 根据权利要求5所述的方法,其特征在于,根据所述目标向量,确定所述K个归一化因子中各个归一化因子的所述第一权重值,包括:
    将所述目标向量中第k个元素,作为第k个归一化因子的所述第一权重值,其中k为1至K中的任一整数。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,根据所述K个归一化因子各自对应的备选归一化特征图和所述第一权重值,确定与所述特征图对应的目标归一化特征图,包括:
    针对所述K个归一化因子中各个归一化因子,
    将与该归一化因子对应的所述备选归一化特征图与该归一化因子的所述第一权重值相乘,获得与该归一化因子对应的第一归一化特征图;
    根据与该归一化因子对应的第二权重值,调整与该归一化因子对应的第一归一化特征图的尺寸,获得与该归一化因子对应的第二归一化特征图;
    根据与该归一化因子对应的目标偏移值,移动与该归一化因子对应的第二归一化特征图,获得与该归一化因子对应的第三归一化特征图;
    将K个所述第三归一化特征图相加后,获得与所述特征图对应的目标归一化特征图。
  8. 一种图像归一化处理装置,包括:
    归一化处理模块,用于采用K个归一化因子,分别对特征图进行归一化处理,获得与所述K个归一化因子各自对应的备选归一化特征图,其中K是大于1的整数;
    第一确定模块,用于确定所述K个归一化因子中各个归一化因子的第一权重值;
    第二确定模块,用于根据所述K个归一化因子各自对应的备选归一化特征图和所述第一权重值,确定与所述特征图对应的目标归一化特征图。
  9. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,当处理器调用所述计算机程序时,所述处理器用于执行上述权利要求1-7任一所述的图像归一化处理方法。
  10. 一种电子设备,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为调用所述存储器中存储的所述处理器可执行指令,实现权利要求1-7中任一项所述的图像归一化处理方法。
  11. 一种计算机程序产品,所述计算机程序产品中存储有计算机可读指令,当所述 计算机可读指令被处理器执行时实现权利要求1至7任一所述的图像归一化处理方法。
PCT/CN2020/103575 2020-02-27 2020-07-22 图像归一化处理方法及装置、存储介质 WO2021169160A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/893,797 US20220415007A1 (en) 2020-02-27 2022-08-23 Image normalization processing

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010123511.8A CN111325222A (zh) 2020-02-27 2020-02-27 图像归一化处理方法及装置、存储介质
CN202010123511.8 2020-02-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/893,797 Continuation US20220415007A1 (en) 2020-02-27 2022-08-23 Image normalization processing

Publications (1)

Publication Number Publication Date
WO2021169160A1 true WO2021169160A1 (zh) 2021-09-02

Family

ID=71172932

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/103575 WO2021169160A1 (zh) 2020-02-27 2020-07-22 图像归一化处理方法及装置、存储介质

Country Status (4)

Country Link
US (1) US20220415007A1 (zh)
CN (1) CN111325222A (zh)
TW (1) TWI751668B (zh)
WO (1) WO2021169160A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325222A (zh) * 2020-02-27 2020-06-23 深圳市商汤科技有限公司 图像归一化处理方法及装置、存储介质
JP2023524038A (ja) 2020-05-01 2023-06-08 マジック リープ, インコーポレイテッド 階層正規化がかけられる画像記述子ネットワーク
US20230274132A1 (en) * 2020-08-26 2023-08-31 Intel Corporation Methods and apparatus to dynamically normalize data in neural networks
CN112201272B (zh) * 2020-09-29 2024-07-23 腾讯音乐娱乐科技(深圳)有限公司 音频数据降噪的方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921283A (zh) * 2018-06-13 2018-11-30 深圳市商汤科技有限公司 深度神经网络的归一化方法和装置、设备、存储介质
CN108960053A (zh) * 2018-05-28 2018-12-07 北京陌上花科技有限公司 归一化处理方法及装置、客户端
CN109784420A (zh) * 2019-01-29 2019-05-21 深圳市商汤科技有限公司 一种图像处理方法及装置、计算机设备和存储介质
US20190228298A1 (en) * 2018-01-24 2019-07-25 International Business Machines Corporation Adaptation of a trained neural network
CN111325222A (zh) * 2020-02-27 2020-06-23 深圳市商汤科技有限公司 图像归一化处理方法及装置、存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6582416B2 (ja) * 2014-05-15 2019-10-02 株式会社リコー 画像処理装置、画像処理方法及びプログラム
US10289825B2 (en) * 2016-07-22 2019-05-14 Nec Corporation Login access control for secure/private data
CN109255382B (zh) * 2018-09-07 2020-07-17 阿里巴巴集团控股有限公司 用于图片匹配定位的神经网络系统,方法及装置
CN109544560B (zh) * 2018-10-31 2021-04-27 上海商汤智能科技有限公司 图像处理方法及装置、电子设备和存储介质
CN109886392B (zh) * 2019-02-25 2021-04-27 深圳市商汤科技有限公司 数据处理方法和装置、电子设备和存储介质
CN109902763B (zh) * 2019-03-19 2020-05-15 北京字节跳动网络技术有限公司 用于生成特征图的方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190228298A1 (en) * 2018-01-24 2019-07-25 International Business Machines Corporation Adaptation of a trained neural network
CN108960053A (zh) * 2018-05-28 2018-12-07 北京陌上花科技有限公司 归一化处理方法及装置、客户端
CN108921283A (zh) * 2018-06-13 2018-11-30 深圳市商汤科技有限公司 深度神经网络的归一化方法和装置、设备、存储介质
CN109784420A (zh) * 2019-01-29 2019-05-21 深圳市商汤科技有限公司 一种图像处理方法及装置、计算机设备和存储介质
CN111325222A (zh) * 2020-02-27 2020-06-23 深圳市商汤科技有限公司 图像归一化处理方法及装置、存储介质

Also Published As

Publication number Publication date
TWI751668B (zh) 2022-01-01
CN111325222A (zh) 2020-06-23
TW202133032A (zh) 2021-09-01
US20220415007A1 (en) 2022-12-29

Similar Documents

Publication Publication Date Title
WO2021169160A1 (zh) 图像归一化处理方法及装置、存储介质
JP7273157B2 (ja) モデル訓練方法、装置、端末及びプログラム
CN112288011B (zh) 一种基于自注意力深度神经网络的图像匹配方法
JP7512262B2 (ja) 顔キーポイント検出方法、装置、コンピュータ機器及びコンピュータプログラム
WO2022068623A1 (zh) 一种模型训练方法及相关设备
CN107292352B (zh) 基于卷积神经网络的图像分类方法和装置
CN109949255A (zh) 图像重建方法及设备
CN112348036A (zh) 基于轻量化残差学习和反卷积级联的自适应目标检测方法
WO2022228425A1 (zh) 一种模型训练方法及装置
WO2022179586A1 (zh) 一种模型训练方法及其相关联设备
CN111709268B (zh) 一种深度图像中的基于人手结构指导的人手姿态估计方法和装置
WO2020062299A1 (zh) 一种神经网络处理器、数据处理方法及相关设备
WO2022206729A1 (zh) 视频封面选择方法、装置、计算机设备和存储介质
CN113011532B (zh) 分类模型训练方法、装置、计算设备及存储介质
CN113498521A (zh) 文本检测方法及装置、存储介质
CN109508640A (zh) 一种人群情感分析方法、装置和存储介质
JP7150651B2 (ja) ニューラルネットワークのモデル縮約装置
CN115063847A (zh) 一种面部图像获取模型的训练方法及装置
WO2020252746A1 (zh) 一种利用共基胶囊投影进行图像分类的方法
Zhang et al. A simple and effective static gesture recognition method based on attention mechanism
WO2022227024A1 (zh) 神经网络模型的运算方法、训练方法及装置
WO2022142084A1 (zh) 匹配筛选方法及装置、电子设备、存储介质和计算机程序
WO2020187029A1 (zh) 图像处理方法及装置、神经网络的训练方法、存储介质
WO2024061123A1 (zh) 一种图像处理方法及其相关设备
CN111860601B (zh) 预测大型真菌种类的方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20921291

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.12.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20921291

Country of ref document: EP

Kind code of ref document: A1