TWI721603B

TWI721603B - Data processing method, data processing device, electronic equipment and computer readable storage medium

Info

Publication number: TWI721603B
Application number: TW108137214A
Authority: TW
Inventors: 羅平; 吳凌雲; 彭章琳; 張瑞茂; 任家敏; 邵文琪
Original assignee: 大陸商深圳市商湯科技有限公司
Priority date: 2019-02-25
Filing date: 2019-10-16
Publication date: 2021-03-11
Also published as: KR20210090691A; CN109886392B; TW202032416A; CN109886392A; US20210312289A1; WO2020172979A1; SG11202106254TA; JP2022516452A

Abstract

本發明涉及一種資料處理方法和資料處理裝置、電子設備和電腦可讀儲存媒體，所述資料處理方法包括：將輸入資料輸入至神經網路模型中，獲取神經網路模型中網路層當前輸出的特徵資料；根據神經網路模型的變換參數，確定與特徵資料相匹配的標準化方式，其中，變換參數用於調整特徵資料的統計量的統計範圍，統計範圍用於表徵標準化方式；根據確定的標準化方式對特徵資料進行標準化處理，得到標準化後的特徵資料。本發明實施例可實現在沒有人為干預的情況下為神經網路模型的每個標準化層自主學習出相匹配的標準化方式的目的。The present invention relates to a data processing method, a data processing device, electronic equipment, and a computer-readable storage medium. The data processing method includes: inputting input data into a neural network model, and obtaining the current output of the network layer in the neural network model According to the transformation parameters of the neural network model, determine the standardization method that matches the feature data, where the transformation parameters are used to adjust the statistical range of the statistical quantity of the feature data, and the statistical range is used to characterize the standardization method; according to the determined The standardization method standardizes the characteristic data, and obtains the standardized characteristic data. The embodiment of the present invention can realize the purpose of autonomously learning a matching standardized mode for each standardized layer of the neural network model without human intervention.

Description

Data processing method, data processing device, electronic equipment and computer readable storage medium

本發明涉及電腦視覺技術領域，尤其涉及一種資料處理方法和裝置、電子設備和儲存媒體。The present invention relates to the field of computer vision technology, in particular to a data processing method and device, electronic equipment and storage media.

在自然語言處理、語音辨識、電腦視覺等具有挑戰性的任務中，各種標準化技術成為深度學習必不可少的模組。其中，標準化技術指的是對神經網路中的輸入資料進行標準化處理，使資料變為平均值為0，標準差為1的分佈或者是範圍在0—1的分佈，以使神經網路更易於收斂。In challenging tasks such as natural language processing, speech recognition, and computer vision, various standardized technologies have become indispensable modules for deep learning. Among them, standardization technology refers to the standardization of the input data in the neural network, so that the data becomes a distribution with an average value of 0 and a standard deviation of 1, or a distribution with a range of 0 to 1, so that the neural network can be more Easy to converge.

根據本發明的一方面，提供了一種資料處理方法，包括：According to one aspect of the present invention, there is provided a data processing method, including:

將輸入資料輸入至神經網路模型中，獲取所述神經網路模型中網路層當前輸出的特徵資料；Input the input data into the neural network model, and obtain the characteristic data of the current output of the network layer in the neural network model;

根據所述神經網路模型的變換參數，確定與所述特徵資料相匹配的標準化方式，其中，所述變換參數用於調整所述特徵資料的統計量的統計範圍，所述統計範圍用於表徵標準化方式；According to the transformation parameters of the neural network model, determine a standardized method that matches the feature data, wherein the transformation parameters are used to adjust the statistical range of the feature data, and the statistical range is used to characterize Standardized method

根據確定的所述標準化方式對所述特徵資料進行標準化處理，得到標準化後的特徵資料。The characteristic data is standardized according to the determined standardization manner to obtain standardized characteristic data.

根據本發明的一方面，還提供了一種資料處理裝置，包括：According to an aspect of the present invention, there is also provided a data processing device, including:

資料輸入模組，用於將輸入資料輸入至神經網路模型中，獲取所述神經網路模型中網路層當前輸出的特徵資料；The data input module is used to input input data into the neural network model, and obtain the characteristic data of the current output of the network layer in the neural network model;

方式確定模組，用於根據所述神經網路模型的變換參數，確定與所述特徵資料相匹配的標準化方式，其中，所述變換參數用於調整所述特徵資料的統計量的統計範圍，所述統計範圍用於表徵標準化方式；The mode determination module is used to determine a standardized mode matching the feature data according to the transformation parameters of the neural network model, wherein the transformation parameters are used to adjust the statistical range of the statistics of the feature data, The statistical range is used to characterize the standardization method;

標準化處理模組，用於根據確定的所述標準化方式對所述特徵資料進行標準化處理，得到標準化後的特徵資料。The standardization processing module is used to perform standardization processing on the characteristic data according to the determined standardization method to obtain standardized characteristic data.

根據本發明的一方面，還提供了一種電子設備，包括：處理器；用於存儲處理器可執行指令的記憶體；其中，所述處理器被配置為：執行前面任一所述的方法。According to an aspect of the present invention, there is also provided an electronic device, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to execute any of the foregoing methods.

根據本發明的一方面，還提供了一種電腦可讀儲存媒體，其上存儲有電腦程式指令，所述電腦程式指令被處理器執行時實現前面任一所述的方法。According to one aspect of the present invention, there is also provided a computer-readable storage medium on which computer program instructions are stored, and the computer program instructions are executed by a processor to implement any of the foregoing methods.

在本發明實施例中，通過在獲取到特徵資料後，根據神經網路模型中的變換參數，來確定與特徵資料相匹配的標準化方式，進而再根據確定的標準化方式對特徵資料進行標準化處理，實現了在沒有人為干預的情況下為神經網路模型的每個標準化層自主學習出相匹配的標準化方式的目的，從而在對特徵資料進行標準化處理時具有更高的靈活性，這也就有效提高了資料標準化處理的適應性。In the embodiment of the present invention, after acquiring the characteristic data, according to the transformation parameters in the neural network model, the standardization method that matches the characteristic data is determined, and then the characteristic data is standardized according to the determined standardization method. It achieves the purpose of autonomously learning a matching standardization method for each standardization layer of the neural network model without human intervention, so that it has higher flexibility when standardizing the feature data, which is also effective Improve the adaptability of data standardization processing.

應當理解的是，以上的一般描述和後文的細節描述僅是示例性和解釋性的，而非限制本發明。It should be understood that the above general description and the following detailed description are only exemplary and explanatory, rather than limiting the present invention.

以下將參考圖式詳細說明本發明的各種示例性實施例、特徵和方面。圖式中相同的圖式標記表示功能相同或相似的元件。儘管在圖式中示出了實施例的各種方面，但是除非特別指出，不必按比例繪製圖式。Various exemplary embodiments, features, and aspects of the present invention will be described in detail below with reference to the drawings. The same drawing symbols in the drawings indicate elements with the same or similar functions. Although various aspects of the embodiments are shown in the drawings, the drawings need not be drawn to scale unless otherwise noted.

在這裡專用的詞“示例性”意為“用作例子、實施例或說明性”。這裡作為“示例性”所說明的任何實施例不必解釋為優於或好於其它實施例。The dedicated word "exemplary" here means "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" need not be construed as being superior or better than other embodiments.

本文中術語“和/或”，僅僅是一種描述關聯物件的關聯關係，表示可以存在三種關係，例如，A和/或B，可以表示：單獨存在A，同時存在A和B，單獨存在B這三種情況。另外，本文中術語“至少一種”表示多種中的任意一種或多種中的至少兩種的任意組合，例如，包括A、B、C中的至少一種，可以表示包括從A、B和C構成的集合中選擇的任意一個或多個元素。The term "and/or" in this article is only an association relationship describing related objects, which means that there can be three relationships. For example, A and/or B can mean: A alone exists, A and B exist at the same time, and B exists alone. three situations. In addition, the term "at least one" herein means any one or any combination of at least two of the multiple, for example, including at least one of A, B, and C, and may mean including those made from A, B, and C Any one or more elements selected in the set.

另外，為了更好地說明本發明，在下文的具體實施方式中給出了眾多的具體細節。本領域技術人員應當理解，沒有某些具體細節，本發明同樣可以實施。在一些實例中，對於本領域技術人員熟知的方法、手段、元件和電路未作詳細描述，以便於凸顯本發明的主旨。In addition, in order to better illustrate the present invention, numerous specific details are given in the following specific embodiments. Those skilled in the art should understand that the present invention can also be implemented without certain specific details. In some instances, the methods, means, elements and circuits well known to those skilled in the art have not been described in detail, so as to highlight the gist of the present invention.

首先，需要說明的是，本發明的資料處理方法是對神經網路模型中的特徵資料（如：特徵圖）進行標準化處理的技術方案。其中，在神經網路模型的標準化層中，對特徵資料進行標準化處理時，不同的標準化方式可以根據統計量（可以為平均值和變異數）的統計範圍的不同來表徵。First of all, it should be noted that the data processing method of the present invention is a technical solution for standardizing feature data (such as a feature map) in a neural network model. Among them, in the standardization layer of the neural network model, when the characteristic data is standardized, different standardization methods can be characterized according to the different statistical ranges of statistics (which can be average and variance).

舉例來說，圖1至圖3示出了統計量的不同統計範圍表徵不同的標準化方式的示意圖。參閱圖1至圖3，在特徵資料為神經網路模型中一個4維的隱層特徵圖時，

。其中，F為特徵資料，R為特徵資料的維度，N代表該資料批次內的樣本量，C代表特徵資料的通道數，H和W則分別代表特徵資料的單個通道的高和寬。For example, FIGS. 1 to 3 show schematic diagrams of different statistical ranges of statistics representing different normalization methods. Referring to Figures 1 to 3, when the feature data is a 4-dimensional hidden layer feature map in the neural network model,

. Among them, F is the characteristic data, R is the dimension of the characteristic data, N is the sample size in the data batch, C is the channel number of the characteristic data, and H and W represent the height and width of a single channel of the characteristic data, respectively.

在對該特徵資料進行標準化處理時，首先需要在該特徵資料F上計算統計量平均值μ和變異數σ² ，進行標準化操作後輸出相同維度的特徵資料

，相關技術中，可用下述公式（1）來表示：

，其中，

(1) 其中，ϵ為一個為防止分母為0的很小的常數，

是第n個特徵資料第c個通道位置在(i,j)的像素點。When standardizing the feature data, you first need to calculate the average value μ and the variance σ ^{2 of the} statistic on the feature data F, and output the feature data of the same dimension after the standardization operation

In the related technology, it can be expressed by the following formula (1):

,among them,

(1) Among them, ϵ is a very small constant to prevent the denominator from being 0,

It is the pixel point of the n-th feature data and the c-th channel position at (i, j).

參閱圖1，在統計量的統計範圍為：

，即在特徵資料的N個樣本特徵資料同一通道上計算平均值和變異數時，此時表徵標準化方式為批標準化(Batch Normalization，簡稱BN)。Referring to Figure 1, the statistical scope of the statistics is:

, That is, when calculating the average value and the variance on the same channel of the N sample feature data of the feature data, the characterization standardization method at this time is Batch Normalization (BN for short).

參閱圖2，在統計量的統計範圍為：

，即在每個樣本特徵資料每個通道上計算平均值和變異數時，表徵標準化方式為實例標準化(Instance Normalization，簡稱IN)。Refer to Figure 2. The statistical range of the statistics is:

, That is, when calculating the average value and variance on each channel of each sample feature data, the characterization standardization method is Instance Normalization (IN for short).

參閱圖3，在統計量的統計範圍為：

，即在每個樣本特徵資料所有通道上計算平均值和變異數時，表徵標準化方式為層標準化(Layer Normalization，簡稱LN)。Refer to Figure 3, the statistical scope of the statistics is:

, That is, when calculating the average value and variance on all channels of each sample feature data, the characterization normalization method is Layer Normalization (LN).

另外，在統計量的統計範圍為在每個樣本特徵資料每

個通道為一組計算平均值和變異數時，表徵標準化方式為組標準化(Group Normalization，簡稱GN )。其中，組標準化方式為IN和LN的通用形式，即，

且C可以被

整除。In addition, the statistical scope of the statistics is that the characteristic data of each sample

When each channel is a group for calculating the average value and the variance, the characterization normalization method is Group Normalization (GN for short). Among them, the group standardization method is the general form of IN and LN, namely,

And C can be

Divide.

圖4示出了根據本發明實施例的資料處理方法的流程圖。參閱圖2，本發明的資料處理方法可以包括：Fig. 4 shows a flowchart of a data processing method according to an embodiment of the present invention. Referring to Figure 2, the data processing method of the present invention may include:

步驟S100，將輸入資料登錄至神經網路模型中，獲取神經網路模型中網路層當前輸出的特徵資料。其中，需要指出的是，神經網路模型可以為卷積神經網路（Convolutional Neural Networks，簡稱CNN）、迴圈神經網路（Recurrent Neural Network，簡稱RNN）或長短期記憶網路（Long Short Term Memory Network，簡稱LSTM），或者是實現圖像分類（ImageNet）、目標檢測與分割（Common Objects in Context ，簡稱COCO）、視頻識別（Kinetics）、圖像風格化和筆記生成等各種視覺任務的神經網路。In step S100, the input data is registered in the neural network model, and the characteristic data of the current output of the network layer in the neural network model is obtained. Among them, it should be pointed out that the neural network model can be Convolutional Neural Networks (CNN), Recurrent Neural Network (RNN) or Long Short Term Memory Network (Long Short Term Memory). Memory Network, referred to as LSTM), or a nerve to achieve various visual tasks such as image classification (ImageNet), object detection and segmentation (Common Objects in Context, referred to as COCO), video recognition (Kinetics), image stylization and note generation network.

同時，本領域技術人員可以理解的是，輸入資料可以包括至少一個樣本資料。如：輸入資料可以包含有多張圖片，也可以包含有一張圖片。在將輸入資料登錄至神經網路模型中時，由神經網路模型對輸入資料中的各個樣本資料進行相應的處理。並且，神經網路模型中的網路層可以為卷積層，通過卷積層對輸入資料進行特徵提取，獲取相應的特徵資料。其中，在輸入資料包括多個樣本資料時，對應的特徵資料相應包括有多個樣本特徵資料。At the same time, those skilled in the art can understand that the input data may include at least one sample data. For example, the input data can contain multiple pictures or one picture. When the input data is registered in the neural network model, the neural network model performs corresponding processing on each sample data in the input data. Moreover, the network layer in the neural network model can be a convolutional layer, and the input data is extracted through the convolutional layer to obtain corresponding characteristic data. Wherein, when the input data includes multiple sample data, the corresponding feature data correspondingly includes multiple sample feature data.

在獲取到神經網路模型中網路層當前輸出的特徵資料後，可以執行步驟S200，根據神經網路模型的變換參數，確定與特徵資料相匹配的標準化方式。其中，變換參數用於調整特徵資料的統計量的統計範圍，統計量的統計範圍表徵了標準化方式。此處，需要說明的是，變換參數為神經網路模型中可學習的參數。即，在神經網路模型的訓練過程中，可以根據不同的輸入資料學習訓練出不同取值的變換參數。由此，通過變換參數學習到的不同取值，來實現對統計量的統計範圍的不同調整，從而達到不同的輸入資料採用不同的標準化方式的目的。After obtaining the feature data currently output by the network layer in the neural network model, step S200 can be executed to determine a standardized method matching the feature data according to the transformation parameters of the neural network model. Among them, the transformation parameter is used to adjust the statistical range of the statistics of the characteristic data, and the statistical range of the statistics represents the standardization method. Here, it should be noted that the transformation parameters are learnable parameters in the neural network model. That is, in the training process of the neural network model, the transformation parameters with different values can be learned and trained according to different input data. In this way, different adjustments to the statistical range of the statistics can be realized by transforming the different values learned by the parameters, so as to achieve the purpose of adopting different standardization methods for different input data.

在確定相匹配的標準化方式後，即可執行步驟S300，根據確定的標準化方式對特徵資料進行標準化處理，得到標準化後的特徵資料。After the matching standardization method is determined, step S300 can be executed to perform standardization processing on the characteristic data according to the determined standardization method to obtain standardized characteristic data.

由此，本發明的資料處理方法，通過在獲取到特徵資料後，根據神經網路模型中的變換參數，來確定與特徵資料相匹配的標準化方式，進而再根據確定的標準化方式對特徵資料進行標準化處理，實現了在沒有人為干預的情況下為神經網路模型的每個標準化層自主學習出相匹配的標準化方式的目的，從而在對特徵資料進行標準化處理時具有更高的靈活性，這也就有效提高了資料標準化處理的適應性。Therefore, the data processing method of the present invention determines the standardized method matching the characteristic data according to the transformation parameters in the neural network model after acquiring the characteristic data, and then performs the characteristic data on the characteristic data according to the determined standardized method. Standardization processing realizes the purpose of autonomously learning a matching standardization method for each standardization layer of the neural network model without human intervention, so that it has higher flexibility when standardizing feature data. It also effectively improves the adaptability of data standardization processing.

在一種可能的實現方式中，變換參數可以包括第一變換參數、第二變換參數、第三變換參數和第四變換參數。其中，第一變換參數和第二變換參數用於調整統計量中的平均值的統計範圍，第三變換參數和第四變換參數用於調整統計量中的標準差的統計範圍。並且，第一變換參數的維度和第三變換參數的維度均基於特徵資料的批尺寸(Batchsize)維度，第二變換參數的維度和第四變換參數的維度均基於特徵資料的通道維度。此處，本領域技術人員可以理解的是，批尺寸維度為特徵資料所在的資料批次內的資料數量N（即，特徵資料的樣本特徵資料的數量），通道維度為所述特徵資料的通道數C。In a possible implementation manner, the transformation parameter may include a first transformation parameter, a second transformation parameter, a third transformation parameter, and a fourth transformation parameter. The first transformation parameter and the second transformation parameter are used to adjust the statistical range of the average value in the statistics, and the third transformation parameter and the fourth transformation parameter are used to adjust the statistical range of the standard deviation in the statistics. Moreover, the dimensions of the first transformation parameter and the dimension of the third transformation parameter are both based on the batch size dimension of the feature data, and the dimensions of the second transformation parameter and the dimension of the fourth transformation parameter are both based on the channel dimension of the feature data. Here, those skilled in the art can understand that the batch size dimension is the number of data N (ie, the number of sample feature data of the feature data) in the data batch where the feature data is located, and the channel dimension is the channel of the feature data. Count C.

相應的，在變換參數包括第一變換參數、第二變換參數、第三變換參數和第四變換參數時，在一種可能的實現方式中，根據神經網路中的變換參數，確定與特徵資料相匹配的標準化方式可以通過以下步驟來實現：Correspondingly, when the transformation parameters include the first transformation parameter, the second transformation parameter, the third transformation parameter, and the fourth transformation parameter, in a possible implementation manner, according to the transformation parameter in the neural network, it is determined that it corresponds to the characteristic data. The standardized way of matching can be achieved through the following steps:

首先，確定特徵資料的統計量的統計範圍為第一範圍。此處，需要說明的是，在一種可能的實現方式中，第一範圍可以為特徵資料的每個樣本特徵資料的每個通道範圍（即，前面所述的實例標準化IN中統計量的統計範圍），也可以為其他標準化方式中統計量的統計範圍。First, determine the statistical range of the statistics of the characteristic data as the first range. Here, it should be noted that, in a possible implementation manner, the first range may be each channel range of each sample feature data of the feature data (that is, the statistical range of the statistics in the standardized IN of the aforementioned example ), it can also be the statistical range of statistics in other standardized methods.

然後，根據第一變換參數和第二變換參數，將平均值的統計範圍由第一範圍調整至第二範圍。此處，需要指出的是，第二範圍是根據第一變換參數和第二變換參數的取值來確定。不同的取值，表徵了不同的統計範圍。並根據第三變換參數和第四變換參數，將標準差的統計範圍由第一範圍調整至第三範圍。同理，第三範圍是根據第三變換參數和第四變換參數的取值來確定的，不同的取值表徵了不同的統計範圍。Then, according to the first transformation parameter and the second transformation parameter, the statistical range of the average value is adjusted from the first range to the second range. Here, it should be pointed out that the second range is determined according to the values of the first transformation parameter and the second transformation parameter. Different values represent different statistical ranges. And according to the third transformation parameter and the fourth transformation parameter, the statistical range of the standard deviation is adjusted from the first range to the third range. In the same way, the third range is determined according to the values of the third transformation parameter and the fourth transformation parameter, and different values represent different statistical ranges.

進而，再基於第二範圍和第三範圍，確定標準化方式。Furthermore, based on the second range and the third range, the standardization method is determined.

舉例來說，根據以上所述，可以定義本發明的資料處理方法中，標準化處理方式為：

（2）其中，F代表標準化前的特徵資料，

代表標準化後的特徵資料，U為第一變換參數，V為第二變換參數。U’為第三變換參數，V’為第四變換參數。For example, based on the above, it can be defined that in the data processing method of the present invention, the standardized processing method is:

(2) Among them, F represents the characteristic data before standardization,

Represents the standardized feature data, U is the first transformation parameter, and V is the second transformation parameter. U'is the third transformation parameter, and V'is the fourth transformation parameter.

在一種可能的實現方式中，統計量（平均值μ和標準差σ）的統計範圍可以採用實例標準化IN中的統計範圍，即在特徵資料的每個樣本特徵資料的每個通道上單獨計算統計量，維度均為N×C。應當說明的是，根據前面所述，統計量的統計範圍也可以採用前面所述的其他標準化方式中的統計範圍。此處不進行具體限定。In a possible implementation, the statistical range of the statistics (mean μ and standard deviation σ) can be the statistical range in the instance standardized IN, that is, the statistics are calculated separately on each channel of each sample feature data of the feature data The quantity and dimensions are all N×C. It should be noted that, according to the foregoing, the statistical range of the statistics can also be the statistical range of the other standardized methods described above. There is no specific limitation here.

由此，通過對第一變換參數、第二變換參數和平均值進行乘積運算來實現對統計量中的平均值的統計範圍的調整，並通過第三變換參數、第四變換參數與標準差進行乘積運算來實現對標準差的統計範圍的調整，從而達到標準化方式的自我調整，調整方式簡單，易於實現。Thus, the adjustment of the statistical range of the average value in the statistics is achieved by multiplying the first transformation parameter, the second transformation parameter and the average value, and the third transformation parameter, the fourth transformation parameter and the standard deviation are used to adjust the statistical range of the average value. The product operation is used to realize the adjustment of the statistical range of the standard deviation, so as to achieve the self-adjustment of the standardized method. The adjustment method is simple and easy to implement.

在一種可能的實現方式中，第一變換參數U、第二變換參數V、第三變換參數U’和第四變換參數V’可以為二值化矩陣。其中，二值化矩陣內的每個元素的取值均為0或1。即，

與

分別為四個可學習的二值化矩陣，其內每個元素為0或1，由此，

和

即為本發明的資料處理方法中的標準化參數，使用

操作將其在H×W維度複製得到與F相同的尺寸，便於矩陣運算。In a possible implementation manner, the first transformation parameter U, the second transformation parameter V, the third transformation parameter U', and the fourth transformation parameter V'may be binarized matrices. Among them, the value of each element in the binarization matrix is 0 or 1. which is,

versus

They are four learnable binarized matrices, each element of which is 0 or 1, thus,

with

That is the standardized parameter in the data processing method of the present invention, using

The operation copies it in the H×W dimension to get the same size as F, which is convenient for matrix operations.

基於前面所述的第一變換參數的維度、第二變換參數的維度、第三變換參數的維度和第四變換參數的維度可知，U、U’代表了在批尺寸N維度學習的統計方式，V、V’代表了在通道C維度學習的統計方式，

代表了平均值μ和標準差σ分別學習相同的統計方式，

則代表了平均值μ和標準差σ分別學習不同的統計方式。由此，不同的U、U’以及V、V’代表不同的標準化方法。Based on the dimensions of the first transformation parameter, the dimension of the second transformation parameter, the dimension of the third transformation parameter, and the dimension of the fourth transformation parameter described above, U and U'represent the statistical method of learning in the batch size N dimension, V and V'represent the statistical method of learning in the channel C dimension,

It represents the same statistical method of learning the mean μ and standard deviation σ respectively,

It represents that the mean μ and standard deviation σ learn different statistical methods respectively. Therefore, different U, U'and V, V'represent different standardization methods.

舉例來說，參閱圖5至圖7，在

的情況下：For example, referring to Figures 5 to 7, in

in the case of:

當U和V都為如圖5所示的單位矩陣I時，本發明的資料處理方法中，標準化方式代表在每個N維度每個C維度單獨計算統計量的IN，此時：

When both U and V are the unit matrix I as shown in Fig. 5, in the data processing method of the present invention, the standardized method means that the IN of each N dimension and each C dimension is calculated separately, at this time:

當U為全1矩陣1且V為單位矩陣I時，本發明的資料處理方法中，標準化方式代表每個C維度的統計量在N維度平均的BN，此時：

When U is the all-one matrix 1 and V is the identity matrix I, in the data processing method of the present invention, the standardized method represents the average BN of the statistics of each C dimension in the N dimension, at this time:

當U為單位矩陣I且V為全1矩陣1時，本發明的資料處理方法中，標準化方式代表每個N維度的統計量在C維度平均的LN，此時：

When U is the unit matrix I and V is the all 1 matrix 1, in the data processing method of the present invention, the standardized method represents the average LN of the statistics of each N dimension in the C dimension, at this time:

當U為單位矩陣I且V為圖6或圖7類似的分塊對角矩陣時，本發明的資料處理方法中，標準化方式代表N維度單獨計算統計量而在C維度分組計算統計量的GN。如：當V為圖6所示的分塊對角矩陣時，分組數為4；當V為圖7所示的分塊對角矩陣時，分組數為2。與GN固定組數不同的是，本發明的資料處理方法中，標準化方式的分組數量可任意學習。When U is the unit matrix I and V is the block diagonal matrix similar to Fig. 6 or Fig. 7, in the data processing method of the present invention, the standardized method means that the statistics are calculated separately in the N dimension and the statistics are calculated in groups in the C dimension. . For example, when V is the block diagonal matrix shown in FIG. 6, the number of groups is 4; when V is the block diagonal matrix shown in FIG. 7, the number of groups is 2. Different from the fixed number of GN groups, in the data processing method of the present invention, the number of groups in the standardized way can be learned arbitrarily.

當U為全1矩陣1且V為全1矩陣1時，本發明的資料處理方法中，標準化方式代表在N和C維度同時平均統計量的“BLN”，即平均值與變異數在（N,H,W,C）都只有一個唯一的值

，此時：

When U is all 1 matrix 1 and V is all 1 matrix 1, in the data processing method of the present invention, the standardized method represents the "BLN" of the statistic averaged simultaneously in the N and C dimensions, that is, the average value and the variance are in (N ,H,W,C) have only one unique value

,at this time:

當U與V均為任意分塊對角矩陣，本發明的資料處理方法中，標準化方式代表在C維度分組計算統計量的同時，在N維度也分組計算統計量。也就是說，本發明的資料處理方法中，標準化方式可以對一個批次內的樣本量學習合適的批尺寸來評估統計量。When U and V are both arbitrary block diagonal matrices, in the data processing method of the present invention, the standardized method means that while the statistics are grouped and calculated in the C dimension, the statistics are also grouped and calculated in the N dimension. That is to say, in the data processing method of the present invention, the standardized method can learn an appropriate batch size for the sample size in a batch to evaluate the statistics.

應當指出的是，在上述實施例中，由於

，因此基於第一變換參數U和第二變換參數V對平均值的統計範圍進行調整確定的第二範圍，和基於第三變換參數U’和第四變換參數V’對標準差的統計範圍進行調整確定的第三範圍相同。本領域技術人員可以理解的是，在

時，此時所得到的第二範圍和第三範圍是不同的，這也就實現了更加多樣化的標準化方式的擴展。並且，還可以包括

和

等幾種情況，此處不再進行一一列舉。It should be noted that in the above-mentioned embodiment, due to

Therefore, the second range determined by adjusting the statistical range of the average value based on the first transformation parameter U and the second transformation parameter V, and the statistical range of the standard deviation based on the third transformation parameter U'and the fourth transformation parameter V' The third range determined by the adjustment is the same. Those skilled in the art can understand that in

At this time, the second range and the third range obtained at this time are different, which also realizes the expansion of more diversified standardized methods. And, it can also include

with

We will not list them one by one here.

由此可以看出，本發明的資料處理方法中對特徵資料進行標準化處理方式，與相關技術中人為設計統計範圍的標準化技術不同，本發明的資料處理方法可以自主學習適應當前資料的標準化方式。It can be seen that the standardized processing method for characteristic data in the data processing method of the present invention is different from the artificially designed statistical range standardization technique in related technologies. The data processing method of the present invention can independently learn the standardized method adapting to the current data.

即，在本發明的資料處理方法中，通過不同的矩陣來表徵變換參數的不同取值（即，變換參數以不同的矩陣來表示），以實現將特徵資料的統計量由初始範圍（即，第一範圍，如：IN中的統計範圍）遷移到不同的統計範圍，從而自主學習出一種依賴資料的數據元素標準化操作 (Data Element Standardization)，這就使得本發明的資料處理方法不僅可以表達出相關技術中所有的標準化技術，而且能拓展出更寬範圍的標準化方法，相比以往的標準化技術擁有更加豐富的表達能力。That is, in the data processing method of the present invention, different values of the transformation parameters are represented by different matrices (that is, the transformation parameters are represented by different matrices), so as to realize the statistics of the characteristic data from the initial range (ie, The first range, such as the statistical range in IN, is migrated to a different statistical range, so as to independently learn a data-dependent data element standardization operation (Data Element Standardization), which enables the data processing method of the present invention to not only express All standardization techniques in related technologies can expand a wider range of standardization methods, and have richer expression capabilities than previous standardization techniques.

根據前面所定義的公式(2)，在一種可能的實現方式中，根據確定的標準化方式對特徵資料進行標準化處理，得到標準化後的特徵資料時，可以包括：According to the previously defined formula (2), in a possible implementation manner, the characteristic data is standardized according to the determined standardization method. When the standardized characteristic data is obtained, it can include:

首先，按照第一範圍，獲取特徵資料的統計量。即，在第一範圍為實例標準化方式中所定義的統計範圍時，按照實例標準化中的統計範圍，根據下述公式（3）計算出特徵資料的平均值後，再根據計算出的平均值，按照下述公式（4）計算出特徵資料的標準差，從而得到統計量：

(3)

(4)First, according to the first range, the statistics of the characteristic data are obtained. That is, when the first range is the statistical range defined in the example standardization method, according to the statistical range in the example standardization, the average value of the characteristic data is calculated according to the following formula (3), and then based on the calculated average value, Calculate the standard deviation of the characteristic data according to the following formula (4) to obtain the statistics:

(3)

(4)

基於統計量、第一變換參數、第二變換參數、第三變換參數和第四變換參數，對特徵資料進行標準化處理，得到標準化後的特徵資料。Based on the statistics, the first transformation parameter, the second transformation parameter, the third transformation parameter, and the fourth transformation parameter, the feature data is standardized to obtain the standardized feature data.

其中，在一種可能的實現方式中，基於統計量、第一變換參數和第二變換參數，對特徵資料進行標準化處理，得到標準化後的特徵資料時，可以通過以下步驟來實現：Among them, in a possible implementation manner, the feature data is standardized based on the statistics, the first transformation parameter, and the second transformation parameter. When the standardized feature data is obtained, the following steps can be implemented:

首先，基於平均值、第一變換參數和第二變換參數，得到第一標準化參數。即，對平均值μ、第一變換參數U和第二變換參數V進行乘積運算（即，點乘運算

），得到第一標準化參數（

）。同時，基於標準差、第三變換參數和第四變換參數，得到第二標準化參數。即，對標準差σ、第三變換參數U’和第四變換參數V’進行乘積運算（點乘運算

），得到第二標準化參數（

）。First, the first normalized parameter is obtained based on the average value, the first transformation parameter, and the second transformation parameter. That is, the average value μ, the first transformation parameter U, and the second transformation parameter V are multiplied (ie, dot multiplication

) To get the first standardized parameter (

). At the same time, based on the standard deviation, the third transformation parameter, and the fourth transformation parameter, the second standardized parameter is obtained. That is, the standard deviation σ, the third transformation parameter U', and the fourth transformation parameter V'are multiplied (dotted

) To get the second standardized parameter (

).

最後，再根據特徵資料、第一標準化參數和第二標準化參數，對特徵資料進行標準化處理，得到標準化後的特徵資料。即，按照公式（2）進行運算處理，得到標準化後的特徵資料。Finally, according to the characteristic data, the first standardized parameter and the second standardized parameter, the characteristic data is standardized to obtain the standardized characteristic data. That is, according to formula (2), the calculation process is performed to obtain standardized feature data.

另外，還需要指出的是，本發明的資料處理方法中，在根據公式（2）對特徵資料進行標準化處理時，將公式（2）所示的標準化方式應用在神經網路模型的每層卷積層之後，即可為神經網路模型的每層特徵資料自主學習出各自獨立的標準化操作方式。其中，在根據公式（2）對特徵資料進行標準化處理時，每層標準化操作方式中均有4個需要學習的二值化分塊對角矩陣：第一變換參數U、第二變換參數V、第三變換參數U’、第四變換參數V’。為了進一步的減小本發明的資料處理方法中的計算量和參數量，並將參數優化過程變為一種可微分的端到端方式，可以採用多個子矩陣進行內積運算來產生出每一個二值化對角塊矩陣。In addition, it should be pointed out that in the data processing method of the present invention, when the feature data is standardized according to formula (2), the standardized method shown in formula (2) is applied to each layer of the neural network model. After layering, you can learn independently standardized operation methods for each layer of characteristic data of the neural network model. Among them, when the feature data is standardized according to formula (2), there are 4 binarized block diagonal matrices that need to be learned in each level of standardized operation mode: the first transformation parameter U, the second transformation parameter V, The third transformation parameter U'and the fourth transformation parameter V'. In order to further reduce the amount of calculations and parameters in the data processing method of the present invention, and to transform the parameter optimization process into a differentiable end-to-end method, multiple sub-matrices can be used to perform inner product operations to generate each two Value the diagonal block matrix.

也就是說，在一種可能的實現方式中，變換參數可以通過多個子矩陣來合成。多個子矩陣則可以通過在神經網路模型中設置可學習的門控參數來實現。即，在本發明的資料處理方法中，還可以包括：基於神經網路模型中設置的可學習的門控參數，獲取相應的多個子矩陣。進而再對多個子矩陣進行內積運算，得到變換參數。That is to say, in a possible implementation manner, the transformation parameters can be synthesized through multiple sub-matrices. Multiple sub-matrices can be realized by setting learnable gating parameters in the neural network model. That is, in the data processing method of the present invention, it may further include: obtaining corresponding multiple sub-matrices based on the learnable gating parameters set in the neural network model. Furthermore, the inner product operation is performed on multiple sub-matrices to obtain the transformation parameters.

此處，需要說明的是，內積運算可以為克羅內克內積運算 (Kronecker Product)。通過採用克羅內克內積運算設計出一種矩陣分解方案，將

維的矩陣

和

維的矩陣

分解為網路優化過程中可接受的計算量較小的參數。Here, it should be noted that the inner product operation may be Kronecker Product (Kronecker Product). By adopting Kronecker inner product operation to design a matrix decomposition scheme,

Dimensional matrix

with

Dimensional matrix

It is broken down into parameters with a small amount of calculation acceptable in the network optimization process.

如：以第二變換參數V為例，對克羅內克內積運算進行具體說明。其中，第二變換參數V可以由一系列的子矩陣

表達，可使用下述公式（5）表示：

(5) 其中，每個子矩陣

的維度為

且

，

代表克羅內克內積運算，為兩個任意大小的矩陣間運算，定義為：

For example, taking the second transformation parameter V as an example, the Kronecker inner product operation will be described in detail. Among them, the second transformation parameter V can be composed of a series of sub-matrices

Expression can be expressed by the following formula (5):

(5) Among them, each sub-matrix

The dimensions are

And

,

Represents the Kronecker inner product operation, which is an operation between two matrices of any size, defined as:

由此，在通過上述步驟得到多個子矩陣

後，即可按照公式（5）進行運算得到相應的第二變換參數。Therefore, after the above steps are used to obtain multiple sub-matrices

After that, the corresponding second transformation parameter can be obtained by calculating according to formula (5).

通過對多個子矩陣

進行內積運算得到第二變換參數V，使得第二變換參數

可以分解為一系列擁有連續值的子矩陣

，而這些子矩陣

可以不用在乎二值約束通過常用優化器學習。也就是說，

維的大矩陣

的學習轉變成一系列子矩陣

的學習，參數量也就從

減少到

。比如，當

為圖6所示

矩陣時，

可以分解為三個

的子矩陣

做克羅內克內積運算，即：

Multiple submatrices

Perform the inner product operation to obtain the second transformation parameter V, so that the second transformation parameter

Can be decomposed into a series of sub-matrices with continuous values

, And these sub-matrices

You can learn through common optimizers without caring about binary constraints. In other words,

Large matrix

The learning into a series of sub-matrices

Learning, the parameter amount is also from

decrease to

. For example, when

As shown in Figure 6

Matrix,

Can be broken down into three

Submatrix

Do the Kronecker inner product operation, namely:

此時，參數量由

減少到

。At this time, the parameter quantity is determined by

decrease to

.

由此，通過採用多個子矩陣來合成大矩陣形式的變換參數，實現了將C*C維的大矩陣形式的第二變換參數V變換參數的學習轉變成一系列的子矩陣的學習，參數量也就從

減少到

。其中，本領域技術人員可以理解的是，第一變換參數U、第三變換參數U’和第四變換參數V’同樣可以通過上述方式來得到，此處不再進行贅述。Therefore, by using multiple sub-matrices to synthesize the transformation parameters in the form of a large matrix, the learning of the second transformation parameter V in the form of a large matrix of C*C dimension is transformed into the learning of a series of sub-matrices, and the parameter amount is also Just from

decrease to

. Among them, those skilled in the art can understand that the first transformation parameter U, the third transformation parameter U', and the fourth transformation parameter V'can also be obtained in the above-mentioned manner, and will not be repeated here.

由此可見，通過多個子矩陣來分別合成第一變換參數U和第二變換參數V，有效減少了參數量，使得本發明的資料處理方法更易於實現。It can be seen that by separately synthesizing the first transformation parameter U and the second transformation parameter V through a plurality of sub-matrices, the parameter amount is effectively reduced, and the data processing method of the present invention is easier to implement.

其中，需要說明的是，在公式（5）中，

代表每個子矩陣

上的元素級變換。由此，在一種可能的實現方式中，可以設定

為符號函數，即，函數

，且

時，二值矩陣

就可以分解為一系列擁有連續值的子矩陣，而這些子矩陣可以不用在乎二值約束通過常用優化器學習，由此來實現將

維的大矩陣

的學習轉變成一系列子矩陣

的學習。但是，在採用上述策略時，通過

函數對矩陣中的元素進行變換並不能保證建構出來的變換參數一定是分塊對角矩陣的結構，這就可能會使得統計量的統計範圍不能被順利的調整。Among them, it should be noted that in formula (5),

Represents each sub-matrix

Element-level transformations on. Thus, in a possible implementation, you can set

Is a symbolic function, that is, the function

And

When, the binary matrix

It can be decomposed into a series of sub-matrices with continuous values, and these sub-matrices can be learned by common optimizers without caring about the binary constraints, thereby achieving the

Large matrix

The learning into a series of sub-matrices

Learning. However, when adopting the above strategy, through

The function to transform the elements in the matrix does not guarantee that the constructed transformation parameters must be the structure of the block diagonal matrix, which may make the statistical range of the statistics unable to be smoothly adjusted.

由此，在一種可能的實現方式中，在基於神經網路模型中設置的可學習的門控參數來獲取相應的多個子矩陣時，可以通過以下步驟來實現：Therefore, in a possible implementation manner, when obtaining the corresponding multiple sub-matrices based on the learnable gating parameters set in the neural network model, the following steps can be used to achieve:

首先，採用符號函數sign對門控參數進行處理，得到二值化向量。First, use the sign function sign to process the gating parameters to obtain the binarized vector.

進而再採用置換矩陣將二值化向量中的元素置換產生二值化門控向量。Furthermore, the permutation matrix is used to replace the elements in the binarized vector to generate a binarized gated vector.

最後，再基於二值化門控向量、第一基礎矩陣和第二基礎矩陣，得到多個子矩陣。此處，需要指出的是，第一基礎矩陣和第二基礎矩陣均為常數矩陣。其中，第一基礎矩陣可以為全1矩陣，如：第一基礎矩陣為2*2的全1矩陣。第二基礎矩陣可以為單位矩陣，如：第二基礎矩陣可以為2*2的單位矩陣或2*3的單位矩陣。Finally, based on the binarized gating vector, the first basic matrix and the second basic matrix, multiple sub-matrices are obtained. Here, it should be pointed out that both the first fundamental matrix and the second fundamental matrix are constant matrices. Wherein, the first basic matrix may be an all-ones matrix, for example, the first basic matrix is a 2*2 all-ones matrix. The second fundamental matrix may be an identity matrix, for example, the second fundamental matrix may be a 2*2 identity matrix or a 2*3 identity matrix.

舉例來說，根據前面所述，變換參數可以包括第一變換參數U、第二變換參數V、第三變換參數U’和第四變換參數V’。其中，第一變換參數U、第二變換參數V、第三變換參數U’和第四變換參數V’的獲取方式原理相同或相似，因此為了便於說明，以下以第二變換參數V為例，對採用多個子矩陣合成變換參數的過程進行更加詳細的說明。For example, according to the foregoing, the transformation parameters may include a first transformation parameter U, a second transformation parameter V, a third transformation parameter U', and a fourth transformation parameter V'. Among them, the acquisition methods of the first transformation parameter U, the second transformation parameter V, the third transformation parameter U', and the fourth transformation parameter V'have the same or similar principles. Therefore, for ease of description, the second transformation parameter V is taken as an example below. The process of using multiple sub-matrices to synthesize transformation parameters is described in more detail.

需要指出的是，設置在神經網路模型中的可學習的門控參數可以用

來表徵。在一種可能的實現方式中，門控參數

可以為具有連續數值的向量，該向量中連續數值的個數與所獲取到的子矩陣的數量相一致。

(6)

(7)It should be pointed out that the learnable gating parameters set in the neural network model can be used

To characterize. In one possible implementation, the gating parameter

It can be a vector with continuous values, and the number of continuous values in the vector is consistent with the number of obtained sub-matrices.

(6)

(7)

參照公式（6）和公式（7），

為二值化門控函數，用於將子矩陣

再參數化。公式（6）中

是

的全1矩陣，

是

的單位矩陣，任意的

都是一個二值化門控，均為0或1，而

為包含多個

的向量。Refer to formula (6) and formula (7),

Is the binarization gating function, used to divide the sub-matrix

Re-parameterize. In formula (6)

Yes

Matrix of all ones,

Yes

The identity matrix of, arbitrary

Is a binary gate, both are 0 or 1, and

To contain multiple

Vector.

在採用上述方式獲取變換參數的過程中，首先，參照公式（7），門控參數

經過sign產生

。其中，sign（a）為符號函數，a≥0時，sign（a）=1；a＜0時，sign（a）=0。由此，在採用符號函數sign（a）對門控參數進行處理後，得到的二值化向量

是只有0或1兩種值的向量。In the process of obtaining transformation parameters in the above manner, first, referring to formula (7), the gating parameter

Generated by sign

. Among them, sign(a) is a sign function, when a≥0, sign(a)=1; when a<0, sign(a)=0. Therefore, after using the sign function sign(a) to process the gating parameters, the binary vector obtained

It is a vector with only two values, 0 or 1.

然後，繼續參照公式（7），採用置換矩陣P對二值化向量中的元素進行置換以生成二值化門控向量。即，

代表一個恒定的置換矩陣，將

中元素置換產生

中的二值化門控。其中，需要說明的是，

的作用是控制二值化門控向量

中0，1的順序，保證0一直在1的前面，即保證單位矩陣

一直在全1矩陣

的前面，以表達出的子矩陣

為分塊對角矩陣。比如：當

時，

，此時

即可表達出圖7所示的分塊對角矩陣。Then, referring to formula (7), the permutation matrix P is used to replace the elements in the binarized vector to generate the binarized gate vector. which is,

Represents a constant permutation matrix that will

Element replacement

Binary gating in. Among them, what needs to be explained is,

The role of is to control the binary gated vector

In the order of 0, 1, to ensure that 0 is always in front of 1, that is, to ensure that the identity matrix

Always in all 1 matrix

In front of, to express the sub-matrix

Is a block diagonal matrix. For example: when

Time,

,at this time

The block diagonal matrix shown in Figure 7 can be expressed.

在採用置換矩陣將二值化向量中的元素進行置換生成相應的二值化門控向量

後，即可根據公式（6），基於二值化門控向量、第一基礎矩陣

和第二基礎矩陣

按照公式（6）進行運算，得到相應的多個子矩陣

。在得到多個子矩陣

後，即可根據公式（5）對多個子矩陣

進行內積運算，從而得到相應的第二變換參數V。The permutation matrix is used to replace the elements in the binarized vector to generate the corresponding binarized gated vector

Then, according to formula (6), based on the binary gated vector, the first basic matrix

And the second fundamental matrix

Operate according to formula (6) to get the corresponding multiple sub-matrices

. Multiple submatrices

Then, according to formula (5), multiple sub-matrices

Perform the inner product operation to obtain the corresponding second transformation parameter V.

此處，還應當指出的是，第一基礎矩陣

和第二基礎矩陣

的維度並不限於上述實施例所設置的維度。也就是說，第一基礎矩陣

和第二基礎矩陣

的維度可以根據實際情況來任意選擇。如：第一基礎矩陣

為2*2的全1矩陣

，第二基礎矩陣

為2*3的單位矩陣（即，

），其中，A表徵第二基礎矩陣

。由此，

即可表達出圖8所示的具有相互重疊部分的分塊對角矩陣。Here, it should also be pointed out that the first fundamental matrix

And the second fundamental matrix

The dimension of is not limited to the dimension set in the above embodiment. In other words, the first fundamental matrix

And the second fundamental matrix

The dimensions of can be arbitrarily selected according to the actual situation. Such as: the first basic matrix

Is a 2*2 all 1 matrix

, The second fundamental matrix

Is a 2*3 identity matrix (that is,

), where A represents the second fundamental matrix

. thus,

That is, the block diagonal matrix with overlapping parts as shown in FIG. 8 can be expressed.

由此，通過採用具有不同維度的常數矩陣（即，第一基礎矩陣和第二基礎矩陣）能夠生成不同的子矩陣，這就使得本發明的資料處理方法中標準化方式能夠適應具有不同通道數的標準化層，這也就更進一步的提高了本發明的方法中標準化方式的可擴展性。Therefore, different sub-matrices can be generated by using constant matrices with different dimensions (that is, the first basic matrix and the second basic matrix), which enables the standardization method in the data processing method of the present invention to be adapted to those with different numbers of channels. The standardization layer further improves the scalability of the standardization method in the method of the present invention.

同時，通過在神經網路模型中設置可學習的門控參數

，使得多個子矩陣的學習轉換為對門控參數

的學習，這就使得本發明的資料處理方法中，對特徵資料進行標準化操作時，標準化中的參數量從

減少到僅有

個參數（比如：神經網路模型中的一個隱藏層的通道數C為1024，對於C*C維的第二變換參數V，其參數量可以減少到10個參數。），這就更進一步的減少了標準化中的參數量，使得本發明的資料處理方法更易於實現和應用。At the same time, by setting the learnable gating parameters in the neural network model

, So that the learning of multiple sub-matrices is converted to the gating parameters

In the data processing method of the present invention, when the characteristic data is standardized, the parameter quantity in the standardization is changed from

Reduced to only

(For example, the number of channels C of a hidden layer in the neural network model is 1024. For the second transformation parameter V of C*C dimension, the parameter amount can be reduced to 10 parameters.), this is a further step The amount of parameters in the standardization is reduced, so that the data processing method of the present invention is easier to implement and apply.

為了更加清楚地說明本發明的資料處理方法中，對特徵資料進行標準化的具體操作方式，以下以一個實施例對本發明的資料處理方法中標準化的具體運算進行說明。In order to more clearly explain the specific operation method for standardizing the characteristic data in the data processing method of the present invention, the following uses an embodiment to describe the specific operation of the standardization in the data processing method of the present invention.

其中，應當指出的是，在本實施例中，第一變換參數U與第三變換參數U’相同，第二變換參數V與第四變換參數V’相同，因此在得到第三變換參數U’和第四變換參數V’時，可直接採用第一變換參數U對應的第一門控參數

和第二變換參數V對應的第二門控參數

來實現。It should be noted that, in this embodiment, the first transformation parameter U is the same as the third transformation parameter U', and the second transformation parameter V is the same as the fourth transformation parameter V'. Therefore, the third transformation parameter U'is obtained. And the fourth transformation parameter V', the first gating parameter corresponding to the first transformation parameter U can be directly used

The second gating parameter corresponding to the second transformation parameter V

to fulfill.

由此，在神經網路模型的某一層標準化層中分別設置有第一門控參數

和第二門控參數

。第一門控參數

對應第一變換參數U，第二門控參數

對應第二變換參數V。同時，該標準化層中還設置有縮減參數γ和位移參數β。縮減參數γ和位移參數β均用於標準化公式（即，公式（2））中。Therefore, the first gating parameters are respectively set in a standardized layer of the neural network model

And the second gating parameter

. The first gating parameter

Corresponding to the first transformation parameter U, the second gating parameter

Corresponds to the second transformation parameter V. At the same time, the reduction parameter γ and the displacement parameter β are also set in the standardization layer. Both the reduction parameter γ and the displacement parameter β are used in the standardized formula (ie, formula (2)).

在該實施例中，輸入（Input）包括：特徵資料

；可學習的第一門控參數

和第二門控參數

；縮減參數

；位移參數

；其中，

；

；

；

。In this embodiment, the input (Input) includes: characteristic data

; The first gating parameter that can be learned

And the second gating parameter

; Reduced parameters

; Displacement parameter

;among them,

；

.

輸出（Output）：標準化後的特徵資料

。Output: standardized characteristic data

.

標準化過程中的運算包括：

，

；

，

；The operations in the standardization process include:

,

；

,

；

根據公式（5）、公式（6）和公式（7）計算得到第一變換參數U和第二變換參數V；Calculate the first transformation parameter U and the second transformation parameter V according to formula (5), formula (6) and formula (7);

在該實施例中，對特徵資料進行標準化時最終所採用的為下述公式（8）：

（8）其中，本領域技術人員可以理解的是，在第一變換參數U與第三變換參數U’不同，第二變換參數V與第四變換參數V’也不同時，設置在神經網路模型中的門控參數

則應當包括第一門控參數

、第二門控參數

、第三門控參數

和第四門控參數

。In this embodiment, the following formula (8) is finally adopted when standardizing the characteristic data:

(8) Among them, those skilled in the art can understand that the first transformation parameter U is different from the third transformation parameter U', and the second transformation parameter V and the fourth transformation parameter V'are also different, which are set in the neural network. Gating parameters in the model

The first gating parameter

, The second gating parameter

, The third gating parameter

And the fourth gating parameter

.

由此，通過採用門控參數

來獲取神經網路模型中的變換參數，實現了將變換參數的學習轉換為門控參數

的學習。根據公式（6）和公式（7），將子矩陣

用一系列的全1矩陣

與單位矩陣

來表達，實現了將公式（5）中子矩陣將

的學習再參數化轉換成連續值向量

的學習，同時將大矩陣形式的變換參數，如：第二變換參數

的參數量從

減少到僅有

個參數，從而實現了利用克羅內克內積運算提出一種參數分解及再參數化的目的，這就將本發明的資料處理方法中

維的大矩陣形式的第一變換參數

和

維的大矩陣形式的第二變換參數

縮減為分別僅有

和

的參數量，以一種可微分的端到端訓練方式，使得本發明的資料處理方法計算量少，參數量小，更加易於實現和應用。Thus, by using gating parameters

To obtain the transformation parameters in the neural network model, and realize the conversion of the learning of the transformation parameters into the gating parameters

Learning. According to formula (6) and formula (7), the sub-matrix

Use a series of all 1 matrices

With identity matrix

To express that the sub-matrix in formula (5) will be

The learning is then parameterized and converted into a continuous value vector

Learning, and at the same time transform the transformation parameters in the form of a large matrix, such as: the second transformation parameter

From

Reduced to only

In this way, the purpose of using Kronecker inner product operation to propose a parameter decomposition and re-parameterization is realized, which makes the data processing method of the present invention

The first transformation parameter in the form of a large matrix of dimensions

with

The second transformation parameter in the form of a large matrix of dimensions

Reduced to only

with

With a differentiable end-to-end training method, the data processing method of the present invention has a small amount of calculation, a small amount of parameters, and is easier to implement and apply.

另外，還需要說明的是，在本發明的資料處理方法中，還可以包括對神經網路模型的訓練過程。即，在將輸入資料登錄至神經網路模型中，獲取神經網路模型中網路層當前輸出的特徵資料之前，還可以包括：In addition, it should be noted that the data processing method of the present invention may also include a training process for the neural network model. That is, before registering the input data into the neural network model and obtaining the characteristic data of the current output of the network layer in the neural network model, it can also include:

基於樣本資料集對神經網路模型進行訓練，得到訓練後的神經網路模型。其中，樣本資料集中各輸入資料具有標注資訊。The neural network model is trained based on the sample data set, and the trained neural network model is obtained. Among them, each input data in the sample data set has label information.

其中，在一種可能的實現方式中，神經網路模型包括至少一個網路層和至少一個標準化層。基於樣本資料集對神經網路模型進行訓練時，首先，將樣本資料集中的各輸入資料通過網路層進行特徵提取，得到相應的各預測特徵資料。然後，再將各預測特徵資料通過標準化層進行標準化處理，得到標準化後的預測特徵資料。進而，根據各預測特徵資料和標注資訊，獲得網路損失，從而基於網路損失，對標準化層中的變換參數進行調整。Among them, in a possible implementation manner, the neural network model includes at least one network layer and at least one standardized layer. When the neural network model is trained based on the sample data set, first, each input data in the sample data set is extracted through the network layer to obtain the corresponding prediction feature data. Then, each predictive feature data is standardized through the standardization layer to obtain standardized predictive feature data. Furthermore, the network loss is obtained according to each predicted feature data and label information, so that the transformation parameters in the standardization layer are adjusted based on the network loss.

如：在對神經網路模型進行訓練時，輸入（Input）包括：訓練資料集

；網路層中的一系列網路參數Θ（如：權重值）；標準化層中的一系列門控參數Ф（如：第一門控參數和第二門控參數）；以及縮減參數和位移參數

。輸出（Output）：訓練後的神經網路模型（包括各網路層和各標準化層等）。For example, when training the neural network model, the input (Input) includes: training data set

; A series of network parameters Θ in the network layer (such as weight value); a series of gating parameters Ф in the standardization layer (such as: the first gating parameter and the second gating parameter); and the reduction parameter and displacement parameter

. Output: The trained neural network model (including each network layer and each standardized layer, etc.).

此處，需要指出的是，在該實施例中，第一變換參數U與第三變換參數U’相同，第二變換參數V與第四變換參數V’也相同，因此標準化層中的一系列門控參數Ф可以只設置第一門控參數和第二門控參數。Here, it should be pointed out that in this embodiment, the first transformation parameter U is the same as the third transformation parameter U', and the second transformation parameter V is the same as the fourth transformation parameter V'. Therefore, a series of The gating parameter Φ can only set the first gating parameter and the second gating parameter.

其中，訓練次數

。在每一次的訓練過程中，根據上述輸入中的各個參數，基於前向傳播方式按照前面所述的標準化的運算過程進行標準化層的訓練，得到預測特徵資料。並根據得到的預測特徵資料和標注資訊，基於反向傳播方式獲取相應的網路損失，進而再根據得到的網路損失更新輸入中的各項參數：

、

以及

。Among them, the number of training

. In each training process, according to the various parameters in the above input, the standardization layer is trained according to the aforementioned standardized operation process based on the forward propagation mode, and the predicted feature data is obtained. And according to the obtained prediction feature data and labeling information, the corresponding network loss is obtained based on the back propagation method, and then the various parameters in the input are updated according to the obtained network loss:

,

as well as

.

在經過多次訓練後，即可進行神經網路模型的測試過程。在本發明的資料處理方法中，主要是針對於標準化層的測試。其中，在測試之前，需要計算每層標準化層中的統計量在多批次訓練中的平均值，然後再根據計算得到的統計量平均值對相應的標準化層進行測試。即，計算各個標準化層在多批次訓練過程中所得到的統計量（平均值μ和標準差σ）的平均值（

、

）。具體的計算過程為：for

，for

），

，

。After many trainings, the testing process of the neural network model can be carried out. In the data processing method of the present invention, the test is mainly aimed at the standardization layer. Among them, before the test, it is necessary to calculate the average value of the statistics in each standardization layer in the multi-batch training, and then test the corresponding standardization layer according to the calculated average value of the statistics. That is, calculate the average value of the statistics (mean μ and standard deviation σ) obtained by each standardization layer in the multi-batch training process (

,

). The specific calculation process is: for

, For

),

,

.

計算得到各個標準化層的統計量的平均值之後，即可進行各個標準化層的測試。其中，在測試過程中，對每一層標準化層可以根據下述公式（9）：

（9）其中，

表徵標準化層的層數。After the average value of the statistics of each standardization layer is calculated, the test of each standardization layer can be performed. Among them, in the test process, the following formula (9) can be used for each standardization layer:

(9) Among them,

Characterize the number of standardized layers.

由此，通過上述過程對神經網路模型進行訓練後，使得最終訓練得到的神經網路模型中的標準化層中的參數為第一門控參數、第二門控參數、縮減參數和位移參數。其中，不同的訓練資料集訓練得到的神經網路模型中，標準化層的第一門控參數和第二門控參數的取值不同。這就使得將本發明的資料處理方法中的標準化方式嵌入到神經網路模型後，能夠使得神經網路模型應用於各種視覺任務。即，通過對神經網路模型進行訓練，將本發明的資料處理方法嵌入到神經網路模型中，利用本發明的資料處理方法在分類、檢測、識別和分割等多種視覺任務中能夠取得效果優異模型，進行相關任務的結果預測，或將還未進行訓練的神經網路模型（預訓練模型）遷移到其他視覺任務上，通過微調參數（如：標準化層中的門控參數），進一步提升其他視覺任務的性能。Therefore, after the neural network model is trained through the above process, the parameters in the standardized layer in the neural network model obtained by the final training are the first gating parameter, the second gating parameter, the reduction parameter, and the displacement parameter. Among them, in the neural network models trained on different training data sets, the values of the first gating parameter and the second gating parameter of the standardized layer are different. This enables the standardized method in the data processing method of the present invention to be embedded in the neural network model, so that the neural network model can be applied to various visual tasks. That is, by training the neural network model and embedding the data processing method of the present invention into the neural network model, the data processing method of the present invention can achieve excellent results in various visual tasks such as classification, detection, recognition, and segmentation. Model, predict the results of related tasks, or transfer the untrained neural network model (pre-training model) to other visual tasks, and further improve other visual tasks by fine-tuning parameters (such as gate control parameters in the standardization layer) Performance of visual tasks.

可以理解，本發明提及的上述各個方法實施例，在不違背原理邏輯的情況下，均可以彼此相互結合形成結合後的實施例，限於篇幅，本發明不再贅述。It can be understood that the various method embodiments mentioned in the present invention can be combined with each other to form a combined embodiment without violating the principle and logic. The length is limited, and the present invention will not be repeated.

同時，本領域技術人員可以理解，在具體實施方式的上述方法中，各步驟的撰寫順序並不意味著嚴格的執行順序而對實施過程構成任何限定，各步驟的具體執行順序應當以其功能和可能的內在邏輯確定。At the same time, those skilled in the art can understand that in the above method of the specific implementation, the writing order of the steps does not mean a strict execution order but constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and The possible internal logic is determined.

此外，本發明還提供了資料處理裝置、電子設備、電腦可讀儲存媒體、程式，上述均可用來實現本發明提供的任一種資料處理方法，相應技術方案和描述和參見方法部分的相應記載，不再贅述。In addition, the present invention also provides data processing devices, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any data processing method provided by the present invention. For the corresponding technical solutions and descriptions, refer to the corresponding records in the method section. No longer.

圖9示出根據本發明實施例的資料處理裝置100的方塊圖，如圖9所示，所述資料處理裝置100，包括：FIG. 9 shows a block diagram of a data processing device 100 according to an embodiment of the present invention. As shown in FIG. 9, the data processing device 100 includes:

資料輸入模組110，用於將輸入資料輸入至神經網路模型中，獲取所述神經網路模型中網路層當前輸出的特徵資料；The data input module 110 is used for inputting input data into a neural network model, and obtaining characteristic data currently output by the network layer in the neural network model;

方式確定模組120，用於根據所述神經網路模型的變換參數，確定與所述特徵資料相匹配的標準化方式，其中，所述變換參數用於調整所述特徵資料的統計量的統計範圍，所述統計範圍用於表徵標準化方式；The mode determination module 120 is used to determine a standardized mode matching the feature data according to the transformation parameters of the neural network model, wherein the transformation parameters are used to adjust the statistical range of the feature data , The statistical range is used to characterize the standardization method;

標準化處理模組130，用於根據確定的所述標準化方式對所述特徵資料進行標準化處理，得到標準化後的特徵資料。The standardization processing module 130 is configured to perform standardization processing on the characteristic data according to the determined standardization method to obtain standardized characteristic data.

在一種可能的實現方式中，還包括：子矩陣獲取模組，用於基於所述神經網路模型中設置的可學習的門控參數，獲取相應的多個子矩陣；變換參數獲取模組，用於對多個所述子矩陣進行內積運算，得到所述變換參數。In a possible implementation, it further includes: a sub-matrix acquisition module, which is used to acquire multiple corresponding sub-matrices based on the learnable gating parameters set in the neural network model; the transformation parameter acquisition module uses To perform an inner product operation on a plurality of the sub-matrices to obtain the transformation parameters.

在一種可能的實現方式中，所述子矩陣獲取模組包括：參數處理子模組，用於採用符號函數對所述門控參數進行處理，得到二值化向量；元素置換子模組，用於採用置換矩陣將所述二值化向量中的元素置換產生二值化門控向量；子矩陣獲取子模組，用於基於所述二值化門控向量、第一基礎矩陣和第二基礎矩陣，得到多個所述子矩陣。In a possible implementation manner, the sub-matrix acquisition module includes: a parameter processing sub-module for processing the gating parameters using a sign function to obtain a binary vector; and an element replacement sub-module for using In using a permutation matrix to replace elements in the binarized vector to generate a binarized gated vector; a sub-matrix acquisition sub-module is used based on the binarized gated vector, the first basic matrix, and the second basic Matrix to obtain a plurality of the sub-matrices.

在一種可能的實現方式中，所述變換參數包括第一變換參數、第二變換參數、第三變換參數和第四變換參數；所述第一變換參數的維度和所述第三變換參數的維度基於所述特徵資料的批尺寸維度，所述第二變換參數的維度和所述第四變換參數的維度基於所述特徵資料的通道維度；其中，所述批尺寸維度為所述特徵資料所在的資料批次內的資料數量，所述通道維度為所述特徵資料的通道數。In a possible implementation manner, the transformation parameters include a first transformation parameter, a second transformation parameter, a third transformation parameter, and a fourth transformation parameter; the dimension of the first transformation parameter and the dimension of the third transformation parameter Based on the batch size dimension of the feature data, the dimension of the second transformation parameter and the dimension of the fourth transformation parameter are based on the channel dimension of the feature data; wherein, the batch size dimension is where the feature data is located The number of data in the data batch, and the channel dimension is the number of channels of the characteristic data.

在一種可能的實現方式中，所述方式確定模組120包括：In a possible implementation manner, the mode determination module 120 includes:

第一確定子模組，用於確定所述特徵資料的統計量的統計範圍為第一範圍，其中，所述統計量包括平均值和標準差；第一調整子模組，用於根據所述第一變換參數和所述第二變換參數，將所述平均值的統計範圍由所述第一範圍調整至第二範圍；第二調整子模組，用於根據所述第三變換參數和所述第四變換參數，將所述標準差的統計範圍由所述第一範圍調整至第三範圍；方式確定子模組，用於基於所述第二範圍和所述第三範圍，確定所述標準化方式。The first determining sub-module is used to determine that the statistical range of the statistics of the characteristic data is the first range, wherein the statistical values include the average value and the standard deviation; the first adjustment sub-module is used to determine the statistical value according to the The first transformation parameter and the second transformation parameter adjust the statistical range of the average value from the first range to the second range; the second adjustment sub-module is used to adjust the statistical range of the average value according to the third transformation parameter and the second range. The fourth transformation parameter adjusts the statistical range of the standard deviation from the first range to the third range; the mode determination sub-module is used to determine the standard deviation based on the second range and the third range Standardized approach.

在一種可能的實現方式中，所述第一範圍為所述特徵資料的每個樣本特徵資料的每個通道範圍。In a possible implementation manner, the first range is each channel range of each sample feature data of the feature data.

在一種可能的實現方式中，所述標準化處理模組130包括：統計量獲取子模組，用於按照所述第一範圍，獲取所述特徵資料的統計量；標準化處理子模組，用於基於所述統計量、所述第一變換參數、所述第二變換參數、所述第三變換參數和所述第四變換參數，對所述特徵資料進行標準化處理，得到標準化後的特徵資料。In a possible implementation manner, the standardization processing module 130 includes: a statistics acquisition sub-module for acquiring statistics of the characteristic data according to the first range; a standardization processing sub-module for Based on the statistics, the first transformation parameter, the second transformation parameter, the third transformation parameter, and the fourth transformation parameter, normalizing the feature data is performed to obtain standardized feature data.

在一種可能的實現方式中，所述標準化處理子模組包括：第一參數獲取單元，用於基於所述平均值、所述第一變換參數和所述第二變換參數，得到第一標準化參數；第二參數獲取單元，用於基於所述標準差、所述第三變換參數和所述第四變換參數，得到第二標準化參數；資料處理單元，用於根據所述特徵資料、所述第一標準化參數和所述第二標準化參數，對所述特徵資料進行標準化處理，得到標準化後的特徵資料。In a possible implementation manner, the standardization processing sub-module includes: a first parameter acquisition unit configured to obtain a first standardization parameter based on the average value, the first transformation parameter, and the second transformation parameter A second parameter acquisition unit for obtaining a second standardized parameter based on the standard deviation, the third transformation parameter and the fourth transformation parameter; a data processing unit for obtaining a second standardized parameter based on the characteristic data and the first A standardized parameter and the second standardized parameter are standardized for the characteristic data to obtain standardized characteristic data.

在一種可能的實現方式中，所述變換參數包括二值化矩陣，所述二值化矩陣內的每個元素的取值為0或1。In a possible implementation manner, the transformation parameter includes a binarization matrix, and the value of each element in the binarization matrix is 0 or 1.

在一種可能的實現方式中，所述門控參數為具有連續數值的向量；其中，所述門控參數中的數值的個數與所述子矩陣的數量相一致。In a possible implementation, the gating parameter is a vector with continuous values; wherein the number of values in the gating parameter is consistent with the number of sub-matrices.

在一種可能的實現方式中，所述第一基礎矩陣為全1矩陣，第二基礎矩陣為單位矩陣。In a possible implementation manner, the first basic matrix is an all-one matrix, and the second basic matrix is an identity matrix.

在一種可能的實現方式中，還包括：模型訓練模組，用於在所述資料輸入模組將輸入資料輸入至神經網路模型中，獲取所述神經網路模型中網路層當前輸出的特徵資料之前，基於樣本資料集對所述神經網路模型進行訓練，得到訓練後的神經網路模型，其中，所述樣本資料集中各輸入資料具有標注資訊。In a possible implementation, it further includes: a model training module for inputting input data into the neural network model in the data input module to obtain the current output of the network layer in the neural network model Before the characteristic data, the neural network model is trained based on the sample data set to obtain the trained neural network model, wherein each input data in the sample data set has label information.

在一種可能的實現方式中，所述神經網路模型包括至少一個網路層和至少一個標準化層；其中，所述模型訓練模組包括：特徵提取子模組，用於所述樣本資料集中的各輸入資料通過所述網路層進行特徵提取，得到各預測特徵資料；預測特徵資料獲取子模組，用於將各所述預測特徵資料通過所述標準化層進行標準化處理，得到標準化後的預測特徵資料；網路損失獲取子模組，用於根據各所述預測特徵資料和標注資訊，獲得網路損失；變換參數調整子模組，用於基於所述網路損失，對所述標準化層中的所述變換參數進行調整。In a possible implementation, the neural network model includes at least one network layer and at least one standardized layer; wherein, the model training module includes: a feature extraction sub-module for the sample data set Each input data is feature extracted through the network layer to obtain each prediction feature data; the prediction feature data acquisition sub-module is used to standardize each prediction feature data through the standardization layer to obtain a standardized prediction Feature data; network loss acquisition sub-module, used to obtain network loss based on each of the predicted feature data and label information; transformation parameter adjustment sub-module, used to adjust the standardization layer based on the network loss To adjust the transformation parameters in the.

在一些實施例中，本發明實施例提供的裝置具有的功能或包含的模組可以用於執行上文方法實施例描述的方法，其具體實現可以參照上文方法實施例的描述，為了簡潔，這裡不再贅述。In some embodiments, the functions or modules contained in the device provided by the embodiments of the present invention can be used to execute the methods described in the above method embodiments. For specific implementation, refer to the description of the above method embodiments. For brevity, I won't repeat it here.

本發明實施例還提出一種電腦可讀儲存媒體，其上存儲有電腦程式指令，所述電腦程式指令被處理器執行時實現上述方法。電腦可讀儲存媒體可以是非揮發性(Non-Volatile)電腦可讀儲存媒體。An embodiment of the present invention also provides a computer-readable storage medium on which computer program instructions are stored, and the computer program instructions implement the above-mentioned method when executed by a processor. The computer-readable storage medium may be a non-volatile (Non-Volatile) computer-readable storage medium.

本發明實施例還提出一種電子設備，包括：處理器；用於儲存處理器可執行指令的記憶體；其中，所述處理器被配置為執行上述方法。An embodiment of the present invention also provides an electronic device, including: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to execute the above method.

電子設備可以被提供為終端、伺服器或其它形態的設備。The electronic device can be provided as a terminal, a server, or other forms of equipment.

圖10是根據一示例性實施例示出的一種電子設備800的框圖。例如，電子設備800可以是行動電話，電腦，數位廣播終端，訊息收發設備，遊戲控制台，平板設備，醫療設備，健身設備，個人數位助理等終端。Fig. 10 is a block diagram showing an electronic device 800 according to an exemplary embodiment. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcasting terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and other terminals.

參照圖10，電子設備800可以包括以下一個或多個組件：處理組件802，記憶體804，電源組件806，多媒體組件808，音頻組件810，輸入/輸出（I/ O）埠812，感測器組件814，以及通信組件816。10, the electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) port 812, a sensor Component 814, and communication component 816.

處理組件802通常控制電子設備800的整體操作，諸如與顯示，電話呼叫，資料通信，相機操作和記錄操作相關聯的操作。處理組件802可以包括一個或多個處理器820來執行指令，以完成上述的方法的全部或部分步驟。此外，處理組件802可以包括一個或多個模組，便於處理組件802和其他組件之間的交互。例如，處理組件802可以包括多媒體模組，以方便多媒體組件808和處理組件802之間的交互連動。The processing component 802 generally controls the overall operations of the electronic device 800, such as operations associated with display, telephone calls, data communication, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the foregoing method. In addition, the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components. For example, the processing component 802 may include a multimedia module to facilitate the interaction and linkage between the multimedia component 808 and the processing component 802.

記憶體804被配置為儲存各種類型的資料以支援在電子設備800的操作。這些資料的示例包括用於在電子設備800上操作的任何應用程式或方法的指令，連絡人資料，電話簿資料，訊息，圖片，影片等。記憶體804可以由任何類型的揮發性或非揮發性儲存裝置或者它們的組合實現，如靜態隨機存取記憶體（SRAM），電子抹除式可複寫唯讀記憶體（EEPROM），可擦除可規劃式唯讀記憶體（EPROM），可程式化唯讀記憶體（PROM），唯讀記憶體（ROM），磁記憶體，快閃記憶體，磁片或光碟。The memory 804 is configured to store various types of data to support the operation of the electronic device 800. Examples of these data include instructions for any application or method operated on the electronic device 800, contact data, phone book data, messages, pictures, videos, etc. The memory 804 can be implemented by any type of volatile or non-volatile storage devices or their combination, such as static random access memory (SRAM), electronically erasable rewritable read-only memory (EEPROM), and erasable Programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, floppy disk or CD-ROM.

電源組件806為電子設備800的各種組件提供電力。電源組件806可以包括電源管理系統，一個或多個電源，及其他與為電子設備800生成、管理和分配電力相關聯的組件。The power supply component 806 provides power for various components of the electronic device 800. The power supply component 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.

多媒體組件808包括在所述電子設備800和使用者之間的提供一個輸出介面的螢幕。在一些實施例中，螢幕可以包括液晶顯示器（LCD）和觸控面板（TP）。如果螢幕包括觸控面板，螢幕可以被實現為觸控式螢幕，以接收來自使用者的輸入信號。觸控面板包括一個或多個觸控感測器以感測觸碰、滑動和觸控面板上的手勢。所述觸控感測器可以不僅感測觸碰或滑動動作的邊界，而且還檢測與所述觸控或滑動操作相關的持續時間和壓力。在一些實施例中，多媒體組件808包括一個前置攝像頭和/或後置攝像頭。當電子設備800處於操作模式，如拍攝模式或視訊模式時，前置攝像頭和/或後置攝像頭可以接收外部的多媒體資料。每個前置攝像頭和後置攝像頭可以是一個固定的光學透鏡系統或具有焦距和光學變焦能力。The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor can not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.

音頻組件810被配置為輸出和/或輸入音頻信號。例如，音頻組件810包括一個麥克風（MIC），當電子設備800處於操作模式，如呼叫模式、記錄模式和語音辨識模式時，麥克風被配置為接收外部音訊信號。所接收的音訊信號可以被進一步存儲在記憶體804或經由通信組件816發送。在一些實施例中，音頻組件810還包括一個揚聲器，用於輸出音訊信號。The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a microphone (MIC). When the electronic device 800 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals. The received audio signal can be further stored in the memory 804 or sent via the communication component 816. In some embodiments, the audio component 810 further includes a speaker for outputting audio signals.

輸出/輸入埠812為處理組件802和週邊介面模組之間提供介面，上述週邊介面模組可以是鍵盤，滑鼠，按鈕等。這些按鈕可包括但不限於：主頁按鈕、音量按鈕、啟動按鈕和鎖定按鈕。The output/input port 812 provides an interface between the processing component 802 and a peripheral interface module. The peripheral interface module may be a keyboard, a mouse, a button, and the like. These buttons may include, but are not limited to: home button, volume button, start button, and lock button.

感測器組件814包括一個或多個感測器，用於為電子設備800提供各個方面的狀態評估。例如，感測器組件814可以檢測到電子設備800的打開/關閉狀態，組件的相對定位，例如所述組件為電子設備800的顯示器和小鍵盤，感測器組件814還可以檢測電子設備800或電子設備800一個組件的位置改變，使用者與電子設備800接觸的存在或不存在，電子設備800方位或加速/減速和電子設備800的溫度變化。感測器組件814可以包括近接感測器，被配置用來在沒有任何的物理接觸時檢測附近物體的存在。感測器組件814還可以包括光感測器，如CMOS或CCD圖像感測器，用於在成像應用中使用。在一些實施例中，該感測器組件814還可以包括加速度感測器，陀螺儀感測器，磁感測器，壓力感測器或溫度感測器。The sensor component 814 includes one or more sensors for providing the electronic device 800 with various aspects of state evaluation. For example, the sensor component 814 can detect the on/off state of the electronic device 800 and the relative positioning of the components. For example, the component is the display and the keypad of the electronic device 800. The sensor component 814 can also detect the electronic device 800 or The position of a component of the electronic device 800 changes, the presence or absence of contact between the user and the electronic device 800, the orientation or acceleration/deceleration of the electronic device 800, and the temperature change of the electronic device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact. The sensor component 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 814 may further include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.

通信組件816被配置為便於電子設備800和其他設備之間有線或無線方式的通信。電子設備800可以接入基於通信標準的無線網路，如WiFi，2G或3G，或它們的組合。在一個示例性實施例中，通信組件816經由廣播通道接收來自外部廣播管理系統的廣播信號或廣播相關資訊。在一個示例性實施例中，所述通信組件816還包括近場通信（NFC）模組，以促進短程通信。例如，在NFC模組可基於射頻識別（RFID）技術，紅外資料協會（IrDA）技術，超寬頻（UWB）技術，藍牙（BT）技術和其他技術來實現。The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.

在示例性實施例中，電子設備800可以被一個或多個應用專用積體電路（ASIC）、數位訊號處理器（DSP）、數位信號處理設備（DSPD）、可程式設計邏輯裝置（PLD）、現場可程式設計閘陣列（FPGA）、控制器、微控制器、微處理器或其他電子元件實現，用於執行上述方法。In an exemplary embodiment, the electronic device 800 may be implemented by one or more application-specific integrated circuits (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), Field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are implemented to implement the above methods.

在示例性實施例中，還提供了一種非揮發性電腦可讀儲存媒體，例如包括電腦程式指令的記憶體804，上述電腦程式指令可由電子設備800的處理器820執行以完成上述方法。In an exemplary embodiment, a non-volatile computer-readable storage medium is also provided, such as the memory 804 including computer program instructions, which can be executed by the processor 820 of the electronic device 800 to complete the above method.

圖11是根據一示例性實施例示出的一種電子設備1900的方塊圖。例如，電子設備1900可以被提供為一伺服器。參照圖11，電子設備1900包括處理組件1922，其進一步包括一個或多個處理器，以及由記憶體1932所代表的記憶體資源，用於存儲可由處理組件1922的執行的指令，例如應用程式。記憶體1932中存儲的應用程式可以包括一個或一個以上的每一個對應於一組指令的模組。此外，處理組件1922被配置為執行指令，以執行上述方法。Fig. 11 is a block diagram showing an electronic device 1900 according to an exemplary embodiment. For example, the electronic device 1900 may be provided as a server. 11, the electronic device 1900 includes a processing component 1922, which further includes one or more processors, and a memory resource represented by a memory 1932 for storing instructions that can be executed by the processing component 1922, such as application programs. The application program stored in the memory 1932 may include one or more modules each corresponding to a set of commands. In addition, the processing component 1922 is configured to execute instructions to perform the above-described methods.

電子設備1900還可以包括一個電源組件1926，被配置為執行電子設備1900的電源管理，一個有線或無線網路埠1950被配置為將電子設備1900連接到網路，和一個輸入輸出（I/O）埠1958。電子設備1900可以操作基於儲存在記憶體1932的作業系統，例如Windows ServerTM，Mac OS XTM，UnixTM, LinuxTM，FreeBSDTM或類似之系統。The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network port 1950 configured to connect the electronic device 1900 to the network, and an input and output (I/O ) Port 1958. The electronic device 1900 can operate based on an operating system stored in the memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar systems.

在示例性實施例中，還提供了一種非揮發性電腦可讀儲存媒體，例如包括電腦程式指令的記憶體1932，上述電腦程式指令可由電子設備1900的處理組件1922執行以完成上述方法。In an exemplary embodiment, there is also provided a non-volatile computer-readable storage medium, such as a memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to complete the above method.

本發明可以是系統、方法和/或電腦程式產品。電腦程式產品可以包括電腦可讀儲存媒體，其上載有用於使處理器實現本發明的各個方面的電腦可讀程式指令。The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling the processor to implement various aspects of the present invention.

電腦可讀儲存媒體可以是可以保持和儲存由指令執行設備使用的指令的有形設備。電腦可讀儲存媒體例如可以是(但不限於)電存放裝置、磁存放裝置、光存放裝置、電磁存放裝置、半導體存放裝置或者上述的任意合適的組合。電腦可讀儲存媒體的更具體的例子（非窮舉的列表）包括：可擕式電腦盤、硬碟、隨機存取記憶體（RAM）、唯讀記憶體（ROM）、可擦除可規劃式唯讀記憶體（EPROM或快閃記憶體）、靜態隨機存取記憶體（SRAM）、可擕式壓縮磁碟唯讀記憶體（CD-ROM）、數位多功能盤（DVD）、記憶棒、軟碟、機械編碼設備、例如其上存儲有指令的打孔卡或凹槽內凸起結構、以及上述的任意合適的組合。這裡所使用的電腦可讀儲存媒體不被解釋為暫態信號本身，諸如無線電波或者其他自由傳播的電磁波、通過波導或其他傳輸媒介傳播的電磁波（例如，通過光纖電纜的光脈衝）、或者通過電線傳輸的電信號。The computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples of computer-readable storage media (non-exhaustive list) include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable and programmable Read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick , Floppy disks, mechanical encoding devices, such as punch cards on which instructions are stored or raised structures in the grooves, and any suitable combination of the above. The computer-readable storage medium used here is not interpreted as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or passing through Electrical signals transmitted by wires.

這裡所描述的電腦可讀程式指令可以從電腦可讀儲存媒體下載到各個計算/處理設備，或者通過網路、例如網際網路、區域網路、廣域網路和/或無線網路下載到外部電腦或外部存放裝置。網路可以包括銅傳輸電纜、光纖傳輸、無線傳輸、路由器、防火牆、交換機、閘道電腦和/或邊緣伺服器。每個計算/處理設備中的網路介面卡或者網路介面從網路接收電腦可讀程式指令，並轉發該電腦可讀程式指令，以供儲存在各個計算/處理設備中的電腦可讀儲存媒體中。The computer-readable program instructions described here can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network Or external storage device. The network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. The network interface card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for computer-readable storage in each computing/processing device In the media.

用於執行本發明操作的電腦程式指令可以是彙編指令、指令集架構（ISA）指令、機器指令、機器相關指令、微代碼、固件指令、狀態設置資料、或者以一種或多種程式設計語言的任意組合編寫的原始程式碼或目標代碼，所述程式設計語言包括物件導向的程式設計語言—諸如Smalltalk、C++等，以及常規的過程式程式設計語言—諸如“C”語言或類似的程式設計語言。電腦可讀程式指令可以完全地在使用者電腦上執行、部分地在使用者電腦上執行、作為一個獨立的套裝軟體執行、部分在使用者電腦上部分在遠端電腦上執行、或者完全在遠端電腦或伺服器上執行。在涉及遠端電腦的情形中，遠端電腦可以通過任意種類的網路—包括區域網路(LAN)或廣域網路(WAN)—連接到使用者電腦，或者，可以連接到外部電腦（例如利用網際網路服務提供者來通過網際網路連接）。在一些實施例中，通過利用電腦可讀程式指令的狀態資訊來個性化定制電子電路，例如可程式設計邏輯電路、現場可程式設計閘陣列（FPGA）或可程式設計邏輯陣列（PLA），該電子電路可以執行電腦可讀程式指令，從而實現本發明的各個方面。The computer program instructions used to perform the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, status setting data, or any of one or more programming languages. Combining source code or object code written, the programming language includes object-oriented programming languages-such as Smalltalk, C++, etc., and conventional procedural programming languages-such as "C" language or similar programming languages. Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or completely remotely executed. On the end computer or server. In the case of a remote computer, the remote computer can be connected to the user’s computer through any kind of network—including a local area network (LAN) or a wide area network (WAN)—or, it can be connected to an external computer (for example, using Internet service providers to connect via the Internet). In some embodiments, the electronic circuit is personalized by using the status information of computer-readable program instructions, such as programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA). The electronic circuit can execute computer-readable program instructions to realize various aspects of the present invention.

這裡參照根據本發明實施例的方法、裝置（系統）和電腦程式產品的流程圖和/或方塊圖描述了本發明的各個方面。應當理解，流程圖和/或方塊圖的每個方框以及流程圖和/或方塊圖中各方框的組合，都可以由電腦可讀程式指令實現。Herein, various aspects of the present invention are described with reference to flowcharts and/or block diagrams of methods, devices (systems) and computer program products according to embodiments of the present invention. It should be understood that each block of the flowchart and/or block diagram and the combination of each block in the flowchart and/or block diagram can be implemented by computer-readable program instructions.

這些電腦可讀程式指令可以提供給通用電腦、專用電腦或其它可程式設計資料處理裝置的處理器，從而生產出一種機器，使得這些指令在通過電腦或其它可程式設計資料處理裝置的處理器執行時，產生了實現流程圖和/或方塊圖中的一個或多個方框中規定的功能/動作的裝置。也可以把這些電腦可讀程式指令存儲在電腦可讀儲存媒體中，這些指令使得電腦、可程式設計資料處理裝置和/或其他設備以特定方式工作，從而，儲存有指令的電腦可讀介質則包括一個製造品，其包括實現流程圖和/或框圖中的一個或多個方框中規定的功能/動作的各個方面的指令。These computer-readable program instructions can be provided to the processors of general-purpose computers, special-purpose computers, or other programmable data processing devices, thereby producing a machine that allows these instructions to be executed by the processors of the computer or other programmable data processing devices At this time, a device that implements the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make the computer, programmable data processing device and/or other equipment work in a specific manner, so that the computer-readable medium storing the instructions is It includes an article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.

也可以把電腦可讀程式指令載入到電腦、其它可程式設計資料處理裝置、或其它設備上，使得在電腦、其它可程式設計資料處理裝置或其它設備上執行一系列操作步驟，以產生電腦實現的過程，從而使得在電腦、其它可程式設計資料處理裝置、或其它設備上執行的指令實現流程圖和/或方塊圖中的一個或多個方框中規定的功能/動作。It is also possible to load computer-readable program instructions into a computer, other programmable data processing device, or other equipment, so that a series of operation steps are executed on the computer, other programmable data processing device, or other equipment to generate a computer The process of implementation enables instructions executed on a computer, other programmable data processing device, or other equipment to implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.

圖式中的流程圖和方塊圖顯示了根據本發明的多個實施例的系統、方法和電腦程式產品的可能實現的體系架構、功能和操作。在這點上，流程圖或框圖中的每個方框可以代表一個模組、程式段或指令的一部分，所述模組、程式段或指令的一部分包含一個或多個用於實現規定的邏輯功能的可執行指令。在有些作為替換的實現中，方框中所標注的功能也可以以不同於圖式中所標注的順序發生。例如，兩個連續的方框實際上可以基本並行地執行，它們有時也可以按相反的順序執行，這依所涉及的功能而定。也要注意的是，方塊圖和/或流程圖中的每個方框、以及方塊圖和/或流程圖中的方框的組合，可以用執行規定的功能或動作的專用的基於硬體的系統來實現，或者可以用專用硬體與電腦指令的組合來實現。The flowcharts and block diagrams in the drawings show the possible implementation architecture, functions, and operations of the system, method, and computer program product according to multiple embodiments of the present invention. In this regard, each block in the flowchart or block diagram can represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction includes one or more Executable instructions for logic functions. In some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, as well as the combination of the blocks in the block diagram and/or flowchart, can be used as a dedicated hardware-based The system can be implemented, or it can be implemented by a combination of dedicated hardware and computer instructions.

以上已經描述了本發明的各實施例，上述說明是示例性的，並非窮盡性的，並且也不限於所揭露的各實施例。在不偏離所說明的各實施例的範圍和精神的情況下，對於本技術領域的普通技術人員來說許多修改和變更都是顯而易見的。本文中所用術語的選擇，旨在最好地解釋各實施例的原理、實際應用或對市場中的技術的改進，或者使本技術領域的其它普通技術人員能理解本文揭露的各實施例。The embodiments of the present invention have been described above, and the above description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Without departing from the scope and spirit of the described embodiments, many modifications and changes are obvious to those of ordinary skill in the art. The choice of terms used herein is intended to best explain the principles of the embodiments, practical applications or improvements to the technology in the market, or to enable those of ordinary skill in the art to understand the embodiments disclosed herein.

100:資料處理裝置 110:資料輸入模組 120:方式確定模組 130:標準化處理模組 800:電子設備 802:處理組件 804:記憶體 806:電源組件 808:多媒體組件 810:音頻組件 812:輸入/輸出埠 814:感測器組件 816:通信組件 820:處理器 1900:電子設備 1922:處理組件 1926:電源組件 1932:記憶體 1950:網路埠 1958:輸入輸出埠 S100~S300:流程步驟100: data processing device 110: Data input module 120: Method determination module 130: Standardized processing module 800: electronic equipment 802: Processing component 804: memory 806: Power Components 808: Multimedia components 810: Audio component 812: input/output port 814: Sensor component 816: Communication Components 820: processor 1900: electronic equipment 1922: processing components 1926: power supply components 1932: memory 1950: network port 1958: Input and output ports S100~S300: process steps

本發明之其他的特徵及功效，將於參照圖式的實施方式中清楚地呈現，其中：圖1至圖3示出根據本發明實施例的資料處理方法中通過統計量的統計範圍表徵標準化方式的示意圖；圖4示出根據本發明實施例的資料處理方法的流程圖；圖5至圖8示出根據本發明實施例的資料處理方法中變換參數的不同表示方式示意圖；圖9示出根據本發明實施例的資料處理裝置的方塊圖；圖10示出根據本發明實施例的電子設備的方塊圖；圖11示出根據本發明實施例的電子設備的方塊圖。Other features and effects of the present invention will be clearly presented in the embodiments with reference to the drawings, in which: Figures 1 to 3 show schematic diagrams of characterizing a standardized manner by a statistical range of statistics in a data processing method according to an embodiment of the present invention; Figure 4 shows a flowchart of a data processing method according to an embodiment of the present invention; 5 to 8 show schematic diagrams of different representations of transformation parameters in a data processing method according to an embodiment of the present invention; Figure 9 shows a block diagram of a data processing device according to an embodiment of the present invention; Figure 10 shows a block diagram of an electronic device according to an embodiment of the present invention; Fig. 11 shows a block diagram of an electronic device according to an embodiment of the present invention.

S100~S300:流程步驟 S100~S300: process steps

Claims

A data processing method, including: Input the input data into the neural network model, and obtain the characteristic data of the current output of the network layer in the neural network model; According to the transformation parameters of the neural network model, determine a standardized method that matches the feature data, wherein the transformation parameters are used to adjust the statistical range of the feature data, and the statistical range is used to characterize Standardized method The characteristic data is standardized according to the determined standardization manner to obtain standardized characteristic data.

The data processing method described in claim 1, further including: Obtaining corresponding multiple sub-matrices based on the learnable gating parameters set in the neural network model; Perform an inner product operation on a plurality of the sub-matrices to obtain the transformation parameters.

The data processing method according to claim 2, wherein, based on the learnable gating parameters set in the neural network model, obtaining corresponding multiple sub-matrices includes: Using a sign function to process the gating parameters to obtain a binarized vector; Using a permutation matrix to replace elements in the binarized vector to generate a binarized gated vector; Based on the binarized gating vector, the first basic matrix and the second basic matrix, a plurality of the sub-matrices are obtained.

The data processing method according to any one of claims 1 to 3, wherein the transformation parameters include a first transformation parameter, a second transformation parameter, a third transformation parameter, and a fourth transformation parameter; the first transformation parameter The dimension of and the dimension of the third transformation parameter are based on the batch size dimension of the feature data, and the dimension of the second transformation parameter and the dimension of the fourth transformation parameter are based on the channel dimension of the feature data; wherein The batch size dimension is the number of data in the data batch where the characteristic data is located, and the channel dimension is the number of channels of the characteristic data.

The data processing method according to claim 4, wherein the determining a standardized method matching the characteristic data according to the transformation parameters in the neural network includes: Determine that the statistical range of the statistics of the characteristic data is the first range, where the statistics include an average value and a standard deviation; Adjusting the statistical range of the average value from the first range to the second range according to the first transformation parameter and the second transformation parameter; Adjusting the statistical range of the standard deviation from the first range to a third range according to the third transformation parameter and the fourth transformation parameter; Based on the second range and the third range, the standardization method is determined.

The data processing method according to claim 4, wherein the first range is each channel range of each sample characteristic data of the characteristic data.

The data processing method according to claim 5, wherein standardizing the characteristic data according to the determined standardization method to obtain standardized characteristic data includes: Obtaining statistics of the characteristic data according to the first range; Based on the statistics, the first transformation parameter, the second transformation parameter, the third transformation parameter, and the fourth transformation parameter, normalizing the feature data is performed to obtain standardized feature data.

The data processing method according to claim 7, wherein based on the statistics, the first transformation parameter, the second transformation parameter, the third transformation parameter, and the fourth transformation parameter, the The characteristic data is standardized to obtain the standardized characteristic data, including: Obtain a first standardized parameter based on the average value, the first transformation parameter, and the second transformation parameter; Obtain a second standardized parameter based on the standard deviation, the third transformation parameter, and the fourth transformation parameter; According to the feature data, the first standardized parameter and the second standardized parameter, the feature data is standardized to obtain standardized feature data.

The data processing method according to claim 1, wherein the transformation parameter includes a binarization matrix, and the value of each element in the binarization matrix is 0 or 1.

The data processing method according to claim 2, wherein the gating parameter is a vector with continuous values; wherein the number of values in the gating parameter is consistent with the number of sub-matrices.

The data processing method according to claim 3, wherein the first basic matrix is an all-one matrix, and the second basic matrix is an identity matrix.

The data processing method according to any one of claim items 1 to 3 and 5 to 11, wherein the input data is input into the neural network model to obtain the characteristics of the current output of the network layer in the neural network model Before the information, it also includes: Training the neural network model based on the sample data set to obtain the trained neural network model, Wherein, each input data in the sample data set has label information.

The data processing method according to claim 12, wherein the neural network model includes at least one network layer and at least one standardized layer; wherein, training the neural network model based on a sample data set includes: Each input data in the sample data set is feature extracted through the network layer to obtain each predicted feature data; Each of the prediction feature data is standardized through the standardization layer to obtain standardized prediction feature data; Obtain network loss according to the predicted feature data and annotation information; Based on the network loss, the transformation parameters in the standardization layer are adjusted.

A data processing device, including: The data input module is used to input input data into the neural network model to obtain the characteristic data of the current output of the network layer in the neural network model; The mode determination module is used to determine a standardized mode matching the characteristic data according to the transformation parameters of the neural network model, wherein the transformation parameters are used to adjust the statistical range of the statistics of the characteristic data, The statistical range is used to characterize the standardization method; The standardization processing module is used to perform standardization processing on the characteristic data according to the determined standardization method to obtain standardized characteristic data.

The data processing device according to claim 14, further comprising: A sub-matrix acquisition module for acquiring corresponding multiple sub-matrices based on the learnable gating parameters set in the neural network model; The transformation parameter acquisition module is used to perform inner product operations on a plurality of the sub-matrices to obtain the transformation parameters.

The data processing device according to claim 15, wherein the sub-matrix obtaining module includes: The parameter processing sub-module is used to process the gating parameters using a sign function to obtain a binary vector; An element replacement sub-module for replacing elements in the binarized vector with a permutation matrix to generate a binarized gated vector; The sub-matrix obtaining sub-module is configured to obtain a plurality of the sub-matrices based on the binarized gating vector, the first basic matrix, and the second basic matrix.

The data processing device according to any one of claims 14 to 16, wherein the transformation parameter includes a first transformation parameter, a second transformation parameter, a third transformation parameter, and a fourth transformation parameter; the first transformation parameter The dimension of and the dimension of the third transformation parameter are based on the batch size dimension of the feature data, and the dimension of the second transformation parameter and the dimension of the fourth transformation parameter are based on the channel dimension of the feature data; wherein The batch size dimension is the number of data in the data batch where the characteristic data is located, and the channel dimension is the number of channels of the characteristic data.

The data processing device according to claim 17, wherein the mode determination module includes: The first determining sub-module is used to determine the statistical range of the statistics of the characteristic data as the first range, wherein the statistics include an average value and a standard deviation; A first adjustment submodule, configured to adjust the statistical range of the average value from the first range to a second range according to the first transformation parameter and the second transformation parameter; A second adjustment submodule, configured to adjust the statistical range of the standard deviation from the first range to a third range according to the third transformation parameter and the fourth transformation parameter; The mode determination sub-module is configured to determine the standardization mode based on the second range and the third range.

The data processing device according to claim 18, wherein the first range is each channel range of each sample characteristic data of the characteristic data.

The data processing device according to claim 18, wherein the standardized processing module includes: The statistical quantity acquisition sub-module is used to acquire the statistical quantity of the characteristic data according to the first range; The standardization processing sub-module is configured to perform standardization processing on the feature data based on the statistics, the first transformation parameter, the second transformation parameter, the third transformation parameter, and the fourth transformation parameter , Get the standardized characteristic data.

The data processing device according to claim 20, wherein the standardized processing sub-module includes: A first parameter obtaining unit, configured to obtain a first standardized parameter based on the average value, the first transformation parameter, and the second transformation parameter; A second parameter obtaining unit, configured to obtain a second standardized parameter based on the standard deviation, the third transformation parameter, and the fourth transformation parameter; The data processing unit is configured to perform standardization processing on the feature data according to the feature data, the first standardized parameter, and the second standardized parameter to obtain standardized feature data.

The data processing device according to claim 14, wherein the transformation parameter includes a binarization matrix, and the value of each element in the binarization matrix is 0 or 1.

The data processing device according to claim 15, wherein the gating parameter is a vector with continuous values; wherein the number of values in the gating parameter is consistent with the number of sub-matrices.

The data processing device according to claim 16, wherein the first basic matrix is an all-one matrix, and the second basic matrix is an identity matrix.

The data processing device according to any one of claims 14 to 16, 18 to 24, further comprising: The model training module is used for comparing the input data into the neural network model by the data input module and obtaining the characteristic data currently output by the network layer in the neural network model based on the sample data set. The neural network model is trained to get the trained neural network model, Wherein, each input data in the sample data set has label information.

The data processing device according to claim 25, wherein the neural network model includes at least one network layer and at least one standardized layer; wherein, the model training module includes: The feature extraction sub-module is used for feature extraction of each input data in the sample data set through the network layer to obtain each predicted feature data; The prediction feature data acquisition sub-module is used to perform standardization processing on each of the prediction feature data through the standardization layer to obtain standardized prediction feature data; The network loss acquisition sub-module is used to obtain network loss based on each of the predicted feature data and annotation information; The transformation parameter adjustment sub-module is used to adjust the transformation parameters in the standardization layer based on the network loss.

An electronic device that contains: processor; Memory used to store executable instructions of the processor; Wherein, the processor is configured to execute the data processing method described in any one of request items 1 to 13.

A computer-readable storage medium has computer program instructions stored thereon, and when the computer program instructions are executed by a processor, the data processing method described in any one of request items 1 to 13 is realized.