CN114970654A - Data processing method and device and terminal - Google Patents

Data processing method and device and terminal Download PDF

Info

Publication number
CN114970654A
CN114970654A CN202110558861.1A CN202110558861A CN114970654A CN 114970654 A CN114970654 A CN 114970654A CN 202110558861 A CN202110558861 A CN 202110558861A CN 114970654 A CN114970654 A CN 114970654A
Authority
CN
China
Prior art keywords
neural network
network
data
scene
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110558861.1A
Other languages
Chinese (zh)
Other versions
CN114970654B (en
Inventor
刘艳琳
王永忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202110558861.1A priority Critical patent/CN114970654B/en
Priority to PCT/CN2021/141388 priority patent/WO2022242175A1/en
Publication of CN114970654A publication Critical patent/CN114970654A/en
Application granted granted Critical
Publication of CN114970654B publication Critical patent/CN114970654B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/285Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a data processing method, a data processing device and a terminal, and relates to the field of artificial intelligence. The method comprises the following steps: after the terminal acquires application data to be processed and scene perception data, a first neural network suitable for processing the application data is selected according to the scene perception data influencing the application data processing and preset conditions, and a processing result of the application data is obtained by utilizing the first neural network. Therefore, the scene requirements of the speed and the precision of data processing are ensured, and the user experience is improved. The scene awareness data is used to indicate the influencing factors of the terminal processing the application data. The preset condition is used for indicating the corresponding relation between the scene data influencing the operation of the neural network and the neural network.

Description

Data processing method and device and terminal
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a data processing method, apparatus, and terminal.
Background
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. Research in the field of artificial intelligence includes Machine Learning (ML), natural language processing, computer vision, decision and reasoning, human-computer interaction, recommendation and search, AI basic theory, and the like. The neural network is a machine learning method.
At present, a terminal (such as a smart phone, an automatic driving automobile, etc.) can process acquired data (such as images, voices, etc.) by using a configured neural network to realize application functions such as face recognition, voice recognition, etc. However, the terminal generally processes data acquired in different scenes based on one neural network, and since the data in different scenes have different characteristics, the scene requirements of speed and accuracy of data processing cannot be ensured by sampling one neural network to process data in different scenes. Therefore, how to provide a data processing method that ensures the speed and accuracy of data processing is a problem to be solved urgently.
Disclosure of Invention
The application provides a data processing method, a data processing device and a terminal, so that the scene requirements of speed and precision of data processing are ensured.
In a first aspect, the present application provides a data processing method, which may be executed by a terminal, and specifically includes the following steps: after the terminal acquires application data to be processed and scene perception data, a first neural network for processing the application data is determined according to the scene perception data and preset conditions, and a processing result of the application data is obtained by using the first neural network. The scene awareness data is used to indicate the influencing factors of the terminal processing the application data. The preset condition is used for indicating the corresponding relation between the scene data influencing the operation of the neural network and the neural network.
Therefore, the terminal considers the scene perception data influencing the application data processing when selecting the first neural network for processing the application data, and selects the first neural network suitable for processing the application data according to the scene perception data and the preset conditions, so that the scene requirements of the data processing speed and precision are ensured, and the user experience is improved.
Wherein the application data includes data in the form of at least one of images, voice, and text. The terminal may obtain the application data through the sensor. The sensor comprises at least one of a camera, an audio sensor and a laser radar.
Understandably, the scene awareness data is used to indicate factors that affect speed and accuracy when the terminal processes the application data. Wherein the scene awareness data includes at least one of an external influence factor and an internal influence factor; the external influence factors are used for describing application scene characteristics of the application data acquired by the terminal, and the internal influence factors are used for describing operation scene characteristics of hardware resources of the application data operated by the terminal. Illustratively, the external influencing factors include at least one of temperature data, humidity data, light data, and time data. Illustratively, the internal influencing factors include at least one of a computing power of the processor, an available storage capacity, and an available remaining power.
The terminal selects a first neural network suitable for processing application data according to the scene perception data and the preset conditions, so that the first neural network is a large-scale network meeting the precision requirement on the premise of ensuring the speed requirement; or, on the premise of ensuring the accuracy requirement, the first neural network is a smaller-scale network which meets the speed requirement. Therefore, the precision and the speed of processing the application data are balanced, and the user experience is improved.
In one possible implementation, determining a first neural network for processing application data according to the scene awareness data and a preset condition includes: the terminal determines parameters of a first neural network corresponding to scene data including scene perception data in preset conditions, and determines the first neural network from the super network according to the parameters of the first neural network. Wherein the parameters of the first neural network include the number of channels and the number of network layers. The hyper-network is used for determining a first neural network corresponding to the scene data. The first neural network is a sub-network in the super network. The sub-networks comprise a number of network layers that is smaller than the number of network layers comprised by the super-network, or the sub-networks comprise a number of channels that is smaller than the number of channels comprised by the super-network, each network layer comprising at least one neuron.
In another possible implementation, the terminal stores parameters of the super network and of at least one sub-network comprised by the super network; determining the first neural network from the super network based on the parameters of the first neural network comprises: the weights of the first neural network are determined from the weights of the super network according to the parameters of the first neural network. Since the sub-networks share the parameters of the super-network, the storage space of the terminal is effectively reduced compared with the storage of a plurality of sub-networks by the terminal.
In another possible implementation manner, after determining the first neural network for processing the application data according to the scene awareness data and the preset condition, the method further includes: and the terminal determines a second neural network from the hyper-network and obtains the processing result of the application data by utilizing the second neural network. The second neural network is a sub-network in the super-network, and the number of network layers contained in the second neural network is greater than or less than that of the network layers contained in the first neural network, or the number of channels contained in the second neural network is greater than or less than that of the channels contained in the first neural network. If the processing result of the first neural network does not meet the speed requirement and the precision requirement of the user, the terminal can also adjust the neural network, so that the terminal can meet the speed requirement and the precision requirement of the user by utilizing the processing result of the application data obtained by the second neural network.
In another possible implementation, the method further includes: and if the speed and the precision of the first neural network do not meet the requirements of the user, adjusting the scene data corresponding to the first neural network, so that the terminal utilizes the first neural network to process the application data again to obtain a processing result which meets the requirements of the speed and the precision of the user.
Optionally, the method further comprises: the terminal also displays the corresponding relation between the scene data influencing the speed and the precision of the first neural network and the first neural network, and a processing result. The user can visually see the processing result, so that the user can conveniently judge whether the running time meets the speed requirement of the user and judge whether the precision of the processing result meets the precision requirement of the user.
In a second aspect, the present application provides a data processing apparatus comprising means for performing the data processing method of the first aspect or any one of the possible designs of the first aspect.
In a third aspect, the present application provides a terminal comprising at least one processor and a memory for storing a set of computer instructions; the operational steps of the data processing method of the first aspect or any of the possible implementations of the first aspect are performed when the set of computer instructions is executed by a processor as an execution device of the first aspect or any of the possible implementations of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium comprising: computer software instructions; the computer software instructions, when executed in the terminal, cause the terminal to perform the operational steps of the method as described in the first aspect or any one of the possible implementations of the first aspect.
In a fifth aspect, the present application provides a computer program product for causing a terminal to perform the operational steps of the method as described in the first aspect or any one of the possible implementations of the first aspect, when the computer program product runs on a computer.
The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.
Drawings
Fig. 1 is a schematic structural diagram of a neural network provided in the present application;
FIG. 2 is a schematic diagram of a convolutional neural network according to the present application;
FIG. 3 is a block diagram of a data processing system according to the present application;
FIG. 4 is a flowchart of a method for generating preset conditions according to the present disclosure;
FIG. 5 is a schematic diagram of a super network and sub-networks according to the present application;
FIG. 6 is a schematic diagram of a pareto boundary provided herein;
FIG. 7 is a flow chart of a data processing method provided herein;
FIG. 8 is a flow chart of another data processing method provided herein;
FIG. 9 is a flow chart of another data processing method provided herein;
FIG. 10 is a schematic diagram of an interface for adjusting scene data according to the present application;
FIG. 11 is a schematic block diagram of a system provided herein;
FIG. 12 is a schematic diagram of a data processing apparatus according to the present application;
fig. 13 is a schematic structural diagram of a terminal provided in the present application.
Detailed Description
For the convenience of understanding, the related terms and related concepts such as neural networks referred to in the embodiments of the present application will be described below.
(1) Neural network
The neural network may be composed of neurons, which may be referred to as x s And an arithmetic unit having intercept 1 as input. The output of the arithmetic unit satisfies the following formula (1).
Figure RE-GDA0003301727510000031
Wherein s is 1, 2, … … n, n is a natural number greater than 1, and W is s Is x s B is the bias of the neuron. f is the activation functions (activations functions) of the neurons for introducing nonlinear characteristics into the neural network to convert the input signals in the neurons into output signals. The output signal of the activation function may be used as an input for the next layer, and the activation function may be a sigmoid function. A neural network is a network formed by a plurality of the above-mentioned single neurons being joined together, i.e. the output of one neuron may be the input of another neuron. The input of each neuron may be connected to a local acceptance domain of a previous layer to extract features of the local acceptance domain, which may be a region composed of several neurons. The weights characterize the strength of the connection between different neurons. The weight determines the influence of the input on the output. A weight close to 0 means that changing the input does not change the output. A negative weight means that increasing the input decreases the output.
Fig. 1 is a schematic structural diagram of a neural network according to an embodiment of the present disclosure. The neural network 100 includes N processing layers, N being an integer greater than or equal to 3. The first layer of the neural network 100 is an input layer 110, which is responsible for receiving input signals, and the last layer of the neural network 100 is an output layer 130, which is responsible for outputting the processing results of the neural network. The other layers except the first and last layers are intermediate layers 140, and these intermediate layers 140 collectively constitute the hidden layers 120, and each intermediate layer 140 in the hidden layers 120 can receive either an input signal or an output signal. The hidden layer 120 is responsible for processing the input signal. Each layer represents a logic level of signal processing, and through multiple layers, data signals may be processed through multiple levels of logic.
The input signal to the neural network may be various forms of signals such as a video signal, a voice signal, a text signal, an image signal, a temperature signal, and the like in some possible embodiments. The image signal may be a landscape signal captured by a camera (image sensor), an image signal showing a community environment captured by a monitoring device, a face signal of a human face acquired by an access control system, or other sensor signals. The input signals of the neural network also include various other computer-processable engineering signals, which are not listed here. If the neural network is used for deep learning of the image signal, the image quality can be improved.
(2) Deep neural network
Deep Neural Networks (DNNs), also known as multi-layer Neural networks, can be understood as Neural networks with multiple hidden layers. Dividing the deep neural network according to the positions of different layers, wherein the neural networks in the deep neural network can be divided into three types: an input layer, a hidden layer and an output layer. Generally, the first layer is an input layer, the last layer is an output layer, and the middle layers are hidden layers. The layers are all connected, that is, any neuron of the ith layer is connected with any neuron of the (i + 1) th layer.
Although deep neural networks seem complex, they are not really complex in terms of the work of each layer, which is simply a linear relational expression as follows:
Figure RE-GDA0003301727510000041
wherein the content of the first and second substances,
Figure RE-GDA0003301727510000042
is the input vector of the input vector,
Figure RE-GDA0003301727510000043
is the output vector of the output vector,
Figure RE-GDA0003301727510000044
is an offset vector, W is a weight matrix (also called coefficient), and α () is an activation function. Each layer is only for the input vector
Figure RE-GDA0003301727510000045
Obtaining the output vector through such simple operation
Figure RE-GDA0003301727510000046
Due to the number of layers of the deep neural network, the coefficient W and the offset vector
Figure RE-GDA0003301727510000047
The number of the same is also large. The definition of these parameters in a deep neural network is as follows: taking coefficient W as an example: suppose that in a three-layer deep neural network, the linear coefficients from the 4 th neuron of the second layer to the 2 nd neuron of the third layer are defined as
Figure RE-GDA0003301727510000048
The superscript 3 represents the number of layers in which the coefficient W is located, and the subscripts correspond to the third-layer index 2 that is output and the second-layer index 4 that is input.
In summary, the coefficients from the kth neuron at layer L-1 to the jth neuron at layer L are defined as
Figure RE-GDA0003301727510000049
Note that the input layer is without the W parameter. In deep neural networks, more hidden layers make the network more able to depict complex situations in the real world. Theoretically, the more parameters the higher the model complexity, the larger the "capacity", which means that it can accomplish more complex learning tasks. The final purpose of training the deep neural network, that is, learning the weight matrix, is to obtain the weight matrix (the weight matrix formed by the vectors W of many layers) of all the layers of the trained deep neural network.
(3) Convolutional neural network
A Convolutional Neural Network (CNN) is a deep neural Network with a Convolutional structure. The convolutional neural network includes a feature extractor consisting of convolutional layers and sub-sampling layers. The feature extractor may be considered a filter and the convolution process may be considered as convolving an input image or feature map (feature map) with a trainable filter. The convolutional layer is a neuron layer for performing convolutional processing on an input signal in a convolutional neural network. In convolutional layers of convolutional neural networks, one neuron may be connected to only a portion of the neighbor neurons. One convolutional layer can output a plurality of feature maps, and the feature maps can refer to intermediate results in the operation process of the convolutional neural network. Neurons of the same signature share weights, where the shared weights are convolution kernels. Sharing weights may be understood as the way image information is extracted is location independent. That is, the statistics of a certain portion of the image are the same as other portions. Meaning that image information learned in one part can also be used in another part. The same learned image information can be used for all positions on the image. In the same convolution layer, a plurality of convolution kernels can be used to extract different image information, and generally, the greater the number of convolution kernels, the more abundant the image information reflected by the convolution operation.
The convolution kernel can be initialized in the form of a matrix of random size, and can be learned to obtain reasonable weights in the training process of the convolutional neural network. In addition, sharing weights brings the direct benefit of reducing connections between layers of the convolutional neural network, while reducing the risk of overfitting.
Exemplarily, as shown in fig. 2, a schematic structural diagram of a convolutional neural network provided in an embodiment of the present application is shown. Convolutional neural network 200 may include an input layer 210, a convolutional/pooling layer 220 (where pooling is optional), and a neural network layer 230.
Convolutional/pooling layer 220 may comprise, for example, layers 221 through 226. In one example, layer 221 may be, for example, a convolutional layer, layer 222 may be, for example, a pooling layer, layer 223 may be, for example, a convolutional layer, layer 224 may be, for example, a pooling layer, layer 225 may be, for example, a convolutional layer, and layer 226 may be, for example, a pooling layer. In another example, layers 221 and 222 may be, for example, convolutional layers, layer 223 may be, for example, a pooling layer, layers 224 and 225 may be, for example, convolutional layers, and layer 226 may be, for example, a pooling layer. The output of a convolutional layer may be used as input to a subsequent pooling layer, or may be used as input to another convolutional layer to continue the convolution operation.
The inner working principle of a convolutional layer will be described by taking convolutional layer 221 as an example.
Convolutional layer 221 may include a number of convolution operators, which may also be referred to as kernels. The role of the convolution operator in image processing is to act as a filter that extracts specific information from the input image matrix. The convolution operator may be essentially a weight matrix, which is usually predefined. In the process of performing convolution operation on an image, the weight matrix is usually processed on the input image pixel by pixel (or two pixels by two pixels, depending on the value of the step size (stride)) in the horizontal direction, so as to complete the work of extracting a specific feature from the image. The size of the weight matrix is related to the size of the image. It is to be noted that the depth dimension (depth dimension) of the weight matrix and the depth dimension of the input image are the same. In the process of performing the convolution operation, the weight matrix may extend to the entire depth of the input image. Thus, convolving with a single weight matrix will produce a single depth dimension of the convolved output, but in most cases not a single weight matrix is used, but a plurality of weight matrices of the same size (row by column), i.e. a plurality of matrices of the same type, are applied. The outputs of each weight matrix are stacked to form the depth dimension of the convolved image. Different weight matrices may be used to extract different features in the image, e.g., one weight matrix to extract image edge information, another weight matrix to extract a particular color of the image, yet another weight matrix to blur unwanted noise in the image, etc. The plurality of weight matrices have the same size (row × column), the feature maps extracted by the plurality of weight matrices having the same size also have the same size, and the extracted feature maps having the same size are combined to form the output of the convolution operation.
The weight values in these weight matrices need to be obtained through a large amount of training in practical application, and each weight matrix formed by the trained weight values can be used to extract information from the input image, so that the convolutional neural network 200 can make correct prediction.
When convolutional neural network 200 has multiple convolutional layers, the initial convolutional layer (e.g., layer 221) tends to extract more general features, which may also be referred to as low-level features. As the depth of convolutional neural network 200 increases, the features extracted by the convolutional layers (e.g., layer 226) further back become more complex, such as features with high levels of semantics, and the higher levels of semantics are more suitable for the problem to be solved.
Since it is often desirable to reduce the number of training parameters, it is often desirable to periodically introduce pooling layers after the convolutional layer. The layers 221 through 226, as exemplified by convolutional layer/pooling layer 220 in FIG. 2, may be one convolutional layer followed by one pooling layer, or multiple convolutional layers followed by one or more pooling layers. During image processing, the only purpose of the pooling layer is to reduce the spatial size of the image. The pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to smaller sized images. The average pooling operator may calculate pixel values in the image over a certain range to produce an average as a result of the average pooling. The max pooling operator may take the pixel with the largest value in a particular range as a result of the max pooling. In addition, just as the size of the weighting matrix used in the convolutional layer should be related to the image size, the operators in the pooling layer should also be related to the image size. The size of the image output after the image processing by the pooling layer may be smaller than the size of the image input to the pooling layer, and each pixel point in the image output by the pooling layer represents an average value or a maximum value of a corresponding sub-region of the image input to the pooling layer.
After processing by convolutional layer/pooling layer 220, convolutional neural network 200 is not sufficient to output the required output information. Because, as previously described, the convolutional/pooling layer 220 extracts features and reduces parameters brought about by the input image. However, in order to generate the final output information (required class information or other relevant information), the convolutional neural network 200 needs to generate one or a set of the required number of classes of outputs using the neural network layer 230. Therefore, a plurality of hidden layers (such as the layer 231, the layer 232 to the layer 23n shown in fig. 2) and an output layer 240 may be included in the neural network layer 230, and parameters included in the hidden layers may be obtained by pre-training according to related training data of a specific task type, for example, the task type may include image recognition, image classification, image super-resolution reconstruction, and the like.
After the hidden layers in the neural network layer 230, i.e., the last layer of the whole convolutional neural network 200 is the output layer 240, the output layer 240 has a loss function similar to the classification cross entropy, and is specifically used for calculating the prediction error, once the forward propagation of the whole convolutional neural network 200 (i.e., the propagation from the layer 210 to the layer 240 in fig. 2 is forward propagation) is completed, the backward propagation (i.e., the propagation from the layer 240 to the layer 210 in fig. 2 is backward propagation) starts to update the weight values and the deviations of the aforementioned layers, so as to reduce the loss of the convolutional neural network 200, and the error between the result output by the convolutional neural network 200 through the output layer and the ideal result.
It should be noted that the convolutional neural network 200 shown in fig. 2 is only an example of a convolutional neural network, and in a specific application, the convolutional neural network may also exist in the form of other network models.
(4) Loss function
In the process of training the deep neural network, because the output of the deep neural network is expected to be as close to the value really expected to be predicted as possible, the weight vector of each layer of the neural network can be updated according to the difference between the predicted value of the network and the really expected target value (of course, an initialization process is usually carried out before the first updating, namely parameters are preset for each layer in the deep neural network), for example, if the predicted value of the network is high, the weight vector is adjusted to be slightly lower, and the adjustment is carried out continuously until the deep neural network can predict the really expected target value or the value which is very close to the really expected target value. Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value", which are loss functions (loss functions) or objective functions (objective functions), which are important equations for measuring the difference between the predicted value and the target value. Taking the loss function as an example, if the higher the output value (loss) of the loss function indicates the greater the difference, the training of the deep neural network becomes a process of reducing the loss as much as possible.
(5) Back propagation algorithm
The convolutional neural network can adopt a Back Propagation (BP) algorithm to correct the size of parameters in the initial super-resolution model in the training process, so that the reconstruction error loss of the super-resolution model is smaller and smaller. Specifically, error loss is generated when the input signal is transmitted forward until the input signal is output, and parameters in the initial super-resolution model are updated by reversely propagating error loss information, so that the error loss is converged. The back propagation algorithm is a back propagation motion with error loss as a dominant factor, aiming at obtaining the optimal parameters of the super-resolution model, such as a weight matrix.
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Fig. 3 is a schematic architecture diagram of a data processing system according to an embodiment of the present application. As shown in fig. 3, the system 300 includes an execution device 310, a training device 320, a database 330, a terminal device 340, a data storage system 350, and a data collection device 360.
The execution device 310 may be a terminal, such as a mobile phone terminal, a tablet computer, a notebook computer, a Virtual Reality (VR)/Augmented Reality (AR) device, a vehicle-mounted terminal, or a peripheral device (e.g., a box carrying a chip with processing capability), or the like.
The training device 320 may be a server or a cloud device, etc. The training device 320 has strong computing power, and can operate a neural network, train the neural network, and perform other calculations.
As one possible embodiment, the execution device 310 and the training device 320 are different processors deployed on different physical devices (e.g., servers or servers in a cluster). For example, the execution device 310 may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, and so forth. A general purpose processor may be a microprocessor or any conventional processor or the like. The training device 320 may be a Graphics Processing Unit (GPU), a neural Network Processing Unit (NPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of programs according to the present disclosure.
The data collection device 360 is used to collect training data and test data and store the training data and test data in the database 330. The training data and the test data may each be in the form of at least one of images, speech and text. For example, the training data includes training images and objects in the training images. The test data includes a test image and an object in the test image. The features of the training image and the features of the test image are both features similar to features of the same application scene. The application scene characteristics at least comprise environmental characteristics (such as temperature, humidity, illumination and the like) and time characteristics (such as daytime period, night period, traffic peak period, traffic low peak period and the like), and the like.
The training device 320 is configured to implement a function of searching N sub networks (sub networks) from the super network 301 (super network) based on training data and test data maintained in the database 330, and generating a preset condition 302. The preset condition 302 is used for indicating the corresponding relation between the scene data influencing the operation of the neural network and the neural network. Wherein, the neural network refers to a sub-network searched from a super network. N is an integer greater than or equal to 2.
Illustratively, as shown in fig. 4, a description is made about a specific process of generating the preset condition.
Step 410, training device 320 trains the super-network and the M sub-networks sampled in the super-network based on the training data maintained in database 330.
The training device 320 may sample M sub-networks from the super-network according to preset rules. The predetermined rule is, for example, to select M sub-networks of different sizes from the first neuron element of the first network layer of the super-network. Optionally, training device 320 randomly samples M sub-networks from the super-network. M is an integer of 2 or more.
The training device 320 trains the super-network using the training data until the loss function in the super-network converges and the loss function value is less than a specific threshold, and the super-network training is completed, so that the super-network achieves a certain precision. Alternatively, if all of the training data in the training set is used for training, then the training of the super-network is completed.
The training device 320 trains the sub-network with the training data until the loss function in the sub-network converges and the value of the loss function is less than a certain threshold, and the sub-network training is completed, so that the sub-network achieves a certain accuracy. Alternatively, all of the training data in the training set is used for training, and the sub-network training is complete.
Understandably, a super network can serve as the underlying network for subnet searches. A super network is a very large neural network comprising a plurality of network layers, each network layer comprising a large number of neurons. Exemplarily, (a) in fig. 5 illustrates a structure of a super network. A sub-network is a part of a network in a super network. A subnetwork is a small neural network. As shown in fig. 5 (b), the sub-network includes three network layers, and the number of network layers included in the sub-network is smaller than the number of network layers included in the super-network shown in fig. 5 (a). Each layer of the super-network and the sub-network includes at least one neuron. As shown in fig. 5 (c), the sub-network includes a smaller number of neurons than the super-network shown in fig. 5 (a). The number of channels included in a sub-network may also be smaller than the number of channels included in the super-network shown in fig. 5 (a). The weights of the sub-networks share the weight of the super-network. For example, the sub-network includes a first network layer and a second network layer in the super-network, and a first neuron in the first network layer is connected to a first neuron in the second network layer. The weights of the first neuron in the first network layer connecting with the first neuron in the second network layer multiplex the weights of the first neuron in the first network layer connecting with the first neuron in the second network layer in the hyper-network. The neural network described in this embodiment may be a deep neural network. In particular, the neural network may be a convolutional neural network.
It should be noted that the training device 320 may perform multiple rounds of training on the super-network and the sub-network. For example, the training device 320 may re-randomly sample subnetworks from the super network at each round, training the super network and the randomly sampled subnetworks. After all the rounds of training device 320 are performed, the M subnetworks are sampled randomly, and training is performed on the M subnetworks. Training device 320 may randomly sample a different number of subnets per round; alternatively, the training device 320 may randomly sample different network-sized subnets at each round.
After the training device 320 has completed training the super-network and the subnetworks in the super-network, the training device 320 may search for subnetworks from the super-network, for example, the training device 320 may also perform steps 420 through 460 to obtain a plurality of subnetworks.
In step 420, training device 320 searches for Q subnetworks from the super network.
Step 430, training device 320 tests the accuracy of the Q subnetworks based on the test data maintained in database 330.
The test data for testing the accuracy of the sub-network comprises a large number of images. For each of the Q subnetworks, the training device 320 inputs a large number of images into the subnetwork to obtain the prediction. And comparing the prediction result with the original image to obtain the accuracy of the sub-network.
The training device 320 has strong computing power and can rapidly run the neural network, thereby effectively reducing the time for training and testing the neural network.
In step 440, the execution device 310 tests the operation duration of the Q subnets based on the test data maintained in the database 330.
Since the computing power of the execution device 310 is less than the computing power of the training device 320, the amount of test data used for the duration of the test execution device 310 while operating the sub-network is much less than the amount of test data used for testing the accuracy of the training device 320 while operating the sub-network. For example, the test data for the duration of the operation of the test executive device 310 on the sub-network may be an image.
For each of the Q subnetworks, the executive device 310 inputs test data into the subnetwork to obtain the prediction result and the operation duration of the subnetwork. The operation time of the sub-network may refer to a time period from when the execution device 310 operates the sub-network to process the test data to when the test result is obtained. Understandably, the longer the operation time of the sub-network is, the slower the data processing speed is; the shorter the operation time of the sub-network, the faster the data processing speed.
It should be noted that, since different hardware resources (such as a processor, a memory, and a battery) are configured for different models of execution devices, the speed of the neural network operation of the execution devices configured with different hardware resources is also different. Therefore, the speed of the Q sub-networks is tested on the execution devices configured with different hardware resources, so that the training device 320 selects a neural network that runs as fast as possible for the execution devices configured with different hardware resources.
Step 450, the execution device 310 transmits the operation duration of the Q sub-networks to the training device 320.
It should be noted that, since the training device 320 has trained the super network and the randomly sampled large number of subnetworks according to the above step 410, Q subnetworks searched by the training device 320 from the super network are subnetworks with a certain accuracy, and the training device 320 performs step 460. The Q subnetworks may be part of the M subnetworks trained by the training device 320 according to step 410 described above. Alternatively, the Q subnetworks may be randomly sampled subnetworks of training device 320.
Optionally, the training device 320 may randomly search Q subnetworks from the super network for the first time, update the Q subnetworks according to the precision of the Q subnetworks and the operating duration of the Q subnetworks, perform step 430 to step 450 in a loop, obtain Q subnetworks with higher precision after updating the Q subnetworks through multiple iterations, and perform step 460 by the training device 320.
Step 460, the training device 320 searches N sub-networks from the Q sub-networks according to the precision of the Q sub-networks and the operating time of the Q sub-networks.
In one example, training device 320 may search N sub-networks from among Q sub-networks according to an evolutionary algorithm, reinforcement learning, or greedy algorithm. Q and N are integers which are greater than or equal to 2, and Q is greater than N.
In another example, the training device 320 may also count the accuracy of the Q sub-networks and the operating time of the Q sub-networks to generate the pareto boundary as shown in fig. 6. The horizontal axis represents error rate and the vertical axis represents network size. The network size may also be referred to as the size of the neural network. The larger the scale of the neural network is, the higher the precision of the neural network is, the lower the error rate is, and the longer the running time is; conversely, the smaller the scale of the neural network, the lower the precision of the neural network, the higher the error rate and the shorter the operation time. Therefore, the accuracy and speed of the neural network are inversely proportional.
From the perspective of the accuracy of the neural network, each point on the pareto boundary represents the fastest neural network at one accuracy. From the perspective of the velocity of the neural network, each point on the pareto boundary represents the most accurate neural network at one velocity.
The training device 320 may select N sub-networks from the Q sub-networks according to the pareto boundary trade-off between accuracy and speed of the neural network, and the training device 320 selects a neural network that meets the accuracy requirement and the speed requirement for execution devices of different hardware resource configurations.
In another example, the training device 320 may further search for Q sub-networks from the super-network according to an evolutionary algorithm, a reinforcement learning algorithm, or a greedy algorithm, the training device 320 generates the pareto boundary as shown in fig. 6 according to the accuracies of the Q sub-networks and the operation durations of the Q sub-networks, and the training device 320 may select N sub-networks from the Q sub-networks according to the pareto boundary to balance the accuracy and the speed of the neural network.
In another example, training device 320 may also search for N sub-networks from the super-network according to an evolutionary algorithm, a reinforcement learning, or a greedy algorithm.
Optionally, the training device 320 may also train the N subnetworks by using the training data, adjust weights of the N subnetworks, and improve the accuracy of the N subnetworks.
Step 470, the training device 320 generates the preset condition according to the N subnetworks and the scene data associated with each of the N subnetworks.
Because the application data processed by one neural network can be acquired under different application scenes, and the different application scenes have different application scene characteristics, the accuracy of the neural network for processing the application data under different application scenes is different.
For example, for an automatic driving scene, when an automatic driving automobile passes through a barrier gate in a clear sky environment, because the sky is clear, the license plate image is clear enough, the accuracy of the license plate recognition by the execution equipment through the neural network with a smaller scale is higher, and the precision requirement of the license plate recognition can be met. The application scene characteristics include sufficient light and sunny days. For another example, when an automatic driving automobile passes through a barrier in a rainy day, because the sight line in the rainy day is unclear, the license plate image may be blurred, and the accuracy of recognizing the license plate by the execution equipment by using the neural network with a small scale is low, so that the precision requirement of license plate recognition cannot be met. The application scene features include insufficient light and rainy days.
For another example, for a gate opening scene, the pedestrian flow rate is large during peak hours, and the speed of identifying the pedestrian flow by the execution device by using the neural network with a large scale is slow, so that the speed requirement of identifying the pedestrian flow cannot be met. The application scenario features may include a commute peak period and a traffic volume.
Accordingly, the training device 320 may generate the preset conditions according to a large number of application scenario features and N subnetworks. The preset condition may include an association relationship between a large number of application scenario features and the N subnetworks. For example, the application scene characteristics include sunny days, cloudy days, and rainy days. The preset conditions comprise an association relationship between sunny days and a first sub-network, an association relationship between cloudy days and a second sub-network, and an association relationship between rainy days and a third sub-network. The size of the first sub-network is smaller than the size of the second sub-network. The size of the second sub-network is smaller than the size of the third sub-network.
The application scenario features may be obtained by the training device 320 analyzing the test data and the training data. Alternatively, the database 330 maintains the application scene characteristics in the application scene for collecting the test data and the application scene characteristics in the application scene for collecting the training data.
In addition, because the hardware resources of the execution device are dynamically changed, the duration of the execution device running the neural network may also be different under different hardware resources. For example, taking a mobile phone as an example, different users use the same brand of mobile phone, and due to different usage habits of different users, the available storage capacity and the available remaining power are different. Even with the same mobile phone, the applications, available storage capacity, and available remaining power that the mobile phone runs at different time periods are different. The time for operating the neural network may be longer when the mobile phone is in a low power state, so that the neural network with a smaller scale can be selected to process the application data when the mobile phone is in a low power state.
Therefore, the training device 320 may also receive the operation scenario features of the hardware resources when the N subnetworks are tested, which are transmitted by the execution device 310. The training device 320 generates preset conditions according to a large number of application scene features, operation scene features, and N sub-networks.
The operating scenario characteristic may include at least one of a computing power of the processor, an available storage capacity, and an available remaining power. The computational power of a processor may also be understood as the occupancy of the computational resources of the processor. For example, the operation scene characteristics include a low power, a medium power, and a high power. The preset conditions include the association of a low battery with a fourth sub-network, the association of a medium battery with a fifth sub-network, and the association of a high battery with a sixth sub-network. The size of the fourth sub-network is smaller than the size of the fifth sub-network. The size of the fifth sub-network is smaller than the size of the sixth sub-network.
After the training device 320 generates the pre-condition 302 according to the above steps 410 to 470, the super network 301 and the pre-condition 302 may be configured to the execution device 310.
The execution device 310 is configured to implement a function of determining a first neural network for processing application data from the super network 301 according to the scene awareness data and the preset condition 302.
Since the execution device 310 selects the first neural network suitable for processing the application data from the super network 301 according to the scene awareness data and the preset condition 302, the first neural network is a large-scale network meeting the precision requirement on the premise of ensuring the speed requirement; or, on the premise of ensuring the accuracy requirement, the first neural network is a smaller-scale network which meets the speed requirement. Therefore, the precision and the speed of processing the application data are balanced, and the user experience is improved. The first neural network is a sub-network in the super network 301.
A specific method for the execution device 310 to determine the first neural network to process the application data may be described with reference to fig. 7 below.
It should be noted that, in practical applications, the training data and the test data maintained in the database 330 are not necessarily both from the data acquisition device 360, and may be received from other devices. In addition, the training device 320 may not necessarily train the subnetwork based entirely on the training data maintained by the database 330, but may also obtain training data from the cloud or elsewhere to train the subnetwork. The performing device 310 may not necessarily test the sub-network based entirely on the test data maintained by the database 330, but may also obtain the test data from the cloud or elsewhere in order to test the speed at which the sub-network processes data based on the test data. The above description should not be taken as limiting the embodiments of the present application.
Further, the execution device 310 may be further subdivided into an architecture as shown in fig. 3 according to the functions performed by the execution device 310, and as shown, the execution device 310 is configured with a calculation module 311, an I/O interface 312, and a preprocessing module 313.
The I/O interface 312 is used for data interaction with external devices. A user may input data to the I/O interface 312 via the terminal device 340. The input data may comprise images or video. In addition, the input data may also come from database 330.
The preprocessing module 313 is configured to perform preprocessing according to input data received by the I/O interface 312. In an embodiment of the present application, the preprocessing module 313 can be used to identify application scenario features of application data received from the I/O interface 312.
During the process of preprocessing the input data by the execution device 310 or performing the calculation and other related processes by the calculation module 311 of the execution device 310, the execution device 310 may call the data, the code and the like in the data storage system 350 for corresponding processes, and may store the data, the instruction and the like obtained by corresponding processes in the data storage system 350.
For example, the hyper network 301 and preset conditions 302 stored by the execution device 310 may be applied to the execution device 310. After the execution device 310 obtains the application data, the calculation module 311 searches for a first neural network from the super network 301 according to the scene awareness data and the preset condition 302, and processes the application data by using the first neural network. Since the first neural network is determined by the execution device 310 based on the scene awareness data, the speed requirement and the precision requirement of the user on data processing can be met by processing the application data by using the first neural network.
Finally, the I/O interface 312 returns the processing result to the terminal device 340, thereby providing it to the user so that the user can view the processing result.
In the case shown in fig. 3, the user may manually give the input data, which may be operated through an interface provided by the I/O interface 312. Alternatively, the terminal device 340 may automatically send the input data to the I/O interface 312, and if the terminal device 340 is required to automatically send the input data to obtain the user's authorization, the user may set the corresponding authority in the terminal device 340. The user can view the processing result output by the execution device 310 at the terminal device 340, and the specific presentation form may be a specific manner such as display, sound, and action. The terminal device 340 may also be used as a data acquisition terminal, and acquires the input data of the input I/O interface 312 and the processing result of the output I/O interface 312 as new sample data, and stores the new sample data in the database 330. Of course, the input data input to the I/O interface 312 and the processing result output to the I/O interface 312 as shown in the figure may be stored in the database 330 as new sample data by the I/O interface 312 without being collected by the terminal device 340.
Fig. 3 is a schematic diagram of a system architecture provided by an embodiment of the present application, and the position relationship between the devices, modules, and the like shown in fig. 3 does not constitute any limitation, for example, in fig. 3, the data storage system 350 is an external memory with respect to the execution device 310, and in other cases, the data storage system 350 may be disposed in the execution device 310.
Next, a data processing method provided in the embodiment of the present application is described in detail with reference to fig. 7 to 11. Fig. 7 is a schematic flowchart of a data processing method according to an embodiment of the present application. This is illustrated here by way of example by the execution device 310 in fig. 3. As shown in fig. 7, the method includes the following steps.
Step 710, the execution device 310 obtains application data to be processed.
The application data includes data in the form of at least one of images, voice, and text. The execution device 310 may acquire application data to be processed through a sensor. The sensor comprises at least one of a camera, an audio sensor and a laser radar. The sensors that collect application data may be different for different applications of application data.
For example, if the execution device 310 is an intelligent terminal, the application scenario is that the intelligent terminal recognizes a face of a user by using a face recognition function, so as to realize a function of unlocking the intelligent terminal. The intelligent terminal can shoot the face by utilizing the camera to obtain a face image, and the application data to be processed is the face image. In another example, the application scenario is that the intelligent terminal identifies the voice of the user by using the voice assistant function, so as to realize the function of unlocking the intelligent terminal. The intelligent terminal can obtain voice of a user by using the audio sensor to obtain audio data, and the application data to be processed is the audio data.
As another example, if the execution device 310 is a barrier monitor. When the application scene is that the automatic driving automobile passes through the barrier, the barrier monitor identifies the license plate. The barrier monitor can shoot the license plate of the automatic driving automobile by using the camera to obtain a license plate image, and the application data to be processed is the license plate image.
Step 720, the execution device 310 acquires scene awareness data.
As can be seen from the above explanation of step 460, the factors that influence the speed and accuracy of the neural network processing the application data include the application scene characteristics of the application scene for acquiring the application data and the operation scene characteristics of the hardware resources for operating the application data. Therefore, the execution device 310 acquires the scene awareness data after acquiring the application data to be processed. The scene awareness data is used to indicate influencing factors when the execution device 310 processes the application data.
Understandably, the context awareness data is used to indicate factors that affect speed and accuracy when the execution device 310 processes the application data using the neural network. The scene awareness data includes at least one of an external influencing factor and an internal influencing factor. The external influencing factors are used to describe the characteristics of the application scenario when the execution device 310 acquires the application data. The external influencing factors include at least one of temperature data, humidity data, illumination data and time data. The internal influencing factors are used to describe the operation scene characteristics of the hardware resources of the execution device 310 for operating the application data. The internal influencing factors include at least one of a computing power of the processor, an available storage capacity and an available remaining power.
If the scene awareness data includes external influencing factors, in one possible implementation, the execution device 310 may obtain the scene awareness data from other devices. For example, if the scene awareness data is temperature data, the execution device 310 may obtain the temperature data from a temperature sensor. As another example, if the scene awareness data is humidity data, the executive device 310 may obtain temperature data from a humidity sensor. In another possible implementation, the execution device 310 may obtain scene awareness data from the application data to be processed. For example, the application data may be an image, the scene awareness data may be a brightness or illumination intensity, and the execution device 310 may analyze the image acquisition brightness or illumination intensity.
If the context awareness data includes internal influencing factors, the controller in the execution device 310 may monitor the processor, the memory, and the amount of power in the execution device 310 to obtain at least one of the computing power, the available storage capacity, and the available remaining power of the processor. The controller may be a central processor in the execution device 310.
Step 730, the execution device 310 determines a first neural network for processing the application data according to the scene awareness data and the preset condition.
The preset condition is used for indicating the corresponding relation between the scene data influencing the operation of the neural network and the neural network.
In a first possible scenario, the context data comprises an application context identification and an application context feature. The application scene identification is used to indicate an application scene. An application scenario may have one or more application scenario features. It is understood that one application scenario identifier may be associated with one application scenario feature, or one application scenario identifier may be associated with a plurality of application scenario features. For example, the application scene is a gate scene of a cell garage, and the application scene features include sunny days, cloudy days and rainy days.
The preset condition is used for indicating the corresponding relation between the application scene characteristics influencing the operation of the neural network and the neural network. Each application scenario feature is associated with a neural network. The preset condition may include a correspondence relationship between a plurality of application scenario features and the neural network. In one example, the correspondence of the application scenario features to the neural network may be presented in a tabular form, as shown in table 1.
TABLE 1
Figure RE-GDA0003301727510000131
As can be seen from table 1, the application scene characteristics associated with the application scene indicated by the identifier 1 include sunny days, cloudy days, and rainy days. The sunny day has a binding relationship with the neural network 1, the cloudy day has a binding relationship with the neural network 2, and the rainy day has a binding relationship with the neural network 3. The application scene characteristics associated with the application scene indicated by the identifier 2 include strong illumination, medium illumination and weak illumination. The strong illumination has a binding relationship with the neural network 4, the medium illumination has a binding relationship with the neural network 5, and the weak illumination has a binding relationship with the neural network 6. The application scene characteristics of the application scene association indicated by the identification 3 include the usage interval duration of the application. The usage interval duration of the application has a binding relationship with the neural network 7.
Assume that the application scenario indicated by identifier 1 is a gate scenario of a cell garage. When an automobile passes through a barrier gate in a clear sky environment, the execution device 310 obtains scene perception data including the passing barrier gate and a clear day of the automobile, the execution device 310 determines that an application scene is the scene indicated by the identifier 1 according to the scene perception data look-up table 1, the execution device 310 determines that the application scene characteristic of the scene indicated by the identifier 1 is a clear day, and determines the neural network 1 corresponding to the clear day as a first neural network.
Since the sky is clear, the license plate image captured by the execution apparatus 310 is sufficiently clear. The first neural network determined by the execution device 310 according to the application scenario features and the preset conditions may be a smaller-scale neural network. The execution device 310 has a high accuracy in recognizing the license plate by using the neural network with a small scale, and can meet the precision requirement of license plate recognition. Moreover, the execution device 310 has a short running time for recognizing the license plate by using the neural network with a small scale, and can meet the speed requirement of license plate recognition.
For another example, when the automobile passes through the barrier in the heavy rainy day environment, the scene awareness data acquired by the execution device 310 includes the passing of the automobile through the barrier and the rainy day, the execution device 310 determines that the application scene is the scene indicated by the identifier 1 according to the scene awareness data lookup table 1, the execution device 310 determines that the application scene characteristic of the scene indicated by the identifier 1 is the rainy day, and determines the neural network 3 corresponding to the rainy day as the first neural network. The scale of the neural network 3 is larger than that of the neural network 1.
Since the line of sight is unclear in rainy weather, the license plate image captured by the execution device 310 may be blurred. If the accuracy rate of the license plate recognition by the execution device 310 by continuously utilizing the neural network with the smaller scale is lower, the precision requirement of the license plate recognition cannot be met. The first neural network determined by the execution device 310 according to the application scenario features and the preset conditions may be a larger-scale neural network. The execution device 310 has a high accuracy in recognizing the license plate by using the neural network with a large scale, and can meet the precision requirement of license plate recognition on the premise of ensuring the speed requirement.
Therefore, the execution device 310 dynamically selects the first neural network for processing the application data according to the application scene characteristics, so that the speed is as fast as possible on the premise of meeting the accuracy requirement, or the accuracy is as high as possible on the premise of meeting the speed requirement, the accuracy requirement and the speed requirement during processing the application data are balanced, and the user experience is improved.
It should be noted that table 1 only indicates the storage form of the corresponding relationship in the storage device in the form of a table, and is not limited to the storage form of the corresponding relationship in the storage device, and of course, the storage form of the corresponding relationship in the storage device may also be stored in other forms, which is not limited in this embodiment.
In addition, the application scenario feature may be expressed in the form of a specific numerical value or value range. For example, if the application scene characteristic is the illumination intensity, the value range of the illumination intensity may include three value ranges of strong illumination, medium illumination, and weak illumination. As another example, the application scenario feature is a usage interval duration of the application, which may be 3 seconds. The more value ranges corresponding to an application scenario feature, the more accurate the execution device 310 determines the first neural network for processing application data.
In a second possible scenario, the scene data includes a running scene identification and a running scene feature. The operation scene identification is used for indicating an operation scene. An operational scenario may have one or more operational scenario features. It is understood that one operation scene identifier may be associated with one operation scene feature, or one operation scene identifier may be associated with a plurality of operation scene features. For example, the operation scenario is available remaining power, and the operation scenario features include high power, medium power and low power.
The preset condition is used for indicating the corresponding relation between the operation scene characteristics influencing the operation of the neural network and the neural network. Each operational scenario feature is associated with a neural network. The preset condition may include a correspondence relationship between a plurality of operation scene features and the neural network. In one example, the correspondence of the operational scenario features to the neural network may be presented in a tabular form, as shown in table 2.
TABLE 2
Figure RE-GDA0003301727510000141
As can be seen from table 2, the operation scenario indicated by the identifier 4 is an available remaining power scenario of the execution device 310. The operation scene characteristics associated with the available residual capacity scene comprise high available residual capacity, medium available residual capacity and low available residual capacity. The available residual capacity is high and has a binding relation with the neural network 8, the neural network 9 in the available residual capacity has a binding relation, and the available residual capacity is low and has a binding relation with the neural network 10.
The operating scenario characteristic may be expressed in the form of a specific value or value range. For example, if the operation scenario is characterized by available remaining power. The value range of the available residual capacity is 100 to 60 percent. The value range in the available remaining capacity is 60% to 30%. The low value of the available residual capacity ranges from 30% to 5%. The more value ranges corresponding to one operational scenario feature, the more accurate the execution device 310 determines the first neural network that processes the operational data.
Assuming that the performing device 310 determines that the power is less than 20% when the performing device 31 needs to initiate the voice assistant function, the performing device 310 determines that the scene awareness data is less than 20% power. The execution device 310 determines that the operation scenario is the scenario indicated by the identifier 4 according to the scenario awareness data lookup table 2, the execution device 310 then determines that the operation scenario characteristic of the scenario indicated by the identifier 4 is low in available remaining power, and determines the neural network 10 corresponding to the low available remaining power as the first neural network.
Because the available remaining power of the execution device 310 is low, if the execution device 310 continues to perform speech recognition by using a neural network with a large scale, the speed requirement of license plate recognition cannot be met. At this time, the first neural network determined by the execution device 310 according to the operation scene characteristics and the preset conditions may be a neural network of a smaller scale. The execution device 310 can recognize the user voice by using the neural network with a small scale, meet the speed requirement of voice recognition on the premise of ensuring the accuracy requirement, and save the electric quantity.
As another example, the performing device 310 determines that the power is above 80%, then the performing device 310 determines that the scene awareness data is above 80% power. The execution device 310 determines that the operation scene is the scene indicated by the identifier 4 according to the scene perception data lookup table 2, the execution device 310 determines that the operation scene characteristic of the scene indicated by the identifier 4 is high in available residual power, and determines the neural network 8 corresponding to the high available residual power as the first neural network.
Due to the high available residual power of the execution device 310, the first neural network determined by the execution device 310 according to the operation scenario features and the preset conditions may be a larger-scale neural network. The execution device 310 can recognize the user voice by using a neural network with a large scale, and can meet the accuracy requirement of voice recognition on the premise of ensuring the speed requirement.
Therefore, the execution device 310 dynamically selects the first neural network for processing the application data according to the running scene characteristics, so that the speed is as fast as possible on the premise of meeting the precision requirement, or the precision is as high as possible on the premise of meeting the speed requirement, the precision requirement and the speed requirement during processing the application data are balanced, and the user experience is improved.
It should be noted that table 2 only indicates the storage form of the corresponding relationship in the storage device in the form of a table, and is not limited to the storage form of the corresponding relationship in the storage device, and of course, the storage form of the corresponding relationship in the storage device may also be stored in other forms, which is not limited in this embodiment.
In a third possible scenario, the scene data includes application scene features and operational scene features. For the explanations of the application scenario features and the operating scenario features, reference is made to the explanations of the first possible scenario and the second possible scenario described above. One application scenario feature may correspond to one or more operational scenario features. One running scene feature may correspond to one or more application scene features. The embodiment of the application has no limitation on the incidence relation between the application scene features and the operation scene features, and the specific incidence relation can be set according to the application scene.
In one example, the correspondence of the scene data to the neural network may be presented in a tabular form, as shown in table 3.
TABLE 3
Figure RE-GDA0003301727510000151
As can be seen from table 3, the application scene characteristics of the application scene association indicated by the identifier 7 include sunny days. The running scene characteristics of the application scene association indicated by the identifier 7 include high available remaining power and low available remaining power. Both sunny days and high available residual power have a binding relationship with the neural network 17. Both sunny days and low available remaining power have a binding relationship with the neural network 18.
Assume that the application scenario indicated by the identifier 7 is an autonomous vehicle passing barrier scenario. When the automatic driving automobile passes through the barrier gate in a clear sky environment, the execution device 310 obtains scene perception data including the passing barrier gate and the clear sky of the automobile, the execution device 310 determines that the scene perception data further includes high available residual power, the execution device 310 determines that the application scene is the scene indicated by the identifier 7 according to the scene perception data query table 3, the execution device 310 determines that the application scene characteristic of the scene indicated by the identifier 7 is the clear sky and the operation scene characteristic is the high available residual power, and the neural network 17 corresponding to the clear sky and the high available residual power is determined as the first neural network.
Because the sky is clear, the license plate image shot by the execution device 310 is clear enough, and the execution device 310 can identify the license plate by using a neural network with a small scale, so that the speed requirement of license plate identification is met on the premise of ensuring the precision requirement. However, if the available residual power of the execution device 310 is high, the execution device 310 may further improve the accuracy of the license plate recognition on the premise that the first neural network determined according to the application scene characteristics, the operation scene characteristics, and the preset conditions may meet the speed requirement of the license plate recognition, so as to meet the accuracy requirement of the license plate recognition.
If the available remaining power of the execution device 310 is low, the execution device 310 may select the neural network 18 corresponding to the low available remaining power on a sunny day as the first neural network because the execution device 310 may affect the operation duration of the neural network if the power is low. On the premise of ensuring the accuracy requirement, the neural network 18 with the scale smaller than that of the neural network 17 is used for recognizing the license plate, and the speed requirement of license plate recognition is met.
For another example, when the automatic driving automobile passes through the barrier in the rainy environment, the scene perception data acquired by the execution device 310 includes that the automobile passes through the barrier and the rainy day, and the execution device 310 determines that the scene perception data further includes high available residual power, the execution device 310 determines that the application scene is the scene indicated by the identifier 8 according to the scene perception data lookup table 3, the execution device 310 determines that the application scene characteristic of the scene indicated by the identifier 8 is the rainy day, the operation scene characteristic is high available residual power, and the neural network 19 corresponding to the rainy day and the high available residual power is determined as the first neural network.
Due to the fact that the sight line is unclear in rainy days, the license plate image shot by the execution equipment 310 may be fuzzy, the accuracy of license plate recognition by the execution equipment 310 through the neural network with the large scale is high, and the precision requirement of license plate recognition can be met on the premise that the speed requirement is met. However, the available residual power of the execution device 310 is high, and the first neural network determined by the execution device 310 according to the application scene characteristics, the operation scene characteristics and the preset conditions may be a neural network for further improving the accuracy of license plate recognition on the premise of meeting the speed requirement of license plate recognition, so as to meet the accuracy requirement of license plate recognition.
If the available remaining power of the execution device 310 is low, the execution device 310 may use a neural network with a large scale to recognize the license plate for a long time, which may not meet the speed requirement of license plate recognition. Accordingly, the execution device 310 may select the neural network 20 corresponding to the rainy day and the low available remaining power to be determined as the first neural network. On the premise of ensuring the precision requirement, the neural network 20 with the scale smaller than that of the neural network 19 is used for recognizing the license plate, and on the premise of ensuring the precision requirement, the speed requirement of license plate recognition is met.
Therefore, the execution device 310 dynamically selects the first neural network for processing the application data according to the application scene characteristics and the operation scene characteristics, so that the speed is as fast as possible on the premise of meeting the precision requirement, or the precision is as high as possible on the premise of meeting the speed requirement, the precision requirement and the speed requirement during processing the application data are balanced, and the user experience is improved.
The method flow illustrated in fig. 8 is illustrative of the specific operational procedure involved in step 730 of fig. 7.
Step 7301, the executing device 310 determines a parameter of the first neural network corresponding to the scene data in the preset condition including the scene sensing data.
It is understood that the scene data is data predetermined for generating the preset condition. The scene awareness data is data acquired by the execution device 310 in real time according to the application data.
In some embodiments, if the execution device 310 queries the application scene features identified by the scene-aware data in the preset condition (as described in the above first possible scenario), or if the execution device 310 queries the operation scene features identified by the scene-aware data in the preset condition (as described in the above second possible scenario), or if the execution device 310 queries the application scene features and the operation scene features identified by the scene-aware data in the preset condition (as described in the above third possible scenario), the execution device 310 processes the application data according to the first neural network associated with the scene data, which balances the precision and speed of processing the application data, satisfies the user requirement, and improves the user experience.
At step 7302, the performing device 310 determines a first neural network from the super network based on parameters of the first neural network.
The preset condition may further include an identification of the first neural network and a parameter of the first neural network. The parameters of the first neural network include the number of channels and the number of network layers. The first neural network may be a sub-network in a super network, the super network being used to determine the first neural network to which the scene data corresponds. The channel may refer to an input result of a convolutional layer in a convolutional neural network, i.e., the channel may also be referred to as a feature map. The network layer may be a convolutional layer in a convolutional neural network.
The execution device 310 may determine the first neural network from the super network based on parameters of the first neural network. Specifically, the execution device 310 determines the weights of the first neural network from the weights of the super network according to the parameters of the first neural network. Since the sub-networks share parameters of the super-network, the storage space of the execution device 310 is effectively reduced with respect to storing multiple sub-networks by the execution device 310.
In step 740, the execution device 310 obtains the processing result of the application data by using the first neural network.
The execution device 310 inputs the application data into the first neural network, and obtains a processing result of the application data by using the first neural network. For example, in an application scenario, the intelligent terminal utilizes a face recognition function to recognize a face of a user, so as to realize a function of unlocking the intelligent terminal. The application data is a face image, and the execution device 310 inputs the face image into the first neural network to obtain a processing result. If the processing result is that the face recognition is successful, the intelligent terminal is unlocked successfully; and if the processing result is that the face recognition fails, unlocking the intelligent terminal fails.
Further, the performing device 310 may also adjust the first neural network after determining the first neural network. The method flow depicted in fig. 9 is a complementary illustration of the method depicted in fig. 7.
And step 750, the execution device 310 displays the corresponding relation between the scene data influencing the operation speed and the accuracy of the first neural network and the first neural network, and the processing result.
The processing results may include run length and accuracy. The run-time duration may refer to a time duration from when the execution device 310 runs the first neural network processing application data to when a processing result is obtained. The execution device 310 starts timing after inputting the application data into the first neural network, until the first neural network outputs the processing result and finishes timing, and obtains the running duration. The accuracy of the processing result is used to indicate a degree to which the accuracy requirement of the user is met by the processing result of processing the application data using the first neural network. The execution apparatus 310 can acquire the accuracy of the processing result by the user feedback information. The execution device 310 displays the corresponding relationship between the scene data affecting the operation speed and accuracy of the first neural network and the first neural network, and the processing result, so that the user can visually see the processing result, and the user can conveniently judge whether the operation time meets the speed requirement of the user, and judge whether the accuracy of the processing result meets the accuracy requirement of the user.
If the processing result of the first neural network does not meet the speed requirement and the accuracy requirement of the user, the performing device 310 may further perform step 760 and step 770, or the performing device 310 may further perform step 780.
In step 760, the performing device 310 determines a second neural network from the super network.
The second neural network may be a user-specified neural network, e.g., the second neural network may be a sub-network in a super-network. The rules for the user to specify the neural network are as follows: if the user needs to improve the accuracy of the execution device 310 in processing the application data, the second neural network includes a greater number of network layers than the first neural network includes, or the second neural network includes a greater number of channels than the first neural network includes. If the user needs to increase the speed of the execution device 310 processing the application data, i.e. reduce the running time of the execution device 310 processing the application data, the number of network layers included in the second neural network is less than the number of network layers included in the first neural network, or the number of channels included in the second neural network is less than the number of channels included in the first neural network.
Step 770, the executing device 310 obtains the processing result of the application data by using the second neural network.
Step 780, the executing device 310 adjusts the scene data corresponding to the first neural network.
If the precision of the processing result obtained by the execution device 310 processing the application data by using the first neural network cannot meet the precision requirement of the user, or the operation duration of the processing result obtained by the execution device 310 processing the application data by using the first neural network cannot meet the speed requirement of the user, and it may be that there are many application scenario features and operation scenario features associated with the first neural network, the execution device 310 may modify at least one of the application scenario features and the operation scenario features associated with the first neural network.
For example, the application scenario is to identify the people flow rate during the peak period of the commute, and because the people flow rate during the peak period of the commute is very large, if the accuracy rate of identifying the people flow rate by the execution device 310 continuously using the neural network with a small scale is low, the accuracy requirement of identifying the people flow rate cannot be met. The execution device 310 thus divides the peak period of work and off duty into a plurality of periods according to the traffic of people, each period being associated with one neural network, so that the execution device 310 performs the traffic identification according to the use of a different neural network in each period. For example, in a time period with a large human traffic, the execution device 310 identifies the human traffic by using a neural network with a small scale, and can meet the speed requirement of human traffic identification on the premise of ensuring the accuracy requirement. For another example, in a time period with a small human flow, the execution device 310 identifies the human flow by using a neural network with a large scale, so that the accuracy requirement of human flow identification can be met on the premise of ensuring the speed requirement.
For another example, the application scenario is that the intelligent terminal utilizes a face recognition function to recognize the face of the user, so as to realize the function of unlocking the intelligent terminal. The user carries out face recognition twice in a short time to unlock the intelligent terminal. If the intelligent terminal continuously utilizes the neural network with larger scale to identify the face, although the accuracy rate is higher and the precision requirement of the face identification is met, the running time of the face identification is longer and the speed requirement of the user on the face identification cannot be met. Therefore, the intelligent terminal can adjust the corresponding relation between the preset interval of the face recognition and the neural network. If the intelligent terminal detects that the time interval of the two face recognition is within the preset interval, the intelligent terminal determines that the first neural network associated with the preset interval can be a neural network with a smaller scale. The intelligent terminal utilizes the neural network with smaller scale to identify the face, and the speed requirement of face identification can be met on the premise of ensuring the precision requirement.
Fig. 10 is a schematic interface diagram for adjusting scene data according to an embodiment of the present application. Assuming that the execution device 310 is an intelligent terminal, as shown in (a) of fig. 10, the intelligent terminal displays a schematic diagram of the result of face recognition. The intelligent terminal displays that face recognition is performed twice within ten minutes, and the time length of face recognition is 4 minutes. Optionally, the smart terminal may also display a "update or not" button 1010. If the user clicks the "update or not" button 1010. The smart terminal may display an interface as shown in (b) of fig. 10. As shown in fig. 10 (b), the intelligent terminal displays whether to perform face recognition twice within two minutes, where the duration of the first face recognition is 4 minutes, and the duration of the second face recognition is 2 minutes. The interface may display a "yes" button 1020 and a "no" button 1030. The user may click the "yes" button 1020. If the intelligent terminal carries out face recognition twice within two minutes, the time length of the first face recognition is 4 minutes, and the time length of the second face recognition is 2 minutes.
It should be noted that the application scenarios described in the embodiment of the present application may include a target detection scenario, a monitoring scenario, a voice recognition scenario, a commodity recommendation scenario, and the like.
Object detection is an important component of computer vision. Computer vision, which is an integral part of various intelligent/autonomous systems in various application fields such as manufacturing, inspection, document analysis, and medical diagnosis, is a study on how to use cameras/camcorders and computers to acquire data and information of a subject to be photographed, which are required by a user. In a descriptive sense, a computer is provided with eyes (camera/camcorder) and a brain (algorithm) to recognize and measure an object, etc. instead of human eyes, thereby enabling the computer to perceive the environment. Because perception can be viewed as extracting information from sensory signals, computer vision can also be viewed as the science of how to make an artificial system "perceive" from images or multidimensional data. Generally, computer vision is to use various imaging systems to obtain input information instead of visual organs, and then use computer to process and interpret the input information instead of brain. The ultimate research goal of computer vision is to make a computer have the ability to adapt to the environment autonomously by visually observing and understanding the world like a human.
The target detection method can be applied to the fields of face detection, vehicle detection, pedestrian counting, automatic driving, safety systems, medical treatment and the like. For example, in an autonomous driving scenario, an autonomous vehicle recognizes objects in the surrounding environment during driving to adjust the speed and direction of the autonomous vehicle, so that the autonomous vehicle can safely drive and avoid traffic accidents. The object may be another vehicle, a traffic control device, or another type of object. As another example, in a security system, a large number of users are identified to assist a worker in identifying a target person as quickly as possible. Generally, input data (such as an image or a video) is input to a neural network having a target detection function, the neural network performs feature extraction on the input data, and target detection is performed based on the extracted features to obtain a detection result.
In addition, the execution device 310 may have stored the super network and the preset condition before performing step 730, i.e., the execution device 310 determines the first neural network for processing the application data according to the scene awareness data and the preset condition, and thus, the execution device 310 may read the super network and the preset condition from the memory and determine the first neural network for processing the application data according to the scene awareness data and the preset condition.
Alternatively, the execution device 310 does not store the super network and the preset conditions, and the super network and the preset conditions need to be downloaded from the server. The server may refer to a cloud server.
For example, fig. 11 is a schematic structural diagram of a system 1100 provided herein, and as shown in fig. 11, the system 1100 may be an entity that provides a cloud service to a user by using a base resource. System 1100 includes cloud data center 1110. The cloud data center 1110 includes a pool of device resources (including computing resources 1111, storage resources 1112, and network resources 1113) and a cloud service platform 1120. The computing resources 1111 included by the cloud data center 1110 may be computing devices (e.g., servers).
An interaction means 1210 may be deployed on the execution apparatus 1200. The interaction means 1210 may be a browser or an application capable of implementing message interaction with the cloud service platform 1120. The user may access the cloud service platform 1120 through the interaction device 1210, upload a request to the cloud data center 1110, and request to acquire the hyper network 301 and the preset condition 302. After receiving the request uploaded by the execution device 1200, the cloud data center 1110 feeds back the hyper network 301 and the preset condition 302 to the execution device 1200. The enforcement device 1200 may be a smart terminal or an edge cell. The edge station can process the application data of the automatic driving automobile and transmit the processing result to the automatic driving automobile. The processing result is used for indicating the operation of the automatic driving automobile.
It is to be understood that, in order to implement the functions in the above-described embodiments, the terminal includes a corresponding hardware structure and/or software module for performing each function. Those of skill in the art will readily appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed in hardware or computer software driven hardware depends on the specific application scenario and design constraints of the solution.
The data processing method provided according to the present embodiment is described in detail above with reference to fig. 1 to 11, and the data processing apparatus provided according to the present embodiment will be described below with reference to fig. 12.
Fig. 12 is a schematic structural diagram of a possible data processing apparatus provided in this embodiment. The data processing device can be used for realizing the functions of the execution device in the method embodiment, so that the beneficial effects of the method embodiment can be realized. In this embodiment, the data processing apparatus may be the execution device 310 shown in fig. 3, or may be a module (e.g., a chip) applied to a server.
As shown in fig. 12, the data processing apparatus 1200 includes a communication module 1210, a selection module 1220, a detection module 1230, and a storage module 1240. The data processing apparatus 1200 is used to implement the functions of the execution device 310 in the method embodiments shown in fig. 7, fig. 8, or fig. 9 described above.
The communication module 1210 is configured to obtain application data to be processed and scene awareness data. The scene awareness data is used to instruct the execution device 310 on influencing factors for processing the application data. The context awareness data is used to indicate factors that affect speed and accuracy when the execution device 310 processes the application data. Wherein the scene awareness data comprises at least one of an external influencing factor and an internal influencing factor. The external influence factors are used for describing application scene characteristics of the application data acquired by the terminal. The internal influence factors are used for describing the operation scene characteristics of the hardware resources of the application data operated by the terminal. Illustratively, the external influencing factors include at least one of temperature data, humidity data, lighting data, and time data. The internal influencing factors include at least one of a computing power of the processor, an available storage capacity and an available remaining capacity. For example, the communication module 1210 is configured to perform step 710 and step 720 in fig. 7, 8, and 9.
The selecting module 1220 is configured to determine a first neural network for processing the application data according to the scene awareness data and a preset condition, where the preset condition is used to indicate a corresponding relationship between the scene data affecting the operation of the neural network and the neural network. For example, the selecting module 1220 is used for executing step 730 in fig. 7, 8 and 9.
The detecting module 1230 is configured to obtain a processing result of the application data by using the first neural network. For example, the detecting module 1230 is used for executing the step 740 in fig. 7, 8 and 9.
When determining, by the selection module 1220, the first neural network for processing the application data according to the scene awareness data and the preset condition, the selection module is specifically configured to: determining parameters of a first neural network corresponding to scene data including the scene perception data in the preset conditions, and determining the first neural network from a super network according to the parameters of the first neural network. The parameters of the first neural network include a number of channels and a number of network layers. The first neural network is a sub-network in the super-network, and the super-network is used for determining the first neural network corresponding to the scene data. Wherein the sub-network comprises a number of network layers that is smaller than a number of network layers comprised by the super-network, or the sub-network comprises a number of channels that is smaller than a number of channels comprised by the super-network, each layer of network layers comprising at least one neuron.
The storage module 1240 is used for storing the preset conditions, the parameters of the super network and at least one sub-network included in the super network. When the selecting module 1220 determines the first neural network from the super network according to the parameter of the first neural network, the selecting module is specifically configured to: determining weights for the first neural network from the weights for the super network according to the parameters for the first neural network.
Optionally, the selecting module 1220 is further configured to determine a second neural network from the super network, where the second neural network is a sub-network in the super network, and the second neural network includes a number of network layers greater than or less than a number of network layers included in the first neural network, or includes a number of channels greater than or less than a number of channels included in the first neural network.
Optionally, the detecting module 1230 is further configured to obtain a processing result of the application data by using the second neural network. For example, the detection module 1230 is configured to perform step 770 in fig. 9.
The data processing apparatus 1200 also includes an update module 1250 and a display module 1260.
The updating module 1250 is configured to, if the speed and the accuracy of the first neural network do not meet the user requirement, adjust the scene data corresponding to the first neural network to obtain adjusted scene data, and store the adjusted scene data in the storage module 1240. For example, the update module 1250 is configured to perform steps 760 and 780 of FIG. 9.
The display module 1260 is configured to display a corresponding relationship between the scene data that affects the speed and the accuracy of the first neural network and the first neural network, and the processing result. For example, the display module 1260 is used to execute step 750 in fig. 9.
It should be understood that the data processing apparatus 1200 according to the embodiments of the present application may be implemented by an ASIC, or a Programmable Logic Device (PLD), which may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof. When the data processing methods shown in fig. 7, 8, and 9 may be implemented by software, the data processing apparatus 1200 and the respective modules thereof may be software modules.
The data processing apparatus 1200 according to the embodiment of the present application may correspond to performing the method described in the embodiment of the present application, and the above and other operations and/or functions of each unit in the data processing apparatus 1200 are not described herein again for brevity in order to implement the corresponding flow of each method in fig. 7, fig. 8, and fig. 9, respectively.
Fig. 13 is a schematic structural diagram of a terminal 1300 according to this embodiment. As shown, terminal 1300 includes a processor 1310, a bus 1320, a memory 1330, a communication interface 1340.
It should be understood that in the present embodiment, the processor 1310 may be a CPU, and the processor 1310 may also be other general purpose processors, DSPs, ASICs, FPGAs, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or any conventional processor or the like.
The processor may also be a GPU, NPU, microprocessor, ASIC, or one or more integrated circuits for controlling the execution of programs in accordance with the present aspects.
Communication interface 1340 is used to enable terminal 1300 to communicate with external devices or appliances. In the present embodiment, the communication interface 1340 is used to receive application data and scene awareness data to be processed.
Bus 1320 may include a path for communicating information between the above components, such as processor 1310 and memory 1330. Bus 1320 may include a power bus, a control bus, a status signal bus, and the like, in addition to a data bus. But for purposes of clarity will be identified in the drawings as bus 1320.
As one example, terminal 1300 can include multiple processors. The processor may be a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or computational units for processing data (e.g., computer program instructions). The processor 1310 may call the super network and the preset condition stored in the memory 1330 to determine a first neural network for processing the application data according to the scene awareness data and the preset condition, and obtain a processing result of the application data by using the first neural network.
It is to be noted that, in fig. 13, only the terminal 1300 includes 1 processor 1310 and 1 memory 1330 as an example, here, the processor 1310 and the memory 1330 are respectively used to indicate a type of device or apparatus, and in a specific embodiment, the number of each type of device or apparatus may be determined according to the service requirement.
The memory 1330 may correspond to a storage medium, such as a magnetic disk, for example, a mechanical hard disk or a solid state disk, for storing information such as the super network and the preset condition in the above method embodiments.
The terminal 1300 may be a general-purpose device or a special-purpose device. For example, the terminal 1300 may be a mobile phone terminal, a tablet computer, a notebook computer, a VR device, an AR device, a Mixed Reality (MR) device or an Extended Reality (ER) device, a vehicle-mounted terminal, and the like, and may also be an edge device (e.g., a box carrying a chip with processing capability), and the like. Alternatively, terminal 1330 may be a server or other device with computing capabilities.
It should be understood that the terminal 1300 according to this embodiment may correspond to the data processing apparatus 1200 in this embodiment, and may correspond to a corresponding main body executing any one of the methods according to fig. 7, fig. 8, and fig. 9, and the above and other operations and/or functions of each module in the data processing apparatus 1200 are not described herein again for brevity in order to implement the corresponding flow of each method in fig. 7, fig. 8, and fig. 9, respectively.
The method steps in this embodiment may be implemented by hardware, or may be implemented by software instructions executed by a processor. The software instructions may consist of corresponding software modules that may be stored in Random Access Memory (RAM), flash memory, read-only memory (ROM), programmable ROM, Erasable PROM (EPROM), Electrically EPROM (EEPROM), registers, a hard disk, a removable hard disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. In addition, the ASIC may reside in a network device or a terminal device. Of course, the processor and the storage medium may reside as discrete components in a network device or a terminal device.
In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network appliance, a user device, or other programmable apparatus. The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire or wirelessly. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape; or an optical medium, such as a Digital Video Disc (DVD); it may also be a semiconductor medium, such as a Solid State Drive (SSD).
While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (21)

1. A data processing method, characterized in that the method is performed by a terminal, the method comprising:
acquiring application data to be processed;
acquiring scene perception data, wherein the scene perception data is used for indicating the terminal to process influence factors of the application data;
determining a first neural network for processing the application data according to the scene perception data and a preset condition, wherein the preset condition is used for indicating a corresponding relation between the scene data influencing the operation of the neural network and the neural network;
and obtaining a processing result of the application data by utilizing the first neural network.
2. The method according to claim 1, wherein the context awareness data is used to indicate factors affecting speed and accuracy when the terminal processes the application data.
3. The method of claim 2, wherein the scene awareness data includes at least one of an external influencing factor and an internal influencing factor; the external influence factor is used for describing application scene characteristics of the application data acquired by the terminal, and the internal influence factor is used for describing operation scene characteristics of hardware resources of the application data operated by the terminal.
4. The method of claim 3, wherein the external influencing factors comprise at least one of temperature data, humidity data, light data, and time data.
5. The method of claim 3, wherein the internal influencing factors include at least one of a computing power of a processor, an available storage capacity, and an available remaining power.
6. The method according to any one of claims 1 to 5, wherein the determining a first neural network for processing the application data according to the scene awareness data and a preset condition comprises:
determining parameters of a first neural network corresponding to scene data including the scene perception data in the preset conditions, wherein the parameters of the first neural network include the number of channels and the number of network layers;
determining the first neural network from a super network according to the parameters of the first neural network, wherein the first neural network is a sub-network in the super network, and the super network is used for determining the first neural network corresponding to the scene data;
wherein the sub-network comprises a number of network layers that is smaller than a number of network layers comprised by the super-network, or the sub-network comprises a number of channels that is smaller than a number of channels comprised by the super-network, each layer of network layers comprising at least one neuron.
7. The method according to claim 6, characterized in that said terminal stores parameters of said super network and of at least one sub-network comprised by said super network; said determining said first neural network from a super network based on parameters of said first neural network comprises:
determining weights of the first neural network from the weights of the super network according to parameters of the first neural network.
8. The method according to any one of claims 1 to 7, wherein after determining the first neural network to process the application data according to the scene awareness data and a preset condition, the method further comprises:
determining a second neural network from a hyper-network, wherein the second neural network is a sub-network in the hyper-network, and the number of network layers contained in the second neural network is greater than or less than that of the network layers contained in the first neural network, or the number of channels contained in the second neural network is greater than or less than that of the channels contained in the first neural network;
and obtaining a processing result of the application data by utilizing the second neural network.
9. The method of claim 1, further comprising:
and if the speed and the precision of the first neural network do not meet the requirements of the user, adjusting the scene data corresponding to the first neural network.
10. The method according to any one of claims 1 to 9, further comprising:
displaying the corresponding relation of the scene data influencing the speed and the precision of the first neural network and the first neural network, and the processing result.
11. A data processing apparatus, characterized in that the apparatus comprises:
the communication module is used for acquiring application data to be processed;
the communication module is further configured to acquire scene awareness data, where the scene awareness data is used to instruct the terminal to process influencing factors of the application data;
the selection module is used for determining a first neural network for processing the application data according to the scene perception data and a preset condition, wherein the preset condition is used for indicating a corresponding relation between the scene data influencing the operation of the neural network and the neural network;
and the detection module is used for obtaining the processing result of the application data by utilizing the first neural network.
12. The apparatus of claim 11, wherein the context awareness data is configured to indicate factors affecting speed and accuracy when the terminal processes the application data.
13. The apparatus of claim 12, wherein the scene awareness data includes at least one of an external influencing factor and an internal influencing factor; the external influence factor is used for describing application scene characteristics of the application data acquired by the terminal, and the internal influence factor is used for describing operation scene characteristics of hardware resources of the application data operated by the terminal.
14. The apparatus of claim 13, wherein the external influencing factors comprise at least one of temperature data, humidity data, light data, and time data.
15. The apparatus of claim 13, wherein the internal influencing factors comprise at least one of a computing power of a processor, an available storage capacity, and an available remaining power.
16. The apparatus according to any one of claims 11 to 15, wherein the selecting module, when determining the first neural network for processing the application data according to the scene awareness data and a preset condition, is specifically configured to:
determining parameters of a first neural network corresponding to scene data including the scene perception data in the preset conditions, wherein the parameters of the first neural network include the number of channels and the number of network layers;
determining the first neural network from a super network according to the parameters of the first neural network, wherein the first neural network is a sub-network in the super network, and the super network is used for determining the first neural network corresponding to the scene data;
wherein the sub-network comprises a number of network layers that is smaller than a number of network layers comprised by the super-network, or the sub-network comprises a number of channels that is smaller than a number of channels comprised by the super-network, each layer of network layers comprising at least one neuron.
17. The device according to claim 16, characterized in that said terminal stores parameters of said hyper-network and of at least one sub-network comprised by said hyper-network; the selecting module, when determining the first neural network from the super network according to the parameter of the first neural network, is specifically configured to:
determining weights for the first neural network from the weights for the super network according to the parameters for the first neural network.
18. The apparatus according to any one of claims 11 to 17, wherein the selection module is further configured to:
determining a second neural network from a super network, wherein the second neural network is a sub-network in the super network, and the second neural network comprises network layers which are more or less than the number of network layers included in the first neural network, or comprises channels which are more or less than the number of channels included in the first neural network;
the detection module is further configured to obtain a processing result of the application data by using the second neural network.
19. The apparatus of claim 11, further comprising an update module;
and the updating module is used for adjusting the scene data corresponding to the first neural network if the speed and the precision of the first neural network do not meet the requirements of the user.
20. The apparatus of any one of claims 11 to 19, further comprising a display module;
the display module is used for displaying the corresponding relation between the scene data which influences the speed and the precision of the first neural network and the first neural network, and the processing result.
21. A terminal, comprising a memory and at least one processor, the memory configured to store a set of computer instructions; the set of computer instructions, when executed by the processor, performs the operational steps of the method of any of claims 1-10.
CN202110558861.1A 2021-05-21 2021-05-21 Data processing method and device and terminal Active CN114970654B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110558861.1A CN114970654B (en) 2021-05-21 2021-05-21 Data processing method and device and terminal
PCT/CN2021/141388 WO2022242175A1 (en) 2021-05-21 2021-12-24 Data processing method and apparatus, and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110558861.1A CN114970654B (en) 2021-05-21 2021-05-21 Data processing method and device and terminal

Publications (2)

Publication Number Publication Date
CN114970654A true CN114970654A (en) 2022-08-30
CN114970654B CN114970654B (en) 2023-04-18

Family

ID=82972903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110558861.1A Active CN114970654B (en) 2021-05-21 2021-05-21 Data processing method and device and terminal

Country Status (2)

Country Link
CN (1) CN114970654B (en)
WO (1) WO2022242175A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678278A (en) * 2016-02-01 2016-06-15 国家电网公司 Scene recognition method based on single-hidden-layer neural network
WO2019114147A1 (en) * 2017-12-15 2019-06-20 华为技术有限公司 Image aesthetic quality processing method and electronic device
CN110956262A (en) * 2019-11-12 2020-04-03 北京小米智能科技有限公司 Hyper network training method and device, electronic equipment and storage medium
CN111459022A (en) * 2020-04-21 2020-07-28 深圳市英维克信息技术有限公司 Device parameter adjustment method, device control apparatus, and computer-readable storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115690558A (en) * 2014-09-16 2023-02-03 华为技术有限公司 Data processing method and device
US20200027009A1 (en) * 2018-07-23 2020-01-23 Kabushiki Kaisha Toshiba Device and method for optimising model performance
CN110569984B (en) * 2019-09-10 2023-04-14 Oppo广东移动通信有限公司 Configuration information generation method, device, equipment and storage medium
CN111338669B (en) * 2020-02-17 2023-10-24 深圳英飞拓仁用信息有限公司 Method and device for updating intelligent function in intelligent analysis box

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678278A (en) * 2016-02-01 2016-06-15 国家电网公司 Scene recognition method based on single-hidden-layer neural network
WO2019114147A1 (en) * 2017-12-15 2019-06-20 华为技术有限公司 Image aesthetic quality processing method and electronic device
CN111095293A (en) * 2017-12-15 2020-05-01 华为技术有限公司 Image aesthetic processing method and electronic equipment
CN110956262A (en) * 2019-11-12 2020-04-03 北京小米智能科技有限公司 Hyper network training method and device, electronic equipment and storage medium
CN111459022A (en) * 2020-04-21 2020-07-28 深圳市英维克信息技术有限公司 Device parameter adjustment method, device control apparatus, and computer-readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨鹏等: "基于卷积神经网络的室内场景识别", 《郑州大学学报(理学版)》 *

Also Published As

Publication number Publication date
CN114970654B (en) 2023-04-18
WO2022242175A1 (en) 2022-11-24

Similar Documents

Publication Publication Date Title
CN112445823A (en) Searching method of neural network structure, image processing method and device
WO2021238366A1 (en) Neural network construction method and apparatus
WO2020192736A1 (en) Object recognition method and device
CN111507378A (en) Method and apparatus for training image processing model
CN110678873A (en) Attention detection method based on cascade neural network, computer device and computer readable storage medium
CN111291809A (en) Processing device, method and storage medium
CN112990211A (en) Neural network training method, image processing method and device
CN112529146B (en) Neural network model training method and device
CN111401517A (en) Method and device for searching perception network structure
CN113570029A (en) Method for obtaining neural network model, image processing method and device
CN110222718A (en) The method and device of image procossing
CN112489072B (en) Vehicle-mounted video perception information transmission load optimization method and device
CN112464930A (en) Target detection network construction method, target detection method, device and storage medium
CN113011562A (en) Model training method and device
CN111709471A (en) Object detection model training method and object detection method and device
CN110705564B (en) Image recognition method and device
CN110909656B (en) Pedestrian detection method and system integrating radar and camera
CN116258940A (en) Small target detection method for multi-scale features and self-adaptive weights
CN115018039A (en) Neural network distillation method, target detection method and device
CN117217280A (en) Neural network model optimization method and device and computing equipment
CN116432736A (en) Neural network model optimization method and device and computing equipment
CN115131503A (en) Health monitoring method and system for iris three-dimensional recognition
CN113379045A (en) Data enhancement method and device
CN117157679A (en) Perception network, training method of perception network, object recognition method and device
CN114970654B (en) Data processing method and device and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant