WO2023273629A1 - 一种配置边缘服务器中神经网络模型的系统及装置 - Google Patents

一种配置边缘服务器中神经网络模型的系统及装置 Download PDF

Info

Publication number
WO2023273629A1
WO2023273629A1 PCT/CN2022/092414 CN2022092414W WO2023273629A1 WO 2023273629 A1 WO2023273629 A1 WO 2023273629A1 CN 2022092414 W CN2022092414 W CN 2022092414W WO 2023273629 A1 WO2023273629 A1 WO 2023273629A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
network model
layer
cloud server
edge server
Prior art date
Application number
PCT/CN2022/092414
Other languages
English (en)
French (fr)
Inventor
张玉楼
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023273629A1 publication Critical patent/WO2023273629A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Definitions

  • the present application relates to the field of computer technology, and in particular to a system and device for configuring a neural network model in an edge server.
  • a business system includes a cloud server and multiple edge servers, and each edge server can receive sampled data sent by collection devices, such as video data, Internet of things (IOT) data (such as temperature, gas concentration, etc.).
  • IOT Internet of things
  • a neural network model is deployed on the edge server, and the neural network model may be preset on the edge server, or may be sent to the edge server by the cloud server.
  • the edge server can detect and identify the received sampled data based on the neural network model. For example, based on video data, it can detect whether employees are wearing masks or helmets.
  • edge servers Due to the large number of edge servers, due to cost and time considerations, edge servers usually do not have the ability to train neural network models. In order to ensure the accuracy of the neural network model on the edge server, at present, the cloud server is responsible for training the neural network model, and sends the trained neural network model to the edge server. However, due to the large number of edge servers, this implementation requires high network bandwidth.
  • the present application provides a system and device for configuring a neural network model in an edge server, which is used to reduce bandwidth requirements on the basis of ensuring the accuracy of the neural network model in the edge server.
  • the embodiment of the present application provides a system for configuring a neural network model in an edge server, where the system includes a cloud server and an edge server.
  • a neural network model runs in the cloud server, for example, it is called the first neural network model.
  • the edge server runs a neural network model, such as a second neural network model, and the first neural network model and the second neural network model have the same structure.
  • the first neural network model and the second neural network model are the same model.
  • the cloud server can be used to train the first neural network model, and send configuration parameters of N layers in the first neural network model that meet preset conditions to the edge server.
  • the edge server is configured to receive the configuration parameters of the N layers, and update the parameters of the N layers corresponding to the configuration parameters in the second neural network model according to the configuration parameters of the N layers.
  • the edge server can update the second neural network model in time to ensure the accuracy of the second neural network model on the edge server. requirements.
  • the N layers are preset layers in the first neural network model.
  • any number of layers in the first neural network model can be designated as preset layers.
  • the cloud server trains the first neural network model, it sends the configuration parameters of the preset N layers to the edge server. Since the N layers are preset, the cloud server does not need to make a selection, which can save CPU overhead.
  • each layer in the first neural network model has a weight, and the weight is used to indicate the degree of influence of the layer on the accuracy of the first neural network model; any layer in the N layers The weight is greater than the weight of any layer in the K-N layer.
  • the cloud server can sort the K layers of the first neural network model in descending order of weight, and send the configuration parameters of the top N layers to the edge server, that is, any one of the N layers The weight of a layer is greater than the weight of any layer after N layers (K-N).
  • the cloud server can send the configuration parameters of the N layers to the edge server, which can ensure the accuracy of the edge server. Based on the rate, reduce the amount of data transmission.
  • N is a preset value (such as a first preset value), or N is the number of layers whose weight exceeds a preset threshold.
  • the configuration mode of the N value is flexible and can be applied in various scenarios.
  • the configuration parameters include parameters and layer identifiers of each layer in the N layers.
  • the edge server can determine the corresponding layer in the second neural network model according to the layer identifier, and update the parameter of the corresponding layer in the second neural network model according to the parameter of the layer in the configuration parameters.
  • the edge server is further configured to send the first information to the cloud server when the performance parameter of the second neural network model is lower than a first preset value, indicating that the cloud
  • the server trains the first neural network model;
  • the cloud server is configured to receive the first information sent by the edge server, and train the first neural network model according to the first information.
  • the edge server is further configured to receive sampled data sent by the collection device, determine the training data used to train the first neural network model in the sampled data, and determine the When the amount of training data exceeds a second preset value, the first information is sent to the cloud server, instructing the cloud server to train the first neural network model; the cloud server is used to receive the edge server The first information is sent, and the first neural network model is trained according to the first information.
  • the first information includes one or more of the following: the training data, the performance parameters of the second neural network model, Instructions for model training; wherein, the performance parameters include one or more of accuracy, confidence, and precision.
  • the cloud server is further configured to receive the sampled data sent by the collection device, determine the training data used to train the first neural network model in the sampled data, and determine the When the amount of training data exceeds a third preset value, the first neural network model is trained.
  • the training data includes sampling data whose similarity with historical data is smaller than a fourth preset value.
  • the cloud server when the cloud server detects that the first training condition is met, it trains the first neural network model based on the training data.
  • the cloud server receives the training data sent by the edge server; or the cloud server receives the sampling data sent by the sampling device, and determines the training data according to the sampling data.
  • the training data is sampling data whose similarity with historical data is lower than a preset value (preset value A).
  • the first training conditions include but are not limited to:
  • the cloud server receives the first information sent by the edge server, and the first information is used to instruct the cloud server to train the first neural network model; or the cloud server receives the first information sent by the edge server, and determines the second neural network model in the edge server
  • the performance parameter of the model is lower than a preset value (such as a preset value B); the first information includes the performance parameter of the second neural network model, and the performance parameter is used to indicate the usage of the second neural network model,
  • the performance parameters include but are not limited to: the accuracy, confidence, and precision of the second neural network model; or the cloud server determines that the amount of training data reaches a preset value (such as a preset value C), and the training data is
  • the cloud server determines based on the sampling data from the collection device, or receives from the edge server; the training data is the sampling data whose similarity with the historical data is lower than a preset value (such as the preset value D).
  • the edge server when the edge server detects that the second training condition is satisfied, it sends the first information to the cloud server; wherein the first information is used to instruct the cloud server to train the first neural network model;
  • the second training conditions include but are not limited to:
  • the edge server determines that the data volume of the training data reaches a preset value (such as a preset value E, and the preset value E and the preset value C may be the same or different), and the training data is based on the The sampling data sent by the received acquisition device is determined; the training data is lower than the preset value (such as the preset value F, the preset value F and the preset value D can be the same as the historical data similarity, or can different) sampling data; or the edge server determines that the performance parameter of the second neural network model is lower than a preset value (such as a preset value G, the preset value G and the preset value B may be the same, or are different); the performance parameters are used to indicate the usage of the second neural network model, and the performance parameters include but are not limited to: accuracy, confidence, and precision of the second neural network model.
  • a preset value such as a preset value E, and the preset value E and the preset value C may be the same or different
  • the edge server sends first information to the cloud server, where the first information includes performance parameters of the second neural network model and/or training data determined by the edge server.
  • the embodiment of the present application also provides a method for configuring the neural network model in the edge server, which can be applied to the system mentioned in the first aspect, and the cloud server in the method implements the method to achieve the above
  • the functions and beneficial effects of the behavior of the cloud server in the system shown in the first aspect can be referred to the description of the first aspect and will not be repeated here.
  • the edge server in this method realizes the function of the cloud server behavior in the system shown in the first aspect by executing the method, and the beneficial effects can be referred to the description of the first aspect and will not be repeated here.
  • the embodiment of the present application also provides a configuration device, which has the function of implementing the cloud server behavior in the system shown in the first aspect above. repeat.
  • the functions described above may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the structure of the device includes a training module, a sending module, and optionally, a receiving module and a processing module. These modules can perform the corresponding functions in the method example of the first aspect above. For details, refer to the detailed description in the method example, and details are not repeated here.
  • the present application also provides a configuration device, the configuration device includes a processor and a memory, and may also include a communication interface, and the processor executes the program instructions in the memory to implement the above-mentioned first aspect or the first Operations performed by the cloud server provided in any possible implementation manner of the aspect.
  • the configuration device may be a server, or a computing device.
  • the memory is coupled with the processor, and stores necessary program instructions and data in the process of configuring the neural network model in the edge server.
  • the communication interface is used for communicating with other devices.
  • the present application provides a computer-readable storage medium.
  • the storage device executes the aforementioned first aspect or any possible implementation of the first aspect.
  • the program is stored in the storage medium.
  • the storage medium includes but not limited to volatile memory, such as random access memory, and nonvolatile memory, such as flash memory, hard disk drive (hard disk drive, HDD), and solid state drive (solid state drive, SSD).
  • the present application provides a program product for a computing device
  • the program product for a computing device includes computer instructions, and when executed by a computing device, the computing device executes the aforementioned first aspect or any possible implementation of the first aspect
  • the computer program product may be a software installation package, and when the function of the cloud server provided in the aforementioned first aspect or any possible implementation of the first aspect needs to be used, the computer program product may be downloaded and installed on the computing device Execute the computer program product.
  • the present application also provides a computer chip, the chip is connected to the memory, and the chip is used to read and execute the software program stored in the memory, and implement the above first aspect and each possibility of the first aspect.
  • FIG. 1A is a schematic diagram of a possible system architecture provided by an embodiment of the present application.
  • FIG. 1B is a schematic structural diagram of a cloud server provided by an embodiment of the present application.
  • FIG. 1C is a schematic structural diagram of another cloud server provided by the embodiment of the present application.
  • Fig. 2A is a structural schematic diagram of a neural network model
  • Fig. 2B is a schematic diagram of layer structure naming of a neural network model
  • FIG. 2C is a schematic diagram of a neural network model
  • FIG. 3 is a schematic flowchart corresponding to a method for configuring a neural network model provided in an embodiment of the present application
  • FIG. 4 is a schematic diagram of a frame structure corresponding to a configuration parameter provided by an embodiment of the present application
  • FIG. 5A is a schematic diagram of a configuration scenario of a neural network model provided in an embodiment of the present application.
  • FIG. 5B is a schematic diagram of another neural network model configuration scenario provided by the embodiment of the present application.
  • FIG. 5C is a schematic flowchart corresponding to a method for configuring a neural network model provided in an embodiment of the present application
  • FIG. 6 is a schematic diagram of a configuration scenario of a neural network model in a system provided by an embodiment of the present application.
  • FIG. 7 is a schematic flowchart corresponding to a data processing method provided in an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a configuration device provided by the present application.
  • FIG. 1A is a schematic structural diagram of a system provided by an embodiment of the present application.
  • the system includes cloud server 10, edge server 20, (only 3 edge servers 20 are shown in FIG. 30, but this application does not limit it).
  • the cloud server 10 is used for managing the edge server 20 and providing services for the edge server 20 .
  • the application server 100 may be a physical machine or a virtual machine.
  • the cloud server 10 has the function of training neural network models, and one or more neural network models identical to the neural network models running on the edge server 20 can be stored on the cloud server, and the cloud server 10 can provide the edge server 20 to train these neural network models. model service, thereby saving the local computing resource overhead of the edge server 20.
  • the cloud server 10 has a processor 112 , a memory 113 and a communication interface 114 .
  • the processor 112 the memory 113 and the communication interface 114 are connected through a bus.
  • the processor 112 is a central processing unit (central processing unit, CPU), which is used to execute a software program in memory to realize one or more functions such as training a neural network model and the like.
  • the processor 112 may also be used for computing and processing data, such as metadata management, deduplication, data compression, data verification, virtualized storage space, and address translation.
  • the processor 112 may also be an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), an artificial intelligence (AI) chip, a system on chip ( system on chip, SoC) or complex programmable logic device (complex programmable logic device, CPLD), graphics processing unit (graphics processing unit, GPU), etc.
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • AI artificial intelligence
  • SoC system on chip
  • CPLD complex programmable logic device
  • GPU graphics processing unit
  • processors 112 there may be multiple processors 112.
  • the multiple processors 112 may include multiple processors of the same type, or Multiple processors of different types are included, for example, multiple processors 112 include multiple CPUs.
  • the plurality of processors 112 includes one or more CPUs and one or more GPUs.
  • the plurality of processors 112 includes one or more CPUs, one or more GPUs, and one or more FPGAs, and so on.
  • the CPU may have one or more CPU cores. This embodiment does not limit the number of CPUs and the number of CPU cores.
  • the storage 113 refers to a device for storing data, which may be a memory or a hard disk.
  • the memory refers to the internal memory directly exchanging data with the processor 112 , it can read and write data at any time, and the speed is very fast, and it is used as a temporary data storage for the operating system or other running programs running on the processor 112 .
  • Memory includes volatile memory (volatile memory), such as RAM, DRAM, etc., and can also include non-volatile memory (non-volatile memory), such as storage class memory (storage class memory, SCM), etc., or volatile memory Combination with non-volatile memory, etc. In practical applications, multiple memories and different types of memories can be configured in the cloud server 10 .
  • the memory can be configured to have a power saving function.
  • the power protection function means that when the system is powered off and then powered on again, the data stored in the memory will not be lost.
  • Memory with a power saving function is called non-volatile memory.
  • the hard disk is used to provide storage resources, for example, to store one or more neural network models.
  • Hard disks include but are not limited to: non-volatile memory (non-volatile memory), such as read-only memory (read-only memory, ROM), hard disk drive (hard disk drive, HDD) or solid state drive (solid state disk, SSD) Wait. Unlike memory, hard drives have slower read and write speeds and are usually used to store data persistently.
  • the data and program instructions in the hard disk need to be loaded into the memory first, and then the processor obtains the data and/or program instructions from the memory.
  • the communication interface 114 is used for communicating with other devices such as edge servers or other cloud servers.
  • the edge server 20 is configured to receive the sampled data sent by the collection device 30, and process the sampled data using a corresponding neural network model locally on the edge server 20, and obtain an output result of the neural network model.
  • the edge server 20 has hardware components similar to those of the cloud server 10 , for example, the edge server 20 has a processor, a memory, a network card, and the like. Wherein, the network card is used for communicating with the collection device 30 and the cloud server 10 .
  • the memory can be used to store the neural network model.
  • the processor is used to run neural network models, etc.
  • the functions and types of these hardware components are similar to those in the cloud server 10. For details, please refer to the introduction of the hardware components of the cloud server 10 above, and details will not be repeated this time.
  • the collection device 30 is used to collect data (for example, sampled data), and send the sampled data to other devices, such as the edge server 20, the cloud server 10, and the like.
  • the acquisition device 30 includes but is not limited to: cameras, mobile phones, smart watches, vehicle-mounted terminal equipment, and sensors, where sensors include heart rate sensors, distance sensors, temperature sensors, pressure sensors, smoke detectors, gas concentration detectors, etc. This is not limited, and any device with a data collection function and a communication function is applicable to this embodiment of the application.
  • FIG. 1A only shows a small number of devices to keep it simple.
  • the embodiment of the present application does not limit the number and type of each device under the system.
  • the collection device 30 in FIG. 1A is illustrated by taking a camera, a mobile phone, a smart watch, and a vehicle-mounted terminal as examples, but the embodiment of the present application does not limit the type and quantity of the collection device 30.
  • various types of Sensors, etc., any device with information collection and transmission functions are applicable to this embodiment of the application.
  • the cloud server 10 in FIG. 1A above can be an edge cloud server or a public cloud server.
  • the edge cloud can be understood as a private cloud, and the edge cloud server can also interact with the public cloud server.
  • the structure of the cloud server 10 shown in FIG. 1B is only an example. In an actual product, the cloud server 10 may have more or fewer components than that in FIG. 1B , which is not limited in this embodiment of the present application.
  • FIG. 1C is another system architecture diagram provided by the embodiment of the present application.
  • FIG. 1C adds a cloud server 11 on the basis of FIG. 1A.
  • the hardware components and software components of the cloud server 11 can be the same as the above-mentioned cloud server 10, here No longer.
  • any two cloud servers can communicate with each other.
  • the foregoing cloud server may be a hardware device or a virtual machine, which is not limited in this embodiment of the present application.
  • the above system architecture can be deployed in a variety of business scenarios, such as indoor and outdoor scenarios such as industry, construction, production lines, buildings, and autonomous driving.
  • collection devices can be deployed in industrial parks, enterprise parks, campuses, shopping malls, residences, public areas, etc.
  • the neural network models deployed on the edge server 20 may be different.
  • the edge server 20 can receive images captured by cameras deployed in industrial parks, and use a neural network model to analyze whether there are warning behaviors in the images, such as whether employees wear safety helmets, whether employees leave their posts, Is there an open flame.
  • a neural network model to analyze whether there are warning behaviors in the images, such as whether employees wear safety helmets, whether employees leave their posts, Is there an open flame.
  • the edge server 20 may receive images captured by cameras deployed in hospitals, and use a neural network model to analyze whether there is someone not wearing a mask in the images, and so on.
  • the edge server 20 may receive images captured by cameras deployed in the public area, and use the neural network model of the corresponding function to analyze whether there are warning behaviors in the image, such as littering, retrograde, etc.
  • the neural network model can also output the confidence of the detection results.
  • the higher the confidence level the stronger the reliability of the detection result; the lower the confidence level, the lower the reliability of the detection result. This is because the neural network model will compare the detected features with the learned features to obtain the detection result, and determine the confidence of the detection result according to the matching degree between the two.
  • the neural network model can detect people and pets. If the characteristics of the target instance match (or have a high similarity to) the characteristics of the person learned by the neural network model, the output detection result is human, and the confidence of the detection result is It can be the matching degree between the characteristics of the target instance and the characteristics of the learned person. The higher the matching degree, the more likely it is that the target instance is a person.
  • the embodiment of the present application does not limit the structure and function of the neural network model applicable to the edge server 20.
  • the neural network model includes but not limited to a multi-layer perceptron neural network model, a convolutional neural network model, a recurrent neural network Model and residual shrinkage network, etc.
  • the residual shrinkage network is an improvement of the convolutional neural network model, which will not be highlighted here.
  • the neural network model may have classification functions, object detection functions, object recognition functions, prediction functions, reasoning functions, etc.
  • the embodiment of the present application does not limit the structure and functions of the neural network model.
  • the neural network model in this application can also include neural network models based on machine learning algorithms, deep learning algorithms, etc., and neural network models obtained based on machine learning algorithms can also be called machine learning models.
  • the obtained neural network model may also be called a deep learning model, and any neural network model is applicable to the embodiment of this application.
  • the structure of the neural network model is introduced as follows taking the multi-layer perceptron neural network model as an example.
  • FIG. 2A is a schematic structural diagram of a neural network model.
  • the neural network model includes an input layer (input layer), a hidden layer (hidden layers) (only 2 hidden layers are shown in Figure 2A, but the application does not limit this) and output Layer (output layer).
  • input layer input layer
  • hidden layers hidden layers
  • output Layer output layer
  • the input layer includes one or more neurons, and each neuron is an input parameter.
  • Hidden layer between the input layer and the output layer, the number of hidden layers is variable, and the neural network model can include one or more hidden layers.
  • the hidden layers arranged in order are respectively recorded as the first layer, the second layer, ..., the mth layer, and m is a positive integer.
  • the first layer and the second layer here refer to the first layer and the second layer in the hidden layer, not the first layer and the second layer of the entire neural network model.
  • the output layer includes one or more neurons, and each neuron is an output. In practical applications, the number of neurons in the output layer depends on the problem to be solved.
  • the function of the hidden layer is to map the input to the output.
  • the output of the previous layer is used as the input of the current hidden layer, and based on the preset function (including weight and bias) to get a output value, and input the obtained output value to the next layer.
  • the relationship between each neuron in the hidden layer and the previous layer is as follows:
  • h 11 x 1 ⁇ w 11 +x 2 ⁇ w 21 +b 11 ;
  • h 12 x 1 ⁇ w 12 +x 2 ⁇ w 22 +b 12 ;
  • y 1 h 11 ⁇ w 110 +h 12 ⁇ w 210 +b 110 ;
  • y 2 h 11 ⁇ w 111 +h 12 ⁇ w 211 +b 120 ;
  • a convolutional neural network model includes an input layer, one or more convolutional layers , one or more pooling layers, and an output layer.
  • the layer between the input layer and the output layer in each neural network model is called the connection layer as follows, and the parameters of each connection layer include the weight value and bias value included in the layer, and may also include other parameters.
  • the coefficient values of functions such as activation functions and feedback functions.
  • the connection layer does not have activation functions or feedback functions, the parameters of the connection layer do not include the coefficient values of activation functions and feedback functions. Any training process can be adjusted. Or parameters that can be changed can be included in the parameters.
  • the parameters of these connection layers in the neural network model are constantly adjusted so that the output of the output layer reaches the desired output value.
  • the training can be stopped, and the trained neural network model can be put into the edge server for application.
  • the training of the neural network model is actually a continuous learning process, and the accuracy of the trained neural network model is not static. If there is a scene where the neural network model has not been learned, its accuracy may vary with time lowered. However, there are a large number of edge servers and their computing power is limited. Generally, edge servers do not have the training function of neural network models.
  • the embodiment of the present application provides a method for configuring neural network models, which is used to ensure that the neural network on edge servers Based on the accuracy of the model, the amount of data transmission is reduced and the requirements for network bandwidth are reduced.
  • the method for configuring the neural network model in the edge server provided by the embodiment of the present application is applied to the system architecture shown in Figures 1A and 1C, and the cloud server and the edge server in this method can be shown in Figure 1A
  • the cloud server 10 or the processor of the cloud server 10
  • the edge server 20 or the processor of the edge server
  • FIG. 3 is a schematic flowchart of a method for configuring a neural network model in an edge server provided by an embodiment of the present application. As shown in Figure 3, the method includes the following steps:
  • step 301 the edge server receives the sampling data sent by the collection device.
  • the cloud server and the edge server have the same neural network model, the same here refers to the same structure and function, but the parameters of the two may be completely the same or completely different at different times , or are not identical.
  • the neural network model on the cloud server is referred to as the first neural network model
  • the neural network model on the edge server is referred to as the second neural network model.
  • the sampled data can be copied to obtain the same two sampled data, and in one path, the sampled data is input into the second neural network model for calculation to obtain the first The output of the two neural network models.
  • the edge server can determine performance parameters of the second neural network model, such as accuracy and precision, according to the output result. When the accuracy rate of the second neural network model is low, the edge server may notify the cloud server to train the first neural network model that is the same as the second neural network model.
  • the second neural network model on the edge server 20 may be sent by the cloud server to the edge server, or may be preconfigured in the edge server before leaving the factory or by the user.
  • the edge server may filter the sampled data (see step 302).
  • step 302 the edge server filters the sampled data to obtain filtered sampled data.
  • the edge server may determine the similarity between the sampled data and the target historical data, and then filter the sampled data based on the similarity to obtain sampled data whose similarity with the target historical data is lower than a first preset value.
  • These sampled data with low similarity to the target historical data can represent the data of new scenes collected by the acquisition device, which may lead to low accuracy of the second neural network model. Therefore, the edge server can take samples of these new scenes The data is filtered out and used as the training data for the subsequent cloud server (re)training the first neural network model.
  • the historical data includes data sampled by the edge server before receiving the sampled data, and/or sample data for training the first neural network model.
  • the target historical data is a subset of the historical data.
  • the target historical data may be sampling data within a preset period of time before the sampling data to be filtered, or all historical data.
  • Determination method 1 The edge server can calculate the distance between the two according to the characteristics of the sampling data and the characteristics of the target historical data, such as Euclidean distance, Manhattan distance, and Hamming distance, and use the distance as the similarity between the two.
  • the edge server may use a feature algorithm to calculate one or more features of each target historical data, and if there are multiple target historical data, the mean value of the feature may be calculated based on the values of the same feature of the multiple target data , get the mean value of each feature in the above way.
  • the edge server uses the same feature algorithm to calculate one or more features of the sampled data to be filtered, and calculates the sampled data and the target history according to the one or more features of the sampled data and the mean value of each feature determined above the distance between the data.
  • the feature of the sampling data includes (a1, b1, c1)
  • the feature mean value of the target historical data includes (a2, b2, c2)
  • the distance between the sampling data and the target historical data d
  • the distance d can be used as the similarity between the sampling data and the target historical data. It should be noted that, the foregoing manner of calculating the distance is only an example, which is not limited in this embodiment of the present application.
  • the edge server may use the number of the same features included in the sampling data and the target historical data as the similarity between the two.
  • the same feature here refers to the same type of feature and the same value of the feature.
  • the features of the sampled data include (a1, b1, c1)
  • the feature mean of the target historical data includes (a2, b1, c1), then the same features contained in the sampled data and the target historical data are b1 and c1.
  • the feature algorithm mentioned above can be preset, and the feature algorithm of different sampled data can be different, for example, the sampled data is an image, and the feature can be R value, G value, B value, etc.
  • the feature may be the value of the data, then the feature average of multiple target historical data is the average of the multiple target historical data.
  • Step 303 the edge server sends the first information to the cloud server.
  • the cloud server receives the first information sent by the edge server.
  • the first information may be used to indicate whether the cloud server needs to train the first neural network model. If the indication information includes 1 bit, when the value of the bit on the 1 bit is 0, it is used to instruct the cloud server to train the first neural network model, or when the value of the bit on the 1 bit is 1 , used to indicate that the cloud server does not need to retrain the first neural network model. In this way, the edge server can periodically send the first information.
  • the above-mentioned explicit indication method may also be that when the edge server determines that the first neural network model needs to be retrained, it may send indication information to the cloud server, and the indication information is used to instruct the cloud server to train the first neural network model. If the edge server does not Sending the instruction information to the cloud server indicates that the cloud server does not need to retrain the first neural network model. Wherein, the edge server may send the first information to the cloud server when it is determined that the performance parameter of the second neural network model is lower, such as lower than a second preset value. Wherein, the performance parameters of the second neural network model include but not limited to accuracy, precision or confidence. Alternatively, when the edge server determines that the amount of filtered sampled data reaches a certain threshold, such as not lower than a third preset value, the edge server sends the first information to the cloud server.
  • the edge cloud service may also send the filtered sampling data together with the first information to the cloud server, and the cloud server may use the filtered sampling data to train the first neural network model.
  • the first information is used to indicate the usage of the second neural network model on the edge server, for example, the first information includes performance parameters of the second neural network model on the edge server and the like.
  • the cloud server can judge whether it is necessary to train the first neural network model in the cloud server according to the first information, for example, when the accuracy rate of the second neural network model in the edge server is low, such as when it is lower than the fourth preset value , the cloud server triggers the training of the first neural network model. For another example, when the confidence level of the second neural network model is lower than the fifth preset value, the cloud server triggers the training of the first neural network model.
  • the first information includes the filtered sampling data obtained in step 301 .
  • the cloud server can judge whether it is necessary to train the first neural network model according to the first information. For example, when the amount of filtered sampling data received by the cloud server reaches a certain threshold, such as not lower than the third preset value, the cloud server The server triggers training of the first neural network model.
  • the above-mentioned first information may also include the first neural network model.
  • the identifier of the network model (or the second neural network model), which is used to uniquely identify the neural network model to be trained. It should be noted that the identifiers of the first neural network model and the second neural network model are the same.
  • the cloud server can determine the first neural network model to be trained based on the information used to record the neural network model running on the edge server.
  • the first information does not need to carry the identifier of the first neural network model.
  • steps 301 to 303 are not steps that must be executed to trigger the cloud server to train the first neural network model, therefore, they are shown in dotted boxes in FIG. 3 .
  • Step 304 the cloud server trains the first neural network model.
  • the cloud server sends the configuration parameters of the trained N layers of the first neural network model to the edge server.
  • the edge server receives the configuration parameters of the N layers sent by the cloud server.
  • the number of connection layers included in the first neural network model is greater than N.
  • step 304 the selected N layers may also be different, therefore, detailed description will be given in conjunction with step 304 and step 305 as follows.
  • step 304 after the cloud server determines to train the first neural network model based on the first information, the filtered sampling data may be used to train the first neural network model.
  • the cloud server determines the weight value of each (connection) layer when training the first neural network model.
  • the weight value of each layer is used to indicate the influence degree of the layer on the accuracy rate of the neural network model. It can be understood that the higher the weight value, the greater the impact on the accuracy of the neural network model. On the contrary, the lower the weight value, the smaller the impact on the accuracy of the neural network model.
  • the cloud server can use the training method provided by this application to train the neural network model to determine the weight of each layer.
  • the training method is introduced as follows:
  • the cloud server trains one layer of the neural network model each time, and obtains the weight of the layer according to the accuracy rate of the neural network model before training and the accuracy rate of the neural network model after training. For example, the difference between the accuracy rate of the neural network model after training and the accuracy rate of the neural network model before training is used as the weight of the layer.
  • the first layer in the connection layer is first trained, that is, the parameters of the second layer and subsequent layers remain unchanged, and only the parameters of the second layer are trained.
  • the end condition such as the neural network
  • the training ends, that is, the training of this layer is completed, and the neural network model outputs the accuracy value of the current neural network model.
  • the accuracy rate of the neural network model before training is 80%
  • the accuracy rate of the neural network model is 82% after training
  • the weight of the first layer is 0.02 (82%-80%).
  • train the second layer in the connection layer is.
  • the neural network model used in training each layer should be the same, that is, the reference objects of each layer are the same, so that it can be determined that each layer has the same effect on the neural network model. impact on accuracy. That is, the parameters of each layer of the neural network model used when training the second layer are the same as the parameters of each layer of the neural network model used when training the first layer.
  • the parameters of other layers except the second layer remain unchanged, and only the parameters of the second layer are trained, assuming that after the second layer training is completed, the accuracy of the neural network model is 89%, The weight of this second layer is then 0.08 (89%-80%).
  • the cloud server may select N layers based on the weight of each layer.
  • the cloud server sorts the multiple connection layers in descending order of weight, selects the top N layers, and sends parameters of the N layers to the edge server. For example, in conjunction with the example in step 303, assume that the cloud server determines that the weight of the first layer is 0.02, the weight of the second layer is 0.08, the weight of the third layer is 0.12, and the weight of the fourth layer is 0.15 (95%-90%) ), the weight of the fifth layer is 0.16 (96%-90%), and the weights of each layer are sorted in descending order: the fifth layer > the fourth layer > the third layer > the second layer > the first layer. Assuming that N is 2, the cloud server can send the parameters of the first two layers to the edge server.
  • the cloud server sends parameters of N layers whose performance parameters exceed a fourth preset value to the edge server.
  • the fourth preset value is 90%
  • the layers of the neural network model whose accuracy rate exceeds 90% include the third layer, the fourth layer and the fifth layer. Then the cloud server can send the parameters of the third layer, the fourth layer and the fifth layer to the edge server.
  • the cloud server can also perform joint training on the N layers again.
  • the cloud server uses the filtered sampling data to train the first neural network model without determining the weight value of each layer.
  • the training method can be based on an existing training mechanism, such as performing joint training on all layers of the first neural network model.
  • Joint training refers to training the parameters of all layers of the first neural network model at the same time. After a joint training , the parameters of each layer may be changed.
  • the N layers may be several layers specified in the first neural network model, ie preset layers.
  • the preset layers include the last three connected layers in the first neural network model.
  • the specified layers may be continuous or discontinuous, which is not limited in this application. In practical applications, those skilled in the art can determine several layers that have a greater impact on the accuracy of the first neural network model as preset layers through a large amount of calculation data.
  • configuration parameters of the N layers of the sending edge server in the embodiment of the present application all refer to parameters after training of the N layers.
  • the cloud server may send the N configuration parameters to the edge server, wherein the configuration parameters include the layer identifier of each target layer And the parameters of this layer, optionally, when there are multiple neural network models on the edge server, for the convenience of confirmation, the cloud server can also send the identification of the first neural network model to the edge cloud server, that is, the configuration parameters can also include Identification of the first neural network model.
  • FIG. 4 is a schematic diagram of a frame structure of a configuration parameter provided by an embodiment of the present application.
  • the configuration parameters include a data header and a data part, wherein the data header can record relevant information of the first neural network model, for example including but not limited to one or more of the following: the identification of the first neural network model, data The number of objects contained in the region, the length of each object, and the length of the layer identifier.
  • the layer ID and layer ID of each target layer are placed at the granularity of objects. Since the parameters contained in each layer may be different, the length of each object can be different. Of course, each object’s The length may also be the same, which is not limited in the embodiment of the present application.
  • the target layers (layer 2, layer 3 to layer n) shown in FIG. Configuration parameters are not limited to layer 2, layer 3 to layer n.
  • the embodiment of the present application also does not limit the ordering of the objects. For example, it may be arranged in the order of layers as shown in FIG. 4 , or it may be arranged in other ways, and the present application does not limit this.
  • Step 306 based on the configuration parameters of the N layers sent by the cloud server, the edge server updates the parameters of the N layers corresponding to the configuration parameters in the second neural network model running on the edge server.
  • the edge server may determine the second neural network model indicated by the identifier among multiple local neural network models according to the identifier of the first neural network model, and identify and replace the second neural network model according to the layer carried in the configuration parameter.
  • FIG. 5A is a schematic diagram of a scenario corresponding to a neural network model configuration method.
  • the cloud server can send the parameters of the m-1th layer and the m-th layer of the first neural network model to the edge server, and the edge server replaces the local The original parameters of the m-1th layer and the mth layer in the second neural network model.
  • the configuration parameters include the parameters of the fourth layer (such as denoted as layer data A) and the parameters of the fifth layer (such as denoted as parameter B), and the edge server can update the local second neural network model of the edge server according to parameter A.
  • the parameters of the four layers that is, modify the parameters of the fourth layer to parameter A, and similarly, modify the parameters of the fifth layer of the local second neural network model to parameter B. So far, the edge server has completed the configuration of the local second neural network model.
  • the cloud server may also send the parameters of one layer of the neural network model, and the edge server replaces the parameters of the local layer based on the parameters of the layer sent by the cloud server.
  • the cloud server trains the first neural network model, and sends the configuration parameters of the N layers of the first neural network model to the edge server, and the edge server receives the configuration parameters of the N layers sent by the cloud server to update the local second The parameters of the corresponding layers of the neural network model, so as to ensure the accuracy of the local first neural network model of the edge server, and reduce the amount of data transmission.
  • FIG. 5C is a schematic flowchart corresponding to another method for configuring a neural network model in an edge server according to an embodiment of the present application. As shown in Figure 5C, the method includes the following steps:
  • step 501 the collection device sends sampling data to the edge server and the cloud server respectively.
  • the edge server and the cloud server respectively receive the sampling data sent by the collection device.
  • step 502a the cloud server filters the sampled data to obtain filtered sampled data.
  • step 302 please refer to the process of filtering the sampled data by the edge server in step 302, which will not be repeated here.
  • step 502b the edge server sends the first information to the cloud server, and correspondingly, the cloud server receives the first information sent by the edge server.
  • This step is an optional step, and the first information may be the introduction related to the filtered sampling data in step 303 above, which will not be repeated here.
  • steps 501 to 502b are not steps that must be executed to trigger the cloud server to train the first neural network model, therefore, they are shown in dotted boxes in FIG. 5C .
  • the cloud server filters the sampled data, which can reduce the CPU overhead of filtering the sampled data on the edge server.
  • FIG. 6 is a schematic diagram of interaction between a cloud server and different edge services.
  • the cloud server can interact with any other edge server based on the method shown in FIG. 3 .
  • any other server here is a server that has established a connection with the cloud server.
  • the cloud server 10 can interact with any edge server 20.
  • the cloud server can synchronously send the configuration parameters of the neural network model to the edge server after retraining the neural network model.
  • Multiple edge servers if the neural network models deployed on multiple edge servers under the cloud server are the same, the cloud server can synchronously send the configuration parameters of the neural network model to the edge server after retraining the neural network model.
  • the configuration parameters of the neural network models can also be exchanged between the cloud servers.
  • the cloud server 10. Send the configuration parameters of the neural network model to the cloud server 11.
  • the cloud server 10 and the cloud server 11 send the configuration parameters to their respective edge servers in the manner shown in FIG. 5A or 5B to update the neural network model on the edge server. Parameters of some layers of the network model.
  • the cloud server when the cloud server manages the edge server, it can record which neural network models are used on the edge server, and which edge servers use the same neural network model. These recorded information (hereinafter referred to as model distribution information) can also be exchanged, so that the cloud server can determine which cloud servers to send the configuration parameters of the trained neural network model to.
  • the model distribution information includes: a cloud server identifier, an edge server identifier, and an identifier of a neural network model used by the edge server.
  • each edge cloud server can send the respectively recorded model distribution information to a central cloud server (such as a public cloud server), In addition to sending the configured parameters of the trained neural network model A to the edge server corresponding to the edge cloud server A, the edge cloud server A may also send the configuration parameters of the neural network model A to the central cloud server.
  • the central cloud server is responsible for management and coordination.
  • the central cloud server can send the configuration parameters of neural network model A to edge cloud server B and edge cloud server C.
  • other cloud servers with neural network models may also be determined in the cloud server based on the information configured by the user. This method is more convenient and clean, and has strong practicability.
  • the above method can improve the parameter configuration efficiency of the neural network model on multiple edge servers, reduce the configuration delay, reduce the amount of network data transmission, and save bandwidth.
  • the parameter update process of the neural network model of the edge server is introduced above, and the application method of the neural network model of the edge server in the embodiment of the present application is introduced as follows.
  • FIG. 7 is a data processing method provided by an embodiment of the present application.
  • the data processing method provided by the embodiment of the present application is applied to the edge server in the framework shown in Figure 1A and 1C, and the method can be executed by the edge server, or a component of the edge server such as a processor
  • the data processing method is described. As shown in Figure 7, the method includes:
  • step 701 the edge server receives sampling data respectively sent by multiple collection devices.
  • the edge server receives the video data sent by the camera, receives the IOT data sent by the sensor, etc., where the video data includes a frame-by-frame image, and the IOT data can include the gas concentration information detected by the gas detector, the temperature sensor detected The temperature signal detected by the heart rate sensor, the heart rate value detected by the heart rate sensor, the snoring sound value detected by the snoring sound sensor, etc.
  • step 702 the edge server inputs the sampling data into the corresponding neural network model to obtain the detection result of the neural network model.
  • the neural network model used by the edge server for different sampled data may be different.
  • the neural network model may have functions such as target detection, target recognition, and prediction, among which target detection and target recognition are non-predictive functions.
  • the neural network models with predictive function and non-predictive function are given as examples as follows:
  • the neural network model 1 has a target detection function, such as detecting whether a target instance in an image is wearing a helmet. It should be understood that there may be multiple target instances in the same frame of image.
  • the target instance here refers to the object to be detected extracted from the image.
  • the target instance can be different, such as in the helmet detection In the algorithm, the target instance is the user's head, and in the overalls detection algorithm, the target instance is the user's torso.
  • the output result of the neural network model 1 may include a detection result indicating whether the target instance wears a safety helmet, and may also include a confidence level of the detection result.
  • the confidence degree is used to characterize the accuracy of the detection result, in other words, the confidence degree is used to represent the possibility or probability that the target instance determined by the neural network model is not wearing a safety helmet or is wearing a safety helmet.
  • a set of detection results output by the neural network model 1 includes that the target instance 1 is not wearing a helmet, and the confidence level is 99%. That is, there is a 99% probability that target instance 1 is not wearing a hard hat.
  • another set of detection results includes that target instance 2 is wearing a hard hat, and the confidence level is 40%, that is, there is a 40% probability that target instance 2 is wearing a hard hat, because the user may wear a helmet similar to The accessory of the hard hat is like a helmet, but the model cannot identify whether the accessory is a hard hat.
  • the helmet detection is taken as an example above.
  • the target detection function can also include but not limited to: detect whether the target instance is wearing work clothes, detect whether the target instance is smoking, detect the type of target instance, such as distinguishing cats from dogs, detecting whether the user has fallen, etc. . No longer enumerate here.
  • the neural network model 2 has a prediction function.
  • the prediction function is to predict the value of the feature in the future based on the change of a feature of a certain data in a historical period. For example, it can be based on the number of historical tasks. The amount of tasks, so as to adjust the supply in advance according to the predicted amount of tasks. For another example, it is possible to predict whether the area will collapse in the future based on geological cracks in areas such as mines.
  • the above-mentioned examples of forecasting through single-dimensional data are listed.
  • the embodiment of the present application can also combine multi-dimensional data for forecasting.
  • the weather can be predicted in combination with temperature, wind force value, humidity, etc., and for example, it can be based on user behavior.
  • face, action, physical signs data to predict potential diseases that users may have, and so on.
  • the edge server can distinguish the sampled data according to parameters such as the identification of the collection device and the port number receiving the sampled data, and call the neural network model corresponding to the sampled data for processing.
  • the identification of the collection device can uniquely identify a collection device
  • the identification can be the IP address of the collection device, the device ID, etc.
  • the port number can be a physical port or a logical port, which is not limited in the embodiment of the present application .
  • it can also be distinguished in other ways, which is not limited in this application.
  • the edge server determines a decision based on multiple detection results, and performs corresponding operations based on the decision.
  • the multiple detection results include the results output by the neural network model, and may also include other detection results, which are not limited in the embodiment of the present application.
  • the multiple detection results are the detection results output by multiple neural network models or one or more Multiple detection results output by the neural network model are taken as an example for illustration.
  • the edge server in the embodiment of this application can make decisions based on the fusion of data of various data types, such as structured data and unstructured data, Internet of Things data and video data, neural network models for prediction functions The output data and the output data of the neural network model of the non-prediction function, etc.
  • the edge server receives the video data of the user captured by the camera, and inputs the images in the video into the neural network model a to detect whether the target instance has fallen. It should be understood that if an elderly person falls, it is easy to cause complications. Therefore, the embodiment of the present application may be described by taking the detection of whether the user has fallen as an example.
  • the edge server can also input the video data into the neural network model b, and the neural network model b can predict the potential diseases that the user may have based on the video data, for example, the user may have a toothache, chest pain, etc. Behaviors that may be a sign of a heart attack.
  • neural network model b and neural network model a may be the same model, which is not limited in this application.
  • the edge server can also receive the heart rate, snoring, body temperature and other values sent by the heart rate sensor, snoring sensor, body temperature sensor, etc., and input one or more parameters such as heart rate, snoring, body temperature, etc. into the neural network model c, through the neural network model c Detect whether the physical signs of the user are normal.
  • the edge server may determine a decision based on the detection results of neural network model a to neural network model c. For example, if the detection result of neural network model a indicates that the user has fallen, and the detection result of neural network model b indicates that the user does not have a potential disease, the neural network The detection result of the network model c indicates that the current physical signs of the user are normal, and the decision made by the edge server may be to only generate a log record, recording the time when the user fell, and so on.
  • the decision made by the edge server It may be to remind the user to go to the hospital for examination, or notify the user's associated users, such as sending information to the user's associated users such as family and friends, reminding the associated users to pay attention to the user's health.
  • the determined decision can be to call 120 to call the police and so on.
  • multiple detection results are fused to determine a decision, which can achieve more comprehensive and accurate processing, and is more used to help users discover hidden dangers in a timely manner, take effective response guidance, and improve user experience.
  • this application is an example of an edge server and a cloud server.
  • the technical solution provided by this application can be applied to any two devices with communication functions and computing functions, or the edge server and cloud server Cloud servers have different names in different scenarios, which are not limited in this embodiment of the present application.
  • the embodiment of the present application further provides a configuration device, which is configured to execute the method executed by the cloud server in the above method embodiment.
  • the configuration device 800 includes a training module 801 and a sending module 802 .
  • a receiving module 803 and a processing module 804 are also included (since this module is an optional model, it is shown in a dashed box in FIG. 8 ).
  • a connection is established between modules through a communication channel.
  • the storage device is used to implement the method executed by the cloud server in the above method embodiment.
  • the training module 801 is used to train the first neural network model, and the first neural network model includes K layers; for specific implementation, please refer to the description of step 304 in Figure 3, or refer to the description of step 503 in Figure 5C, here No longer.
  • the sending module 802 is configured to send the configuration parameters of N layers meeting preset conditions in the first neural network model to the edge server, where N is a positive integer smaller than K.
  • N is a positive integer smaller than K.
  • the N layers are preset layers in the first neural network model.
  • each layer in the first neural network model has a weight, and the weight is used to indicate the degree of influence of the layer on the accuracy of the first neural network model; the weight of any layer in the N layers is greater than that of the K-N layers weights of any layer.
  • N is a first preset value, or N is the number of layers whose weight exceeds a preset threshold.
  • the configuration parameters include trained parameters and layer identifiers of each layer in the N layers.
  • the receiving module 803 is configured to receive the first information sent by the edge server, and train the first neural network model according to the first information.
  • the first information sent by the edge server please refer to the description of step 303 in FIG. 3 , or refer to the description of step 502b in FIG. 5C , which will not be repeated here.
  • the first information includes one or more of the following: the training data, the performance parameters of the second neural network model, and triggering the training of the first neural network model.
  • Training instruction information wherein, the performance parameters include one or more of accuracy, confidence, and precision.
  • the receiving module 803 is configured to receive the sampled data sent by the collection device; for a specific implementation manner, please refer to the description of step 502a in FIG. 5C , which will not be repeated here.
  • the processing module 804 is used to determine the training data used to train the first neural network model in the sampled data; it is also used to train the first neural network model when the amount of the determined training data exceeds a second preset value. network model. See steps.
  • the training data includes sampling data whose similarity with historical data is smaller than a third preset value.
  • the embodiment of the present application also provides a computer storage medium, the computer storage medium stores computer instructions, and when the computer instructions are run on the storage device, the storage device executes the above-mentioned relevant method steps to realize the cloud server in the above-mentioned embodiments
  • the executed method refer to the description of steps 303 to 405 in FIG. 3 , or refer to the description of steps 501 to 504 in FIG. 5C , and details are not repeated here.
  • the embodiment of the present application also provides a computer program product.
  • the computer program product When the computer program product is run on a computer, it causes the computer to perform the above-mentioned related steps, so as to realize the method performed by the cloud server in the above-mentioned embodiment. See the steps in FIG. 3 The description of steps 303 to 405, or refer to the description of steps 501 to 504 in FIG. 5C , will not be repeated here.
  • an embodiment of the present application also provides a device, which may specifically be a chip, a component or a module, and the device may include a connected processor and a memory; wherein the memory is used to store computer-executable instructions, and when the device is running,
  • the processor can execute the computer-executed instructions stored in the memory, so that the chip executes the methods performed by the cloud server in the above method embodiments, refer to the description of steps 303 to 405 in Figure 3, or refer to steps 501 to 504 in Figure 5C description, which will not be repeated here.
  • the storage devices, computer storage media, computer program products or chips provided in the embodiments of the present application are all used to execute the method corresponding to the cloud server provided above, therefore, the beneficial effects it can achieve can refer to the above provided The beneficial effects of the corresponding method will not be repeated here.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of modules or units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or It may be integrated into another device, or some features may be omitted, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component shown as a unit may be one physical unit or multiple physical units, which may be located in one place or distributed to multiple different places. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit (or module) in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • an integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the software product is stored in a storage medium Among them, several instructions are included to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods in various embodiments of the present application.
  • the aforementioned storage medium includes: various media that can store program codes such as U disk, mobile hard disk, read only memory (ROM), random access memory (random access memory, RAM), magnetic disk or optical disk.
  • the computer-executed instructions in the embodiments of the present application may also be referred to as application program codes, which is not specifically limited in the embodiments of the present application.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device including a server, a data center, and the like integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a solid state disk (Solid State Disk, SSD)), etc.
  • a magnetic medium such as a floppy disk, a hard disk, or a magnetic tape
  • an optical medium such as a DVD
  • a semiconductor medium such as a solid state disk (Solid State Disk, SSD)
  • the various illustrative logic units and circuits described in the embodiments of the present application can be implemented by a general-purpose processor, a digital signal processor, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic devices, Discrete gate or transistor logic, discrete hardware components, or any combination of the above designed to implement or operate the described functions.
  • the general-purpose processor may be a microprocessor, and optionally, the general-purpose processor may also be any conventional processor, controller, microcontroller or state machine.
  • a processor may also be implemented by a combination of computing devices, such as a digital signal processor and a microprocessor, multiple microprocessors, one or more microprocessors combined with a digital signal processor core, or any other similar configuration to accomplish.
  • the steps of the method or algorithm described in the embodiments of the present application may be directly embedded in hardware, a software unit executed by a processor, or a combination of both.
  • the software unit may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM or any other storage medium in the art.
  • the storage medium can be connected to the processor, so that the processor can read information from the storage medium, and can write information to the storage medium.
  • the storage medium can also be integrated into the processor.
  • the processor and storage medium can be provided in an ASIC.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供一种配置边缘服务器中神经网络模型的系统及装置,该系统包括云服务器和边缘服务器,云服务器用于训练第一神经网络模型,并将所述第一神经网络模型中符合预设条件的N个层配置参数发送给边缘服务器,边缘服务器基于该N个层的配置更新边缘服务器中的第二神经网络模型中与该配置参数对应的N个层的参数。通过上述方式,边缘服务器可以及时更新第二神经网络模型,以保证边缘服务器上第二神经网络模型的准确性,同时由于传输的为部分层的配置参数,减少了数据传输量,降低了对带宽的要求。

Description

一种配置边缘服务器中神经网络模型的系统及装置
相关申请的交叉引用
本申请要求在2021年06月30日提交中国专利局、申请号为202110738374.3、申请名称为“一种配置边缘服务器中神经网络模型的系统、方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种配置边缘服务器中神经网络模型的系统及装置。
背景技术
一种业务系统包括云服务器和多个边缘服务器,每个边缘服务器可以接收采集设备发送的采样数据,例如视频数据、物联网(Internet of things,IOT)数据(如温度、瓦斯浓度等)。边缘服务器上部署有神经网络模型,该神经网络模型可以是预置在边缘服务器上的,也可以是云服务器发送给边缘服务器的。边缘服务器可以基于神经网络模型对接收到的采样数据进行检测、识别等,例如,基于视频数据检测员工是否佩戴口罩、是否佩戴安全帽等。
由于边缘服务器数量众多,出于成本和时间的考量,边缘服务器通常不具有训练神经网络模型的能力。为了保证边缘服务器上神经网络模型的准确率,目前,由云服务器负责训练神经网络模型,并将训练好的神经网络模型下发给边缘服务器。但由于边缘服务器的数量众多,这种实现方式对网络带宽的要求会很高。
发明内容
本申请提供一种配置边缘服务器中神经网络模型的系统及装置,用于在保证边缘服务器的神经网络模型准确率的基础上,降低对带宽的要求。
第一方面,本申请实施例提供了一种配置边缘服务器中神经网络模型的系统,该系统包括云服务器和边缘服务器。其中,云服务器中运行有神经网络模型,如称为第一神经网络模型。边缘服务器中运行有神经网络模型,如称为第二神经网络模型,第一神经网络模型和第二神经网络模型的结构相同。实际上,第一神经网络模型和第二神经网络模型是同一模型。
云服务器可以用于训练第一神经网络模型,并将第一神经网络模型中符合预设条件的N个层的配置参数发送给边缘服务器。边缘服务器用于接收该N个层的配置参数,并根据该N个层的配置参数更新第二神经网络模型中与该配置参数对应的N个层的参数。
通过上述设计,边缘服务器可以及时更新第二神经网络模型,以保证边缘服务器上第二神经网络模型的准确性,同时由于传输的为部分层的配置参数,减少了数据传输量,降低了对带宽的要求。
在一种可能的实现方式中,该N个层为第一神经网络模型中的预设层。
通过上述设计,可以指定第一神经网络模型中的任意几层作为预设层,云服务器对第 一神经网络模型进行训练之后,将预设的N个层的配置参数发送给边缘服务器。由于该N个层为预设的,云服务器不需要再进行选择,可以节省CPU开销。
在一种可能的实现方式中,第一神经网络模型中的每一层具有权重,该权重用于指示该层对第一神经网络模型准确率的影响程度;该N个层中任一层的权重大于K-N层任一层的权重。例如,云服务器可以将第一神经网络模型的K个层按照权重由大到小的顺序排序,将排在前N个的层的配置参数发送给边缘服务器,即该N个层中的任意一个层的权重大于N个层之后的(K-N)任一层的权重。
通过上述设计,由于该N个层的权重较高,对第一神经网络模型的准确率的影响较大,云服务器可以将该N个层的配置参数发送给边缘服务器,可以在保证边缘服务器准确率的基础上,减少数据传输量。
在一种可能的实现方式中,N为预设值(如第一预设值),或N为权重超过预设阈值的层的数量。
通过上述设计,N值配置方式灵活,可以应用在多种场景中。
在一种可能的实现方式中,配置参数包括N个层中每一层的参数和层标识。
通过上述设计,边缘服务器可以根据层标识确定第二神经网络模型中的对应层,并根据配置参数中该层的参数来更新第二神经网络模型中该对应层的参数。
在一种可能的实现方式中,边缘服务器还用于当所述第二神经网络模型的性能参数低于第一预设值时,发送所述第一信息至所述云服务器,指示所述云服务器训练所述第一神经网络模型;云服务器用于接收所述边缘服务器发送的第一信息,并根据所述第一信息训练所述第一神经网络模型。
在一种可能的实现方式中,所述边缘服务器还用于接收采集设备发送的采样数据,在所述采样数据中确定用于训练所述第一神经网络模型的训练数据,在确定的所述训练数据的数据量超过第二预设值时,发送所述第一信息至所述云服务器,指示所述云服务器训练所述第一神经网络模型;所述云服务器用于接收所述边缘服务器发送的第一信息,并根据所述第一信息训练所述第一神经网络模型。
在一种可能的实现方式中,所述第一信息包括下列中的一项或多项:所述训练数据、所述第二神经网络模型的性能参数、用于触发对所述第一神经网络模型进行训练的指示信息;其中,所述性能参数包括准确率、置信度、精确度中的一项或多项。
在一种可能的实现方式中,所述云服务器还用于接收采集设备发送的采样数据,在所述采样数据中确定用于训练所述第一神经网络模型的训练数据,在确定的所述训练数据的数据量超过第三预设值时,训练所述第一神经网络模型。
在一种可能的实现方式中,所述训练数据包括与历史数据的相似度小于第四预设值的采样数据。
在一种可能的实现方式中,云服务器检测到满足第一训练条件时,基于训练数据对第一神经网络模型进行训练。
在一种可能的实现方式中,所述云服务器接收所述边缘服务器发送的训练数据;或所述云服务器接收采样设备发送的采样数据,并根据采样数据确定训练数据。
在一种可能的实现方式中,所述训练数据为与历史数据相似度低于预设值(预设值A)的采样数据。
在一种可能的实现方式中,第一训练条件包括但不限于:
云服务器接收到边缘服务器发送的第一信息,所述第一信息用于指示云服务器训练第一神经网络模型;或云服务器接收边缘服务器发送的第一信息,并确定边缘服务器中第二神经网络模型的性能参数低于预设值(如预设值B);所述第一信息包括所述第二神经网络模型的性能参数,所述性能参数用于指示第二神经网络模型的使用情况,所述性能参数包括但不限于:第二神经网络模型的准确率、置信度、精确度;或云服务器确定训练数据的数据量达到预设值(如预设值C),所述训练数据为所述云服务器基于来自采集设备的采样数据确定的,或从边缘服务器接收的;所述训练数据为与历史数据相似度低于预设值(如预设值D)的采样数据。
在一种可能的实现方式中,边缘服务器检测到满足第二训练条件时,向云服务器发送第一信息;其中,所述第一信息用于指示云服务器训练所述第一神经网络模型;
所述第二训练条件包括但不限于:
边缘服务器确定训练数据的数据量达到预设值(如预设值E,预设值E与预设值C可以是相同的,也可以是不同的),所述训练数据为所述边缘服务器基于接收到的采集设备发送的采样数据确定的;所述训练数据为与历史数据相似度低于预设值(如预设值F,预设值F与预设值D可以是相同的,也可以是不同的)的采样数据;或边缘服务器确定所述第二神经网络模型的性能参数低于预设值(如预设值G,预设值G与预设值B可以是相同的,也可以是不同的);所述性能参数用于指示第二神经网络模型的使用情况,所述性能参数包括但不限于:第二神经网络模型的准确率、置信度、精确度。
在一种可能的实现方式中,边缘服务器向云服务器发送第一信息,所述第一信息包括所述第二神经网络模型的性能参数和/或所述边缘服务器确定的训练数据。
第二方面,本申请实施例还提供了一种配置边缘服务器中神经网络模型的方法,该方法可以应用至第一方面提及的系统中,该方法中的云服务器通过执行该方法以实现上述第一方面的所示出的系统中云服务器行为的功能,有益效果可以参见第一方面的描述此处不再赘述。该方法中的边缘服务器通过执行该方法以实现上述第一方面的所示出的系统中云服务器行为的功能,有益效果可以参见第一方面的描述此处不再赘述。
第三方面,本申请实施例还提供了一种配置装置,该装置具有实现上述第一方面的所示出的系统中云服务器行为的功能,有益效果可以参见第一方面的描述此处不再赘述。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块。在一个可能的设计中,所述装置的结构中包括训练模块、发送模块,可选的,还包括接收模块、处理模块。这些模块可以执行上述第一方面方法示例中的相应功能,具体参见方法示例中的详细描述,此处不做赘述。
第四方面,本申请还提供了一种配置设备,所述配置设备包括处理器和存储器,还可以包括通信接口,所述处理器执行所述存储器中的程序指令执行上述第一方面或第一方面任一可能的实现方式提供的云服务器执行的操作。该配置设备可以为服务器、或计算设备等设备。所述存储器与所述处理器耦合,其保存配置边缘服务器中神经网络模型过程中必要的程序指令和数据。所述通信接口,用于与其他设备进行通信。
第五方面,本申请提供了一种计算机可读存储介质,所述计算书可读存储介质被存储设备执行时,所述存储设备执行前述第一方面或第一方面的任意可能的实现方式中提供的云服务器执行的操作。该存储介质中存储了程序。该存储介质包括但不限于易失性存储器,例如随机访问存储器,非易失性存储器,例如快闪存储器、硬盘(hard disk drive,HDD)、 固态硬盘(solid state drive,SSD)。
第六方面,本申请提供了一种计算设备程序产品,所述计算设备程序产品包括计算机指令,在被计算设备执行时,所述计算设备执行前述第一方面或第一方面的任意可能的实现方式中提供的云服务器的操作。该计算机程序产品可以为一个软件安装包,在需要使用前述第一方面或第一方面的任意可能的实现方式中提供的云服务器的功能的情况下,可以下载该计算机程序产品并在计算设备上执行该计算机程序产品。
第七方面,本申请还提供一种计算机芯片,所述芯片与存储器相连,所述芯片用于读取并执行所述存储器中存储的软件程序,执行上述第一方面以及第一方面的各个可能的实现方式中云服务器的操作。
上述第二方面至第七方面实现的有益效果,请参考第一方面关于配置边缘服务器中神经网络模型的系统的有益效果的描述,此处不再赘述。
附图说明
图1A为本申请实施例提供的一种可能的系统架构示意图;
图1B为本申请实施例提供的一种云服务器的结构示意图;
图1C为本申请实施例提供的另一种云服务器的结构示意图;
图2A为一种神经网络模型的结构示意图;
图2B为一种神经网络模型的层结构命名示意图;
图2C为一种神经网络模型的示意图;
图3为本申请实施例提供的一种神经网络模型的配置方法所对应的流程示意图;
图4为本申请实施例提供的一种配置参数所对应的帧结构示意图;
图5A为本申请实施例提供的一种神经网络模型的配置场景示意图;
图5B为本申请实施例提供的另一种神经网络模型的配置场景示意图;
图5C为本申请实施例提供的一种神经网络模型的配置方法所对应的流程示意图;
图6为本申请实施例提供的系统内神经网络模型的配置场景示意图;
图7为本申请实施例提供的一种数据处理方法所对应的流程示意图;
图8为本申请提供的一种配置装置的结构示意图。
具体实施方式
图1A为本申请实施例提供的一种系统的架构示意图。该系统包括云服务器10、边缘服务器20,(图1A中仅示出了3个边缘服务器20,但本申请对此不做限定)、采集设备30(图1A中仅示出了4个采集设备30,但本申请对此不做限定)。
其中,云服务器10,用于管理边缘服务器20以及为边缘服务器20提供服务。应用服务器100可以是物理机,也可以是虚拟机。云服务器10具有训练神经网络模型的功能,云服务器上可以存储有一个或多个与边缘服务器20上运行的神经网络模型相同的神经网络模型,云服务器10可以为边缘服务器20提供训练这些神经网络模型的服务,从而节省边缘服务器20本地的计算资源开销。
在硬件层面,如图1B所示,云服务器10具有处理器112、存储器113和通信接口114。其中,处理器112、存储器113和通信接口114之间通过总线连接。
其中,处理器112是一个中央处理器(central processing unit,CPU),用于执行内存中的软件程序以实现一个或多个功能例如,对神经网络模型进行训练等。处理器112还可以用于对数据进行计算和处理,如元数据管理、重复数据删除、数据压缩、数据校验、虚拟化存储空间以及地址转换等。
除CPU之外,处理器112还可以是专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)、人工智能(artificial intelligence,AI)芯片、片上系统(system on chip,SoC)或复杂可编程逻辑器件(complex programmable logic device,CPLD),图形处理器(graphics processing unit,GPU)等。
需要说明的是,图1B中仅示出了一个处理器112,在实际应用中,处理器112的数量可以有多个,该多个处理器112可以包括多个相同类型的处理器,也可以包括多个不同类型的处理器,例如,多个处理器112包括多个CPU。又例如,该多个处理器112包括一个或多个CPU以及一个或多个GPU。再例如,多个处理器112包括一个或多个CPU、一个或多个GPU、以及一个或多个FPGA等等。其中,CPU可以具有一个或多个CPU核。本实施例不对CPU的数量,以及CPU核的数量进行限定。
存储器113,是指用于存储数据的装置,它可以是内存,也可以是硬盘。内存,是指与处理器112直接交换数据的内部存储器,它可以随时读写数据,而且速度很快,作为运行在处理器112上的操作系统或其他正在运行中的程序的临时数据存储器。内存包括易失性存储器(volatile memory),例如RAM、DRAM等,也可以包括非易失性存储器(non-volatile memory),例如存储级内存(storage class memory,SCM)等,或者易失性存储器与非易失性存储器的组合等。实际应用中,云服务器10中可配置多个内存,以及不同类型的内存。本实施例不对内存的数量和类型进行限定。此外,可对内存进行配置使其具有保电功能。保电功能是指系统发生掉电又重新上电时,内存中存储的数据也不会丢失。具有保电功能的内存被称为非易失性存储器。
硬盘,用于提供存储资源,例如用于存储一个或多个神经网络模型。硬盘包括但不限于:非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),硬盘驱动器(hard disk drive,HDD)或固态驱动器(solid state disk,SSD)等。与内存不同之处在于,硬盘的读写速度较慢,通常用于持久性地存储数据。在一种实施方式中,硬盘中的数据、程序指令等需要先加载到内存中,然后,处理器再从内存中获取这些数据和/或程序指令。
通信接口114,用于与其他设备例如边缘服务器、或其他云服务器通信。
边缘服务器20,用于接收采集设备30发送的采样数据,并使用边缘服务器20本地上的相应的神经网络模型对采样数据进行处理,并得到神经网络模型的输出结果。在硬件层面,边缘服务器20具有与云服务器10类似的硬件组件,例如,边缘服务器20具有处理器、存储器、网卡等。其中,网卡用于与采集设备30以及云服务器10通信。存储器可以用于存储神经网络模型。处理器用于运行神经网络模型等。这些硬件组件与云服务器10中的硬件组件的功能和类型相似,具体请参见上文对云服务器10各硬件组件的介绍,此次不再赘述。
采集设备30,用于采集数据(如称为采样数据),并将采样数据发送给其他设备,如边缘服务器20、云服务器10等。采集设备30包括但不限于:摄像头、手机、智能手表、车载终端设备、传感器,其中传感器包括心率传感器、距离传感器、温度传感器、压力传 感器、烟雾探测器、气体浓度探测器等等,本申请对此不做限定,任何具有数据采集功能和通信功能的设备均适用于本申请实施例。
需要说明的是,(1)图1A仅示出了少量设备以保持简洁,实际上,云服务器10可以有多个,每个云服务器10可以为大量边缘服务器20提供服务,边缘服务器20也可以连接多个采集设备30,并且多个采集设备30的种类可以是相同的也可以是不同的,本申请实施例对该系统下各设备的数量和类型不做限定。(2)图1A中的采集设备30是以摄像头、手机、智能手表、车载终端为例示出的,但本申请实施例对采集设备30的种类和数量不做限定,例如,还可以是各类传感器等,任何具有信息采集以及传输功能的设备均适用于本申请实施例。(3)上述图1A中的云服务器10可以是边缘云服务器,也可以是公有云服务器,边缘云可以理解为私有云,边缘云服务器还可以与公有云服务器交互。(4)图1B所示出的云服务器10的结构仅为举例,实际产品中,云服务器10可以具有相对图1B更多或更少的组件,本申请实施例对此也不做限定。
图1C为本申请实施例提供的另一种系统架构图,图1C在图1A的基础上又增加了云服务器11,云服务器11的硬件组件和软件组件可以与上述云服务器10相同,此处不再赘述。图1C中,任意两个云服务器之间可以互相通信。上述云服务器可以是硬件设备也可以是虚拟机,本申请实施例对此不做限定。
上述如图1A、1C所示的系统架构仅是举例,本申请并不限定系统的结构,凡是具有训练神经网络模型的功能的服务器和与该服务器通信的设备所构建的系统均适用于本申请实施例。
上述系统架构可以部署在多种业务场景中,例如,工业、建筑、生产线、楼宇等室内、室外场景、自动驾驶等场景中。对应的,采集设备可以部署于如工业园区、企业园区、校园、商城、住宅、公共区域等。在不同的场景中,边缘服务器20上部署的神经网络模型可能是不同的。
举例来说,在工业场景中,边缘服务器20可以接收部署在工业园区内的摄像头拍摄的图像,并使用神经网络模型分析图像中是否存在告警行为,例如员工是否佩戴安全帽、员工是否离岗、是否有明火。又例如,在检测到用户未佩戴安全帽时,还可以尝试获取该用户的面部区域,提取该其面部特征,并将该用户的面部特征与该企业员工的人脸数据库进行比对,从而确定该用户的身份。
又例如,在医疗场景中,边缘服务器20可以接收部署在医院中的摄像头拍摄的图像,并使用神经网络模型分析图像中是否有人未佩戴口罩等等。
再例如,在公共区域,边缘服务器20可以接收部署在公共区域的摄像头拍摄的图像,并使用相应功能的神经网络模型分析图像中是否存在告警行为,如乱扔垃圾、逆行等。
除输出检测结果此之外,神经网络模型还可以输出检测结果的置信度。置信度越高,则表示检测结果的可靠性越强;置信度越低,则表示检测结果的可靠性也越低。这是由于,神经网络模型会将检测到的特征与学习到的特征进行比对,以此得到检测结果,并根据两者之间的匹配度来确定检测结果的置信度。例如神经网络模型可以检测人和宠物,如果目标实例的特征与该神经网络模型学习到的人的特征的匹配度(或者说相似度高),则输出的检测结果为人,其检测结果的置信度便可以是该目标实例的特征与学习到的人的特征的匹配度。匹配度越高,则表明该目标实例是人的可能性越大。
本申请实施例不限定适用于边缘服务器20的神经网络模型的结构和功能,例如,基 于结构介绍,神经网络模型包括但不限于多层感知机神经网络模型、卷积神经网络模型、循环神经网络模型以及残差收缩网络等。其中,残差收缩网络是卷积神经网络模型的改进,这里不做重点说明。又例如,基于功能介绍,神经网络模型可以具有分类功能、目标检测功能、目标识别功能、预测功能、推理功能等等,本申请实施例对神经网络模型的结构和功能不做限定。另外,本申请中的神经网络模型还可以包括基于机器学习算法、深度学习算法等的神经网络模型,基于机器学习算法得到的神经网络模型也可以称为机器学习模型,类似的,基于深度学习算法得到的神经网络模型也可以称为深度学习模型,任何神经网络模型均适用于本申请实施例。
如下以多层感知机神经网络模型为例对神经网络模型的结构进行介绍。
图2A为一种神经网络模型的结构示意图。如图2A所示,该神经网络模型包括输入层(input layer)、隐含层(hidden layers)(图2A中仅示出了2个隐含层,但本申请对此不做限定)和输出层(output layer)。
其中,输入层,包括一个或多个神经元,每个神经元为一个输入参数。
隐含层,在输入层与输出层之间,隐含层的层数是可变的,神经网络模型可以包括一个或多个隐含层。为便于区分,如图2B所示,如下按照输入层至输出层的方向,将顺序排列的各隐含层分别记为第一层、第二层、…、第m层,m取正整数。应理解,这里的第一层、第二层是指隐含层中的第一层、第二层,并非表示整个神经网络模型的第一层、第二层。
输出层,包括一个或多个神经元,每个神经元为一个输出,在实际应用中,输出层的神经元的数量取决于所要解决的问题。
隐含层的功能是将输入映射到输出,针对每个隐含层而言,是将前一层的输出作为当前隐含层的输入,并基于预设函数(包括权重和偏置)得到一个输出值,并将得到的该输出值输入至下一层。如图2C所示,隐含层各神经元与前一层的关系如下:
h 11=x 1·w 11+x 2·w 21+b 11
h 12=x 1·w 12+x 2·w 22+b 12
隐含层各神经元与下一层的关系如下:
y 1=h 11·w 110+h 12·w 210+b 110
y 2=h 11·w 111+h 12·w 211+b 120
需要说明的是,上述介绍的为多层感知机神经网络模型,其他类型的神经网络模型的层结构可能是不同的,比如,卷积神经网络模型包括输入层、一层或多层卷积层、一层或多层池化层、输出层。为便于说明,如下将各神经网络模型中输入层与输出层之间的层称为连接层,每一连接层的参数包括该层所包括的权重值、偏置值,还可以包括其他参数,例如激活函数、反馈函数等函数的系数值,当然,如果该连接层没有激活函数或反馈函数,则该连接层的参数则不包括激活函数、反馈函数的系数值,任何在训练过程可以被调节或者可以发生变化的参数均可以被包括于参数中。
对于神经网络模型的训练,实际上,是对神经网络模型中的这些连接层的参数如权重和偏置进行不断调整,以使输出层的输出达到期望的输出值,当输出结果的准确性高于一定阈值时或当达到设置的迭代次数时,便可以停止训练,训练好的神经网络模型便可以投入至边缘服务器中应用。
值得注意的是,神经网络模型的训练其实是一个不断学习的过程,训练后的神经网络 模型的准确性也不是一成不变的,如果出现神经网络模型未学习过的场景,则其准确性可能会随之降低。而边缘服务器数量众多,并且运算能力有限,通常,边缘服务器不具备神经网络模型的训练功能,鉴于此,本申请实施例提供了一种神经网络模型的配置方法,用于保证边缘服务器上神经网络模型的准确性的基础上,减少数据传输量,降低对网络带宽的要求。
下面结合附图3,以本申请实施例提供的对边缘服务器中神经网络模型的配置方法应用于图1A、1C所示的系统构架中,该方法中的云服务器、边缘服务器可以分别是图1A~图1C中任一附图中的云服务器10(或者云服务器10的处理器)、边缘服务器20(或者边缘服务器的处理器)为例,对本申请实施例提供的神经网络模型的配置方法进行说明。
图3为本申请实施例提供的配置边缘服务器中神经网络模型的方法的流程示意图。如图3所示,该方法包括如下步骤:
步骤301,边缘服务器接收采集设备发送的采样数据。
本申请实施例中,云服务器和边缘服务器上具有相同的神经网络模型,这里的相同是指结构、功能相同,但是两者的参数在不同的时间可能是完全相同的,也可以是完全不同的,或者是不完全相同。为便于说明,在下文的描述中,将云服务器上的神经网络模型称为第一神经网络模型,将边缘服务器上的神经网络模型称为第二神经网络模型。
对于接收到的采样数据,在一种实施方式中,该采样数据可以被复制得到相同的两个采样数据,在一条路径中,该采样数据被输入至第二神经网络模型中进行计算,得到第二神经网络模型的输出结果。边缘服务器可以根据输出结果确定第二神经网络模型的性能参数,例如,准确率、精确度等。当第二神经网络模型的准确率较低时,边缘服务器可以通知云服务器训练与第二神经网络模型相同的第一神经网络模型。其中,边缘服务器20上的第二神经网络模型可以是云服务器发送给边缘服务器的,也可以是在出厂前或用户预配置在边缘服务器中的。在另一条路径中,边缘服务器可以对采样数据进行过滤(参见步骤302)。
步骤302,边缘服务器对采样数据进行过滤,得到过滤后的采样数据。
边缘服务器可以确定采样数据与目标历史数据的相似度,之后,基于相似度对采样数据进行过滤,得到与目标历史数据相似度低于第一预设值的采样数据。这些与目标历史数据相似度低的采样数据可以表征采集设备采集到了新场景的数据,新场景的数据可能会导致第二神经网络模型的准确率低,因此,边缘服务器可以将这些新场景的采样数据过滤出来,作为后续云服务器(重新)训练第一神经网络模型的训练数据。
其中,历史数据包括边缘服务器在接收到所述采样数据之前所采样的数据,和/或训练第一神经网络模型的样本数据。目标历史数据为历史数据的子集,例如目标历史数据可以是历史数据中在待过滤的采样数据之前一段预设时间内的采样数据,也可以是全部的历史数据。
如下对确定采样数据与目标历史数据相似度的方式进行介绍:
1)确定方式一:边缘服务器可以根据采样数据的特征与目标历史数据的特征,计算两者之间的距离,例如欧式距离、曼哈顿距离、汉明距离,将该距离作为两者的相似度。
示例性地,边缘服务器可以使用特征算法计算每个目标历史数据的一个或多个特征,如果目标历史数据为多个,则可以基于该多个目标数据的同一特征的值,计算该特征的均值,通过上述方式得到每个特征的均值。类似的,边缘服务器使用相同的特征算法计算待 过滤的采样数据的一个或多个特征,并根据该采样数据的一个或多个特征和上述确定的每个特征的均值,计算采样数据与目标历史数据之间的距离。例如,采样数据的特征包含(a1,b1,c1),目标历史数据的特征均值包括(a2,b2,c2),则示例性地,采样数据与目标历史数据之间的距离d=|a1-a2|+|b1-b2|+|c1-c2|,该距离d可以作为采样数据与目标历史数据的相似度。需要说明的是,上述计算距离的方式仅为举例,本申请实施例对此不做限定。
2)确定方式二:边缘服务器可以将采样数据与目标历史数据所包含的相同特征的数量作为两者之间的相似度。这里的相同特征是指相同类型的特征且特征的值相同。例如,采样数据的特征包括(a1,b1,c1),目标历史数据的特征均值包括(a2,b1,c1),则采样数据与目标历史数据所包含的相同特征为b1和c1。
上述特征算法可以是预设置的,不同的采样数据的特征算法可以是不同的,例如,采样数据为图像,该特征可以是R值、G值、B值等。又例如,物联网数据,例如温度、瓦斯浓度等,该特征可以是该数据的值,则多个目标历史数据的特征均值即为该多个目标历史数据的均值。
步骤303,边缘服务器向云服务器发送第一信息。对应的,云服务器接收边缘服务器发送的第一信息。
在第一种实施方式中,该第一信息可以用于指示云服务器是否需要训练第一神经网络模型。如该指示信息包括1个比特位,该1个比特位上的比特的值为0时,用于指示云服务器训练第一神经网络模型,或该1个比特位上的比特的值为1时,用于指示云服务器不需要重新训练第一神经网络模型。在这种方式下,边缘服务器可以周期性的发送第一信息。
上述为显性的指示方法,也可以是边缘服务器确定需要重新训练第一神经网络模型时,向云服务器发送指示信息,该指示信息用于指示云服务器训练第一神经网络模型,若边缘服务器未向云服务器发送指示信息,则表示云服务器不需要重新训练第一神经网络模型。其中,边缘服务器可以在确定第二神经网络模型的性能参数较低,如低于第二预设值时,向云服务器发送该第一信息。其中,第二神经网络模型的性能参数包括但不限于准确率、精准率或置信度。或者,当边缘服务器确定过滤后的采样数据的数据量达到一定阈值,如不低于第三预设值时,向云服务器发送第一信息。
可选的,边缘云服务还可以将过滤后的采样数据和该第一信息一起发送给云服务器,云服务器可以使用过滤后的采样数据对第一神经网络模型进行训练。
在第二种实施方式中,该第一信息用于指示边缘服务器上第二神经网络模型的使用情况,例如该第一信息包括边缘服务器中该第二神经网络模型的性能参数等。云服务器可以根据该第一信息判断是否需要训练云服务器中的第一神经网络模型,例如,当边缘服务器中的第二神经网络模型的准确率等较低,如低于第四预设值时,云服务器触发对第一神经网络模型的训练。又例如,该第二神经网络模型的置信度低于第五预设值时,云服务器触发对第一神经网络模型的训练。
在第三种实施方式中,该第一信息包括步骤301得到的过滤后的采样数据。云服务器可以根据该第一信息判断是否需要训练第一神经网络模型,例如,当云服务器接收到的过滤后的采样数据的数据量达到一定阈值,如不低于第三预设值时,云服务器触发对第一神经网络模型的训练。
可选的,由于云服务器上可能存在多个神经网络模型,或边缘服务器上可能存在多个神经网络模型,为了便于确认待训练的第一神经网络模型,上述第一信息还可以包括第一 神经网络模型(或第二神经网络模型)的标识,该标识用于唯一标识待训练的神经网络模型,值得注意的是,第一神经网络模型和第二神经网络模型的标识相同。
边缘服务器上也可能仅运行一个神经网络模型,即第二神经网络模型,则云服务器可以基于用于记录边缘服务器上运行的神经网络模型的信息确定待训练的第一神经网络模型,这种方式第一信息便不需要携带第一神经网络模型的标识。
需要说明的是,上述步骤301~步骤303并非为触发云服务器训练第一神经网络模型所必须执行的步骤,因此,在图3中以虚线框示出。
步骤304,云服务器对第一神经网络模型进行训练。
步骤305,云服务器将第一神经网络模型的训练后的N个层的配置参数发送给边缘服务器。对应的,边缘服务器接收云服务器发送的N个层的配置参数。其中,第一神经网络模型包括的连接层的数量大于N。
由于训练方式不同,选择的N个层也可能是不同的,因此,如下结合步骤304和步骤305进行详细说明。
示例一:
在步骤304中,云服务器基于第一信息确定对第一神经网络模型进行训练之后,可以使用过滤后的采样数据对第一神经网络模型进行训练。
在第一种实施方式中,云服务器在对第一神经网络模型进行训练时,确定每一(连接)层的权重值。其中,每一层的权重值用于指示该层对该神经网络模型的准确率的影响程度。可以理解的是,权重值越高,则对神经网络模型的准确率的影响程度越大。相反,权重值越低,对神经网络模型的准确率的影响程度越小。
在本申请实施例中,云服务器可以采用本申请提供的训练方法对神经网络模型进行训练,以确定每一层的权重,如下对该训练方法进行介绍:
在一种实施方式中,云服务器每次训练神经网络模型中的一层,并根据训练前该神经网络模型的准确率和训练后该神经网络模型的准确率,得到该层的权重。例如,将训练后该神经网络模型的准确率和训练前该神经网络模型的准确率之差作为该层的权重。
举例来说,对于连接层,首先训练连接层中的第一层,也就是第二层及其之后的层的参数不变,仅对第二层的参数进行训练,当满足结束条件如神经网络模型的迭代次数达到预设值时,训练结束即该层训练完成,神经网络模型输出当前神经网络模型的准确率值。假设训练前神经网络模型的准确率为80%,训练完成后该神经网络模型的准确率为82%,则该第一层的权重为0.02(82%-80%)。之后,训练连接层中的第二层,应注意,训练每一层时所使用的神经网络模型应是相同的,即每一层的参照物相同,这样才能确定出每一层对神经网络模型准确率的影响。也就是在训练第二层时所使用的神经网络模型的各层的参数与训练第一层时所使用的神经网络模型的各层的参数相同。在训练第二层时,除第二层之外的其他层的参数不变,仅对第二层的参数进行训练,假设第二层训练完成后,该神经网络模型的准确率为89%,则该第二层的权重为0.08(89%-80%)。再之后,训练第三层,假设第三层训练完成之后,该神经网络模型的准确率为92%,则该第三层的权重为0.12(92%-80%),依此类推,云服务器通过上述方式分别确定每一连接层的权重。
在步骤305中,云服务器可以基于每一层的权重来选择N个层。
一种实施方式为,云服务器将多个连接层按照权重由大到小进行排序,选择排在前N个的层,将该N个层的参数发送给边缘服务器。例如,结合步骤303中的例子,假设云服 务器确定第一层的权重为0.02,第二层的权重为0.08,第三层的权重为0.12,第四层的权重为0.15(95%-90%),第五层的权重为0.16(96%-90%),按照各层的权重由大到小进行排序为:第五层>第四层>第三层>第二层>第一层。假设N取2,则云服务器可以将排在前2层的参数发送给边缘服务器。
另一种实施方式为,云服务器将性能参数超过第四预设值的N个层的参数发送给边缘服务器。例如,仍结合上述例子,并以准确率为例,假设第四预设值为90%,其中,神经网络模型的准确率超过90%的层包括第三层、第四层和第五层,则云服务器可以将第三层、第四层和第五层的参数发送给边缘服务器。
更进一步地,云服务器还可以再次对该N个层进行联合训练,联合训练是指同时对第一神经网络模型的该N个层的参数进行训练,经过一次联合训练后,该N个层的每一层的参数均可能改变。若联合训练后的神经网络模型的准确率低于上述单层训练时确定的权重最大的层所对应的准确率(也即准确率最大值),则云服务器可以发送权重最大的那层的参数,即N=1。如果联合训练后的神经网络模型的准确率高于上述权重最大的层所对应的准确率,则云服务器发送该N个层联合训练后每一层的参数。
示例二:
在步骤304中,云服务器使用过滤后的采样数据对第一神经网络模型进行训练,不需要确定每一层的权重值。该训练方式可以是基于现有的训练机制,如对第一神经网络模型的全部层进行联合训练,联合训练是指同时对第一神经网络模型的全部层的参数进行训练,经过一次联合训练后,每一层的参数均可能改变。
对应的,在步骤305中,该N个层可以为第一神经网络模型中指定的几层,即预设层。例如,预设层包括第一神经网络模型中的最后三个连接层。该指定的几层可以是连续的几层也可以不连续,本申请对此不做限定。在实际应用中,本领域技术人员可以通过大量运算数据确定对第一神经网络模型的准确率影响较大的几个层作为预设层。
需要说明的是,本申请实施例中发送边缘服务器的N个层的配置参数均指包括N个层训练后的参数。
如下将云服务器确定的N个层中的每一层称为目标层,示例性地,云服务器可以将该N个的配置参数发送给边缘服务器,其中,配置参数包括每一个目标层的层标识和该层的参数,可选的,当边缘服务器上有多个神经网络模型时,为了方便确认,云服务器还可以将第一神经网络模型的标识发送给边缘云服务器,即配置参数还可以包括第一神经网络模型的标识。
参见图4,图4为本申请实施例提供的一种配置参数的帧结构示意图。该配置参数包括数据头和数据部分,其中,数据头可以记录该第一神经网络模型的相关信息,例如包括但不限于下列中的一项或多项:该第一神经网络模型的标识、数据区域所包含的对象的数量、每个对象的长度,层标识的长度。数据区域中以对象为粒度放置每个目标层的层标识和层标识,由于每一层所包含的参数可能是不同的,因此,每个对象的长度可以是不同的,当然,每个对象的长度也可能是相同的,本申请实施例对此不做限定,另外,图4示出的目标层(层2、层3至层n)为示意,在实际应用中,根据实际的目标层生成配置参数,并非限定在层2、层3至层n。本申请实施例对各对象的排序也不做限定,例如,可以是图4示出的按照层的顺序排列,也可以是其他方式排列,本申请对此也不做限定。
需要说明的是,上述图4所示的帧结构仅为举例,任何可以用于指示层标识与参数一 一对应关系的帧结构均适用于本申请实施例。
步骤306,边缘服务器基于云服务器发送的N个层的配置参数,更新边缘服务器中运行的第二神经网络模型中与该配置参数对应的N个层的参数。
对应的,边缘服务器可以根据该第一神经网络模型的标识,在本地的多个神经网络模型中确定该标识所指示的第二神经网络模型,并根据配置参数中携带的层标识和替换该第二神经网络模型中该层标识所指示的层的参数。
参见图5A所示,图5A为一种神经网络模型的配置方法所对应场景示意图。云服务器可以将第一神经网络模型的第m-1层和第m层的参数发送给边缘服务器,边缘服务器基于云服务器发送的第m-1层和第m层的参数,替换掉边缘服务器本地第二神经网络模型中的第m-1层和第m层的原始的参数。
例如,配置参数包括第四层的参数(如记为层数据A)和第五层的参数(如记为参数B),边缘服务器可以根据参数A更新边缘服务器本地的第二神经网络模型的第四层的参数,即将第四层的参数修改为参数A,同理,将本地第二神经网络模型的第五层的参数修改为参数B。至此,边缘服务器完成本地第二神经网络模型的配置。
需要说明的是,上述图5A所示的仅为举例,本申请并不限定于发送神经网络模型的连续几层,或末尾几层,也不限定必须发送多层的参数,如图5B所示,云服务器也可能发送神经网络模型其中的一层的参数,边缘服务器基于云服务器发送的该层的参数替换掉本地该层的参数。
上述方式,由云服务器训练第一神经网络模型,并将第一神经网络模型的N个层的配置参数发送给边缘服务器,边缘服务器接收云服务器发送的N个层的配置参数来更新本地第二神经网络模型的相应层的参数,从而保证边缘服务器本地第一神经网络模型的准确率的基础上,减少数据传输量。
图5C为本申请实施例提供的另一种配置边缘服务器中神经网络模型的方法所对应的流程示意图。如图5C所示,该方法包括如下步骤:
步骤501,采集设备分别向边缘服务器和云服务器发送采样数据。对应的,边缘服务器和云服务器分别接收采集设备发送的采样数据。
步骤502a,云服务器对采样数据进行过滤,得到过滤后的采样数据。
该步骤请参见步骤302中边缘服务器对采样数据进行过滤的流程,此处不再赘述。
步骤502b,边缘服务器向云服务器发送第一信息,对应的,云服务器接收边缘服务器发送的第一信息。
该步骤为可选的步骤,该第一信息可以是上述步骤303中除与过滤后的采样数据相关的介绍,此次不再赘述。
步骤503~步骤505的具体流程请参见上文步骤304~步骤306的介绍,此次不再赘述。需要说明的是,上述步骤501~步骤502b并非为触发云服务器训练第一神经网络模型所必须执行的步骤,因此,在图5C中以虚线框示出。
上述方式,在图3所示方法的基础上,由云服务器对采样数据进行过滤,可以减少边缘服务器上对采样数据进行过滤的CPU开销。
上文对云服务器与一个边缘服务器的交互流程进行了介绍,实际应用中,云服务器可以通过上述方式与其他边缘服务器交互。参见图6,图6为云服务器和不同边缘服务交互示意图。在图6中,云服务器可以基于图3所示的方法与其他任一边缘服务器交互。应理 解的是,这里的其他任一服务器为与云服务器建立有连接的服务器,如图1A中,云服务器10可以与任一边缘服务器20交互。
在一种应用场景中,云服务器下有多个边缘服务器上部署的神经网络模型是相同的,则云服务器可以在重新训练该神经网络模型之后,将该神经网络模型的配置参数同步发送给该多个边缘服务器。
在另一种应用场景中,如果多个云服务器下的边缘服务器所使用的神经网络模型相同,则云服务器之间还可以交互神经网络模型的配置参数,例如,请结合图1C理解,云服务器10将神经网络模型的配置参数发送给云服务器11,云服务器10和云服务器11分别通过图5A或图5B所示的方式,为各自的边缘服务器发送该配置参数,以更新边缘服务器上的神经网络模型的部分层的参数。
值得注意的是,在一种实施方式中,云服务器在管理边缘服务器时,可以记录边缘服务器上所使用的神经网络模型有哪些,以及使用相同神经网络模型的边缘服务器有哪些,云服务器之间还可以交互这些记录的信息(如下称为模型分布信息),以便云服务器确定要将训练后神经网络模型的配置参数发送到哪些云服务器上。其中,模型分布信息包括:云服务器标识、边缘服务器标识、边缘服务器所使用的神经网络模型的标识。
在另一种实施方式中,如果这里的云服务器为边缘云服务器,则还在一种实施方式,各边缘云服务器可以将各自记录的模型分布信息发送到中心云服务器(如公有云服务器),边缘云服务器A除了将训练后的神经网络模型A的配置参数发送到该边缘云服务器A对应的边缘服务器之外,还可以将该神经网络模型A的配置参数发送到中心云服务器。中心云服务器负责管理协调,例如,在接收到边缘云服务器A发送的神经网络模型A的配置参数后,基于其他各边缘云服务器发送的模型分布信息,确定哪些边缘云服务器下的边缘服务器也使用神经网络模型A,例如,确定边缘云服务器B、边缘云服务器C下的边缘服务器也使用神经网络模型A,则中心云服务器可以将该神经网络模型A的配置参数发送到边缘云服务器B和边缘云服务器C。在第三种实施方式中,云服务器中还可以基于用户配置的信息确定具有神经网络模型的其他云服务器。这种方式更加便洁,实用性强。
上述方式,可以提高多个边缘服务器上神经网络模型的参数配置效率,降低配置时延,降低网络数据传输量,节省带宽。
如上介绍了边缘服务器的神经网络模型的参数更新过程,如下对本申请实施例的边缘服务器的神经网络模型的应用方法进行介绍。
参见图7,图7为本申请实施例提供的一种数据处理方法。如下以本申请实施例提供的数据处理方法应用于图1A、1C所示的构架中边缘服务器,该方法可以由该边缘服务器,或边缘服务器的组件如处理器执行为例,对本申请实施例提供的数据处理方法进行说明。如图7所示,该方法包括:
步骤701,边缘服务器接收多个采集设备分别发送的采样数据。
例如,边缘服务器接收摄像头发送的视频数据,接收传感器发送的IOT数据等,其中,该视频数据包括一帧一帧的图像,IOT数据则可以包括气体探测器检测到的气体浓度信息、温度传感器检测到的温度信号、心率传感器检测到的心率值、鼾声传感器检测到的鼾声值等等。
步骤702,边缘服务器将采样数据输入至相应的神经网络模型,得到神经网络模型的检测结果。
边缘服务器针对不同的采样数据所使用的神经网络模型可能是不同的,如前所述,神经网络模型可以具有目标检测、目标识别、预测等功能,其中,目标检测和目标识别属于非预测功能。如下对预测功能和非预测功能的神经网络模型进行举例介绍:
1)非预测功能的神经网络模型。
例如,神经网络模型1具有目标检测功能,如可以检测图像中的目标实例是否佩戴安全帽。应理解,同一帧图像中可能会有多个目标实例,这里的目标实例是指从图像中提取的需要被检测的对象,在不同的算法中,目标实例可以是不同的,如在安全帽检测算法中,目标实例为用户头部,在工作服检测算法中,目标实例为用户的躯干部分。该神经网络模型1的输出结果可以包括用于指示目标实例是否佩戴安全帽的检测结果,还可以包括该检测结果的置信度。其中,置信度用于表征检测结果的准确性,换而言之,置信度用于表示该神经网络模型确定的该目标实例为未佩戴安全帽或佩戴安全帽的可能性或者说概率。比如,神经网络模型1输出的一组检测结果包括目标实例1未佩戴安全帽,置信度为99%。也就是,目标实例1有99%的可能性未佩戴安全帽。又如,另一组检测结果包括目标实例2佩戴安全帽,置信度为40%,也就是,目标实例2有40%的可能性佩戴了安全帽,这是由于,该用户可能佩戴了类似于安全帽的饰品如头盔,但模型不能识别该饰品是否属于安全帽。
上述以安全帽检测为例,目标检测功能还可以包括但不限于:检测目标实例是否穿工作服、检测目标实例是否抽烟、检测目标实例的种类,如区分猫和狗、检测用户是否摔倒等等。这里不再枚举。
2)预测功能的神经网络模型。
例如,神经网络模型2具有预测功能,通常预测功能是基于一段历史时间内的某数据的特征的变化来预测未来时间该特征的值,例如,可以基于历史的任务数量来预测未来一段时间内的任务量,从而根据预测的任务量提前调节供给量。又例如,可以根据矿井等地区的地质裂纹预测该地区未来一段时间是否会坍塌。
上述列举了通过单维度数据进行预测的例子,本申请实施例还可以结合多维度的数据进行预测,例如,可以结合气温、风力值、湿度等值来预测天气,又例如,可以根据用户的行为、面貌、动作、体征数据来预测用户可能存在的潜在疾病,等等。
边缘服务器可以根据采集设备的标识、接收采样数据的端口号等参数来区分采样数据,并调用该采样数据对应的神经网络模型进行处理。其中,采集设备的标识可以唯一标识一个采集设备,该标识可以是采集设备的IP地址、设备ID等等,该端口号可以是物理端口也可以是逻辑端口,本申请实施例对此不做限定。当然,还可以通过其他方式来区分,本申请对此也不做限定。
步骤703,边缘服务器根据多个检测结果确定决策,并基于该决策执行相应的操作。
该多个检测结果包括神经网络模型输出的结果,还可以包括其他检测结果,本申请实施例对此不做限定,以多个检测结果为多个神经网络模型输出的检测结果或一个或多个神经网络模型输出的多个检测结果为例,进行说明。
本申请实施例中的边缘服务器可以基于多种数据类型的数据融合起来进行决策,这里的多种数据类型例如结构性数据和非结构性数据、物联网数据和视频数据、预测功能的神经网络模型输出的数据和非预测功能的神经网络模型输出的数据,等等。
举例来说,边缘服务器接收摄像头拍摄的用户的视频数据,并将该视频中的图像输入 至神经网络模型a中检测目标实例是否摔倒。应理解,如果是老人摔倒很容易引起并发症,因此,本申请实施例可以以检测用户是否摔倒为例进行说明。
边缘服务器还可以将视频数据输入至神经网络模型b中,该神经网络模型b可以根据该视频数据来预测用户可能存在的潜在疾病,例如,用户牙疼、胸口疼可能会有托腮拍胸等行为,这些行为可能是心脏疾病犯病之前的征兆。其中,神经网络模型b和神经网络模型a可以是同一模型,本申请对此不做限定。
边缘服务器还可以接收心率传感器、鼾声传感器、体温传感器等分别发送的心率、鼾声、体温等值,并将心率、鼾声、体温等一项或多项参数输入至神经网络模型c,通过神经网络模型c检测用户的体征是否正常。
应理解,上述采样数据应是属于同一用户的数据。
边缘服务器可以基于神经网络模型a~神经网络模型c的检测结果确定决策,例如,如果神经网络模型a的检测结果指示用户摔倒,而神经网络模型b的检测结果指示用户不存在潜在疾病,神经网络模型c的检测结果指示用户当前体征正常,则边缘服务器确定的决策可以是仅生成日志记录,记录用户摔倒的时间等等信息。
又例如,如果神经网络模型a的检测结果指示用户摔倒,而神经网络模型b的检测结果指示用户不存在潜在疾病,神经网络模型c的检测结果指示用户当前体征异常,则边缘服务器确定的决策可以是提示用户去医院检查,或通知该用户的关联用户,如向该用户的关联用户如家人朋友等发送信息,提醒关联用户关注该用户的身体健康。
又例如,如果神经网络模型a的检测结果指示用户摔倒,神经网络模型b检测用户存在潜在疾病,神经网络模型c检测用户体征不正常,则确定的决策可以是拨打120报警等等。
上述方式,将多个检测结果融合确定决策,可以实现更加全面的、精确的处理,更利用于帮助用户及时发现隐患,采取有效的应对指导,提高用户使用体验。
需要说明的是,本申请是以边缘服务器和云服务器为例示出的,实际上,本申请提供的技术方案可以应用于任意两个具有通信功能和计算功能的设备中,或者是,边缘服务器和云服务器在不同的场景中具有不同的名称,本申请实施例对此均不做限定。
基于与方法实施例相同的构思,本申请实施例还提供了一种配置装置,该配置装置用于执行上述方法实施例中云服务器所执行的方法。如图8所示,该配置装置800包括训练模块801、发送模块802。可选的,还包括接收模块803、处理模块804(由于该模块是可选的模型,因此图8中以虚线框示出)。具体地,在配置装置800中,各模块之间通过通信通路建立连接。
在一个实施例中,该存储装置用于实现上述方法实施例中云服务器执行的方法。
训练模块801,用于训练第一神经网络模型,所述第一神经网络模型包括K层;具体实现方式请参见图3中的步骤304的描述,或参见图5C中步骤503的描述,此处不再赘述。
发送模块802,用于将所述第一神经网络模型中符合预设条件的N个层的配置参数发送给边缘服务器,所述N取小于K的正整数。具体实现方式请参见图3中的步骤305的描述,或参见图5C中步骤504的描述,此处不再赘述。
在一个可能的设计中,该N个层为所述第一神经网络模型中的预设层。
在一个可能的设计中,第一神经网络模型中的每一层具有权重,权重用于指示层对第 一神经网络模型准确率的影响程度;N个层中任一层的权重大于K-N层中任一层的权重。
在一种可能的实现方式中,N为第一预设值,或N为权重超过预设阈值的层的数量。
在一种可能的实现方式中,配置参数包括N个层中每一层训练后的参数和层标识。
在一种可能的实现方式中,接收模块803,用于接收边缘服务器发送的第一信息,并根据第一信息训练所述第一神经网络模型。具体实现方式请参见图3中的步骤303的描述,或参见图5C中步骤502b的描述,此处不再赘述。
在一种可能的实现方式中,第一信息包括下列中的一项或多项:所述训练数据、所述第二神经网络模型的性能参数、用于触发对所述第一神经网络模型进行训练的指示信息;其中,所述性能参数包括准确率、置信度、精确度中的一项或多项。
在一种可能的实现方式中,接收模块803,用于接收采集设备发送的采样数据;具体实现方式请参见图5C中步骤502a的描述,此处不再赘述。
处理模块804,用于在采样数据中确定用于训练所述第一神经网络模型的训练数据;还用于在确定的训练数据的数据量超过第二预设值时,训练所述第一神经网络模型。参见步骤。
在一种可能的实现方式中,训练数据包括与历史数据的相似度小于第三预设值的采样数据。
本申请实施例还提供一种计算机存储介质,该计算机存储介质中存储有计算机指令,当该计算机指令在存储装置上运行时,使得存储装置执行上述相关方法步骤以实现上述实施例中的云服务器所执行的方法,参见图3中步骤303~步骤405的描述,或参见图5C中步骤501~504的描述,此处不再赘述。
本申请实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述相关步骤,以实现上述实施例中的云服务器所执行的方法,参见图3中步骤303~步骤405的描述,或参见图5C中步骤501~504的描述,此处不再赘述。
另外,本申请的实施例还提供一种装置,这个装置具体可以是芯片,组件或模块,该装置可包括相连的处理器和存储器;其中,存储器用于存储计算机执行指令,当装置运行时,处理器可执行存储器存储的计算机执行指令,以使芯片执行上述各方法实施例中的云服务器所执行的方法,参见图3中步骤303~步骤405的描述,或参见图5C中步骤501~504的描述,此处不再赘述。
其中,本申请实施例提供的存储设备、计算机存储介质、计算机程序产品或芯片均用于执行上文所提供的云服务器对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
通过以上实施方式的描述,所属领域的技术人员可以了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其他的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通 信连接,可以是电性,机械或其他的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元(或模块)可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
可选的,本申请实施例中的计算机执行指令也可以称之为应用程序代码,本申请实施例对此不作具体限定。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包括一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。
本申请实施例中所描述的各种说明性的逻辑单元和电路可以通过通用处理器,数字信号处理器,专用集成电路(ASIC),现场可编程门阵列(FPGA)或其它可编程逻辑装置,离散门或晶体管逻辑,离散硬件部件,或上述任何组合的设计来实现或操作所描述的功能。通用处理器可以为微处理器,可选地,该通用处理器也可以为任何传统的处理器、控制器、微控制器或状态机。处理器也可以通过计算装置的组合来实现,例如数字信号处理器和微处理器,多个微处理器,一个或多个微处理器联合一个数字信号处理器核,或任何其它类似的配置来实现。
本申请实施例中所描述的方法或算法的步骤可以直接嵌入硬件、处理器执行的软件单元、或者这两者的结合。软件单元可以存储于RAM存储器、闪存、ROM存储器、EPROM存储器、EEPROM存储器、寄存器、硬盘、可移动磁盘、CD-ROM或本领域中其它任意形式的存储媒介中。示例性地,存储媒介可以与处理器连接,以使得处理器可以从存储媒介中读取信息,并可以向存储媒介存写信息。可选地,存储媒介还可以集成到处理器中。 处理器和存储媒介可以设置于ASIC中。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包括这些改动和变型在内。

Claims (17)

  1. 一种配置边缘服务器中神经网络模型的系统,其特征在于,包括:
    云服务器用于训练第一神经网络模型,所述第一神经网络模型包括K层;
    所述云服务器还用于将所述第一神经网络模型中符合预设条件的N个层的配置参数发送给边缘服务器,所述N取小于K的正整数;
    所述边缘服务器用于根据所述N个层的配置参数,更新所述边缘服务器中运行的第二神经网络模型中与所述配置参数对应的N个层的参数,所述第一神经网络模型与所述第二神经网络模型的结构相同。
  2. 如权利要求1所述的系统,其特征在于,所述N个层为所述第一神经网络模型中的预设层。
  3. 如权利要求1所述的系统,其特征在于,所述第一神经网络模型中的每一层具有权重,所述权重用于指示所述层对所述第一神经网络模型准确率的影响程度;
    所述N个层中任一层的权重大于K-N层中任一层的权重。
  4. 如权利要求1至3任意一项所述的系统,其特征在于,
    所述边缘服务器还用于当所述第二神经网络模型的性能参数低于第一预设值时,发送所述第一信息至所述云服务器,指示所述云服务器训练所述第一神经网络模型;
    所述云服务器用于接收所述边缘服务器发送的第一信息,并根据所述第一信息训练所述第一神经网络模型。
  5. 如权利要求1至4任意一项所述的系统,其特征在于,所述边缘服务器还用于接收采集设备发送的采样数据,在所述采样数据中确定用于训练所述第一神经网络模型的训练数据,在确定的所述训练数据的数据量超过第二预设值时,发送所述第一信息至所述云服务器,指示所述云服务器训练所述第一神经网络模型;
    所述云服务器用于接收所述边缘服务器发送的第一信息,并根据所述第一信息训练所述第一神经网络模型。
  6. 如权利要求4或5所述的系统,其特征在于,所述第一信息包括下列中的一项或多项:
    所述训练数据、所述第二神经网络模型的性能参数、用于触发对所述第一神经网络模型进行训练的指示信息;其中,所述性能参数包括准确率、置信度、精确度中的一项或多项。
  7. 如权利要求1至6任意一项所述的系统,其特征在于,所述云服务器还用于接收采集设备发送的采样数据,在所述采样数据中确定用于训练所述第一神经网络模型的训练数据,在确定的所述训练数据的数据量超过第三预设值时,训练所述第一神经网络模型。
  8. 如权利要求5-7任一项所述的系统,其特征在于,所述训练数据包括与历史数据的相似度小于第四预设值的采样数据。
  9. 一种配置装置,其特征在于,该装置包括:
    训练模块,用于训练第一神经网络模型,所述第一神经网络模型包括K层;
    发送模块,用于将所述第一神经网络模型中符合预设条件的N个层的配置参数发送给边缘服务器,所述N取小于K的正整数。
  10. 如权利要求9所述的装置,其特征在于,所述N个层为所述第一神经网络模型中的 预设层。
  11. 如权利要求9所述的装置,其特征在于,所述第一神经网络模型中的每一层具有权重,所述权重用于指示所述层对所述第一神经网络模型准确率的影响程度;
    所述N个层中任一层的权重大于K-N层中任一层的权重。
  12. 如权利要求9-11任一项所述的装置,其特征在于,所述装置还包括:
    接收模块,用于接收所述边缘服务器发送的第一信息,并根据所述第一信息训练所述第一神经网络模型。
  13. 如权利要求12所述的装置,其特征在于,所述第一信息包括下列中的一项或多项:
    所述训练数据、所述第二神经网络模型的性能参数、用于触发对所述第一神经网络模型进行训练的指示信息;其中,所述性能参数包括准确率、置信度、精确度中的一项或多项。
  14. 如权利要求9-11任一项所述的装置,其特征在于,所述装置还包括:
    接收模块,用于接收采集设备发送的采样数据;
    处理模块,用于在所述采样数据中确定用于训练所述第一神经网络模型的训练数据;还用于在确定的所述训练数据的数据量超过第二预设值时,训练所述第一神经网络模型。
  15. 如权利要求13或14所述的装置,其特征在于,所述训练数据包括与历史数据的相似度小于第三预设值的采样数据。
  16. 一种存储设备,其特征在于,所述存储设备包括处理器和存储器;
    所述存储器,用于存储计算机程序指令;
    所述处理器执行调用所述存储器中的计算机程序指令执行如权利要求9至15中任一项所述的配置装置的功能。
  17. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质被计算设备执行时,所述计算设备执行上述权利要求9至15中任一项所述的配置装置所执行的功能。
PCT/CN2022/092414 2021-06-30 2022-05-12 一种配置边缘服务器中神经网络模型的系统及装置 WO2023273629A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110738374.3 2021-06-30
CN202110738374.3A CN115546821A (zh) 2021-06-30 2021-06-30 一种配置边缘服务器中神经网络模型的系统及装置

Publications (1)

Publication Number Publication Date
WO2023273629A1 true WO2023273629A1 (zh) 2023-01-05

Family

ID=84692481

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/092414 WO2023273629A1 (zh) 2021-06-30 2022-05-12 一种配置边缘服务器中神经网络模型的系统及装置

Country Status (2)

Country Link
CN (1) CN115546821A (zh)
WO (1) WO2023273629A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116095089A (zh) * 2023-04-11 2023-05-09 云南远信科技有限公司 遥感卫星数据处理方法及系统
CN116629386A (zh) * 2023-07-21 2023-08-22 支付宝(杭州)信息技术有限公司 模型训练方法及装置
CN117689041A (zh) * 2024-01-26 2024-03-12 西安电子科技大学 云端一体化的嵌入式大语言模型训练方法及语言问答方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200074282A1 (en) * 2018-08-31 2020-03-05 Element Ai Inc. Data point suitability determination from edge device neural networks
CN111445026A (zh) * 2020-03-16 2020-07-24 东南大学 面向边缘智能应用的深度神经网络多路径推理加速方法
US20200272899A1 (en) * 2019-02-22 2020-08-27 Ubotica Technologies Limited Systems and Methods for Deploying and Updating Neural Networks at the Edge of a Network
CN111625361A (zh) * 2020-05-26 2020-09-04 华东师范大学 一种基于云端服务器和IoT设备协同的联合学习框架

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200074282A1 (en) * 2018-08-31 2020-03-05 Element Ai Inc. Data point suitability determination from edge device neural networks
US20200272899A1 (en) * 2019-02-22 2020-08-27 Ubotica Technologies Limited Systems and Methods for Deploying and Updating Neural Networks at the Edge of a Network
CN111445026A (zh) * 2020-03-16 2020-07-24 东南大学 面向边缘智能应用的深度神经网络多路径推理加速方法
CN111625361A (zh) * 2020-05-26 2020-09-04 华东师范大学 一种基于云端服务器和IoT设备协同的联合学习框架

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116095089A (zh) * 2023-04-11 2023-05-09 云南远信科技有限公司 遥感卫星数据处理方法及系统
CN116629386A (zh) * 2023-07-21 2023-08-22 支付宝(杭州)信息技术有限公司 模型训练方法及装置
CN116629386B (zh) * 2023-07-21 2023-09-19 支付宝(杭州)信息技术有限公司 模型训练方法及装置
CN117689041A (zh) * 2024-01-26 2024-03-12 西安电子科技大学 云端一体化的嵌入式大语言模型训练方法及语言问答方法
CN117689041B (zh) * 2024-01-26 2024-04-19 西安电子科技大学 云端一体化的嵌入式大语言模型训练方法及语言问答方法

Also Published As

Publication number Publication date
CN115546821A (zh) 2022-12-30

Similar Documents

Publication Publication Date Title
WO2023273629A1 (zh) 一种配置边缘服务器中神经网络模型的系统及装置
CN111723786B (zh) 一种基于单模型预测的安全帽佩戴检测方法及装置
US9781575B1 (en) Autonomous semantic labeling of physical locations
US20170046574A1 (en) Systems and Methods for Categorizing Motion Events
US10043078B2 (en) Virtual turnstile system and method
US11232327B2 (en) Smart video surveillance system using a neural network engine
JPWO2018180588A1 (ja) 顔画像照合システムおよび顔画像検索システム
CN108363997A (zh) 一种在视频中对特定人的实时跟踪方法
WO2021063056A1 (zh) 人脸属性识别方法、装置、电子设备和存储介质
US20220189001A1 (en) Rail feature identification system
CN115775085B (zh) 一种基于数字孪生的智慧城市管理方法及系统
CN107657232A (zh) 一种行人智能识别方法及其系统
WO2023216609A1 (zh) 视听特征融合的目标行为识别方法、装置及应用
Das et al. Heterosense: An occupancy sensing framework for multi-class classification for activity recognition and trajectory detection
Zhang A cloud-based platform for big data-driven cps modeling of robots
JP7457436B2 (ja) 少数ショット時間的行動局所化を容易化するシステム、方法、プログラム
EP4287145A1 (en) Statistical model-based false detection removal algorithm from images
De et al. Fall detection approach based on combined displacement of spatial features for intelligent indoor surveillance
WO2022095807A1 (zh) 一种任务学习系统、方法及相关设备
US11823452B2 (en) Video analytics evaluation
Xia et al. SCSS: An intelligent security system to guard city public safe
Dai et al. Trajectory outlier detection based on dbscan and velocity entropy
Choujaa et al. Activity recognition from mobile phone data: State of the art, prospects and open problems
Shirsat et al. Optimization-enabled deep stacked autoencoder for occupancy detection
US20230011337A1 (en) Progressive deep metric learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22831456

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE