CN114237861A - Data processing method and equipment thereof - Google Patents

Data processing method and equipment thereof Download PDF

Info

Publication number
CN114237861A
CN114237861A CN202010943745.7A CN202010943745A CN114237861A CN 114237861 A CN114237861 A CN 114237861A CN 202010943745 A CN202010943745 A CN 202010943745A CN 114237861 A CN114237861 A CN 114237861A
Authority
CN
China
Prior art keywords
target task
equipment
task
target
arbitration result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010943745.7A
Other languages
Chinese (zh)
Inventor
朱志峰
李阜
吴义镇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010943745.7A priority Critical patent/CN114237861A/en
Publication of CN114237861A publication Critical patent/CN114237861A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application discloses a data processing method and equipment thereof, which are used in a distributed task scheduling system. The method in the embodiment of the application comprises the following steps: the method comprises the steps that a first device receives a target task sent by a second device, the target task comprises task description information, the target task is generated for the second device, the first device obtains an arbitration result according to the target task, the arbitration result is used for expressing a score of the first device for executing the target task, and the first device sends the arbitration result to the second device so that the second device can determine that the first device is the device for executing the target task according to the arbitration result. In the embodiment of the application, the first device obtains the arbitration result through the target task sent by the second device, and sends the arbitration result to the second device, so that the second device can determine the device for executing the target task according to the arbitration result, the process of cross-device interaction is reduced, and the interaction efficiency is improved.

Description

Data processing method and equipment thereof
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a data processing method and data processing equipment.
Background
With the development of Artificial Intelligence (AI) technology and the improvement of living standard of people in recent years, people increasingly pay more attention to comfort, safety and convenience of their living environment. In order to comply with the demands of the public, smart home systems (smart home) are rapidly developed. The intelligent home system aims to enable various devices in a user family to serve according to the life style of the user more by utilizing the technology of the Internet of things, so that a more comfortable, more environment-friendly, more convenient and more intelligent living environment is created for the user.
In the overlay device of the smart home system, the hardware resources of some devices are limited. For example, the chip computation power of a household cat eye camera is only 0.1T, so that the computation delay is more than hundred milliseconds when local stranger AI identification is carried out, and the accuracy is not high. Thus, the task of local stranger AI identification can be performed by means of other highly-sophisticated devices of the home networking of the smart home system.
In the existing solution, there is a central manager in the smart home system, and the central manager is responsible for the same management and scheduling of cluster resources and processing the request of the overlay device of the smart home system. In the working process of the central manager, after the overlay device sends a task request to the central manager, the central manager needs to perform task scheduling and send the task request to other overlay devices capable of executing the task request, so that the process of cross-device interaction is increased, and the interaction efficiency is reduced.
Disclosure of Invention
The embodiment of the application provides a data processing method, which includes the steps that a target task sent by second equipment is received, an arbitration result is obtained according to the target task, and the arbitration result is sent to the second equipment, so that the second equipment can determine equipment for executing the target task according to the arbitration result of the first equipment, and in the process, intermediate equipment such as a central manager and the like are not needed for task scheduling, the process of cross-equipment interaction is reduced, and the interaction efficiency is improved.
The first aspect of the embodiments of the present application provides a data processing method.
The method comprises the steps that a first device receives a target task sent by a second device, the target task comprises task description information, the target task is generated for the second device, the first device obtains an arbitration result according to the target task, the arbitration result is used for expressing a score of the first device for executing the target task, and the first device sends the arbitration result to the second device so that the second device can determine that the first device is the device for executing the target task according to the arbitration result.
In the embodiment of the application, the first device receives the target task sent by the second device and obtains the arbitration result according to the target task, so that the second device can determine the device for executing the target task according to the arbitration result of the first device, and in the process, intermediate devices such as a central manager and the like are not needed for task scheduling, thereby reducing the process of cross-device interaction and improving the interaction efficiency.
Based on the method of the first aspect of the embodiments of the present application, in a possible implementation manner, before a first device obtains an arbitration result according to a target task, the first device obtains dynamic resource information, where the dynamic resource information represents real-time hardware resource usage information of the first device, the first device performs calculation according to the dynamic resource information to obtain a virtualization computational power, where the virtualization computational power represents a computational power supported by the first device in a current state, and obtaining, by the first device, the arbitration result according to the target task includes: the first device obtains an arbitration result according to the virtualization computing power and the target task.
In the embodiment of the application, the first device obtains the virtualization computing power according to the dynamic resource information, and obtains the arbitration result according to the virtualization computing power and the target task, so that the accuracy of the arbitration result is improved.
In a possible implementation manner of the method according to the first aspect of the embodiment of the present application, the obtaining, by the first device, an arbitration result according to the virtualization calculation power and the target task includes: the first device obtains task resource information according to the target task, wherein the task resource information represents hardware resources required for executing the target task. The first device obtains an arbitration result according to the task resource information and the virtualization computing power.
In the embodiment of the application, the first device obtains the arbitration result according to the task resource information and the virtualization computing power, and the accuracy of the arbitration result is improved.
A second aspect of the embodiments of the present application provides a data processing method.
The second device sends a target task to the first device, the target task comprises task description information, the target task is generated for the second device, the second device receives an arbitration result sent by the first device, the arbitration result is used for expressing a score of the first device for executing the target task, and the second device determines that the first device is the device for executing the target task according to the arbitration result.
In the embodiment of the application, the first device receives the target task sent by the second device and obtains the arbitration result according to the target task, so that the second device can determine the device for executing the target task according to the arbitration result of the first device, and in the process, intermediate devices such as a central manager and the like are not needed for task scheduling, thereby reducing the process of cross-device interaction and improving the interaction efficiency.
Based on the method of the second aspect of the embodiment of the present application, in a possible implementation manner, the sending, by the second device, the target task to the first device includes: the second device sends the target tasks to the plurality of first devices. The second device receiving the arbitration result sent by the first device comprises: the second device receives a plurality of arbitration results sent by the plurality of first devices, and the plurality of arbitration results correspond to the plurality of first devices one to one. The second device determining that the first device is the device for executing the target task according to the arbitration result comprises: the second device obtains a target arbitration result according to the plurality of arbitration results, wherein the target arbitration result is the optimal arbitration result in the plurality of arbitration results. And the second equipment determines the corresponding target equipment as the equipment for executing the target task according to the target arbitration result, wherein the target equipment belongs to one of the plurality of first equipment.
In the embodiment of the application, the second device determines the optimal arbitration result in the plurality of arbitration results, and determines the corresponding target device as the device for executing the target task according to the optimal arbitration result, so that the efficiency for executing the target task is improved.
In a possible implementation manner, based on the method of the second aspect of the embodiment of the present application, before the second device sends the target task to the first device, the method further includes: the second device determines whether the second device can perform the target task. And if not, triggering the second equipment to send the target task to the first equipment.
In the embodiment of the application, before the target task is sent to the first device, whether the second device can execute the target task is judged, if yes, the target task is not sent, and if not, the target task is sent, so that the flexibility of the scheme is improved.
Based on the method in the second aspect of the embodiment of the present application, in a possible implementation manner, after the second device determines that the first device is a device for executing a target task according to the arbitration result, the second device sends target task data to the first device, where the target task data is used for the first device to execute the target task.
In the embodiment of the application, after the device for executing the target task is determined, the target task data is sent to the target device, so that the completeness of the scheme is improved.
A third aspect of embodiments of the present application provides an apparatus.
An apparatus, comprising:
the receiving unit is used for receiving a target task sent by the second equipment, the target task comprises task description information, and the target task is generated for the second equipment;
the processing unit is used for obtaining an arbitration result according to the target task, and the arbitration result is used for expressing the score of the first equipment for executing the target task;
and the sending unit is used for sending the arbitration result to the second equipment so that the second equipment determines that the first equipment is the equipment for executing the target task according to the arbitration result.
Optionally, the apparatus further comprises:
the acquisition unit is used for acquiring dynamic resource information, and the dynamic resource information represents the real-time hardware resource use information of the first equipment;
the processing unit is further used for calculating according to the dynamic resource information to obtain a virtualization calculation force, and the virtualization calculation force represents the calculation force supported by the first equipment in the current state;
the processing unit is specifically configured to obtain an arbitration result according to the virtualization computing power and the target task.
Optionally, the processing unit is specifically configured to obtain task resource information according to the target task, where the task resource information indicates hardware resources required for executing the target task;
the processing unit is specifically configured to obtain an arbitration result according to the task resource information and the virtualization computing power.
A fourth aspect of embodiments of the present application provides an apparatus.
An apparatus, comprising:
the sending unit is used for sending a target task to the first equipment, wherein the target task comprises task description information and is generated for the second equipment;
the receiving unit is used for receiving an arbitration result sent by the first equipment, and the arbitration result is used for expressing the score of the first equipment for executing the target task;
and the determining unit is used for determining the first equipment as the equipment for executing the target task according to the arbitration result.
Optionally, the sending unit is specifically configured to send the target task to the plurality of first devices;
the receiving unit is specifically configured to receive multiple arbitration results sent by multiple first devices, where the multiple arbitration results correspond to the multiple first devices one to one;
the determining unit is specifically configured to obtain a target arbitration result according to the multiple arbitration results, where the target arbitration result is an optimal arbitration result among the multiple arbitration results;
the determining unit is specifically configured to determine, according to the target arbitration result, that the corresponding target device is a device that executes the target task, where the target device belongs to one of the plurality of first devices.
Optionally, the apparatus further comprises:
a judging unit configured to judge whether the second device can execute the target task;
and if not, triggering the second equipment to send the target task to the first equipment.
Optionally, the sending unit is further configured to send target task data to the first device, where the target task data is used for the first device to execute the target task.
A fifth aspect of the present application provides a computer storage medium having stored thereon instructions that, when executed on a computer, cause the computer to perform a method as embodied in the first aspect of the present application, and/or the second aspect.
A sixth aspect of the present application provides a computer program product which, when executed on a computer, causes the computer to perform the method as embodied in the first aspect of the present application, and/or the second aspect.
According to the technical scheme, the embodiment of the application has the following advantages:
in the embodiment of the application, the first device receives the target task sent by the second device and obtains the arbitration result according to the target task, so that the second device can determine the device for executing the target task according to the arbitration result of the first device, and in the process, intermediate devices such as a central manager and the like are not needed for task scheduling, thereby reducing the process of cross-device interaction and improving the interaction efficiency.
Drawings
FIG. 1 is a schematic diagram of a neural network framework provided by an embodiment of the present application;
FIG. 2 is a schematic diagram of another neural network framework provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of another neural network framework provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of another neural network framework provided by an embodiment of the present application;
fig. 5 is a schematic diagram of a chip structure provided in the embodiment of the present application;
FIG. 6 is a block diagram of a distributed task scheduling system according to an embodiment of the present disclosure;
fig. 7 is a schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 8 is a schematic view of a scenario of a data processing method according to an embodiment of the present application;
FIG. 9 is a schematic structural diagram of an apparatus provided in an embodiment of the present application;
FIG. 10 is a schematic structural diagram of another apparatus provided in an embodiment of the present application;
FIG. 11 is a schematic structural diagram of an apparatus provided in an embodiment of the present application;
FIG. 12 is a schematic structural diagram of another apparatus provided in an embodiment of the present application;
fig. 13 is a schematic structural diagram of another apparatus provided in an embodiment of the present application;
fig. 14 is a schematic structural diagram of another apparatus provided in an embodiment of the present application.
Detailed Description
The embodiment of the application provides a data processing method, which is used in a distributed task scheduling system, wherein a first device obtains an arbitration result through a target task sent by a second device, and sends the arbitration result to the second device, so that the second device can determine a device for executing the target task according to the arbitration result, the cross-device interaction process is reduced, and the interaction efficiency is improved.
FIG. 1 shows a schematic diagram of an artificial intelligence body framework that describes the overall workflow of an artificial intelligence system, applicable to the general artificial intelligence field requirements.
The artificial intelligence topic framework described above is set forth below in terms of two dimensions, the "intelligent information chain" (horizontal axis) and the "IT value chain" (vertical axis).
The "smart information chain" reflects a list of processes processed from the acquisition of data. For example, the general processes of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision making and intelligent execution and output can be realized. In this process, the data undergoes a "data-information-knowledge-wisdom" refinement process.
The 'IT value chain' reflects the value of the artificial intelligence to the information technology industry from the bottom infrastructure of the human intelligence, information (realization of providing and processing technology) to the industrial ecological process of the system.
(1) Infrastructure:
the infrastructure provides computing power support for the artificial intelligent system, realizes communication with the outside world, and realizes support through a foundation platform. Communicating with the outside through a sensor; the computing power is provided by hardware acceleration chips such as a Central Processing Unit (CPU), an embedded Neural Network Processor (NPU), a Graphic Processing Unit (GPU), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), and the like; the basic platform comprises distributed computing framework, network and other related platform guarantees and supports, and can comprise cloud storage and computing, interconnection and intercommunication networks and the like. For example, sensors and external communications acquire data that is provided to intelligent chips in a distributed computing system provided by the base platform for computation.
(2) Data of
Data at the upper level of the infrastructure is used to represent the data source for the field of artificial intelligence. The data relates to graphics, images, voice, video and text, and also relates to internet of things data of traditional equipment, including service data of an existing system and sensing data of force, displacement, liquid level, temperature, humidity and the like.
(3) Data processing
Data processing typically includes data training, machine learning, deep learning, searching, reasoning, decision making, and the like.
The machine learning and the deep learning can perform symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.
Inference means a process of simulating an intelligent human inference mode in a computer or an intelligent system, using formalized information to think about and solve a problem by a machine according to an inference control strategy, and a typical function is searching and matching.
The decision-making refers to a process of making a decision after reasoning intelligent information, and generally provides functions of classification, sequencing, prediction and the like.
(4) General capabilities
After the above-mentioned data processing, further based on the result of the data processing, some general-purpose capabilities can be formed, such as an algorithm or a general-purpose system, for example, translation, text analysis, computer vision processing (e.g., image recognition, object detection, etc.), voice recognition, etc.
(5) Intelligent product and industrial application
The intelligent product and industry application refers to the product and application of an artificial intelligence system in various fields, and is the encapsulation of an artificial intelligence integral solution, the intelligent information decision is commercialized, and the landing application is realized, and the application field mainly comprises: intelligent manufacturing, intelligent transportation, intelligent home, intelligent medical treatment, intelligent security, automatic driving, safe city, intelligent terminal and the like.
Referring to fig. 2, a system architecture 200 is provided in an embodiment of the present application. The system architecture includes a database 230 and a client device 240. The data collection device 260 is used to collect data and store it in the database 230, and the training module 220 generates the target model/rule 201 based on the data maintained in the database 230.
The operation of each layer in a deep neural network can be described by the mathematical expression y ═ a (W × x + b): from the work of each layer in the physical-level deep neural network, it can be understood that the transformation of the input space into the output space (i.e. the row space to the column space of the matrix) is accomplished by five operations on the input space (set of input vectors), which include: 1. ascending/descending dimensions; 2. zooming in/out; 3. rotating; 4. translating; 5. "bending". Wherein, the operations 1, 2 and 3 are completed by W x, the operation 4 is completed by + b, and the operation 5 is realized by a (). The expression "space" is used herein because the object being classified is not a single thing, but a class of things, and space refers to the collection of all individuals of such things. Where W is a weight vector, each value in the vector representing a weight value for a neuron in the layer of neural network. The vector determines the spatial transformation from input space to output space described above, i.e., the weight of each layer controls how the space is transformed. The purpose of training the deep neural network is to finally obtain the weight matrix of all layers of the trained neural network.
Because the output of the deep neural network is expected to be as close to the target value as possible, the weight vector of each layer of the neural network can be updated according to the difference between the predicted value and the target value of the current network by comparing the predicted value and the target value (of course, an initialization process is usually performed before the first update, that is, parameters are configured in advance for each layer in the deep neural network). For example, if the predicted value of the network is too high, the values of the weights in the weight matrix are adjusted to reduce the predicted value, and the adjustment is continued until the value output by the neural network approaches the target value or equals the target value. Therefore, it is necessary to define in advance how to compare the difference between the predicted value and the target value, that is, a loss function (loss function) which is an important equation for measuring the difference between the predicted value and the target value or an objective function (objective function). In this case, taking the loss function as an example, a higher output value (loss) of the loss function indicates a larger difference, and the training of the neural network may be understood as a process of reducing the loss as much as possible.
The calculation module may include a training module 220, and the target model/rule obtained by the training module 220 may be applied to different systems or devices. In fig. 2, the execution device 210 configures a transceiver 212, the transceiver 212 may be a wireless transceiver, an optical transceiver, a wired interface (such as an I/O interface), or the like, and performs data interaction with an external device, and a "user" may input data to the transceiver 212 through the client device 240, for example, in the following embodiments of the present application, the client device 240 may send a target task to the execution device 210, request the execution device to construct a neural network, and send a database for training to the execution device 210.
The execution device 210 may call data, code, etc. from the data storage system 250 and may store data, instructions, etc. in the data storage system 250.
The calculation module 211 processes the input data using the target model/rule 201.
Finally, the transceiver 212 returns the constructed neural network to the client device 240 for deployment in the client device 240 or other device.
Further, the training module 220 may derive corresponding target models/rules 201 based on different data for different target tasks to provide better results to the user.
In the case shown in fig. 2, the user may manually specify data to be input into the execution device 210, for example, to operate in an interface provided by the transceiver 212. Alternatively, the client device 240 may automatically input data to the transceiver 212 and obtain the results, and if the client device 240 automatically inputs data to obtain authorization from the user, the user may set corresponding permissions in the client device 240. The user can view the result output by the execution device 210 at the client device 240, and the specific presentation form can be display, sound, action, and the like. The client device 240 may also act as a data collector to store collected data associated with the target task in the database 230.
It should be noted that fig. 2 is only a schematic diagram of a system architecture provided in an embodiment of the present application, and a positional relationship between devices, modules, and the like shown in the diagram does not constitute any limitation. For example, in FIG. 2, the data storage system 250 is an external memory with respect to the execution device 210, and in other scenarios, the data storage system 250 may be disposed in the execution device 210.
Illustratively, a Convolutional Neural Network (CNN) is taken as an example below.
CNN is a deep neural network with a convolution structure, and is a deep learning (deep learning) architecture, which refers to learning at multiple levels at different abstraction levels by a machine learning algorithm. As a deep learning architecture, CNN is a feed-forward artificial neural network in which individual neurons respond to overlapping regions in an image input thereto.
As shown in fig. 3, Convolutional Neural Network (CNN)100 may include an input layer 110, a convolutional/pooling layer 120, where the pooling layer is optional, and a neural network layer 130.
As shown in FIG. 3, convolutional layer/pooling layer 120 may include, for example, 121-126 layers, in one implementation, 121 layers are convolutional layers, 122 layers are pooling layers, 123 layers are convolutional layers, 124 layers are pooling layers, 125 layers are convolutional layers, and 126 layers are pooling layers; in another implementation, 121, 122 are convolutional layers, 123 are pooling layers, 124, 125 are convolutional layers, and 126 are pooling layers. I.e., the output of a convolutional layer may be used as input to a subsequent pooling layer, or may be used as input to another convolutional layer to continue the convolution operation.
Taking convolutional layer 121 as an example, convolutional layer 121 may include a plurality of convolution operators, also called kernels, whose role in image processing is to act as a filter to extract specific information from the input image matrix, and the convolution operator may be essentially a weight matrix, which is usually predefined. During the convolution operation on the image, the weight matrix is usually processed on the input image pixel by pixel in the horizontal direction (or two pixels by two pixels … … depending on the value of the step size stride), so as to complete the extraction of the specific feature from the image. The size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension (depth dimension) of the weight matrix is the same as the depth dimension of the input image, and the weight matrix extends to the entire depth of the input image during the convolution operation. Thus, convolving with a single weight matrix will produce a single depth dimension of the convolved output, but in most cases not a single weight matrix is used, but a plurality of weight matrices of the same dimension are applied. The outputs of each weight matrix are stacked to form the depth dimension of the convolved image. Different weight matrices may be used to extract different features in the image, e.g., one weight matrix to extract image edge information, another weight matrix to extract a particular color of the image, yet another weight matrix to blur unwanted noise in the image, etc. The dimensions of the multiple weight matrixes are the same, the dimensions of the feature maps extracted by the multiple weight matrixes with the same dimensions are also the same, and the extracted feature maps with the same dimensions are combined to form the output of convolution operation.
The weight values in these weight matrices need to be obtained through a large amount of training in practical application, and each weight matrix formed by the trained weight values can extract information from the input image, thereby helping the convolutional neural network 100 to make correct prediction.
When convolutional neural network 100 has multiple convolutional layers, the initial convolutional layer (e.g., 121) tends to extract more general features, which may also be referred to as low-level features; as the depth of the convolutional neural network 100 increases, the more convolutional layers (e.g., 126) that go further back extract more complex features, such as features with high levels of semantics, the more highly semantic features are more suitable for the problem to be solved.
A pooling layer:
since it is often necessary to reduce the number of training parameters, it is often necessary to periodically introduce pooling layers after the convolutional layer, i.e. the layers 121-126 as illustrated by 120 in fig. 3, may be one convolutional layer followed by one pooling layer, or may be multiple convolutional layers followed by one or more pooling layers. During image processing, the only purpose of the pooling layer is to reduce the spatial size of the image. The pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to smaller sized images. The average pooling operator may calculate pixel values in the image over a particular range to produce an average. The max pooling operator may take the pixel with the largest value in a particular range as the result of the max pooling. In addition, just as the size of the weighting matrix used in the convolutional layer should be related to the image size, the operators in the pooling layer should also be related to the image size. The size of the image output after the processing by the pooling layer may be smaller than the size of the image input to the pooling layer, and each pixel point in the image output by the pooling layer represents an average value or a maximum value of a corresponding sub-region of the image input to the pooling layer.
The neural network layer 130:
after processing by convolutional layer/pooling layer 120, convolutional neural network 100 is not sufficient to output the required output information. Because, as previously described, the convolutional layer/pooling layer 120 only extracts features and reduces the parameters brought by the input image. However, to generate the final output information (class information or other relevant information as needed), the convolutional neural network 100 needs to generate one or a set of outputs of the number of classes as needed using the neural network layer 130. Accordingly, a plurality of hidden layers (131, 132 to 13n as shown in fig. 3) and an output layer 140 may be included in the neural network layer 130. In this application, the convolutional neural network is: and searching the super unit by taking the output of the delay prediction model as a constraint condition to obtain at least one first construction unit, and stacking the at least one first construction unit. The convolutional neural network can be used for image recognition, image classification, image super-resolution reconstruction and the like.
After the hidden layers in the neural network layer 130, i.e. the last layer of the whole convolutional neural network 100 is the output layer 140, the output layer 140 has a loss function similar to the class cross entropy, and is specifically used for calculating the prediction error, once the forward propagation (i.e. the propagation from 110 to 140 in fig. 3 is the forward propagation) of the whole convolutional neural network 100 is completed, the backward propagation (i.e. the propagation from 140 to 110 in fig. 3 is the backward propagation) starts to update the weight values and the bias of the aforementioned layers, so as to reduce the loss of the convolutional neural network 100 and the error between the result output by the convolutional neural network 100 through the output layer and the ideal result.
It should be noted that the convolutional neural network 100 shown in fig. 3 is only an example of a convolutional neural network, and in a specific application, the convolutional neural network may also exist in the form of other network models, for example, as shown in fig. 4, a plurality of convolutional layers/pooling layers are parallel, and the features extracted respectively are all input to the overall neural network layer 130 for processing.
Fig. 5 is a diagram of a chip hardware structure according to an embodiment of the present invention.
The neural network processor NPU 50NPU is mounted on a main CPU (Host CPU) as a coprocessor, and tasks are allocated by the Host CPU. The core portion of the NPU is an arithmetic circuit 50, and the controller 504 controls the arithmetic circuit 503 to extract matrix data in the memory and perform multiplication.
In some implementations, the arithmetic circuit 503 internally includes a plurality of processing units (PEs). In some implementations, the operational circuitry 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuitry 503 is a general-purpose matrix processor.
For example, assume that there is an input matrix A, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to matrix B from the weight memory 502 and buffers each PE in the arithmetic circuit. The arithmetic circuit takes the matrix a data from the input memory 501 and performs matrix operation with the matrix B, and partial results or final results of the obtained matrix are stored in the accumulator 508 accumulator.
The unified memory 506 is used to store input data as well as output data. The weight data is directly transferred to the weight Memory 502 through the Direct Memory Access Controller 505, and the DMAC. The input data is also carried through the DMAC into the unified memory 506.
The BIU is a Bus Interface Unit 510, which is used for the interaction between the AXI Bus and the DMAC and the Instruction Fetch memory 509Instruction Fetch Buffer.
The Bus Interface Unit 510(Bus Interface Unit, BIU for short) is configured to obtain an instruction from the instruction fetch memory 509 and obtain the original data of the input matrix a or the weight matrix B from the external memory by the memory Unit access controller 505.
The DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 506 or to transfer weight data into the weight memory 502 or to transfer input data into the input memory 501.
The vector calculation unit 507 has a plurality of operation processing units, and further processes the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like, if necessary. The method is mainly used for non-convolution/FC layer network calculation in the neural network, such as Pooling (Pooling), Batch Normalization (Batch Normalization), Local Response Normalization (Local Response Normalization) and the like.
In some implementations, the vector calculation unit 507 can store the processed output vector to the unified buffer 506. For example, the vector calculation unit 507 may apply a non-linear function to the output of the arithmetic circuit 503, such as a vector of accumulated values, to generate the activation value. In some implementations, the vector calculation unit 507 generates normalized values, combined values, or both. In some implementations, the vector of processed outputs can be used as activation inputs to the arithmetic circuitry 503, for example for use in subsequent layers in a neural network.
An instruction fetch buffer 509 connected to the controller 504 for storing instructions used by the controller 504;
the unified memory 506, the input memory 501, the weight memory 502, and the instruction fetch memory 509 are all On-Chip memories. The external memory is private to the NPU hardware architecture.
Among them, the operations of the layers in the convolutional neural networks shown in fig. 3 and 4 may be performed by the matrix calculation unit or the vector calculation unit 507.
Referring to fig. 6, a framework diagram of a distributed task scheduling system according to an embodiment of the present application is shown.
The distributed task scheduling system framework comprises at least two devices for performing distributed task scheduling. In fig. 6, three devices, i.e., a device a, a device B, and a device C, are used as an example for explanation, and the number and connection mode of the devices are not limited in the actual application process.
The device A, the device B and the device C can be connected in a wired or wireless connection mode, so that data transmission can be carried out among the device A, the device B or the device C. For example, the a device and the B device are connected by means of a fiber optic network. Or the A device and the C device are connected through a wireless network connection (Wi-Fi).
In the embodiment of the application, the device can be a computer device, a large screen television, a sound device, glasses, a watch, a vehicle-mounted device and the like, and can also be a headset, an intelligent household device, an electronic door lock, a household appliance and the like, and it can be understood that the device can also be more intelligent devices, such as a household cat eye camera, and specific limitation is not made here.
The device A comprises a device side application and plug-in service module, a task scheduling module, a task arbitration module and a soft bus module. Correspondingly, the device B and the device C also include corresponding applications and modules. The device A, the device B and the device C carry out data transmission through a distributed soft bus. Specifically, the plug-in service module is used for executing the target task. The task scheduling module is used for receiving and sending the target task. The task arbitration module is used for arbitrating the target task to obtain an arbitration result. And after receiving the arbitration results sent by other equipment, the task scheduling module selects equipment for executing the target task according to the arbitration results of the equipment and sends related target task data to the equipment for execution. The soft bus modules on the devices are used for updating the device state information of all the devices connected by the distributed soft bus, so that the devices can determine the state information of other devices through the soft bus modules, and the execution device of the target task can be judged better.
The following describes a data processing method in the embodiment of the present application with reference to the descriptions and architectures of fig. 1 to 6.
Please refer to fig. 7, which is a flowchart illustrating a data processing method according to an embodiment of the present application.
In step 701, the second device sends a target task to the first device.
In a distributed task system, a second device generates a target task and sends the target task to one or more first devices.
Specifically, the second device generates the target task correspondingly according to the scene requirement. For example, the target task is a task of face recognition AI. In a possible implementation manner, before the second device sends the target task to the first device, the second device may first determine whether the second device itself can execute the target task. For example, the second device obtains current dynamic resource information of the second device, that is, real-time hardware resource usage information of the second device, such as CPU usage, GPU usage, NPU usage, memory usage, network bandwidth, current device power consumption, and the like. And the second equipment calculates according to the dynamic resource information to obtain the virtualization computing power of the second equipment, wherein the virtualization computing power of the second equipment represents the support computing power of the second equipment in the current state. And the second device further determines a task resource set required by the target task, such as what CPU, how much computation power is required, how much read-only memory (ROM) is required, how much power consumption is required, and the like, to execute the target task. And the second equipment judges whether the second equipment can execute the target task according to the task resource set required by the target task and the virtualization computing power of the second equipment. And if the target task is available, the second equipment executes the target task, and if the target task is unavailable, the second equipment sends the target task to the first equipment.
In one possible implementation, after generating the target task, the second device sends the target task directly to one or more first devices to determine a device best suited to perform the target task.
In a possible implementation manner, the second device sends the target task to all devices in the networking recorded in the soft bus module, and the soft bus module stores information of all devices in the networking. For example, as shown in fig. 8, the softbus module has recorded therein an a device, a B device, a C device, and a D device. Under normal conditions, the device A, the device B, the device C and the device D are in normal operation, and when heartbeat messages exist among the device A, the device B, the device C and the device D, the device A sends a target task, namely an arbitration request, to the other three devices. When the D device goes offline, for example, the D device has a fault, and the a device, the B device, and the C device update their respective soft bus modules because they do not have heartbeat messages with the D device, and delete the D device from the network of the soft bus modules, and when the a device sends an arbitration request next time, the a device only sends an arbitration request to the B device or the C device. In the process, because no central node exists, when the equipment in the networking is offline, the resource information corresponding to the offline equipment does not need to be updated, and only the offline equipment needs to be deleted from the networking, so that the resource consumption caused by updating the offline equipment is saved.
In a possible implementation manner, the target task only carries task description information, that is, describes the task type, and does not carry specific task data. Because if the target task carries the target data needed to execute the target task, the network congestion may be caused because of the large amount of data. And if the device receiving the target task cannot execute the target task, the device receiving the task data wastes resources corresponding to the task data received by the device.
In a possible implementation manner, the second device sends the target task in the networking corresponding to the soft bus module by a broadcasting method, that is, a specific frequency channel.
In step 702, the first device obtains task resource information according to the target task.
After the first device receives the target task, the first device obtains task resource information according to the target task.
Specifically, in a possible implementation manner, the first device obtains task resource information in the target task, where the task resource information indicates hardware resources required for executing the target task. For example, when the target task is a face recognition task, the first device obtains information about the type of CPU required for the face recognition task, how much computation power is required to execute the face recognition task, how much ROM is required, how much power is consumed, and the like. It can be understood that the task resource set may be determined according to the specific requirements of the face recognition task, or may be determined according to a corresponding task resource comparison table, which is not limited herein.
In step 703, the first device obtains dynamic resource information.
After the first device receives the target task, the first device obtains dynamic resource information, wherein the dynamic resource information represents real-time use information of hardware resources of the first device.
In a possible implementation manner, the first device acquires dynamic resource information of the current device through the device feature set acquisition module. The dynamic resource information may include at least one of: CPU utilization, GPU utilization, NPU utilization, memory utilization, network bandwidth, power consumption, and the like. It is understood that, in the actual application process, because of the difference of the hardware resource information needed to be used by the target task, more dynamic resource information may also be included, which is not limited herein.
For example, when the first device is in an operating state, the first device obtains that the CPU usage rate is 20%, the GPU usage rate is 5%, the NPU usage rate is 15%, the memory usage rate is 30%, the network bandwidth is 2 mega, and the power consumption is 100 mhz.
In step 704, the first device performs calculation according to the dynamic resource information to obtain the virtualization calculation power.
After the first device acquires the dynamic resource information of the first device, the first device performs calculation according to the dynamic resource information to obtain a virtualization calculation force, wherein the virtualization calculation force represents the calculation force supported by the first device in the current state.
Specifically, the first device performs trend calculation and analysis according to the dynamic resource information to obtain the virtualization computing power in the current device state. For example, when the dynamic resource information indicates that the CPU usage is 20%, the GPU usage is 5%, the NPU usage is 15%, the memory usage is 30%, the network bandwidth is 2 megabits, and the power consumption is 100 megahertz, the first device performs trend calculation and analysis on the dynamic resource information, and obtains that the virtualization computing power is 0.8T.
It is understood that, in the actual application process, the first device may also obtain the virtualization computing power by calculating the dynamic resource information according to more ways, for example, by using a comparison table of the dynamic resource information and the virtualization computing power, which is not limited herein.
In step 705, the first device obtains an arbitration result according to the task resource information and the virtualization computing power.
After the task resource information and the virtualization computing power are obtained, the first device obtains an arbitration result according to the task resource information and the virtualization computing power, and the arbitration result represents the score of the first device for executing the target task.
Specifically, in a possible implementation manner, the first device compares the task resource information, the task type of the target task, and the virtualization computing power to obtain a task execution degree score, where the task execution degree score is an arbitration result. For example, the task type of the target task is a face recognition task, the task resource information includes a calculation power of 1T, 30 mega ROM, and 100 mhz consumption required for performing the face recognition task, and the virtualization calculation power of the first device is 2T, so the task execution degree is scored to 70 points.
It can be understood that, in the actual application process, the task execution degree score may also be obtained by calculating the virtualization computing power and the task resource information according to more ways, and is not limited herein.
In step 706, the first device sends the arbitration result to the second device.
After the first device obtains the arbitration result, the first device sends the arbitration result to the second device.
In particular, in one possible implementation, the first device sends the arbitration result to the second device via a distributed soft bus.
In step 707, the second device determines that the target device is a device for executing the target task according to the arbitration result.
After the second device receives the arbitration results sent by the one or more first devices, the second device determines, according to the one or more arbitration results, that the target device is a device for executing the target task, the target device belongs to one of the one or more first devices, and the one or more arbitration results are in one-to-one correspondence with the one or more first devices.
Specifically, after the second device receives the one or more arbitration results, the second device determines, according to the one or more arbitration results, an arbitration result with the highest score, that is, an optimal arbitration result in the one or more arbitration results, and determines that the target device corresponding to the optimal arbitration result is a device that executes the target task.
It can be understood that, in an actual application process, if arbitration results with the same score exist, the second device may randomly select one of the target devices corresponding to the arbitration results to execute the target task, or may continue to determine the device corresponding to the arbitration result with the same score to determine a better device to execute the target task, which is not limited herein.
In step 708, the second device sends the target task data to the first device.
After the second device determines a target device for executing the target task, the second device sends target task data required for executing the target task to the target device.
Specifically, after receiving the target task data, the first device executes the target task according to the hardware resource of the first device.
In the embodiment of the application, the first device obtains the arbitration result through the target task sent by the second device, and sends the arbitration result to the second device, so that the second device can determine the device for executing the target task according to the arbitration result, the process of cross-device interaction in a distributed system with a central node is reduced, and the interaction efficiency is improved.
The data processing method in the embodiment of the present application is described above, and the following describes the device in the embodiment of the present application, please refer to fig. 9, which is a schematic structural diagram of the device provided in the present application.
An apparatus, comprising:
a receiving unit 901, configured to receive a target task sent by a second device, where the target task includes task description information and is generated for the second device;
the processing unit 902 is configured to obtain an arbitration result according to the target task, where the arbitration result is used to indicate a score for the first device to execute the target task;
a sending unit 903, configured to send the arbitration result to the second device, so that the second device determines, according to the arbitration result, that the first device is a device that executes the target task.
In this embodiment, operations performed by each unit of the device are similar to the steps performed by the first device in the embodiment shown in fig. 7, and detailed description thereof is omitted here.
Please refer to fig. 10, which is another structural diagram of the apparatus provided in the present application.
An apparatus, comprising:
a receiving unit 1001, configured to receive a target task sent by a second device, where the target task includes task description information and is generated for the second device;
the processing unit 1002 is configured to obtain an arbitration result according to the target task, where the arbitration result is used to indicate that the first device executes a score of the target task;
a sending unit 1003, configured to send the arbitration result to the second device, so that the second device determines, according to the arbitration result, that the first device is a device that executes the target task.
Optionally, the apparatus further comprises:
an obtaining unit 1004, configured to obtain dynamic resource information, where the dynamic resource information represents real-time hardware resource usage information of the first device;
the processing unit 1002 is further configured to perform calculation according to the dynamic resource information to obtain a virtualization calculation force, where the virtualization calculation force represents a calculation force supported by the first device in the current state;
the processing unit 1002 is specifically configured to obtain an arbitration result according to the virtualization computing power and the target task.
Optionally, the processing unit 1002 is specifically configured to obtain task resource information according to the target task, where the task resource information indicates hardware resources required for executing the target task;
the processing unit 1002 is specifically configured to obtain an arbitration result according to the task resource information and the virtualization computation power.
In this embodiment, operations performed by each unit of the device are similar to the steps performed by the first device in the embodiment shown in fig. 7, and detailed description thereof is omitted here.
Please refer to fig. 11, which is a schematic structural diagram of the apparatus provided in the present application.
An apparatus, comprising:
a sending unit 1101, configured to send a target task to a first device, where the target task includes task description information and is generated for a second device;
a receiving unit 1102, configured to receive an arbitration result sent by a first device, where the arbitration result is used to indicate a score for the first device to execute a target task;
a determining unit 1103, configured to determine, according to the arbitration result, that the first device is a device that executes the target task.
In this embodiment, operations performed by each unit of the device are similar to the steps performed by the second device in the embodiment shown in fig. 7, and detailed description thereof is omitted here.
Please refer to fig. 12, which is another structural diagram of the apparatus provided in the present application.
An apparatus, comprising:
a sending unit 1201, configured to send a target task to a first device, where the target task includes task description information and is generated for a second device;
a receiving unit 1202, configured to receive an arbitration result sent by a first device, where the arbitration result is used to indicate a score for the first device to execute a target task;
a determining unit 1203 is configured to determine, according to the arbitration result, that the first device is a device that executes the target task.
Optionally, the sending unit is specifically configured to send the target task to the plurality of first devices;
the receiving unit 1202 is specifically configured to receive multiple arbitration results sent by multiple first devices, where the multiple arbitration results correspond to the multiple first devices one to one;
the determining unit 1203 is specifically configured to obtain a target arbitration result according to the multiple arbitration results, where the target arbitration result is an optimal arbitration result in the multiple arbitration results;
the determining unit 1203 is specifically configured to determine, according to the target arbitration result, that the corresponding target device is a device that executes the target task, where the target device belongs to one of the multiple first devices.
Optionally, the apparatus further comprises:
a determination unit 1204 configured to determine whether the second device can execute the target task;
and if not, triggering the second equipment to send the target task to the first equipment.
Optionally, the sending unit 1201 is further configured to send target task data to the first device, where the target task data is used for the first device to execute a target task.
In this embodiment, operations performed by each unit of the device are similar to the steps performed by the second device in the embodiment shown in fig. 7, and detailed description thereof is omitted here.
Please refer to fig. 13, which is a schematic structural diagram of an apparatus according to an embodiment of the present application.
The processor 1301, the memory 1302, the bus 1305, and the interface 1304, where the processor 1301 is connected to the memory 1302 and the interface 1304, the bus 1305 is connected to the processor 1301, the memory 1302, and the interface 1304, respectively, the interface 1304 is used for receiving or transmitting data, and the processor 1301 is a single-core or multi-core central processing unit, or a specific integrated circuit, or one or more integrated circuits configured to implement the embodiments of the present invention. The memory 1302 may be a Random Access Memory (RAM), or may be a non-volatile memory (non-volatile memory), such as at least one hard disk memory. The memory 1302 is used to store computer-executable instructions. Specifically, the computer-executable instructions may include a program 1303.
In this embodiment, when the processor 1301 invokes the program 1303, the apparatus in fig. 13 may execute the operation executed by the first apparatus in the embodiment shown in fig. 7, which is not described herein again.
Please refer to fig. 14, which is a schematic structural diagram of an apparatus according to an embodiment of the present application.
Processor 1401, memory 1402, bus 1405, interface 1404, processor 1401 connected to memory 1402, interface 1404, bus 1405 connected to processor 1401, memory 1402, and interface 1404, respectively, interface 1404 for receiving or transmitting data, processor 1401 being a single or multi-core central processing unit, either a specific integrated circuit, or one or more integrated circuits configured to implement an embodiment of the present invention. The memory 1402 may be a Random Access Memory (RAM), or may be a non-volatile memory (non-volatile memory), such as at least one hard disk memory. Memory 1402 is used to store computer-executable instructions. Specifically, program 1403 may be included in computer-executable instructions.
In this embodiment, when the processor 1401 invokes the program 1403, the device in fig. 13 may execute the operation executed by the second device in the embodiment shown in fig. 7, which is not described herein again.
It should be understood that the processor mentioned in the above embodiments of the present application or provided in the above embodiments of the present application may be a Central Processing Unit (CPU), or may also be other general purpose processor, a Digital Signal Processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It should also be understood that the number of processors in the device in the above embodiments in the present application may be one or more, and may be adjusted according to the actual application scenario, and this is merely an exemplary illustration and is not limited. The number of the memories in the embodiment of the present application may be one or multiple, and may be adjusted according to an actual application scenario, and this is merely an exemplary illustration and is not limited.
It should be further noted that, when the device includes a processor (or a processing unit) and a memory, the processor in this application may be integrated with the memory, or the processor and the memory are connected through an interface, and may be adjusted according to an actual application scenario, and is not limited.
The present application provides a chip system comprising a processor for enabling a device to implement the functionality of the controller involved in the above method, e.g. to process data and/or information involved in the above method. In one possible design, the system-on-chip further includes a memory for storing necessary program instructions and data. The chip system may be formed by a chip, or may include a chip and other discrete devices.
In another possible design, when the chip system is a chip in a user equipment or an access network, the chip includes: a processing unit, which may be for example a processor, and a communication unit, which may be for example an input/output interface, a pin or a circuit, etc. The processing unit may execute computer-executable instructions stored by the storage unit to cause a chip within the device or the like to perform the steps performed by the device in the embodiment of fig. 7 described above. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, and the like, and the storage unit may also be a storage unit located outside the chip in the device and the like, such as a read-only memory (ROM) or another type of static storage device that can store static information and instructions, a Random Access Memory (RAM), and the like.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a computer, implements the method flows executed by the apparatus in any of the method embodiments described above. Correspondingly, the computer can be the equipment.
It should be understood that the controller or processor mentioned in the above embodiments of the present application may be a Central Processing Unit (CPU), and may also be one or a combination of various other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It should also be understood that the number of processors or controllers in the device or chip system and the like in the above embodiments in the present application may be one or more, and may be adjusted according to practical application scenarios, and this is merely an exemplary illustration and is not limited. The number of the memories in the embodiment of the present application may be one or multiple, and may be adjusted according to an actual application scenario, and this is merely an exemplary illustration and is not limited.
It should also be understood that the memory or the readable storage medium and the like mentioned in the devices and the like in the above embodiments in the present application may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), double data rate SDRAM, enhanced SDRAM, SLDRAM, Synchronous Link DRAM (SLDRAM), and direct rambus RAM (DR RAM).
Those of ordinary skill in the art will appreciate that the steps performed by the device or processor in whole or in part to implement the embodiments described above may be performed by hardware or a program instructing associated hardware. The program may be stored in a computer-readable storage medium, which may be read only memory, random access memory, or the like. Specifically, for example: the processing unit or processor may be a central processing unit, a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
When implemented in software, the method steps described in the above embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The available media may be magnetic media (e.g., floppy disks, hard disks, tapes), optical media (e.g., DVDs), or semiconductor media, among others.
The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely descriptive of the various embodiments of the application and how objects of the same nature can be distinguished. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the embodiments of the present application, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that in the description of the present application, unless otherwise indicated, "/" indicates a relationship where the objects associated before and after are an "or", e.g., a/B may indicate a or B; in the present application, "and/or" is only an association relationship describing an associated object, and means that there may be three relationships, for example, a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural.
The word "if" or "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (15)

1. A data processing method, comprising:
the method comprises the steps that a first device receives a target task sent by a second device, wherein the target task comprises task description information, and the target task is generated for the second device;
the first equipment obtains an arbitration result according to the target task, and the arbitration result is used for expressing the score of the first equipment for executing the target task;
and the first equipment sends the arbitration result to the second equipment, so that the second equipment determines that the first equipment is the equipment for executing the target task according to the arbitration result.
2. The method of claim 1, wherein before the first device obtains the arbitration result according to the target task, the method further comprises:
the first equipment acquires dynamic resource information, wherein the dynamic resource information represents real-time hardware resource use information of the first equipment;
the first equipment calculates according to the dynamic resource information to obtain a virtualization computing power, wherein the virtualization computing power represents the computing power supported by the first equipment in the current state;
the first device obtaining an arbitration result according to the target task comprises:
and the first equipment obtains the arbitration result according to the virtualization computing power and the target task.
3. The method of claim 2, wherein obtaining the arbitration result by the first device based on the virtualization computing power and the target task comprises:
the first equipment obtains task resource information according to the target task, wherein the task resource information represents hardware resources required by executing the target task;
and the first equipment obtains an arbitration result according to the task resource information and the virtualization computing power.
4. A data processing method, comprising:
the method comprises the steps that a second device sends a target task to a first device, wherein the target task comprises task description information, and the target task is generated for the second device;
the second equipment receives an arbitration result sent by the first equipment, wherein the arbitration result is used for expressing the score of the first equipment for executing the target task;
and the second equipment determines that the first equipment is the equipment for executing the target task according to the arbitration result.
5. The method of claim 4, wherein the second device sending the target task to the first device comprises:
the second device sends target tasks to a plurality of first devices;
the second device receiving the arbitration result sent by the first device comprises:
the second device receives a plurality of arbitration results sent by the plurality of first devices, and the plurality of arbitration results are in one-to-one correspondence with the plurality of first devices;
the second device determining that the first device is the device executing the target task according to the arbitration result comprises:
the second equipment obtains a target arbitration result according to the plurality of arbitration results, wherein the target arbitration result is the optimal arbitration result in the plurality of arbitration results;
and the second equipment determines that the corresponding target equipment is the equipment for executing the target task according to the target arbitration result, wherein the target equipment belongs to one of the plurality of first equipment.
6. The method of claim 4 or 5, wherein before the second device sends the target task to the first device, the method further comprises:
the second device judges whether the second device can execute the target task;
and if not, triggering the second equipment to send the target task to the first equipment.
7. The method of any of claims 4 to 6, wherein after the second device determines that the first device is the device performing the target task according to the arbitration result, the method further comprises:
and the second equipment sends target task data to the first equipment, wherein the target task data is used for the first equipment to execute the target task.
8. An apparatus, comprising:
the device comprises a receiving unit, a processing unit and a processing unit, wherein the receiving unit is used for receiving a target task sent by second equipment, the target task comprises task description information, and the target task is generated for the second equipment;
the processing unit is used for obtaining an arbitration result according to the target task, and the arbitration result is used for expressing the score of the first equipment for executing the target task;
and the sending unit is used for sending the arbitration result to the second equipment so that the second equipment determines that the first equipment is the equipment for executing the target task according to the arbitration result.
9. The apparatus of claim 8, further comprising:
the acquisition unit is used for acquiring dynamic resource information, and the dynamic resource information represents real-time hardware resource use information of the first equipment;
the processing unit is further configured to perform calculation according to the dynamic resource information to obtain a virtualization calculation force, where the virtualization calculation force represents a calculation force supported by the first device in the current state;
the processing unit is specifically configured to obtain the arbitration result according to the virtualization computing power and the target task.
10. The device according to claim 9, wherein the processing unit is specifically configured to obtain task resource information according to the target task, where the task resource information indicates hardware resources required for executing the target task;
the processing unit is specifically configured to obtain an arbitration result according to the task resource information and the virtualization calculation power.
11. An apparatus, comprising:
a sending unit, configured to send a target task to a first device, where the target task includes task description information, and the target task is generated for a second device;
a receiving unit, configured to receive an arbitration result sent by the first device, where the arbitration result is used to indicate a score for the first device to execute the target task;
and the determining unit is used for determining the first equipment as the equipment for executing the target task according to the arbitration result.
12. The device according to claim 11, wherein the sending unit is specifically configured to send the target task to a plurality of first devices;
the receiving unit is specifically configured to receive multiple arbitration results sent by the multiple first devices, where the multiple arbitration results correspond to the multiple first devices one to one;
the determining unit is specifically configured to obtain a target arbitration result according to the multiple arbitration results, where the target arbitration result is an optimal arbitration result among the multiple arbitration results;
the determining unit is specifically configured to determine, according to the target arbitration result, that a corresponding target device is a device that executes the target task, where the target device belongs to one of the plurality of first devices.
13. The apparatus according to claim 11 or 12, characterized in that it further comprises:
a judging unit configured to judge whether the second device can execute the target task;
and if not, triggering the second equipment to send the target task to the first equipment.
14. The device according to any one of claims 11 to 13, wherein the sending unit is further configured to send target task data to the first device, the target task data being used for the first device to perform the target task.
15. A readable storage medium storing instructions that, when executed, cause the method of any of claims 1-7 to be implemented.
CN202010943745.7A 2020-09-09 2020-09-09 Data processing method and equipment thereof Pending CN114237861A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010943745.7A CN114237861A (en) 2020-09-09 2020-09-09 Data processing method and equipment thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010943745.7A CN114237861A (en) 2020-09-09 2020-09-09 Data processing method and equipment thereof

Publications (1)

Publication Number Publication Date
CN114237861A true CN114237861A (en) 2022-03-25

Family

ID=80742871

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010943745.7A Pending CN114237861A (en) 2020-09-09 2020-09-09 Data processing method and equipment thereof

Country Status (1)

Country Link
CN (1) CN114237861A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114936223A (en) * 2022-05-27 2022-08-23 阿里云计算有限公司 Data processing method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114936223A (en) * 2022-05-27 2022-08-23 阿里云计算有限公司 Data processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
WO2021120719A1 (en) Neural network model update method, and image processing method and device
WO2022083536A1 (en) Neural network construction method and apparatus
WO2020221200A1 (en) Neural network construction method, image processing method and devices
WO2021238366A1 (en) Neural network construction method and apparatus
CN112651511B (en) Model training method, data processing method and device
WO2019228358A1 (en) Deep neural network training method and apparatus
EP3933693B1 (en) Object recognition method and device
CN112183718B (en) Deep learning training method and device for computing equipment
CN112990211B (en) Training method, image processing method and device for neural network
WO2022052601A1 (en) Neural network model training method, and image processing method and device
WO2022001805A1 (en) Neural network distillation method and device
US12026938B2 (en) Neural architecture search method and image processing method and apparatus
CN110222718B (en) Image processing method and device
WO2021018245A1 (en) Image classification method and apparatus
CN111783937A (en) Neural network construction method and system
WO2021018251A1 (en) Image classification method and device
WO2023093724A1 (en) Neural network model processing method and device
CN111428854A (en) Structure searching method and structure searching device
CN111931901A (en) Neural network construction method and device
US20220327835A1 (en) Video processing method and apparatus
CN111797992A (en) Machine learning optimization method and device
WO2022156475A1 (en) Neural network model training method and apparatus, and data processing method and apparatus
WO2022179606A1 (en) Image processing method and related apparatus
CN113191479A (en) Method, system, node and storage medium for joint learning
WO2021036397A1 (en) Method and apparatus for generating target neural network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination