WO2023093053A1 - 推理实现方法、网络、电子设备及存储介质 - Google Patents

推理实现方法、网络、电子设备及存储介质 Download PDF

Info

Publication number
WO2023093053A1
WO2023093053A1 PCT/CN2022/103001 CN2022103001W WO2023093053A1 WO 2023093053 A1 WO2023093053 A1 WO 2023093053A1 CN 2022103001 W CN2022103001 W CN 2022103001W WO 2023093053 A1 WO2023093053 A1 WO 2023093053A1
Authority
WO
WIPO (PCT)
Prior art keywords
inference
reasoning
model
data
edge node
Prior art date
Application number
PCT/CN2022/103001
Other languages
English (en)
French (fr)
Inventor
魏旭宾
朱磊
Original Assignee
达闼科技(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 达闼科技(北京)有限公司 filed Critical 达闼科技(北京)有限公司
Publication of WO2023093053A1 publication Critical patent/WO2023093053A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/96Management of image or video recognition tasks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular to a reasoning implementation method, network, electronic equipment and storage medium.
  • the result obtained may not be ideal. For example, when the confidence level is low, the terminal will determine that the inference fails. Once the terminal device fails to reason, it will give up related tasks, resulting in poor user experience.
  • the purpose of the embodiments of the present invention is to provide an inference implementation method, a network, an electronic device, and a storage medium, which can greatly improve the probability and accuracy of terminal device inference success and improve user experience.
  • an embodiment of the present invention provides an inference implementation method, which is applied to an edge node, including: receiving a first inference request sent by a lower-level edge node and/or a terminal device in the case of an inference failure, the first inference request
  • An inference request carries the data to be inferred and the inference type; according to the inference type, the corresponding target inference model is called to infer the data to be inferred; in the case of successful inference, the inference result output by the target inference model is returned to Corresponding to the lower-level edge node and/or the terminal device; in the case of inference failure, generating a second inference request according to the data to be inferred and the inference type, and sending the second inference request to a superior
  • the edge node and/or the cloud node sends the inference result returned by the upper-level edge node and/or the cloud node to the corresponding lower-level edge node and/or the terminal device.
  • the embodiment of the present invention also provides a reasoning implementation method, which is applied to the terminal device, including: collecting data according to the received reasoning task to obtain the data to be reasoned; calling the method according to the reasoning type of the reasoning task
  • the corresponding target inference model performs inference on the data to be inferred; in the case of inference failure, generates an inference request according to the data to be inferred and the inference type and sends the inference request to the edge node and/or cloud node , for the edge node and/or cloud node to return an inference result according to the inference request.
  • an embodiment of the present invention also provides an inference implementation method, which is applied to cloud nodes, including: receiving an inference request sent by an edge node and/or a terminal device when the inference fails, and the inference request carries The data to be inferred and the inference type; according to the inference type, the corresponding target inference model is invoked to infer the data to be inferred; in the case of successful inference, the inference result output by the target inference model is returned to the edge node , for the edge node to send the received inference result to the terminal device, and/or return the inference result output by the target inference model to the terminal device; in the case of inference failure, return a reasoning failure response For the edge node, the edge node sends the received reasoning failure response to the terminal device, and/or returns the reasoning failure response to the terminal device.
  • an embodiment of the present invention also provides an intelligent distribution network, the intelligent distribution network includes a cloud node, several terminal devices and several edge nodes, the edge node, the terminal devices and the cloud The nodes are respectively used to execute the inference implementation method as described above.
  • an embodiment of the present invention also provides an electronic device, including: at least one processor; and a memory connected to the at least one processor in communication; wherein, the memory stores information that can be used by the Instructions executed by at least one processor, where the instructions are executed by the at least one processor, so that the at least one processor can execute the reasoning implementation method described in any one of the preceding items.
  • an embodiment of the present invention further provides a computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, the reasoning implementation method described in any one of the preceding items is implemented.
  • an embodiment of the present invention also provides a computer program product, including a computer program/instruction, when the computer program is executed by a processor, it causes the processor to implement the reasoning implementation method described in any one of the above items.
  • the reasoning implementation method provided by the embodiment of the present invention can send a reasoning request to the edge node and/or cloud node when the reasoning of the terminal device fails, so that the edge node and/or cloud node can perform reasoning according to the reasoning request and return the reasoning result.
  • the edge node also fails to infer, it can also continue to send a reasoning request to the upper-level edge node and/or cloud central node of the edge node, so that the upper-level edge node and/or cloud node can perform reasoning according to the reasoning request and return reasoning results. That is to say, the reasoning is divided into multiple levels from low to high terminal devices, at least one edge node, and cloud nodes.
  • the ontology reasoning of terminal devices is used first, supplemented by edge node reasoning and cloud node reasoning.
  • Super intelligence when the current level fails, you can request the next level to continue reasoning at the upper level until the reasoning succeeds, and you can get executable reasoning results, which greatly improves the probability and accuracy of terminal device reasoning success and improves user experience.
  • the request is made step by step, which avoids the excessive processing pressure of a single node, reduces the upstream bandwidth, and speeds up the processing efficiency and the downward response speed.
  • Fig. 1 is a flow chart of a reasoning implementation method provided in an embodiment of the present invention
  • Fig. 2 is a flow chart of a reasoning implementation method provided in another embodiment of the present invention.
  • Fig. 3 is a flow chart of a reasoning implementation method provided in another embodiment of the present invention.
  • Fig. 4 is an interactive flow chart of a reasoning implementation method provided in another embodiment of the present invention.
  • Fig. 5 is an interactive flow chart of a reasoning implementation method provided in another embodiment of the present invention.
  • Fig. 6 is an interactive flow chart of a reasoning implementation method provided in another embodiment of the present invention.
  • Fig. 7 is an interactive flow chart of the reasoning implementation method provided in another embodiment of the present invention.
  • Fig. 8 is a schematic structural diagram of an intelligent distribution network provided in another embodiment of the present invention.
  • Fig. 9 is a schematic structural diagram of an electronic device provided in another embodiment of the present invention.
  • the current terminal device only relies on the terminal device itself when performing inference, and once the inference fails, it will directly give up executing the task, resulting in poor user experience.
  • an embodiment of the present invention provides an inference implementation method, which is applied to an edge node, including: receiving a first inference request sent by a lower-level edge node and/or a terminal device in the case of an inference failure, the first The inference request carries the data to be inferred and the inference type; according to the inference type, the corresponding target inference model is called to infer the data to be inferred; in the case of successful inference, the inference result output by the target inference model is returned to the corresponding The lower-level edge node and/or the terminal device; in the case of inference failure, generate a second inference request according to the data to be inferred and the inference type, and send the second inference request to the upper-level edge
  • the node and/or cloud node sends the inference result returned by the upper-level edge node and/or the cloud node to the corresponding lower-level edge node and/or the terminal device.
  • the reasoning implementation method provided by the embodiment of the present invention can send a reasoning request to the edge node and/or cloud node when the reasoning of the terminal device fails, so that the edge node and/or cloud node can perform reasoning according to the reasoning request and return the reasoning result.
  • the edge node also fails inference, it can also continue to send an inference request to the upper edge node and/or cloud central node of the edge node, so that the upper edge node and/or cloud node can perform inference according to the inference request and return reasoning results. That is to say, the reasoning is divided into multiple levels from low to high terminal devices, at least one edge node, and cloud nodes.
  • the ontology reasoning of terminal devices is used first, supplemented by edge node reasoning and cloud node reasoning.
  • Super intelligence when the current level fails, you can request the next level to continue reasoning at the upper level until the reasoning succeeds, and you can get executable reasoning results, which greatly improves the probability and accuracy of terminal device reasoning success and improves user experience.
  • the request is made step by step, which avoids the excessive processing pressure of a single node, reduces the upstream bandwidth, and speeds up the processing efficiency and the downward response speed.
  • an embodiment of the present invention provides a reasoning implementation method, which is applied to an edge node, and its process is shown in FIG. 1 .
  • Step 101 receiving a first inference request sent by a lower-level edge node and/or a terminal device when the inference fails.
  • the first inference request carries data to be inferred and an inference type.
  • the data to be reasoned can be one or a combination of the following data: audio data, video data, text data, image data, etc.
  • the type of reasoning can be identification, positioning, path planning, etc.
  • this embodiment does not limit the format of the inference data, such as image data can be gray images, depth images or color images, etc.
  • audio can be AIFF (Audio Interchange File Format) format or MIDI (Musical Instrument Digital Interface ) format, etc., which will not be repeated here.
  • the inference failure in this embodiment refers to the following situations in the output of the inference model: the inference result cannot be executed, the confidence of the inference result is lower than the preset reliability threshold, or the task success rate estimated according to the inference result is lower than Preset success rate, etc.
  • the reasoning type can be further limited to the model type.
  • the reasoning model used for recognition can be a face recognition model, an object recognition model, etc.
  • a face recognition model can be further divided into a FaceNet network model, a multi-task convolutional neural network ( Multi-Task Convolutional Neural Network, MTCNN), etc.
  • the lower-level edge nodes, terminal devices, etc. can establish a communication connection with the edge node.
  • the inference of the lower-level edge nodes and terminal devices fails, they will not directly give up the execution of related tasks, but send the first inference request to the upper-level node, that is, the edge node, to request the edge node to perform inference.
  • the edge node receives After the first inference request is received, the first inference request is parsed to obtain the data to be inferred and the inference type.
  • edge nodes in this embodiment refer to all nodes except terminal devices and cloud nodes, not just nodes having communication links with terminal devices.
  • this embodiment does not limit the number of lower-level edge nodes and terminal devices, there may be one or more, that is, the first inference request sent by one or more lower-level edge nodes can be received, or the number of terminal devices can be received.
  • the first inference requests respectively sent from one or more lower-level edge nodes may also receive the first inference requests respectively sent by at least one lower-level edge node and at least one terminal device, in particular, one lower-level edge node or one terminal device is also Multiple first inference requests may be sent at the same time, and the data to be inferred and inference types of these inference requests may be the same or different. I won't go into details here.
  • step 102 according to the inference type, the corresponding target inference model is called to perform inference on the data to be inferred.
  • these inference models include at least one inference model of several inference types, for example, two inference models for identification, three inference models for An inference model for path planning.
  • invoking the corresponding target inference model according to the inference type to perform inference on the data to be inferred can be achieved in the following manner: matching in the stored inference model according to the inference type; if the matching is successful, the matched inference
  • the model is used as the target inference model and the data to be inferred is input into the target inference model.
  • at least two inference models may be matched according to the inference model.
  • each inference model also carries a priority label, and the matched inference model As the target inference model, it includes: taking the inference model with the highest priority indicated by the priority label as the target inference model; in the case of a matching failure, requesting the inference model to the upper-level edge node and/or cloud node according to the inference type, and the upper-level After receiving the request, the edge node and/or cloud node searches for the corresponding inference model from its currently stored inference model according to the inference type and delivers it, while the edge node receives the information returned by the superior edge node and/or cloud node.
  • the inference model returned by the upper-level edge node and/or cloud node is used as the target inference model and the data to be inferred is input into the target inference model.
  • the inference model returned by the upper-level edge node and/or cloud node The number of is 1, that is, when the upper-level edge node and/or cloud node finds out one inference model, the inference model is issued; when the upper-level edge node and/or cloud node finds out several inference models, select the one with the highest priority
  • the reasoning model is delivered instead of all the found reasoning models, which reduces the traffic used when delivering the reasoning model, and prevents edge nodes from storing too many reasoning models and occupying memory and affecting system performance.
  • the priority label in the above example is set according to the storage duration of the inference model and/or the accuracy of the inference model, where the priority label indicates the priority between inference models belonging to the same inference type. level, rather than the priority of all stored inference models. For example, evaluate factors such as the accuracy rate of the stored reasoning model according to a preset cycle, and reduce the priority of one or some reasoning models whose accuracy rate is lower than the preset threshold.
  • the edge node and/or cloud node requests a new inference model of the inference type whose priority is lowered, and the new inference model returned by the upper edge node and/or cloud node will be set with a higher priority,
  • the priority of the reasoning model exceeds the preset priority, it is determined that the reasoning model is invalid and the reasoning model is deleted;
  • the priority of the inference model with the preset duration threshold is reduced, etc.
  • the reasoning implementation method also includes: detecting whether the stored reasoning model is invalid according to the preset failure condition; the failure condition includes that the stored time exceeds the preset effective duration and/or the accuracy rate is lower than the preset threshold; When a stored inference model becomes invalid, the invalid inference model is deleted.
  • the inference model with TTL of 0 can detect whether the Time To Live (TTL) value of the stored inference model is 0, and determine that the inference model with TTL of 0 is invalid; or, detect whether the stored inference model has not been used for a long time, Inference models that have not been used for a long period of time are determined to be invalid; or, the accuracy rate of the detection inference model is low, and the inference model with a low accuracy rate is determined to be invalid.
  • TTL Time To Live
  • the stored inference model may be requested by the edge node to the upper edge node and/or cloud node, or may be actively issued by the upper edge node and/or cloud node. Therefore, the inference implementation method also includes : Receive and store the inference model actively issued by the upper-level edge node and/or the cloud node, wherein the active delivery can be a periodic active delivery, and some trigger conditions can also be set, such as in an inference model of a certain reasoning type At least two of the latest reasoning models have not been issued or received instructions issued by the manager. Similarly, the edge node will also actively send the inference model with the highest priority among the stored inference models to the lower-level edge node and/or terminal device. Therefore, the reasoning implementation method also includes: actively Deliver the inference model.
  • Step 103 check whether the reasoning is successful, if yes, execute step 104 , if not, execute step 105 .
  • whether the reasoning is successful can be detected by detecting the output of the reasoning model, that is, whether the reasoning result is executable, whether the confidence is higher than the preset reliability threshold, and the task success rate estimated according to the reasoning result Whether it is higher than the preset success rate, etc., that is, to detect whether the inference result can be executed with a higher success rate.
  • Step 104 returning the inference result output by the target inference model to the corresponding lower-level edge node and/or terminal device.
  • the inference implementation method further includes: when the inference is successful, reporting the feature information of the data to be inferred to the The upper-level edge node and/or the cloud node are used to train the inference model offline on the cloud node according to the feature information of the data to be inferred.
  • the reasoning implementation method also includes: receiving the lower-level edge node and /or the feature information of the data to be inferred sent by the terminal device when the inference is successful; and the feature information of the data to be inferred is sent to the upper edge node and/or the cloud node.
  • the feature information of the data to be inferred can be obtained by setting a feature extraction model for feature extraction, or can be extracted by the inference model during the inference process. This embodiment does not limit the way of acquiring feature information.
  • Step 105 Generate a second inference request according to the data to be inferred and the inference type, and send the second inference request to the upper-level edge node and/or cloud node, and deliver the inference result returned by the upper-level edge node and/or cloud node to the corresponding subordinate edge nodes and/or terminal devices.
  • first reasoning request and “second reasoning request” in this embodiment are to distinguish the reasoning request received by the edge node from the reasoning request sent out, that is, the first reasoning request indicates that the edge node The received inference request, the second inference request indicates the inference request sent by the edge node, and does not limit whether the inference request is from a terminal device or an edge node.
  • the embodiments of the present invention also provide a reasoning implementation method, which is applied to a terminal device, where the terminal device may be a user-side electronic device such as a robot, a mobile phone, or a computer.
  • the process flow is shown in FIG. 2 .
  • Step 201 collect data according to the received reasoning task, and obtain data to be reasoned.
  • the reasoning implementation method provided in this embodiment is roughly the same as the reasoning implementation method provided in the first method embodiment, and the different parts are mainly described below.
  • the terminal device will receive control instructions from the user.
  • the robot will receive instructions from the user's control terminal or the cloud platform of the robot supplier to instruct the robot to perform tasks, such as object recognition and path planning.
  • the robot completes the task indicated by the instruction, it needs to go through three processes of perception, reasoning and execution.
  • Data collection is to perceive the environment by calling related sensors, so as to reason according to the perceived information. Therefore, the collected data is It is the data to be inferred, which is also the input of the inference model; when the user starts a program by operating the mouse to perform object recognition on the picture uploaded by the user, the computer loads the picture to perceive the picture and run the relevant program to perform object recognition on the picture. Identification, etc., will not be repeated here.
  • Step 202 According to the reasoning type of the reasoning task, the corresponding target reasoning model is invoked to perform reasoning on the data to be reasoned.
  • invoking the corresponding target inference model according to the inference type of the inference task to perform inference on the data to be inferred can be achieved in the following manner: matching in the stored inference model according to the inference type; The matched inference model is used as the target inference model and the data to be inferred is input into the target inference model; in the case of a matching failure, the inference model is requested from the edge node and/or cloud node according to the inference type, and the edge node and/or cloud node The inference model returned by the node is used as the target inference model and the data to be inferred is input into the target inference model.
  • the matched inference model includes multiple inference models with priority labels.
  • the matched inference model is used as the target inference model, including: the priority label Indicates the inference model with the highest priority as the target inference model.
  • the priority label is set according to the storage period of the reasoning model and/or the accuracy of the reasoning model.
  • the stored reasoning models include the reasoning models requested from the upper-level edge nodes and the reasoning models actively issued by the higher-level edge nodes. Therefore, in some other examples, the reasoning implementation method further includes: receiving and storing the reasoning model actively delivered by the edge node and/or cloud node.
  • the reasoning implementation method further includes: detecting whether the stored reasoning model is invalid according to the preset failure condition; the failure condition includes that the stored time exceeds the preset effective duration and/or the accuracy rate is lower than the preset Threshold; in case invalidation of at least one stored inference model is detected, the invalidated inference model is deleted.
  • Step 203 check whether the reasoning is successful, if yes, execute step 204 , if not, execute step 205 .
  • Step 204 report the feature information of the data to be inferred to the cloud node through the edge node, or send the feature information of the data to be inferred to the cloud node for the cloud node to train the inference model offline according to the received feature information of the data to be inferred.
  • the terminal node may be directly connected to the cloud node, or may be connected to the cloud node through at least one edge node, such as the terminal node is connected to the edge node, and the edge node is connected to the cloud node.
  • the terminal node may first send the feature information of the data to be inferred to the edge node, and the edge node will directly forward the received feature information of the data to be inferred to the upper edge node or/cloud node after receiving it, until The cloud node receives the feature information, and the terminal node may also directly send the feature information of the data to be inferred to the cloud node.
  • Step 205 generate an inference request according to the data to be inferred and the inference type, and send the inference request to the edge node and/or cloud node, for the edge node and/or cloud node to return the inference result according to the inference request.
  • the inference result returned by the edge node is not necessarily the result obtained by the edge node that received the inference request, but may also be the result obtained by the edge node's superior edge node or cloud node.
  • the edge node feeds back.
  • the embodiments of the present invention also provide a reasoning implementation method, which is applied to cloud nodes, and its flow is shown in FIG. 3 .
  • Step 301 receiving an inference request sent by an edge node and/or a terminal node when the inference fails.
  • the inference request carries data to be inferred and an inference type.
  • the reasoning implementation method provided in this embodiment is roughly the same as the reasoning implementation method provided in the first method embodiment, and the different parts are mainly described below.
  • Step 302 Invoking the corresponding target inference model according to the inference type to perform inference on the data to be inferred.
  • invoking the corresponding target inference model according to the inference type to perform inference on the data to be inferred can be achieved in the following ways: matching in the stored inference model according to the inference type; using the matched inference model as the standard inference model and Input the data to be inferred into the target inference model.
  • the matched inference model includes multiple inference models with priority labels.
  • the matched inference model is used as the target inference model, including: the priority label Indicates the inference model with the highest priority as the target inference model.
  • the priority label is set according to the generation time of the reasoning model and/or the accuracy rate of the reasoning model.
  • the reasoning implementation method further includes: detecting whether the stored reasoning model is invalid according to the preset failure condition; the failure condition includes that the stored time exceeds the preset effective duration and/or the accuracy rate is lower than the preset Threshold; in case invalidation of at least one stored inference model is detected, the invalidated inference model is deleted.
  • Step 303 check whether the inference is successful, if yes, execute step 304 , if not, execute step 305 .
  • Step 304 Return the inference result output by the target inference model to the edge node, for the edge node to send the received inference result to the terminal device, and/or return the inference result output by the target inference model to the terminal device.
  • Step 305 Return the inference failure response to the edge node, so that the edge node can send the received inference failure response to the terminal device, and/or return the inference failure response to the terminal device.
  • the inferencing implementation method further includes: receiving the characteristic information of the data to be inferred sent by the edge node and/or the terminal device when the inference is successful.
  • the cloud node will also perform inference model training based on the received inference data and its feature information. Therefore, in some other examples, the inference implementation method also includes: according to the inference data received within a preset time period belonging to different inference categories and/or feature information of inference data to train inference models offline to obtain inference models of different inference categories. For example, it is agreed to conduct inference model training on the first day of each month.
  • the training process includes dividing the data to be inferred and its characteristics received in the previous month according to inference types, and obtaining training data corresponding to various inference types.
  • the data to be inferred needs further feature extraction in order to obtain feature information, and then use the feature information to train according to the predefined initial model.
  • each inference type can adopt multiple training methods and multiple initial model training to obtain multiple The inference model will not be described here one by one.
  • the reasoning implementation method also includes: actively sending reasoning to edge nodes and/or terminal devices Model.
  • the following will take the recognition scene and the terminal device as a robot as an example, and analyze the successful reasoning of terminal devices, edge nodes, and cloud nodes at different levels. The execution process is described.
  • the reasoning implementation methods include:
  • Step 401 the robot photographs the target area according to the recognition instruction issued by the user to obtain the target image.
  • the robot matches the reasoning models used for object recognition from the stored reasoning models and takes the reasoning model with the highest priority among the matched reasoning models as the target reasoning model.
  • Step 403 the robot invokes the target reasoning model to process the target image, and obtains the name and confidence level of the target object in the target image.
  • Step 404 in the case of judging that the recognition is successful according to the confidence level, the robot sends the feature information of the target image to the edge node.
  • Step 405 the edge node forwards the received feature information of the target image to the cloud node.
  • Step 406 the cloud node stores the feature information of the target image.
  • the reasoning implementation methods include:
  • Step 501 the robot photographs the target area according to the recognition instruction issued by the user to obtain the target image.
  • Step 502 the robot matches an inference model for object recognition from the stored inference models and uses the matched inference model as the first target inference model.
  • Step 503 the robot invokes the target reasoning model to process the target image, and obtains the first name and first confidence level of the target object in the target image.
  • Step 504 in the case of judging that the recognition fails according to the first confidence level, the robot sends a first inference request to the edge node, and the first inference request carries the target image and information indicating that the inference type is object recognition.
  • Step 505 the edge node receives the first inference request and parses out the target image and information indicating that the inference type is object recognition.
  • the edge node matches the reasoning model used for object recognition from the stored reasoning models and uses the reasoning model with the highest priority among the matched reasoning models as the second target reasoning model.
  • Step 507 the edge node invokes the second target reasoning model to process the target image, and obtains the second name and the second confidence level of the target object in the target image.
  • Step 508 when it is determined that the recognition is successful according to the second confidence level, the edge node sends the feature information of the target image to the superior edge node.
  • Step 509 the superior edge node forwards the received feature information of the target image to the cloud node.
  • Step 510 the cloud node receives and stores feature information of the target image.
  • Step 511 the edge node forwards the second name of the target object to the robot.
  • the inference implementation methods include:
  • Step 601 the robot photographs the target area according to the recognition instruction issued by the user to obtain the target image.
  • step 602 the robot matches an inference model for object recognition from stored inference models and uses the matched inference model as the first target inference model.
  • Step 603 the robot calls the target reasoning model to process the target image, and obtains the first name and first confidence level of the target object in the target image.
  • Step 604 in the case of judging that the recognition fails according to the first confidence level, the robot sends a first inference request to the edge node, and the first inference request carries the target image and information indicating that the inference type is object recognition.
  • Step 605 the edge node receives the first inference request and parses out the target image and information indicating that the inference type is object recognition.
  • step 606 the edge node matches the reasoning model used for object recognition from the stored reasoning models.
  • Step 607 if the matching fails, the edge node requests an inference model for object recognition from the cloud node.
  • step 608 the cloud node sends the highest priority inference model for object recognition to the edge node after receiving the request for the model.
  • Step 609 the edge node calls the inference model returned by the cloud node to process the target image, and obtains the second name and the second confidence level of the target object in the target image.
  • Step 610 when it is determined that the recognition is successful according to the second degree of confidence, the edge node sends the feature information of the target image to the cloud node.
  • Step 611 the cloud node receives and saves the feature information of the target image.
  • Step 612 the edge node sends the second name to the robot.
  • the inference implementation methods include:
  • Step 701 the robot photographs the target area according to the recognition instruction sent by the user to obtain the target image.
  • Step 702 the robot matches the reasoning model used for object recognition from the stored reasoning models.
  • Step 703 if the matching fails, the robot requests an inference model for object recognition from the edge node.
  • Step 704 after receiving the request sent by the robot, the edge node matches the reasoning model used for object recognition from the stored reasoning models.
  • Step 705 if the matching fails, the edge node requests an inference model for object recognition from the cloud node.
  • Step 706 After receiving the request sent by the edge node, the cloud node matches the reasoning model with the highest priority for object recognition from the stored reasoning models.
  • Step 707 the cloud node sends the matched inference model to the edge node.
  • Step 708 the edge node forwards the received reasoning model to the robot.
  • Step 709 the robot invokes the received reasoning model to process the target image, and obtains the name and confidence level of the target object in the target image.
  • Step 710 in the case of judging that the recognition is successful according to the confidence, the robot sends the feature information of the target image to the edge node.
  • Step 711 the edge node forwards the received feature information of the target image to the cloud node.
  • Step 712 the cloud node receives and saves the feature information of the target image.
  • the robot may also send feature information to the cloud node through the edge node, the upper-level edge node, and the upper-level edge node.
  • the above examples do not involve actively delivering inference models, inference model training, cloud nodes receiving data collected by robots instead of feature information, and robots directly communicating with the cloud to upload feature information of data to be inferred to cloud nodes or directly Request scenarios such as inference models to cloud nodes.
  • robots and edge nodes also receive inference models from superior nodes; robots and edge nodes at all levels fail to infer, and the target image will be transmitted to cloud nodes for inference , when the inference is successful, the cloud nodes will feed back the inference results to the edge nodes and robots at all levels, and if the inference is unsuccessful, they will feed back the inference failure responses to the edge nodes and robots at all levels; A target image training inference model, etc., will not be repeated here.
  • step division of the above various methods is only for the sake of clarity of description. During implementation, it can be combined into one step or some steps can be split and decomposed into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. ; Adding insignificant modifications or introducing insignificant designs to the algorithm or process, but not changing the core design of the algorithm and process are all within the scope of protection of this patent.
  • the embodiment of the present invention also provides an intelligent distribution network, as shown in FIG. 8 , including: a cloud node, several terminal devices, and several edge nodes.
  • the cloud node is used to receive the reasoning request sent by the edge node and/or the terminal device in the case of reasoning failure, and the reasoning request carries the data to be reasoned and the type of reasoning; according to the reasoning type, the corresponding target reasoning model is invoked to reason about the data to be reasoned;
  • inference failure return an inference failure response to the edge node, for the edge node to send the received inference failure response to the terminal device, and/or return the inference failure response to the terminal device.
  • the terminal device is used to collect data according to the received reasoning task, and obtain the data to be reasoned; call the corresponding target reasoning model to reason about the data to be reasoned according to the reasoning type of the reasoning task;
  • the type generates an inference request and sends the inference request to the edge node and/or cloud node for the edge node and/or cloud node to return the inference result according to the inference request.
  • the edge node is used to receive the first inference request sent by the lower-level edge node and/or terminal device in the case of inference failure.
  • the first inference request carries the data to be inferred and the inference type; according to the inference type, the corresponding target inference model is invoked to treat the inference data Perform inference; in the case of successful inference, return the inference result output by the target inference model to the corresponding lower-level edge node and/or terminal device; in the case of inference failure, generate a second inference request according to the data to be inferred and the inference type , and send the second inference request to the upper-level edge node and/or cloud node, and deliver the inference result returned by the upper-level edge node and/or cloud node to the corresponding lower-level edge node and/or terminal device.
  • the intelligent distribution network is actually a network of centralized training and distributed reasoning.
  • the cloud center and the central node conduct the training of the reasoning model, and the data to be reasoned and the data to be reasoned uploaded by edge nodes or terminal devices at all levels
  • the feature information of the data is concentrated to the cloud nodes to train various inference models on the cloud nodes, while the inference models stored by edge nodes and terminal devices at all levels come from the upper-level nodes or actively request the upper-level nodes.
  • node reasoning When performing inference, once the current If node reasoning fails, it will send a reasoning request to the superior node until the request is conveyed to the cloud node, that is, the terminal device ontology recognition is used first, supplemented by edge recognition and cloud recognition to achieve multi-level intelligence; improved recognition accuracy Rate and response speed, while reducing the uplink bandwidth through step-by-step processing, and ultimately improving user experience.
  • the intelligent distribution network can realize intelligent reasoning model management, and the distribution of data transmission process is intelligent application mirroring, application operation configuration management and transfer of application-related data. It is more complex, more flexible and changeable; and through the intelligent distribution network, cloud intelligence can be quickly sunk to the edge, and edge intelligence can also be quickly transferred to the cloud, so that specific intelligence can run more freely and efficiently throughout the entire In the intelligent distribution network, it breaks the application gap between the cloud and the edge, and can also sink the cloud intelligence to the edge in exchange for bandwidth savings and reduce latency, and transfer the edge intelligence to the cloud in exchange for stronger processing performance and larger data.
  • the intelligent distribution network takes the terminal device as the source point, and the bottom-up network traffic occupies the main position.
  • the average uplink network traffic of smart machines is 10-10% of the average downlink network traffic 100 times, therefore, can make full use of uplink traffic, improve inference efficiency, can effectively reduce the uplink bandwidth of the backbone network, and achieve the purpose of improving the bandwidth utilization rate of the backbone network and saving costs.
  • this embodiment corresponds to the method embodiment, and this embodiment can be implemented in cooperation with the method embodiment.
  • the relevant technical details mentioned in the method embodiments are still valid in this embodiment, and will not be repeated here in order to reduce repetition.
  • the related technical details mentioned in this embodiment can also be applied in the method embodiment.
  • the edge nodes in the intelligent distribution network by using the edge nodes in the intelligent distribution network to process the inference data, the upstream backbone network traffic uploaded to the cloud of the intelligent distribution network is reduced, and the bandwidth is saved;
  • the data is processed at all levels of edge nodes in the intelligent distribution network, and the edge nodes in the intelligent distribution network do not need to train reasoning models, which reduces the consumption of computing power; and the cloud nodes in the intelligent distribution network pass centralized Training, to obtain more sufficient sample space, using the more powerful computing power of the intelligent distribution network cloud, the accuracy of the trained model is higher; the edge nodes of the intelligent distribution network dynamically obtain the feature information library from the cloud to ensure the use of the latest features information base.
  • modules involved in this embodiment are logical modules.
  • a logical unit can be a physical unit, or a part of a physical unit, or multiple Combined implementation of physical units.
  • units that are not closely related to solving the technical problem proposed by the present invention are not introduced in this embodiment, but this does not mean that there are no other units in this embodiment.
  • the embodiment of the present invention also provides an electronic device, as shown in FIG. 9 , including: at least one processor 901; An instruction to be executed by at least one processor 901, the instruction is executed by at least one processor 901, so that at least one processor 901 can execute the reasoning implementation method described in any one of the foregoing method embodiments.
  • the memory 902 and the processor 901 are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors 901 and various circuits of the memory 902 together.
  • the bus may also connect together various other circuits such as peripherals, voltage regulators, and power management circuits, all of which are well known in the art and therefore will not be further described herein.
  • the bus interface provides an interface between the bus and the transceivers.
  • a transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing means for communicating with various other devices over a transmission medium.
  • the data processed by the processor 901 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 901 .
  • Processor 901 is responsible for managing the bus and general processing, and may also provide various functions, including timing, peripheral interface, voltage regulation, power management, and other control functions. And the memory 902 may be used to store data used by the processor 901 when performing operations.
  • Another aspect of the embodiment of the present invention provides a computer-readable storage medium storing a computer program.
  • the computer program is executed by the processor, the reasoning implementation method described in any of the above method embodiments is implemented.
  • the program is stored in a storage medium, and includes several instructions to make a device (can It is a single-chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .
  • Another aspect of the embodiments of the present invention provides a computer program product, including computer programs/instructions, when the computer program is executed by a processor, the reasoning implementation method described in any of the above method embodiments is implemented.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例涉及人工智能技术领域,公开了一种推理实现方法、网络、电子设备及存储介质。应用于边缘节点时,该方法包括:接收下级边缘节点和/或终端设备在推理失败的情况下发送的第一推理请求,第一推理请求携带待推理数据和推理类型;根据推理类型调用目标推理模型对待推理数据进行推理;在推理成功的情况下,将目标推理模型输出的推理结果返回给下级边缘节点和/或终端设备;在推理失败的情况下,根据待推理数据和推理类型生成第二推理请求,并将第二推理请求发送给上级边缘节点和/或云端节点,将上级边缘节点和/或云端节点返回的推理结果下发给下级边缘节点和/或终端设备。大大提高终端设备推理成功的概率和准确率,提升用户体验。

Description

推理实现方法、网络、电子设备及存储介质
交叉引用
本申请引用于2021年11月25日递交的名称为“推理实现方法、网络、电子设备及存储介质”的第202111413826.7号中国专利申请,其通过引用被全部并入本申请。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种推理实现方法、网络、电子设备及存储介质。
背景技术
终端设备要实现人工智能通常需要经过感知、推理等过程,例如机器人在需要抓取房间内的某个物体时,机器人通常需要调用自身或外设的一些传感器来采集房间内的图像,以感知当前环境进行感知,然后基于感知到的环境中障碍物的分布情况和目标抓取物体的位置进行推理,得到运动规划结果,然后按照运动规划结果进行执行,直到抓取目标物体,完成抓取任务。
然而,终端设备进行推理后,得到的结果不一定理想,例如当置信度较低时,终端会判定推理失败。而终端设备一旦推理失败,就会放弃相关任务,导致用户体验很差。
发明内容
本发明实施例的目的在于提供一种推理实现方法、网络、电子设备及存储介质,能够大大提高终端设备推理成功的概率和准确率,使得用户体验得到提升。
为达到上述目的,本发明的实施例提供了一种推理实现方法,应用于边缘节点,包括:接收下级边缘节点和/或终端设备在推理失败的情况下发送的第一推理请求,所述第一推理请求携带待推理数据和推理类型;根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理;在推理成功的情况下,将所述目标推理模型输出的推理结果返回给相应的所述下级边缘节 点和/或所述终端设备;在推理失败的情况下,根据所述待推理数据和所述推理类型生成第二推理请求,并将所述第二推理请求发送给上级边缘节点和/或云端节点,将所述上级边缘节点和/或所述云端节点返回的推理结果下发给相应的所述下级边缘节点和/或所述终端设备。
为达到上述目的,本发明的实施例还提供了一种推理实现方法,应用于终端设备,包括:根据接收到的推理任务进行数据采集,得到待推理数据;根据所述推理任务的推理类型调用相应的目标推理模型对所述待推理数据进行推理;在推理失败的情况下,根据所述待推理数据和所述推理类型生成推理请求并将所述推理请求发送给边缘节点和/或云端节点,供所述边缘节点和/或云端节点根据所述推理请求返回推理结果。
为达到上述目的,本发明的实施例还提供了一种推理实现方法,应用于云端节点,包括:接收边缘节点和/或终端设备在推理失败的情况下发送的推理请求,所述推理请求携带待推理数据和推理类型;根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理;在推理成功的情况下,将所述目标推理模型输出的推理结果返回给所述边缘节点,供所述边缘节点将接收到的推理结果发送给终端设备,和/或,将所述目标推理模型输出的推理结果返回给所述终端设备;在推理失败的情况下,将推理失败响应返回给所述边缘节点,供所述边缘节点将接收到的推理失败响应发送给所述终端设备,和/或,将所述推理失败响应返回给所述终端设备。
为达到上述目的,本发明的实施例还提供了一种智能分发网络,所述智能分发网络包括一个云端节点、若干终端设备和若干边缘节点,所述边缘节点、所述终端设备和所述云端节点分别用于执行如上所述的推理实现方法。
为达到上述目的,本发明的实施例还提供了一种电子设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如上任一项所述的推理实现方法。
为达到上述目的,本发明的实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现如上任一项所述的推理实现方法。
为达到上述目的,本发明的实施例还提供了一种计算机程序产品,包括 计算机程序/指令,当计算机程序被处理器执行时,致使处理器实现如上任一项所述的推理实现方法。
本发明实施例提供的推理实现方法,在终端设备推理失败的情况下,可以向边缘节点和/或云端节点发送推理请求,以便边缘节点和/或云端节点根据推理请求进行推理并返回推理结果,特别地,在边缘节点也推理失败的情况下,还可以继续向该边缘节点的上级边缘节点和/或云端中心节点发送理请求,以便上级边缘节点和/或云端节点根据推理请求进行推理并返回推理结果。也就是说,将推理划分为由低到高的终端设备、至少一个边缘节点和云端节点的多个层次,优先使用终端设备本体推理,辅助以边缘节点推理、云端节点推理进行补充,实现了多级智能,当前层次失败时可以向上一层次请求以在上一层次继续进行推理,直到推理成功,得到可执行的推理结果,大大提高终端设备推理成功的概率和准确率,提升了用户体验。并且逐级向上请求,避免了单一节点的处理压力过大,减少了上行带宽,加快了处理效率以及向下响应的速度。此外,相对于单一地依赖各个终端设备自身,还能够避免终端设备自身一旦推理失败就放弃执行任务的情况,并且由于可以依赖边缘节点或云端节点推理得到的推理结果来执行任务,边缘节点和云端节点的计算能力也会比终端设备更好,因此,任务的完成率更高、准确性也更高。
附图说明
一个或多个实施例通过与之对应的附图中的图片进行示例性说明,这些示例性说明并不构成对实施例的限定,附图中具有相同参考数字标号的元件表示为类似的元件,除非有特别申明,附图中的图不构成比例限制。
图1是本发明一实施例中提供的推理实现方法的流程图;
图2是本发明另一实施例中提供的推理实现方法的流程图;
图3是本发明另一实施例中提供的推理实现方法的流程图;
图4是本发明另一实施例中提供的推理实现方法的交互流程图;
图5是本发明另一实施例中提供的推理实现方法的交互流程图;
图6是本发明另一实施例中提供的推理实现方法的交互流程图;
图7是本发明另一实施例中提供的推理实现方法的交互流程图;
图8是本发明另一实施例中提供的智能分发网络的结构示意图;
图9是本发明另一实施例中提供的电子设备的结构示意图。
具体实施方式
由背景技术可知,目前的终端设备在进行推理时仅依赖于终端设备自身,一旦推理失败,就直接放弃执行任务,用户体验很差。
为解决上述问题,本发明实施例提供了一种推理实现方法,应用于边缘节点,包括:接收下级边缘节点和/或终端设备在推理失败的情况下发送的第一推理请求,所述第一推理请求携带待推理数据和推理类型;根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理;在推理成功的情况下,将所述目标推理模型输出的推理结果返回给相应的所述下级边缘节点和/或所述终端设备;在推理失败的情况下,根据所述待推理数据和所述推理类型生成第二推理请求,并将所述第二推理请求发送给上级边缘节点和/或云端节点,将所述上级边缘节点和/或所述云端节点返回的推理结果下发给相应的所述下级边缘节点和/或所述终端设备。
本发明实施例提供的推理实现方法,在终端设备推理失败的情况下,可以向边缘节点和/或云端节点发送推理请求,以便边缘节点和/或云端节点根据推理请求进行推理并返回推理结果,特别地,在边缘节点也推理失败的情况下,还可以继续向该边缘节点的上级边缘节点和/或云端中心节点发送推理请求,以便上级边缘节点和/或云端节点根据推理请求进行推理并返回推理结果。也就是说,将推理划分为由低到高的终端设备、至少一个边缘节点和云端节点的多个层次,优先使用终端设备本体推理,辅助以边缘节点推理、云端节点推理进行补充,实现了多级智能,当前层次失败时可以向上一层次请求以在上一层次继续进行推理,直到推理成功,得到可执行的推理结果,大大提高终端设备推理成功的概率和准确率,提升了用户体验。并且逐级向上请求,避免了单一节点的处理压力过大,减少了上行带宽,加快了处理效率以及向下响应的速度。此外,相对于单一地依赖各个终端设备自身,还能够避免终端设备自身一旦推理失败就放弃执行任务的情况,并且由于可以依赖边缘节点或云端节点推理得到的推理结果来执行任务,边缘节点和云端节点的计算 能力也会比终端设备更好,因此,任务的完成率更高、准确性也更高。
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合附图对本发明的各实施例进行详细的阐述。然而,本领域的普通技术人员可以理解,在本发明各实施例中,为了使读者更好地理解本发明而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施例的种种变化和修改,也可以实现本发明所要求保护的技术方案。
以下各个实施例的划分是为了描述方便,不应对本发明的具体实现方式构成任何限定,各个实施例在不矛盾的前提下可以相互结合相互引用。
本发明实施例一方面提供了一种推理实现方法,应用于边缘节点,其流程如图1所示。
步骤101,接收下级边缘节点和/或终端设备在推理失败的情况下发送的第一推理请求,第一推理请求携带待推理数据和推理类型。
本实施例不对待推理数据和推理类型进行限定,待推理数据可以是以下数据中的一种或组合:音频数据、视频数据、文本数据、图像数据等,推理类型可以识别、定位、路径规划等,特别地,本实施例也不对待推理数据的格式进行限定,如图像数据可以是灰色图像、深度图像或彩色图像等,音频可以是AIFF(Audio Interchange File Format)格式或MIDI(Musical Instrument Digital Interface)格式等,此处就不再一一赘述了。
并且,本实施例中的推理失败是指推理模型输出的结果出现以下情形:推理结果不可被执行、推理结果的置信度低于预设置信度阈值或者根据推理结果预估的任务成功率低于预设成功率等。
特别地,推理类型还可以进一步限定到模型类型,如用于识别的推理模型可以有人脸识别模型、物体识别模型等,人脸识别模型可以进一步分为FaceNet网络模型、多任务卷积神经网络(Multi-Task Convolutional Neural Network,MTCNN)等。
可以理解的是,下级边缘节点、终端设备等能够与本边缘节点之间建立通信连接。当下级边缘节点、终端设备等推理失败的情况下,不会直接放弃 执行相关任务,而是向上级节点即本边缘节点,发送第一推理请求,以请求本边缘节点进行推理,边缘节点在接收到第一推理请求后,对第一推理请求进行解析,得到待推理数据和推理类型。
需要说明的是,本实施例中的边缘节点是除终端设备和云端节点之外的所有节点,而不仅仅包括与终端设备之间具有通信链路的节点。
还需要说明的是,本实施例不对下级边缘节点以及终端设备的数量进行限定,可以是一个或多个,即可以接收来自一个或多个下级边缘节点分别发送的第一推理请求,也可以接收来自一个或多个下级边缘节点分别发送的第一推理请求,还可以接收来自至少一个下级边缘节点和至少一个终端设备分别发送的第一推理请求,特别地,一个下级边缘节点或一个终端设备也可以同时发送多个第一推理请求,这些推理请求的待推理数据和推理类型可以相同也可以不同。此处也不再一一赘述了。
步骤102,根据推理类型调用相应的目标推理模型对待推理数据进行推理。
本实施例中,边缘节点中已存储有若干训练好的推理模型,这些推理模型包括若干推理类型的至少一个推理推理模型,如边缘节点中存储有2个用于识别的推理模型、3个用于路径规划的推理模型。
在一些例子中,根据推理类型调用相应的目标推理模型对待推理数据进行推理可以通过如下方式实现:根据推理类型在已存储的推理模型中进行匹配;在匹配成功的情况下,将匹配到的推理模型作为目标推理模型并将待推理数据输入到目标推理模型中,特别地,可能根据推理模型匹配到至少两个推理模型,此时,各个推理模型还携带优先级标签,将匹配到的推理模型作为所述目标推理模型,包括:将优先级标签指示的优先级最高的推理模型作为目标推理模型;在匹配失败的情况下,根据推理类型向上级边缘节点和/或云端节点请求推理模型,上级边缘节点和/或云端节点在接收到请求后,根据推理类型从自身当前已存储的推理模型中查找相应的推理模型并下发,而边缘节点在接收到上级边缘节点和/或云端节点返回的推理模型后,将上级边缘 节点和/或云端节点返回的推理模型作为目标推理模型并将待推理数据输入到目标推理模型中,可以理解的是,上级边缘节点和/或云端节点返回的推理模型的数量为1,即当上级边缘节点和/或云端节点查找出1个推理模型时,下发该推理模型;当上级边缘节点和/或云端节点查找出若干推理模型时,选择优先级最高的推理模型下发,而不是查找到的所有推理模型,减少下发推理模型时使用的流量,避免边缘节点存储过多推理模型占用内存而影响系统性能。
需要说明的是,上述例子中的优先级标签是根据推理模型被存储的时长和/或推理模型的准确率设置的,其中,优先级标签指示的是属于同一推理类型的推理模型之间的优先级,而不是所有已存储的推理模型的优先级。例如,按照预设的周期对已存储的推理模型的准确率等因素进行评估,将某个或某些准确率低于预设阈值的推理模型的优先级降低,此时,还可以继续向上级边缘节点和/或云端节点请求优先级被降低的推理模型所属推理类型的新的推理模型,而上级边缘节点和/或云端节点返回的新的推理模型将会被设置有更高的优先级,而当推理模型的优先级超过预设优先级时,则判定该推理模型失效并删除该推理模型;或者,实时检测已存储的推理模型的已被存储的时长是否超过预设时长阈值,将超过预设时长阈值的推理模型的优先级降低等。
可以理解的是,为了避免存储过多的推理模型而影响系统性能,还可以检测推理模型是否不会被调用或调用输出的结果不可靠,即检测是否失效,以便删除失效的推理模型。因此,推理实现方法还包括:根据预设的失效条件检测已存储的推理模型是否失效;失效条件包括已存储时间超过预设的有效时长和/或准确率低于预设阈值;在检测到至少一个已存储的推理模型失效的情况下,删除失效的推理模型。例如可以检测已存储的推理模型的生存时间值(Time To Live,TTL)是否为0,认定TTL为0的推理模型失效;或者,检测已存储的推理模型的是否较长时间内未被使用,认定较长时间内未被使用的推理模型失效;或者,检测推理模型的准确率较低,认定准确率较低的推理模型失效。当然,以上仅为具体的距离说明,实际还可以认定优先级低 于某一级别的推理模型失效、按照优先级由高到低进行排序的情况下认定位于前N(N为正整数)个之外的推理模型失效等,此处就不再一一赘述了。
还需要说明的是,已存储的推理模型可以是边缘节点向上级边缘节点和/或云端节点请求的,还可以是上级边缘节点和/或云端节点主动下发的,因此,推理实现方法还包括:接收上级边缘节点和/或所述云端节点主动下发的推理模型并存储,其中,主动下发可以是周期性主动下发,也可以设置一些触发条件,如某一推理类型的推理模型中至少两个最新的推理模型均未下发过或者接收到管理者的下发指令等。类似地,边缘节点还会主动向下级边缘节点和/或终端设备下发已存储的推理模型中优先级最高的推理模型,因此,推理实现方法还包括:向下级边缘节点和/或终端设备主动下发推理模型。
步骤103,检测推理是否成功,若是,执行步骤104,若否,执行步骤105。
如步骤101中对推理失败的说明,检测推理是否成功可以通过检测推理模型输出的结果,即推理结果是否可执行、置信度是否高于预设置信度阈值并且根据推理结果预估的任务成功率是否高于预设成功率等,即检测推理结果是否能够以较高的成功率执行。
步骤104,将目标推理模型输出的推理结果返回给相应的下级边缘节点和/或终端设备。
在一些例子中,根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理之后,推理实现方法还包括:在推理成功的情况下,将待推理数据的特征信息上报给所述上级边缘节点和/或所述云端节点,以在云端节点根据待推理数据的特征信息离线训练推理模型。
可以理解的是,在云端节点离线训练推理模型的情况下,云端节点期望接收到尽量多的用于训练的特征数据,因此,在另一些例子中,推理实现方法还包括:接收下级边缘节点和/或终端设备在推理成功的情况下发送的待推理数据的特征信息;将待推理数据的特征信息发送给上级边缘节点和/或云端节点。从而能够为云端节点的推理模型训练构建容量尽量大的训练集,提高推理模型的准确性。其中,待推理数据的特征信息可以设置一个特征提取模 型进行特征提取得到,也可以是推理模型在进行推理的过程中提取出来的,本实施例不对特征信息的获取方式进行限定。
步骤105,根据待推理数据和推理类型生成第二推理请求,并将第二推理请求发送给上级边缘节点和/或云端节点,将上级边缘节点和/或云端节点返回的推理结果下发给相应的下级边缘节点和/或终端设备。
需要说明的是,本实施例中的“第一推理请求”、“第二推理请求”是为了对边缘节点接收到的推理请求和发送出去的推理请求进行区分,即第一推理请求指示边缘节点接收到的推理请求,第二推理请求指示边缘节点发送出去的推理请求,而不对推理请求构成来自终端设备还是边缘节点等的限定。
本发明实施例另一方面还提供了一种推理实现方法,应用于终端设备,其中终端设备可以是机器人、手机、计算机等用户侧的电子设备,其流程如图2所示。
步骤201,根据接收到的推理任务进行数据采集,得到待推理数据。
本实施例提供的推理实现方法与第一个方法实施例提供的推理实现方法大致相同,以下主要对不同的部分进行说明。
本实施例中,终端设备会接收到来自用户的控制指令,如机器人会接受到用户控制终端或者机器人供应商的云平台等设备的指令,指示机器人执行任务,如物体识别、路径规划等,此时,机器人完成指令指示的任务需要经过感知、推理和执行三个过程,数据采集就是通过调用相关的传感器来对环境等进行感知,以便根据感知到的信息进行推理,因此,采集到的数据即为待推理数据,待推理数据也是推理模型的输入;当用户通过操作鼠标启动某个程序对用户上传的图片进行物体识别时,计算机加载图片来感知图片并运行相关程序,对图片中的物体进行识别等,此处就不再一一赘述了。
步骤202,根据推理任务的推理类型调用相应的目标推理模型对待推理数据进行推理。
在一些例子中,根据推理任务的推理类型调用相应的目标推理模型对待推理数据进行推理,可以通过如下方式实现:根据推理类型在已存储的推理 模型中进行匹配;在匹配成功的情况下,将匹配到的推理模型作为目标推理模型并将待推理数据输入到目标推理模型中;在匹配失败的情况下,根据推理类型向边缘节点和/或云端节点请求推理模型,将边缘节点和/或云端节点返回的推理模型作为目标推理模型并将待推理数据输入到标推理模型中。
匹配到的推理模型可能有多个,具体地说,匹配到的推理模型包括多个携带优先级标签的推理模型,此时,将匹配到的推理模型作为目标推理模型,包括:将优先级标签指示的优先级最高的推理模型作为目标推理模型。其中,优先级标签是根据推理模型被存储的时长和/或推理模型的准确率设置的。
需要说明的是,已存储的推理模型包括向上级边缘节点请求的推理模型以及上级边缘节点主动下发的推理模型。因此,在另一些例子中,推理实现方法还包括:接收边缘节点和/或云端节点主动下发的推理模型并存储。
还需要说明的是,为了避免存储的模型过多而影响终端设备的运行,因此,还会对已存储的推理模型进行检测,确定是否存在失效的推理模型,进而可以删除已失效的推理模型。因此,在另一些例子中,推理实现方法还包括:根据预设的失效条件检测已存储的推理模型是否失效;失效条件包括已存储时间超过预设的有效时长和/或准确率低于预设阈值;在检测到至少一个已存储的推理模型失效的情况下,删除失效的推理模型。
步骤203,检测推理是否成功,若是,执行步骤204,若否,执行步骤205。
步骤204,将待推理数据的特征信息经过边缘节点上报给云端节点,或者,将待推理数据的特征信息发送给云端节点,供云端节点根据接收到的待推理数据的特征信息离线训练推理模型。
可以理解的是,终端节点可能是与云端节点直接通信连接,也可能是通过至少一个边缘节点与云端节点连接,如终端节点与边缘节点通信连接,边缘节点与云端节点通信连接。相应地,本实施例中,终端节点可能先将待推理数据的特征信息发送给边缘节点,边缘节点接收后会直接向上级边缘节点或/云端节点转发接收到的待推理数据的特征信息,直到云端节点接收到该特征信息,终端节点也可能直接将待推理数据的特征信息发送给云端节点。
步骤205,根据待推理数据和推理类型生成推理请求并将推理请求发送给边缘节点和/或云端节点,供边缘节点和/或云端节点根据推理请求返回推理结果。
需要说明的是,边缘节点返回的推理结果不一定是接收到推理请求的边缘节点进行推理得到的结果,还可能是该边缘节点的上级边缘节点或云端节点进行推理得到的结果,同样也会经由该边缘节点反馈回来。
本发明实施例另一方面还提供了一种推理实现方法,应用于云端节点,其流程如图3所示。
步骤301,接收边缘节点和/或终端节点在推理失败的情况下发送的推理请求,推理请求携带待推理数据和推理类型。
本实施例提供的推理实现方法与第一个方法实施例提供的推理实现方法大致相同,以下主要对不同的部分进行说明。
步骤302,根据推理类型调用相应的目标推理模型对待推理数据进行推理。
在一些例子中,根据推理类型调用相应的目标推理模型对待推理数据进行推理,可以通过如下方式实现:根据推理类型在已存储的推理模型中进行匹配;将匹配到的推理模型作为标推理模型并将待推理数据输入到目标推理模型中。特别地,匹配到的推理模型可能有多个,即匹配到的推理模型包括多个携带优先级标签的推理模型,此时,将匹配到的推理模型作为目标推理模型,包括:将先级标签指示的优先级最高的推理模型作为目标推理模型。其中,优先级标签是根据推理模型的生成时间和/或推理模型的准确率设置的。
此外,为了避免存储过多不会被调用或者调用时输出结果不理想的推理模型,还需要检测已存储的推理模型是否失效,以便删除失效的推理模型,释放存储空间,避免影响系统性能。因此,在另一些例子中,推理实现方法还包括:根据预设的失效条件检测已存储的推理模型是否失效;失效条件包括已存储时间超过预设的有效时长和/或准确率低于预设阈值;在检测到至少一个已存储的推理模型失效的情况下,删除失效的推理模型。
步骤303,检测推理是否成功,若是,执行步骤304,若否,执行步骤305。
步骤304,将目标推理模型输出的推理结果返回给边缘节点,供边缘节点将接收到的推理结果发送给终端设备,和/或,将目标推理模型输出的推理结果返回给终端设备。
步骤305,将推理失败响应返回给边缘节点,供边缘节点将接收到的推理失败响应发送给终端设备,和/或,将推理失败响应返回给终端设备。
可以理解的是,为了让云端节点能够构建容量尽量大的训练推理模型的训练集,在推理成功的情况下,边缘节点和终端设备均会向上传输推理成功的待推理数据的特征信息,直到云端节点接收到该特征信息,因此,在一些例子中,推理实现方法还包括:接收边缘节点和/或终端设备在推理成功的情况下发送的待推理数据的特征信息。
进而,云端节点还会根据接收到的推理数据及其特征信息进行推理模型训练,因此,在另一些例子中,推理实现方法还包括:根据预设时长内接收到的属于不同推理类别的推理数据和/或推理数据的特征信息离线训练推理模型,得到不同推理类别的推理模型。例如约定在每个月的第一天进行推理模型训练,训练过程包括,对上个月接收到的待推理数据及其特征按照推理类型进行划分,得到各种推理类型对应的训练数据,其中,待推理数据还需要进一步仅特征提取以便得到特征信息,然后利用特征信息按照预定义的初始模型进行训练,特别地,每个推理类型可以采用多种训练方式、多种初始模型训练,得到多个推理模型,此处就不再一一赘述了。
可以理解的是,为了帮助边缘节点、终端设备能够更好地进行推理,需要为边缘节点、终端提供更新后的推理模型,避免出现推理模型由于不太新、训练数据滞后,不能够适应当前的推理任务,导致推理总是出错而导致大部分推理处理压力均集中在云端节点上的问题,因此,在另一些例子中,推理实现方法还包括:向边缘节点和/或终端设备主动下发推理模型。
为了便于本领域技术人员更好地理解上述方法实施例提供的推理实现方法,以下将以识别场景且终端设备为机器人为例,对终端设备、边缘节点以 及云端节点在不同的层次上推理成功时的执行过程进行说明。
在机器人本体推理成功的场景下,如图4所示,推理实现方法包括:
步骤401,机器人根据用户下发的识别指令对目标区域进行拍摄,得到目标图像。
步骤402,机器人从已存储的推理模型中匹配出用于物体识别的推理模型并将匹配到的推理模型中优先级最高的推理模型作为目标推理模型。
步骤403,机器人调用目标推理模型处理目标图像,得到目标图像中目标物体的名称及置信度。
步骤404,在根据置信度判定识别成功的情况下,机器人向边缘节点发送目标图像的特征信息。
步骤405,边缘节点将接收到的目标图像的特征信息转发给云端节点。。
步骤406,云端节点存储目标图像的特征信息。
在机器人本体推理失败且边缘节点推理成功的场景下,如图5所示,推理实现方法包括:
步骤501,机器人根据用户下发的识别指令对目标区域进行拍摄,得到目标图像。
步骤502,机器人从已存储的推理模型中匹配出用于物体识别的一个推理模型并将匹配到的推理模型作为第一目标推理模型。
步骤503,机器人调用目标推理模型处理目标图像,得到目标图像中目标物体的第一名称及第一置信度。
步骤504,在根据第一置信度判定识别失败的情况下,机器人向边缘节点发送第一推理请求,第一推理请求携带目标图像和指示推理类型为物体识别的信息。
步骤505,边缘节点接收第一推理请求并解析出目标图像和指示推理类型为物体识别的信息。
步骤506,边缘节点从已存储的推理模型中匹配出用于物体识别的推理模型并将匹配到的推理模型中优先级最高的推理模型作为第二目标推理模型。
步骤507,边缘节点调用第二目标推理模型处理目标图像,得到目标图像中目标物体的第二名称及第二置信度。
步骤508,在根据第二置信度判定识别成功的情况下,边缘节点向上级边缘节点发送目标图像的特征信息。
步骤509,上级边缘节点将接收到的目标图像的特征信息转发给云端节点。
步骤510,云端节点接收并存储目标图像的特征信息。
步骤511,边缘节点将目标物体的第二名称转发给机器人。
在机器人本体失败且边缘节点匹配目标推理模型失败的场景下,如图6所示,推理实现方法包括:
步骤601,机器人根据用户下发的识别指令对目标区域进行拍摄,得到目标图像。
步骤602,机器人从已存储的推理模型中匹配出用于物体识别的一个推理模型并将匹配到的推理模型作为第一目标推理模型。
步骤603,机器人调用目标推理模型处理目标图像,得到目标图像中目标物体的第一名称及第一置信度。
步骤604,在根据第一置信度判定识别失败的情况下,机器人向边缘节点发送第一推理请求,第一推理请求携带目标图像和指示推理类型为物体识别的信息。
步骤605,边缘节点接收第一推理请求并解析出目标图像和指示推理类型为物体识别的信息。
步骤606,边缘节点从已存储的推理模型中匹配出用于物体识别的推理模型。
步骤607,在匹配失败的情况下,边缘节点向云端节点请求用于物体识别的推理模型。
步骤608,云端节点在接收到请求模型的请求后,将优先级最高的用于物体识别的推理模型下发给边缘节点。
步骤609,边缘节点调用云端节点返回的推理模型处理目标图像,得到目标图像中目标物体的第二名称及第二置信度。
步骤610,在根据第二置信度判定识别成功的情况下,边缘节点向云端节点发送目标图像的特征信息。
步骤611,云端节点接收并保存目标图像的特征信息。
步骤612,边缘节点向机器人发送第二名称。
在机器人本体以及边缘节点均匹配目标推理模型失败的场景下,如图7所示,推理实现方法包括:
步骤701,机器人根据用户下发的识别指令对目标区域进行拍摄,得到目标图像。
步骤702,机器人从已存储的推理模型中匹配出用于物体识别的推理模型。
步骤703,在匹配失败的情况下,机器人向边缘节点请求用于物体识别的推理模型。
步骤704,边缘节点在接收到机器人发送的请求后,从已存储的推理模型中匹配用于物体识别的推理模型。
步骤705,在匹配失败的情况下,边缘节点向云端节点请求用于物体识别的推理模型。
步骤706,云端节点在接收到边缘节点发送的请求后,从已存储的推理模型中匹配出优先级最高的用于物体识别的推理模型。
步骤707,云端节点将匹配到的推理模型下发给边缘节点。
步骤708,边缘节点将接收到的推理模型转发给机器人。
步骤709,机器人调用接收到的推理模型处理目标图像,得到目标图像中目标物体的名称及置信度。
步骤710,在根据置信度判定识别成功的情况下,机器人向边缘节点发送目标图像的特征信息。
步骤711,边缘节点将接收到的目标图像的特征信息转发给云端节点。
步骤712,云端节点接收并保存目标图像的特征信息。
需要说明的是,以上仅为具体举例说明,在其他例子中,机器人还可以通过边缘节点、上级边缘节点以及上上级边缘节点将特征信息发送给云端节点。此外,上述例子中并未涉及主动下发推理模型、推理模型训练、云端节点接收机器人采集的数据而不是特征信息、机器人直接与云端通信连接以便直接向云端节点上传待推理数据的特征信息或直接向云端节点请求推理模型等场景,在其他例子中,机器人、边缘节点等还接收上级节点下发的推理模型;机器人、各级边缘节点均推理失败,目标图像将被传输至云端节点以进行推理,云端节点在推理成功的情况下,向各级边缘节点和机器人反馈推理结果,在推理不成功的情况下,向各级边缘节点和机器人反馈推理失败的响应;云端节点还根据接收到的各种目标图像训练推理模型等,此处就不再一一赘述了。
上面各种方法的步骤划分,只是为了描述清楚,实现时可以合并为一个步骤或者对某些步骤进行拆分,分解为多个步骤,只要包括相同的逻辑关系,都在本专利的保护范围内;对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计,但不改变其算法和流程的核心设计都在该专利的保护范围内。
本发明实施例另一方面还提供了一种智能分发网络,如图8所示,包括:一个云端节点、若干终端设备和若干边缘节点。
其中,云端节点用于接收边缘节点和/或终端设备在推理失败的情况下发送的推理请求,推理请求携带待推理数据和推理类型;根据推理类型调用相应的目标推理模型对待推理数据进行推理;在推理成功的情况下,将目标推理模型输出的推理结果返回给边缘节点,供边缘节点将接收到的推理结果发送给终端设备,和/或,将目标推理模型输出的推理结果返回给终端设备;在推理失败的情况下,将推理失败响应返回给边缘节点,供边缘节点将接收到的推理失败响应发送给终端设备,和/或,将推理失败响应返回给终端设备。终端设备用于根据接收到的推理任务进行数据采集,得到待推理数据;根据推 理任务的推理类型调用相应的目标推理模型对待推理数据进行推理;在推理失败的情况下,根据待推理数据和推理类型生成推理请求并将推理请求发送给边缘节点和/或云端节点,供边缘节点和/或云端节点根据推理请求返回推理结果。边缘节点用于接收下级边缘节点和/或终端设备在推理失败的情况下发送的第一推理请求,第一推理请求携带待推理数据和推理类型;根据推理类型调用相应的目标推理模型对待推理数据进行推理;在推理成功的情况下,将目标推理模型输出的推理结果返回给相应的下级边缘节点和/或终端设备;在推理失败的情况下,根据待推理数据和推理类型生成第二推理请求,并将第二推理请求发送给上级边缘节点和/或云端节点,将上级边缘节点和/或云端节点返回的推理结果下发给相应的下级边缘节点和/或终端设备。
由此不难看出,智能分发网络实际是一个集中训练、分布式推理的网络,由云端中心,及中心节点进行推理模型的训练,将待推理数据、各级边缘节点或终端设备上传的待推理数据的特征信息集中到云端节点以在云端节点训练各类推理模型,而各级边缘节点和终端设备存储的推理模型来自于上级节点下发或主动向上级节点请求,在进行推理时,一旦当前节点推理失败,则向上级节点发送推理请求,直到请求被传达至云端节点,即优先使用终端设备本体识别,辅助以边缘识别、云端识别为补充的方式,以实现多级智能;提升了识别准确率以及响应速度,同时通过逐级处理减少了上行带宽,最终实现用户体验的提升。
此外,相对于内容分发网络(Content Delivery Network,CDN),智能分发网络能够实现智能的推理模型进行管理,且数据传输过程中分发的是智能应用镜像、应用运行配置管理以及应用关联数据的转移。更加复杂、更加灵活多变;并且通过智能分发网络,可以快速地将云端智能下沉到边缘,还可以快速的将边缘的智能转移至云端,使得特定智能可以更加自由、更加高效地运行在整个智能分发网络中,打破了云边端的应用鸿沟,还能够将云端智能下沉到边缘来换取带宽节省、降低时延,将边缘的智能转移至云端来换取更强的处理性能、更大的数据集合和数据共享,实现更高效的智能;以及, 智能分发网络以终端设备为源点,自下而上的网络流量占据主要,考虑到智能机器的平均上行网络流量是平均下行网络流量的10-100倍,因此,可以充分利用上行流量,提高推理效率,可以有效的降低骨干网上行带宽,达到提升骨干网带宽利用率并节省成本的目的。
不难发现,本实施例与方法实施例相对应,本实施例可与方法实施例互相配合实施。方法实施例中提到的相关技术细节在本实施例中依然有效,为了减少重复,这里不再赘述。相应地,本实施例中提到的相关技术细节也可应用在方法实施例中。
值得一提的是,通过将利用智能分发网络中的边缘节点对待推理数据进行处理,减少了上传到智能分发网络云端的上行骨干网络流量,节省了带宽;针对智能分发网络中终端设备处理能力不足的情况,将数据在智能分发网络中的各级边缘节点进行处理,而智能分发网络中的边缘节点不需要训练推理模型,减少了算力的消耗;并且智能分发网络中的云端节点通过集中的训练,获取到更充足的样本空间,利用智能分发网络云端更强大的算力,训练出来的模型的准确率更高;智能分发网络边缘节点动态从云端获取特征信息库,以保证使用最新的特征信息库的。
还值得一提的是,本实施例中所涉及到的各模块均为逻辑模块,在实际应用中,一个逻辑单元可以是一个物理单元,也可以是一个物理单元的一部分,还可以以多个物理单元的组合实现。此外,为了突出本发明的创新部分,本实施例中并没有将与解决本发明所提出的技术问题关系不太密切的单元引入,但这并不表明本实施例中不存在其它的单元。
本发明实施例另一方面还提供了一种电子设备,如图9所示,包括:至少一个处理器901;以及,与至少一个处理器901通信连接的存储器902;其中,存储器902存储有可被至少一个处理器901执行的指令,指令被至少一个处理器901执行,以使至少一个处理器901能够执行上述任一方法实施例所描述的推理实现方法。
其中,存储器902和处理器901采用总线方式连接,总线可以包括任意 数量的互联的总线和桥,总线将一个或多个处理器901和存储器902的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供用于在传输介质上与各种其他装置通信的单元。经处理器901处理的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传输给处理器901。
处理器901负责管理总线和通常的处理,还可以提供各种功能,包括定时,外围接口,电压调节、电源管理以及其他控制功能。而存储器902可以被用于存储处理器901在执行操作时所使用的数据。
本发明实施方式另一方面还提供了一种计算机可读存储介质,存储有计算机程序。计算机程序被处理器执行时实现上述任一方法实施例所描述的推理实现方法。
即,本领域技术人员可以理解实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
本发明实施方式另一方面还提供了一种计算机程序产品,包括计算机程序/指令,当计算机程序被处理器执行时实现上述任一方法实施例所描述的推理实现方法。
本领域的普通技术人员可以理解,上述各实施例是实现本发明的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本发明的精神和范围。

Claims (27)

  1. 一种推理实现方法,其特征在于,应用于边缘节点,包括:
    接收下级边缘节点和/或终端设备在推理失败的情况下发送的第一推理请求,所述第一推理请求携带待推理数据和推理类型;
    根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理;
    在推理成功的情况下,将所述目标推理模型输出的推理结果返回给相应的所述下级边缘节点和/或所述终端设备;
    在推理失败的情况下,根据所述待推理数据和所述推理类型生成第二推理请求,并将所述第二推理请求发送给上级边缘节点和/或云端节点,将所述上级边缘节点和/或所述云端节点返回的推理结果下发给相应的所述下级边缘节点和/或所述终端设备。
  2. 根据权利要求1所述的推理实现方法,其特征在于,所述根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理,包括:
    根据所述推理类型在已存储的推理模型中进行匹配;
    在匹配成功的情况下,将匹配到的推理模型作为所述目标推理模型并将所述待推理数据输入到所述目标推理模型中;
    在匹配失败的情况下,根据所述推理类型向所述上级边缘节点和/或所述云端节点请求推理模型,将所述上级边缘节点和/或所述云端节点返回的推理模型作为所述目标推理模型并将所述待推理数据输入到所述目标推理模型中。
  3. 根据权利要求2所述的推理实现方法,其特征在于,所述匹配到的推理模型包括多个携带优先级标签的推理模型,所述将匹配到的推理模型作为所述目标推理模型,包括:
    将所述优先级标签指示的优先级最高的推理模型作为所述目标推理模型。
  4. 根据权利要求3所述的推理实现方法,其特征在于,所述优先级标签是根据推理模型被存储的时长和/或推理模型的准确率设置的。
  5. 根据权利要求1至4中任一项所述的推理实现方法,其特征在于,所述方法还包括:
    接收所述上级边缘节点和/或所述云端节点主动下发的推理模型并存储; 和/或,
    向所述下级边缘节点和/或所述终端设备主动下发推理模型。
  6. 根据权利要求1至4中任一项所述的推理实现方法,其特征在于,所述方法还包括:
    根据预设的失效条件检测已存储的推理模型是否失效;所述失效条件包括已存储时间超过预设的有效时长和/或准确率低于预设阈值;
    在检测到至少一个已存储的推理模型失效的情况下,删除失效的推理模型。
  7. 根据权利要求1至4中任一项所述的推理实现方法,其特征在于,所述根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理之后,所述方法还包括:
    在推理成功的情况下,将所述待推理数据的特征信息上报给所述上级边缘节点和/或所述云端节点,以在所述云端节点根据所述待推理数据的特征信息离线训练推理模型。
  8. 根据权利要求1至4中任一项所述的推理实现方法,其特征在于,所述方法还包括:
    接收所述下级边缘节点和/或所述终端设备在推理成功的情况下发送的所述待推理数据的特征信息;
    将所述待推理数据的特征信息发送给所述上级边缘节点和/或所述云端节点。
  9. 一种推理实现方法,其特征在于,应用于终端设备,包括:
    根据接收到的推理任务进行数据采集,得到待推理数据;
    根据所述推理任务的推理类型调用相应的目标推理模型对所述待推理数据进行推理;
    在推理失败的情况下,根据所述待推理数据和所述推理类型生成推理请求并将所述推理请求发送给边缘节点和/或云端节点,供所述边缘节点和/或云端节点根据所述推理请求返回推理结果。
  10. 根据权利要求9所述的推理实现方法,其特征在于,所述根据所述推理任务的推理类型调用相应的目标推理模型对所述待推理数据进行推理,包括:
    根据所述推理类型在已存储的推理模型中进行匹配;
    在匹配成功的情况下,将匹配到的推理模型作为所述目标推理模型并将所述待推理数据输入到所述目标推理模型中;
    在匹配失败的情况下,根据所述推理类型向所述边缘节点和/或所述云端节点请求推理模型,将所述边缘节点和/或所述云端节点返回的推理模型作为所述目标推理模型并将所述待推理数据输入到所述目标推理模型中。
  11. 根据权利要求10所述的推理实现方法,其特征在于,所述匹配到的推理模型包括多个携带优先级标签的推理模型,所述将匹配到的推理模型作为所述目标推理模型,包括:
    将所述优先级标签指示的优先级最高的推理模型作为所述目标推理模型。
  12. 根据权利要求11所述的推理实现方法,其特征在于,所述优先级标签是根据推理模型被存储的时长和/或推理模型的准确率设置的。
  13. 根据权利要求9至12中任一项所述的推理实现方法,其特征在于,所述方法还包括:
    接收所述边缘节点和/或所述云端节点主动下发的推理模型并存储。
  14. 根据权利要求9至12中任一项所述的推理实现方法,其特征在于,所述方法还包括:
    根据预设的失效条件检测已存储的推理模型是否失效;所述失效条件包括已存储时间超过预设的有效时长和/或准确率低于预设阈值;
    在检测到至少一个已存储的推理模型失效的情况下,删除失效的推理模型。
  15. 根据权利要求9至12中任一项所述的推理实现方法,其特征在于,所述根据所述推理任务的推理类型调用相应的目标推理模型对所述待推理数据进行推理之后,所述方法还包括:
    在推理成功的情况下,将所述待推理数据的特征信息经过所述边缘节点上报给云端节点,或者,将所述待推理数据的特征信息发送给所述云端节点,供所述云端节点根据接收到的所述待推理数据的特征信息离线训练推理模型。
  16. 一种推理实现方法,其特征在于,应用于云端节点,包括:
    接收边缘节点和/或终端设备在推理失败的情况下发送的推理请求,所述推理请求携带待推理数据和推理类型;
    根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理;
    在推理成功的情况下,将所述目标推理模型输出的推理结果返回给所述边缘节点,供所述边缘节点将接收到的推理结果发送给终端设备,和/或,将所述目标推理模型输出的推理结果返回给所述终端设备;
    在推理失败的情况下,将推理失败响应返回给所述边缘节点,供所述边缘节点将接收到的推理失败响应发送给所述终端设备,和/或,将所述推理失败响应返回给所述终端设备。
  17. 根据权利要求16所述的推理实现方法,其特征在于,所述根据所述推理类型调用相应的目标推理模型对所述待推理数据进行推理,包括:
    根据所述推理类型在已存储的推理模型中进行匹配;
    将匹配到的推理模型作为所述目标推理模型并将所述待推理数据输入到所述目标推理模型中。
  18. 根据权利要求17所述的推理实现方法,其特征在于,所述匹配到的推理模型包括多个携带优先级标签的推理模型,所述将匹配到的推理模型作为所述目标推理模型,包括:
    将所述优先级标签指示的优先级最高的推理模型作为所述目标推理模型。
  19. 根据权利要求18所述的推理实现方法,其特征在于,所述优先级标签是根据推理模型的生成时间和/或推理模型的准确率设置的。
  20. 根据权利要求16至19中任一项所述的推理实现方法,其特征在于,所述方法还包括:
    向所述边缘节点和/或所述终端设备主动下发推理模型。
  21. 根据权利要求16至19中任一项所述的推理实现方法,其特征在于,所述方法还包括:
    根据预设的失效条件检测已存储的推理模型是否失效;所述失效条件包括已存储时间超过预设的有效时长和/或准确率低于预设阈值;
    在检测到至少一个已存储的推理模型失效的情况下,删除失效的推理模型。
  22. 根据权利要求16至19中任一项所述的推理实现方法,其特征在于,所述方法还包括:
    接收所述边缘节点和/或所述终端设备在推理成功的情况下发送的所述待推理数据的特征信息。
  23. 根据权利要求22所述的推理实现方法,其特征在于,所述方法还包括:
    根据预设时长内接收到的属于不同所述推理类别的所述推理数据和/或所述推理数据的特征信息离线训练推理模型,得到不同所述推理类别的推理模型。
  24. 一种智能分发网络,其特征在于,所述智能分发网络包括一个云端节点、若干终端设备和若干边缘节点,所述边缘节点用于执行如权利要求1至8中任一项所述的推理实现方法,所述终端设备用于执行如权利要求9至15中任一项所述的推理实现方法,所述云端节点用于执行如权利要求16至23中任一项所述的推理实现方法。
  25. 一种电子设备,其特征在于,包括:
    至少一个处理器;以及,
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1至8中任一项所述的推理实现方法,或者,执行如权利要求9至15中任一项所述的推理实现方法,或者,执行如权利要求16至23中任一项所述的推理实现方法。
  26. 一种计算机可读存储介质,存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至8中任一项所述的推理实现方法,或者,实现如权利要求9至15中任一项所述的推理实现方法,或者,实现如权利要求16至23中任一项所述的推理实现方法。
  27. 一种计算机程序产品,包括计算机程序/指令,其特征在于,当计算机程序被处理器执行时,致使处理器实现如权利要求1至8中任一项所述的推理实现方法,或者,实现如权利要求9至15中任一项所述的推理实现方法,或者,实现如权利要求16至23中任一项所述的推理实现方法。
PCT/CN2022/103001 2021-11-25 2022-06-30 推理实现方法、网络、电子设备及存储介质 WO2023093053A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111413826.7 2021-11-25
CN202111413826.7A CN114330722B (zh) 2021-11-25 2021-11-25 推理实现方法、网络、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023093053A1 true WO2023093053A1 (zh) 2023-06-01

Family

ID=81047601

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/103001 WO2023093053A1 (zh) 2021-11-25 2022-06-30 推理实现方法、网络、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN114330722B (zh)
WO (1) WO2023093053A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114330722B (zh) * 2021-11-25 2023-07-11 达闼科技(北京)有限公司 推理实现方法、网络、电子设备及存储介质
CN114885028B (zh) * 2022-05-25 2024-01-23 国网北京市电力公司 业务调度方法、装置及计算机可读存储介质
CN115049061A (zh) * 2022-07-13 2022-09-13 卡奥斯工业智能研究院(青岛)有限公司 基于区块链的人工智能推理系统

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543829A (zh) * 2018-10-15 2019-03-29 华东计算技术研究所(中国电子科技集团公司第三十二研究所) 在终端和云端上混合部署深度学习神经网络的方法和系统
CN111104954A (zh) * 2018-10-26 2020-05-05 华为技术有限公司 一种对象分类的方法与装置
CN112241719A (zh) * 2020-10-23 2021-01-19 杭州卷积云科技有限公司 基于边云卷积神经网络级联的监控视频目标实时查询方法
CN112419401A (zh) * 2020-11-23 2021-02-26 上海交通大学 一种基于云边协同与深度学习的飞机表面缺陷检测系统
CN112732450A (zh) * 2021-01-22 2021-04-30 清华大学 端-边-云协同框架下的机器人知识图谱生成系统及方法
US11017050B1 (en) * 2020-04-01 2021-05-25 Vmware Inc. Hybrid quantized decision model framework
CN113139660A (zh) * 2021-05-08 2021-07-20 北京首都在线科技股份有限公司 模型推理方法、装置、电子设备及存储介质
CN114330722A (zh) * 2021-11-25 2022-04-12 达闼科技(北京)有限公司 推理实现方法、网络、电子设备及存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102124033B1 (ko) * 2017-04-26 2020-06-17 에스케이텔레콤 주식회사 분산형 클라우드 기반 어플리케이션 실행 시스템, 이에 적용되는 장치 및 장치의 동작 방법
CN109086789A (zh) * 2018-06-08 2018-12-25 四川斐讯信息技术有限公司 一种图像识别方法及系统
CN110390246A (zh) * 2019-04-16 2019-10-29 江苏慧中数据科技有限公司 一种边云环境中的视频分析方法
CN110096996B (zh) * 2019-04-28 2021-10-22 达闼机器人有限公司 生物信息识别方法、装置、终端、系统及存储介质
CN112348172B (zh) * 2020-11-13 2022-05-06 之江实验室 一种基于端边云架构的深度神经网络协同推理方法
CN113056035B (zh) * 2021-03-11 2023-02-21 深圳华声医疗技术股份有限公司 基于云计算的超声图像处理方法和超声系统
CN113259359B (zh) * 2021-05-21 2022-08-02 重庆紫光华山智安科技有限公司 一种边缘节点能力补充方法、系统、介质及电子终端
CN113344208B (zh) * 2021-06-25 2023-04-07 中国电信股份有限公司 数据推理方法、装置及系统

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543829A (zh) * 2018-10-15 2019-03-29 华东计算技术研究所(中国电子科技集团公司第三十二研究所) 在终端和云端上混合部署深度学习神经网络的方法和系统
CN111104954A (zh) * 2018-10-26 2020-05-05 华为技术有限公司 一种对象分类的方法与装置
US11017050B1 (en) * 2020-04-01 2021-05-25 Vmware Inc. Hybrid quantized decision model framework
CN112241719A (zh) * 2020-10-23 2021-01-19 杭州卷积云科技有限公司 基于边云卷积神经网络级联的监控视频目标实时查询方法
CN112419401A (zh) * 2020-11-23 2021-02-26 上海交通大学 一种基于云边协同与深度学习的飞机表面缺陷检测系统
CN112732450A (zh) * 2021-01-22 2021-04-30 清华大学 端-边-云协同框架下的机器人知识图谱生成系统及方法
CN113139660A (zh) * 2021-05-08 2021-07-20 北京首都在线科技股份有限公司 模型推理方法、装置、电子设备及存储介质
CN114330722A (zh) * 2021-11-25 2022-04-12 达闼科技(北京)有限公司 推理实现方法、网络、电子设备及存储介质

Also Published As

Publication number Publication date
CN114330722B (zh) 2023-07-11
CN114330722A (zh) 2022-04-12

Similar Documents

Publication Publication Date Title
WO2023093053A1 (zh) 推理实现方法、网络、电子设备及存储介质
US10163420B2 (en) System, apparatus and methods for adaptive data transport and optimization of application execution
KR20220079958A (ko) 블록체인 메시지 처리 방법 및 장치, 컴퓨터 및 판독 가능한 저장 매체
KR102544531B1 (ko) 연합 학습 시스템 및 방법
US20210042578A1 (en) Feature engineering orchestration method and apparatus
CN111770157B (zh) 一种业务处理方法、装置及电子设备和存储介质
CN104243611A (zh) 一种基于分发思想的消息服务中间件系统
US20210337007A1 (en) Method and server for http protocol-based data request
CN113132490A (zh) 一种基于强化学习的MQTT协议QoS机制选择方案
JP2022546108A (ja) 情報処理方法、装置、設備及びコンピュータ読み取り可能な記憶媒体
CN114189818B (zh) 消息发送方法、装置、服务器及存储介质
WO2017114180A1 (zh) 调整组件逻辑线程数量的方法及装置
CN114091572A (zh) 模型训练的方法、装置、数据处理系统及服务器
CN108989465B (zh) 共识方法、服务器、存储介质及分布式系统
WO2022237484A1 (zh) 一种推理系统、方法、装置及相关设备
US20210329436A1 (en) Bluetooth device networking system and method based on ble
CN114968617A (zh) Api转换系统及其访问请求处理方法、电子设备及介质
CN107707383B (zh) 放通处理方法、装置、第一网元及第二网元
CN106487694A (zh) 一种数据流处理方法和装置
CN113485747B (zh) 一种数据处理方法、数据处理器、目标源组件和系统
CN115942392B (zh) 基于时间间隙的服务质量QoS动态配置方法及系统
WO2023213270A1 (zh) 模型训练处理方法、装置、终端及网络侧设备
CN111901253B (zh) 用于存储系统的流量控制方法、装置、介质及电子设备
US20230049198A1 (en) Internet of things system
WO2023169402A1 (zh) 模型的准确度确定方法、装置及网络侧设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22897149

Country of ref document: EP

Kind code of ref document: A1