WO2023035147A1

WO2023035147A1 - Data processing method of industry edge product and distributed computing protocol engine thereof

Info

Publication number: WO2023035147A1
Application number: PCT/CN2021/117198
Authority: WO
Inventors: Jing Wang; Maximilian Hoch; Yuxuan XING; Ning Liu; Lihui XIE; Ming Zhong
Original assignee: Siemens Aktiengesellschaft; Siemens Ltd., China
Priority date: 2021-09-08
Filing date: 2021-09-08
Publication date: 2023-03-16

Abstract

The present invention provides a method for processing data of an industry edge product and a distributed computing protocol engine. The method is applied to a distributed computing protocol engine in the industry edge product, comprising: obtaining data to be processed and running parameter information sent from a user terminal, wherein the running parameter information comprises environment configuration information and service logic information; performing data source configuration for the data to be processed so as to obtain a data source; configuring a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job; and submitting the data source and the configuration job to at least one distributed computing orchestrator in the industry edge product, to use the distributed computing orchestrator to process the data source according to the configuration job. The solution can improve data processing capability of factory edge analysis.

Description

Data Processing Method of Industry Edge Product and Distributed Computing Protocol Engine Thereof

TECHNICAL FIELD

The present invention relates to the technical field of computer, especially to a data processing method of a industry edge product and a distributed computing protocol engine thereof.

BACKGROUND OF THE INVENTION

The industry edge product is a developing trend in the future in the edge computing area. By establishing an excellent industrial edge ecosystem, it is possible to provide powerful edge data processing capabilities for industry customers and application developers, to solve typical AI and data analysis cases.

However, a single industry edge product has limited device performance and thus it is difficult to support edge large data processing. Moreover, there are many industry edge products with insufficient utilization. Therefore, as a result, the data processing capability of edge analysis within a factory is poor.

SUMMARY OF THE INVENTION

The present invention provides a method for processing data of an industry edge product and a distributed computing protocol engine, which can improve data processing capability of edge analysis of factories.

In a first aspect, an embodiment of the present invention provides a method for processing data of an industry edge product, applied to a distributed computing protocol engine in the industry edge product, comprising:

obtaining data to be processed and running parameter information sent from a user terminal, wherein the running parameter information comprises environment configuration information and service logic information;

performing data source configuration for the data to be processed so as to obtain a data source;

configuring a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job; and

submitting the data source and the configuration job to at least one distributed computing orchestrator in the industry edge product, to use the distributed computing orchestrator to process the data source according to the configuration job.

In a possible implementation scheme, the step of performing data source configuration for the data to be processed so as to obtain a data source comprises:

transforming a protocol format of the data to be processed into a predefined unified protocol format so as to obtain a parsing data source; and

performing data modeling on the parsing data source according to a predefined data configuration template so as to obtain the data source.

In a possible implementation scheme, after the step of performing data source configuration for the data to be processed and before the step of submitting the data source to at least one distributed computing orchestrator in the industry edge product, the method further comprises:

using a data store interface to store the data source in a data store format defined by the user; and

the step of submitting the data source to at least one distributed computing orchestrator in the industry edge product comprises:

transmitting the data source stored in the data store format defined by the user into the distributed computing orchestrator.

In a possible implementation scheme, the step of configuring a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job comprises:

using the environment configuration information to configure a processing environment of the data to be processed; and

using the service logic information to configure a processing logic of the data to be processed.

In a possible implementation scheme, the step of using the environment configuration information to configure a processing environment of the data to be processed comprises:

using the environment configuration information to configure a data type of the data to be processed; and

using the environment configuration information to configure an output path for a result which is obtained by processing the data source according to the configuration job by the distributed computing orchestrator.

In a possible implementation scheme, the step of submitting the configuration job to at least one distributed computing orchestrator in the industry edge product comprises:

submitting the configuration job obtained after environment configuration and logic configuration, via a first application programming interface provided in advance, to the distributed computing orchestrator.

In a possible implementation scheme, the step of configuring a running parameter for processing of the data to be processed according to the running parameter information comprises:

determining a computing resource component which can be used for task processing, according to the running parameter information; and

allocating a task corresponding to the running parameter to the computing resource component so as to use the computing resource component to configure the running parameter.

In a possible implementation scheme, after the step of submitting the data source and the configuration job to at least one distributed computing orchestrator in the industry edge product, the method further comprises:

monitoring the distributed computing orchestrator in each industry edge product in real time, and collecting the processing result obtained by the distributed computing orchestrator; and

sending the processing result back to the user terminal.

In a possible implementation scheme, the step of sending the processing result back to the user terminal comprises:

sending predefined callback interface information back to the user terminal;

and/or

sending the processing result back to the user terminal in a way of message queue.

In a second aspect, an embodiment of the present invention provides a method for processing data of an industry edge product, applied to a user terminal, comprising:

sending data to be processed of the industry edge product and running parameter information for processing of the data to be processed, via a second application programming interface provided by a distributed computing protocol engine, into the distributed computing protocol engine; and

receiving a processing result for the data to be processed sent back from the distributed computing protocol engine, via the second application programming interface.

In a third aspect, an embodiment of the present invention provides a method for processing data of an industry edge product, applied to a distributed computing orchestrator in the industry edge product, comprising:

receiving a data source and a configuration job sent from a distributed computing protocol engine;

using the configuration job to perform data processing on the data source so as to obtain a processing result; and

sending the processing result back to the distributed computing protocol engine.

In a fourth aspect, an embodiment of the present invention provides a distributed computing protocol engine, comprising: an obtaining module, a data source configuration module, a job configuration module and a submitting module,

the obtaining module is used for obtaining data to be processed and running parameter information sent from a user terminal, wherein the running parameter information comprises environment configuration information and service logic information;

the data source configuration module is used for performing data source configuration for the data to be processed as obtained by the obtaining module so as to obtain a data source;

the job configuration module is used for configuring a running parameter for processing of the data to be processed according to the running parameter information as obtained by the obtaining module so as to obtain a configuration job; and

the submitting module is used for submitting the data source as obtained by the data source configuration module and the configuration job as obtained by the job configuration module to at least one distributed computing orchestrator in the industry edge product, to use the distributed computing orchestrator to process the data source according to the configuration job.

In a possible implementation scheme, the data source configuration module, when performing data source configuration for the data to be processed so as to obtain the data source, is configured to perform the following operations:

In a possible implementation scheme, it further comprises: a data source store module,

the data source store module is used for using a data store interface to store the data source in a data store format defined by the user; and

the submitting module, when submitting the data source to at least one distributed computing orchestrator in the industry edge product, is configured to perform the following operation:

In a possible implementation scheme, the job configuration module, when configuring the running parameter for processing of the data to be processed according to the running parameter information so as to obtain the configuration job, is configured to perform the following operations:

In a possible implementation scheme, the job configuration module, when using the environment configuration information to configure the processing environment of the data to be processed, is configured to perform the following operation:

In a possible implementation scheme, the submitting module, when submitting the configuration job to at least one distributed computing orchestrator in the industry edge product, is configured to perform the following operation:

In a possible implementation scheme, the job configuration module, when configuring the running parameter for processing of the data to be processed according to the running parameter information, is configured to perform the following operations:

In a possible implementation scheme, it further comprises: a processing result feedback module which is configured to perform the following operations:

monitoring each of the distributed computing orchestrators in real time, and collecting the processing result (s) obtained by the distributed computing orchestrator (s) ; and

sending the processing result (s) back to the user terminal.

In a possible implementation scheme, the processing result feedback module, when sending the processing result (s) back to the user terminal, is configured to perform the following operations:

sending predefined callback interface information back to the user terminal;

and/or

In a fifth aspect, an embodiment of the present invention provides a user terminal, which is configured to perform the following operations:

In a sixth aspect, an embodiment of the present invention provides a distributed computing orchestrator, which is configured to perform the following operations:

In a seventh aspect, an embodiment of the present invention provides a computing apparatus, comprising: at least one storage and at least one processor,

the at least one storage is used for storing a machine-readable program; and

the at least one processor is used for invoking the machine-readable program to perform the method according to any one of the first, second and third aspects.

In an eighth aspect, an embodiment of the present invention provides computer-readable medium, stored thereon with computer instructions which, when being executed by the processor, make the processor perform the method according to any one of the first, second and third aspects.

As can be seen from the above technical solutions, when the distributed computing protocol engine is processing the data of the industry edge product, it first will obtain the data to be processed and the running parameter information sent from a user terminal. Further, it will perform data source configuration for the data to be processed so as to obtain a data source, and then configure a running parameter for the data to be processed according to the running parameter information so as to obtain a configuration job. Thus, it will submit the obtained data source and configuration job to at least one distributed computing orchestrator, and thus can process the data source according to the configuration job in the distributed computing orchestrator. As can be seen from above, in the present solution, the data to be processed is processed directly by the industry edge product within the factory, without need to upload the data to servers (such as the cloud) for processing. This not only can improve data processing in real time, but also can improve data processing efficiency by distributing the data processing to multiple industry edge products for processing.

DESCRIPTION OF THE DRAWINGS

Figure 1 is a flowchart of a method for processing data of an industry edge product, applied to a distributed computing protocol engine, as provided in an embodiment of the present invention;

Figure 2 is a flowchart of a method for running parameter configuration as provided in an embodiment of the present invention;

Figure 3 is a flowchart of a method for processing data of an industry edge product, applied to a user terminal, as provided in an embodiment of the present invention;

Figure 4 is a flowchart of a method for processing data of an industry edge product, applied to a distributed computing orchestrator, as provided in an embodiment of the present invention;

Figure 5 is a diagram of a method for processing data of an industry edge product, as provided in an embodiment of the present invention;

Figure 6 is a diagram of a distributed computing protocol engine, as provided in an embodiment of the present invention; and

Figure 7 is a diagram of a computing apparatus, as provided in an embodiment of the present invention

List of reference numerals:

101: obtaining data to be processed and running parameter information sent from a user terminal

102: performing data source configuration for the data to be processed so as to obtain a data source

103: configuring a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job

104: submitting the data source and the configuration job to at least one distributed computing orchestrator in the industry edge product, to use the distributed computing orchestrator to process the data source according to the configuration job

201: using the environment configuration information to configure a processing environment of the data to be processed

202: using the service logic information to configure a processing logic of the data to be processed

301: sending data to be processed of the industry edge product and running parameter information for processing of the data to be processed, via a second application programming interface provided by a distributed computing protocol engine, into the distributed computing protocol engine

302: receiving a processing result for the data to be processed sent back from the distributed computing protocol engine, via the second application programming interface

401: receiving a data source and a configuration job sent from a distributed computing protocol engine

402: using the configuration job to perform data processing on the data source so as to obtain a processing result

403: sending the data processing result back to the distributed computing protocol engine

501: user terminal 502: distributed computing 503: distributed computing

protocol engine orchestrator

5021: standard interface layer 5022: data parsing 5023: data modeling

component component

5024: common store layer 5025: job configuration 5026: job schedule

component component

5027: result collection 5028: result return 5029: common schedule

component component layer

601: obtaining module 602: data source 603: job configuration

configuration module module

604: submitting module 701: storage 702: processor

500: system for processing data of 600 distributed computing protocol engine industry edge product

700: computing apparatus 100/300/400: method for processing data of

industry edge product

DESCRIPTION OF EXEMPLARY EMBODIMENTS

In aspects of industrial large data analysis processing, etc, a common way is to upload the data to the cloud for processing. For example, currently, many public cloud suppliers have developed and configured with a distributed computing orchestrator. Thus, when there is a need for data processing or artificial intelligence (AI) , a client may send local result data to the cloud in which the intensive data analysis and model training is performed. Then, the user can download the obtained model training result or data processing result to the local edge apparatus. However, in such manner, the communication between the service cloud and the local apparatus is necessary, which may incur a high cost of processing for large dataset. Moreover, the processing is performed in the cloud and a remote cloud can not guarantee delay performance. In addition, in such manner of data processing in the cloud, it is necessary to send internal data to the cloud, causing problems of privacy and safety.

In order to solve the above problem (s) , the industry edge computation gradually becomes a hot spot in future development in the aspects of industry large data analysis and processing. Using the industry edge product to establish a good industry edge ecosystem, it is possible to provide a powerful edge data processing capability for industry customs and application developers, to handle typical cases of AI and data analysis. However, a single industry edge product has limited device performance and thus it is difficult to support edge large data analysis and processing. Moreover, there are many industry edge products with insufficient utilization.

Base on this, it is conceivable in the present solution to use the distributed computing protocol engine installed in the industry edge product to process the data of the industry edge product. Thus, it is not necessary to transmit the data industry edge product to an apparatus outside the factory for processing. Rather, the processing can be performed by means of the edge product within the factory. In addition, the data processing operation is distributed to multiple edge products and this can effectively improve data processing efficiency.

Hereinafter, the method (s) for processing data of an industry edge product and the distributed computing protocol engine (s) as provided in the embodiments of the present invention will be explained in detail in combination with the figures.

As shown in figure 1, an embodiment of the present invention provides a method 100 for processing data of an industry edge product, applied to a distributed computing protocol engine in the industry edge product. The method may comprise the following steps:

Step 101: obtaining data to be processed and running parameter information sent from a user terminal, wherein the running parameter information comprises environment configuration information and service logic information;

Step 102: performing data source configuration for the data to be processed so as to obtain a data source;

Step 103: configuring a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job; and

Step 104: submitting the data source and the configuration job to at least one distributed computing orchestrator in the industry edge product, to use the distributed computing orchestrator to process the data source according to the configuration job.

In the present embodiment, when the distributed computing protocol engine is processing the data of the industry edge product, firstly, it obtains data to be processed and running parameter information sent from a user terminal. Further, it performs data source configuration for the data to be processed so as to obtain a data source, and configures a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job. Thus, by submitting the obtained data source and configuration job to at least one distributed computing orchestrator, it is possible to process the data source in the distributed computing orchestrator according to the configuration job. As can be seen from above, in the present solution, the data to be processed is processed directly by the industry edge product within the factory, without need to upload the data to servers (such as the cloud) for processing. This not only can improve data processing in real time, but also can improve data processing efficiency by distributing the data processing to multiple industry edge products for processing.

The industry edge product can include industrial edge management (IEM) and industrial edge device (IED) . In the present solution, the distributed computing protocol engine can be installed on the IEM or IED as an individual internal service, and thus can be used by an application to access the computing engine. For an upper application, it can be compatible with different industry protocols; for a bottom layer distributed cluster, it can adapt to different distributed computing orchestrators.

A single edge product has a relatively low performance. Thus, in addition to the existing industry edge product (s) within the factory, it is possible during data processing to use some plug-and-play peripherals, such as USB accelerator (Coral Edge TPU) , external dock or module system (System on Module, SoM) , etc. These peripherals are used to improve the performance of processor, disk, cache, network, etc.

In addition, it is also possible during data processing to integrate all idle IEM or IED resources on the factory field to establish the edge distributed computing cluster, such as Hadoop, Spark, Flink, Storm, etc. The overall performance of the platform is improved by horizontally improving computing capacity, thus laying a foundation for edge large data analysis, machine learning training and real-time testing, etc.

Step 102 (performing data source configuration for the data to be processed so as to obtain a data source) , in a possible implementation scheme, may be achieved in the following manner:

Currently, industrial factories and manufacturing units, during data transmission and computation, often use various industry protocols, such as OPC UA, Profinet, Modbus, MQTT, etc; while the bottom layer distributed computing orchestrator does not support transformation between data in different protocols. Moreover, during application development by the application developers, the burden will be heavier due to transformation for different protocols necessary for the high-level application. Therefore, in the present solution, it is conceivable to transform various protocols into a unified protocol format, then performing data modeling according to a predefined data configuration template so as to obtain the data source. Therefore, it is possible to use data from various industries/areas as the data source input.

As can be seen from above, the edge distributed computing protocol engine is a general method mechanism, used for processing the data sources in various industry protocols, performing modeling for internal data format, and allowing the user or application developers to write self-algorithm or analysis logic based on the unified data model. Finally, the job of the user is submitted to the bottom layer distributed computing cluster and is running with a predefined cluster configuration. That is, in the present solution, it is possible to transform a specified industry protocol (such as OPCUA, etc. ) into a suitable distributed computing cluster protocol according to the requirement of the user. In this way, the application can access the protocol and the application can be distributed to an edge product with sufficient capacity, thereby reducing data transmission between IEDs/IEMs to thus optimize data accessibility.

That is, the present embodiment provides a unified data access model and data layer for the applications such that it is not necessary for the applications to consider data parsing and connection relation between different industry protocols. Thus, the applications can be installed on each industry edge system, without any available factory unit protocol or consideration of the problem of respective protocol transformation.

After Step 102 (performing data source configuration for the data to be processed) and before Step 104 (submitting the data source to at least one distributed computing orchestrator in the industry edge product) , in a possible implementation scheme, it is also conceivable in the present solution to use a data store interface to store the data source in a data store format defined by the user. Thus, in Step 104 (submitting the data source to at least one distributed computing orchestrator) , it is conceivable to transmit the data source stored in the data store format defined by the user into the distributed computing orchestrator.

In the present embodiment, it is necessary to decouple the data in different formats with respect to the data management component, thus storing different types of data from different data sources. That is, the user can predefine data sources in multiple formats, such as database data, file data, data stream, etc. for distributed computing. Thus, by providing an abstract data store interface, the data management can uniformly use various data store formats to operate the data input.

In a possible implementation scheme, Step 103 (configuring a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job) , as shown in figure 2, may be achieved in the following manner:

Step 201: using the environment configuration information to configure a processing environment of the data to be processed; and

Step 202: using the service logic information to configure a processing logic of the data to be processed.

During data processing, it is necessary to know the environment in which the data is processed as well as the logic by which the data is processed. Therefore, in the present embodiment, the running parameter information obtained from the user terminal may comprise environment configuration information and service logic information. Thus, the accuracy of data processing can be guaranteed by using the environment configuration information to configure a processing environment of the data to be processed and using the service logic information to configure a processing logic of the data to be processed.

For example, in a possible implementation scheme, in Step 201 (using the environment configuration information to configure a processing environment of the data to be processed) , it is conceivable to use the environment configuration information to configure a data type of the data to be processed and to configure an output path for a result which is obtained by processing the data source according to the configuration job by the distributed computing orchestrator. That is, in a running environment of the configuration job, it is necessary to select a cluster environment for running the job, determine a type of the input data source and a target device on the user side into which a result will be output, etc. Then, a processing logic by which the data to be processed should be running is determined to use the respective configuration scheme. Thus, the data to be processed can, according to the configured processing environment and service logic, perform data processing, model learning training, and other opertions in the distributed computing orchestrator, and thus the accuracy of the data processing result and the effectiveness of the training models can be guaranteed.

In Step 104 (submitting the obtained configuration job to at least one distributed computing orchestrator in the industry edge product) , after environment configuration and logic configuration, it is possible to submit the configuration job obtained after environment configuration and logic configuration, via a first application programming interface provided in advance, to the distributed computing orchestrator. As can be seen from above, in the present embodiment, it is possible to achieve decoupling of various distributed computing orchestrators with respect to the job management component, and provide a unified application programming interface for submission of the configuration job. Thus, it is possible to use such interface to schedule the environment, input and output pre-configured by the job configuration component and the data source configured by the data management component onto different distributed computing orchestrators for running.

Certainly, after configuring the environment and service logic of the data to be processed, it is conceivable, according to the upper-level service condition and lower-level resource condition, to first submit the obtained configuration job into a suitable distributed computing orchestrator. For example, it is possible to first determine which distributed computing orchestrators on the bottom layer have available resource for data processing. Thus, by scheduling, the configuration job is submitted into one or more of these distributed computing orchestrators.

Further, for each distributed computing orchestrator in the industry edge product, it may contain multiple distributed nodes. When submitting the configuration job into the distributed computing orchestrator, it is also possible to select a suitable node to run the task according the task condition as submitted by the user terminal. For example, when the submitted task is AI computing, as it relates to a large amount of computation, it is possible that not each node can support such AI computing. Therefore, it is conceivable to evaluate the resource condition of each node in the distributed computing orchestrator and thus to select a suitable resource of the distributed computing orchestrator for running and data processing of the configuration job.

In job scheduling, it is also possible to use various prior work scheduling algorithms and systems, such as the adaptive frame rate video inference service for edge, the DAG based task scheduler system for heterogeneous computing, the distributed data processing system for edge devices, etc. By these advanced techniques, the job scheduling capability of the distributed computing protocol engine can be greatly improved.

In addition, when the present solution is applied to fast testing, in order to guarantee inference delay performance for fast testing, it is also conceivable to distinguish between different application scenes according to the job configuration of the user. It is allowable for the user terminal to directly send the input data source to the distributed data cluster, and such manner without any intermediate data modeling transformation can improve the job execution performance. Certainly, it should be noted that the work here requires the user or the application developers to have a self-algorithm. That is, according to the user requirement, the different code logics achieved in different distributed clusters should also follow the above data model.

In a possible implementation scheme, Step 103 (configuring a running parameter for processing of the data to be processed according to the running parameter information) may be achieved by the following steps:

In the present embodiment, when configuring the running parameter for processing of the data to be processed according to the running parameter information, first, it is possible to determine the computing resource components, which can perform task processing, according to the running parameter information, and then allocate the task (s) corresponding to the running parameter to the computing resource component (s) so as to use the computing resource component (s) to configure the running parameter.

Currently, a single industry edge product (such as industry edge management and industry edge apparatus) has limited device performance and thus it is difficult to support edge large data analysis, machine learning model training and real-time testing. Moreover, the arranged individual industry edge management or industry edge products can not be sufficiently utilized. In the present solution, when configuring the running parameter for processing of the data to be processed, it is considered to first perform management and evaluation on available computing resources in the edge product system in the entire factory, i.e. determining which industry edge management or industry edge apparatuses can be used for parameter configuration for the data to be processed. Thus, by allocating a configuration task for configuring the running parameter for processing of the data to be processed to these industry edge management and industry edge apparatuses, these industry edge products having idle resources are sufficiently utilized. Moreover, such manner can improve the data processing performance of the entire data processing platform.

Certainly, it should be noted that when the available computing resources in the edge product system in the entire factory are managed and evaluated, the IEM and IED resources in the entire factory are managed and evaluated. When the IEM and IED resources are sufficient for parameter configuration to obtain the configuration job, it is possible to select suitable IEM and IED resources for running parameter configuration. When the IEM and IED resources in the factory are not sufficient for running parameter configuration, it is possible to dynamically arrange additional IEM and IED resources according to the configuration requirement of the running parameter. That is, in the present solution, it is not only possible to allocate a configuration task to the current computing resources, but also possible to allocate a configuration task to the dynamically arranged additional computing resources. Thus, according to the resource requirement of the configuration job, the available computing resources can be automatically adjusted.

In a possible implementation scheme, after the step of submitting the data source and the configuration job to at least one distributed computing orchestrator in the industry edge product, the method for processing data of the industry edge product may further comprise:

monitoring each distributed computing orchestrator in real time, and collecting the processing result obtained by the distributed computing orchestrator; and

sending the processing result back to the user terminal.

After the data source and the configuration job are submitted to the distributed computing orchestrator, the distributed computing orchestrator will perform running processing on the data source according to the configuration job so as to obtain the processing result. Therefore, in the present embodiment, it is conceivable to monitor each distributed computing orchestrator in real time, thus collect the processing results obtained by the individual distributed computing orchestrators and send the collected processing results back to the user terminal. As can be seen from above, in the present embodiment, automatic monitoring to the distributed computing orchestrator (s) and automatic collecting of the processing result (s) are achieved, thus it is possible to automatically provide the processing result (s) running on the distributed computing orchestrator (s) and notification (s) for the user (s) or application (s) . Therefore, it is not necessary for the application developers to pay special attention to the collection and return of the processing result (s) .

In the data processing platform of the entire industry edge product, both data source configuration and running parameter configuration use the asynchronous processing mechanism. Therefore, it is also conceivable for the notification of the processing result (s) to be sent back asynchronously to the user terminal. That is, the distributed computing protocol engine, when monitoring the distributed computing orchestrator in real time, will send back the processing result (s) to the user terminal when the processing result (s) is/are received. Thus, it is not necessary to wait, before sending the processing results back, until all the processing results corresponding to the current configuration job (s) are obtained. In this way, it is possible to avoid a certain moment in which it is necessary to occupy a large amount of resources for sending back the processing results, causing a decrease in execution efficiency.

When sending the processing result back to the user terminal, in a possible implementation scheme, it is conceivable to send predefined callback interface information back to the user terminal. Certainly, in another possible implementation scheme, it is possible to send the processing result back to the user terminal in a way of message queue. That is, there may be mainly two result return mechanisms for processing result return. One mechanism is asynchronously invoking interface wherein a callback interface is predefined by the application, and when a notification indicating that the distributed computing orchestrator has completed the data processing is received, the information of the callback interface is sent back (returned) to the user application such that the user can obtain the processing result. The other mechanism is using a way of message queue to send the result to the application which has subscribed the message. Thus, the application can automatically receive the data processing result.

In addition, if there are large data or file information needed to return to the user, it is also possible to put the large data or file into an output position predefined by the user and then return the address to the user.

As shown in figure 3, an embodiment of the present invention provides a method 300 for processing data of an industry edge product, applied to a user terminal. The method may comprise the following steps:

Step 301: sending data to be processed of the industry edge product and running parameter information for processing of the data to be processed, via a second application programming interface provided by a distributed computing protocol engine, into the distributed computing protocol engine; and

Step 302: receiving a processing result for the data to be processed sent back from the distributed computing protocol engine, via the second application programming interface.

In an embodiment of the present invention, the user terminal may be an application on a computer, a hand phone, or a PAD, or may be accessed via a webpage, etc. The distributed computing protocol engine provides a unified API interface for the user terminal. Thus, the user terminal uses the unified API interface to send the data to be processed of the industry edge product and the running parameter information for processing of the data to be processed into the distributed computing protocol engine. After the distributed computing orchestrator performs data processing according to the data source and the configuration job uploaded by distributed computing protocol engine, the distributed computing protocol engine will collect the processing result. At this time, it is also possible for the user terminal to use the above unified API interface to receive the processing result of the data to be processed sent back (returned) by the distributed computing protocol engine. Thus, by a manner of data transmission-in and sending-back via the unified AIP interface, the bottom layer details of the data source and the distributed computing orchestrator are shielded. Thus, the user can develop applications having uniform operation habits to improve the application developing efficiency.

As shown in figure 4, an embodiment of the present invention also provides a method 400 for processing data of an industry edge product, applied to a distributed computing orchestrator in the industry edge product. The method may comprise:

Step 401: receiving a data source and a configuration job sent from a distributed computing protocol engine;

Step 402: using the configuration job to perform data processing on the data source so as to obtain a processing result; and

Step 403: sending the data processing result back to the distributed computing protocol engine.

In the present embodiment, the distributed computing protocol engine, after performing data source configuration on the data to be processed so as to obtain the data source and performing job configuration according to the running parameter so as to obtain the configuration job, will submit the data source and the configuration job to the distributed computing orchestrator. The distributed computing orchestrator will perform data processing on the data source according to the received configuration job. Further, the obtained data processing result will be sent back (returned) to the distributed computing protocol engine. As can be seen from above, the present solution arranges the distributed computing cluster in the IEM or IED, and thus can improve the computing capacity of the IEM/IED vertically and horizontally to optimize utilization of idle resources.

Hereinafter, the method for processing data of the industry edge product provided in the present invention will be further explained in combination with a distributed computing protocol engine, a user terminal and a distributed computing orchestrator. As shown in figure 5, a data processing system 500 for an industry edge product comprises a user terminal 501, a distributed computing protocol engine 502 and a distributed computing orchestrator 50. In a possible implementation scheme, the distributed computing protocol engine 502 in such system may comprise: a standard interface layer 5021, a data parsing component 5022, a data modeling component 5023, a common store layer 5024, a job configuration component 5025, a job schedule component 5026, a result collection component 5027, a result return component 5028 and a common schedule layer 5029, etc.

The present embodiment may be divided into three groups of asynchronous processing procedures. In the first group of asynchronous processing procedure, the standard interface layer 5021 provides a unified second application programming interface for the user terminal 501. Thus, the user terminal 501 transmits the data to be processed of the edge product into the data parsing component 5022. The data input by the user terminal 501 may have different industry protocol formats, such as OPCUA, MQTT, etc. Therefore, the data parsing component 5022 transforms the data in different industry protocol formats into a unified industry protocol format. Then the data modeling component 5023 performs modeling on the data in the unified industry protocol format. That is, the various data formats of the data sources of the running distributed computing jobs will be configured in the unified data format. Thus, the user or application developer can develop a self-algorithm based on such unified data model, preparing for the subsequent job (s) running on the distributed computing orchestrator. Further, the common store layer 5024 provides a data store interface. Thus, the data modeling component 5023 uses the data store interface to store the data after modeling in the data store format defined by the user, such as database data, file data, data stream, etc. Further, such data may be transmitted into the distributed computing orchestrator 503.

In the second group of asynchronous processing procedure, the user terminal 501 transmits the running parameter information (environment configuration information, service logic information, etc. ) into the job configuration component 5025. The job configuration component 5025 uses the environment configuration information to configure a processing environment of the data to be processed, and uses the service logic information to configure a processing logic of the data to be processed. Further, the job schedule component 5026 and the common schedule layer 5029 are used to determine the available resource of the bottom layer distributed computing orchestrator 503, i.e. determining which distributed computing orchestrator (s) 503 is/are available. Thus, by transforming the format of the configuration job into the API interface suitable for the available distributed computing orchestrator (s) 503, the job schedule component 5026 can submit the configuration job to the distributed computing orchestrator 503 with available resource.

In the third group of asynchronous processing procedure, the result collection component 5027 will use the API interface provided by the common schedule layer 5029 to monitor each distributed computing orchestrator 503 in real time. After the distributed computing orchestrator 503 obtains the data to be processed, the result collection component 5027 collects the processing results obtained by the individual distributed computing orchestrators 503, and returns them to the user terminal 501 by the result return component 5028.

As shown in figure 6, the present invention provides a distributed computing protocol engine 600, comprising: an obtaining module 601, a data source configuration module 602, a job configuration module 603 and a submitting module 604,

the obtaining module 601 is used for obtaining data to be processed and running parameter information sent from a user terminal, wherein the running parameter information comprises environment configuration information and service logic information;

the data source configuration module 602 is used for performing data source configuration for the data to be processed as obtained by the obtaining module 601 so as to obtain a data source;

the job configuration module 603 is used for configuring a running parameter for processing of the data to be processed according to the running parameter information as obtained by the obtaining module 601 so as to obtain a configuration job; and

the submitting module 604 is used for submitting the data source as obtained by the data source configuration module 602 and the configuration job as obtained by the job configuration module 603 to at least one distributed computing orchestrator in the industry edge product, to use the distributed computing orchestrator to process the data source according to the configuration job.

In a possible implementation scheme, the data source configuration module 602, when performing data source configuration for the data to be processed so as to obtain the data source, is configured to perform the following operations:

In a possible implementation scheme, the distributed computing protocol engine 600 further comprises: a data source store module,

the submitting module 604, when submitting the data source to at least one distributed computing orchestrator in the industry edge product, is configured to perform the following operation:

In a possible implementation scheme, the job configuration module 603, when configuring the running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job, is configured to perform the following operations:

In a possible implementation scheme, the job configuration module 603, when using the environment configuration information to configure the processing environment of the data to be processed, is configured to perform the following operation:

In a possible implementation scheme, the submitting module 604, when submitting the configuration job to at least one distributed computing orchestrator in the industry edge product, is configured to perform the following operation:

In a possible implementation scheme, the job configuration module 603, when configuring the running parameter for processing of the data to be processed according to the running parameter information, is configured to perform the following operations:

In a possible implementation scheme, the distributed computing protocol engine 600 further comprises: a processing result feedback module which is configured to perform the following operations:

sending the processing result back to the user terminal.

In a possible implementation scheme, the processing result feedback module, when sending the processing result back to the user terminal, is configured to perform the following operations:

sending predefined callback interface information back to the user terminal;

and/or

In a possible implementation scheme, a user terminal is configured to perform the following operations:

sending data to be processed of the industry edge product and running parameter information for processing of the data to be processed, via a second application programming interface provided by a distributed computing protocol engine 600, into the distributed computing protocol engine 600; and

receiving a processing result for the data to be processed sent back from the distributed computing protocol engine 600, via the second application programming interface.

In a possible implementation scheme, a distributed computing orchestrator is configured to perform the following operations:

receiving a data source and a configuration job sent from a distributed computing protocol engine 600;

sending the processing result back to the distributed computing protocol engine 600.

As shown in figure 7, an embodiment of the present invention also provides a computing apparatus 700, and it comprises: at least one storage 701 and at least one processor 702,

the at least one storage 701 is used for storing a machine-readable program; and

the at least one processor 702, coupled with the at least one storage 701, is used for invoking the machine-readable program to perform the method 100 for processing data of an industry edge product as provided in any one of the above embodiments.

The present invention also provides a type of computer-readable medium, and it is stored thereon with computer instructions which, when being executed by the processor, make the processor perform the method for processing data of an industry edge product as provided in any one of the above embodiments. Specifically, it is possible to provide a system or a device equipped with the storage medium, the storage medium is stored thereon with software program codes for implementing the function (s) of any one of the above embodiments, such that a computer (or CPU or MPU) of the system or device can read and perform the program codes stored in the storage medium.

In this case, the program codes read from the storage medium can, by themselves, achieve the function (s) of any one of the above embodiments. Therefore, the program codes and the storage medium storing the program codes constitute a portion of the present invention.

The embodiments of the storage media for providing program codes comprise: floppy disk, hard disk, magneto-optical disk, optical disk (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW) , magnetic tape, non-volatile memory card and ROM. Optionally, it is possible to use a communication network to download program codes from a server computer.

In addition, it should be clear that not only the program codes read by a computer but also an operation system operated on a computer by instructions based on the program codes can complete a portion of or all the practical operations, thus achieving the function (s) of any one of the above embodiments.

In addition, it is understandable that the program codes read from the storage medium is written into a storage provided in an extension board inserted in a computer or is written into a storage provided in an extension module connected with a computer. Then, the instructions based on the program codes make a CPU on the extension board or extension module perform a portion of or all the practical operations, thus achieving the function (s) of any one of the above embodiments.

It should be noted that in the above flowcharts and structural diagrams of the devices, not all the steps and modules are necessary, and some step (s) or module (s) may be omitted according to practical requirement (s) . The performing order of the steps is not constant, and may be adjusted according to requirement (s) . The system structures described in the above embodiments may be a physical structure or may be a logic structure. That is, some module (s) may be implemented by one same physical entity. Or, some module (s) may be implemented by multiple physical entities or may be implemented by some parts in multiple individual apparatuses together. In the embodiments, the above-described distributed computing protocol engine, user terminal and distributed computing orchestrator are based on the same inventive concept with that of the method for processing data of an industry edge product.

In the above embodiments, a hardware module may be implemented by a mechanical manner or an electrical manner. For example, a hardware module may comprise a permanent specialized circuit or logic (such as specialized processor, FPGA or ASIC) to complete respective operation (s) . A hardware module may also comprise a programmable logic or circuit (such as general processor or other programmable processors) , and it is possible to perform temporary configuration by software to complete respective operation (s) . The specific implementation manner (mechanical manner, or specialized permanent circuit, or temporarily provided circuit) can be determined by considerations based on cost and time.

Hereinbefore, the present invention is demonstrated and explained in detail by means of accompanying drawings and preferred embodiments. However, the present invention is not limited to these disclosed embodiments. Based on the above various embodiments, those skilled in the art can know that further embodiments of the present invention can be obtained by combination of code examination means in the above various embodiments and these embodiments will fall within the protection scope of the present invention.

Claims

A method for processing data of an industry edge product, characterized in that it is applied to a distributed computing protocol engine in the industry edge product, comprising:

obtaining data to be processed and running parameter information sent from a user terminal, wherein the running parameter information comprises environment configuration information and service logic information;

performing data source configuration for the data to be processed so as to obtain a data source;

configuring a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job; and

submitting the data source and the configuration job to at least one distributed computing orchestrator in the industry edge product, to use the distributed computing orchestrator to process the data source according to the configuration job.
The method according to claim 1, characterized in that the step of performing data source configuration for the data to be processed so as to obtain a data source comprises:

transforming a protocol format of the data to be processed into a predefined unified protocol format so as to obtain a parsing data source; and

performing data modeling on the parsing data source according to a predefined data configuration template so as to obtain the data source.
The method according to claim 1, characterized in that after the step of performing data source configuration for the data to be processed and before the step of submitting the data source to at least one distributed computing orchestrator in the industry edge product, the method further comprises:

using a data store interface to store the data source in a data store format defined by the user; and

the step of submitting the data source to at least one distributed computing orchestrator in the industry edge product comprises:

transmitting the data source stored in the data store format defined by the user into the distributed computing orchestrator.
The method according to claim 1, characterized in that the step of configuring a running parameter for processing of the data to be processed according to the running parameter information so as to obtain a configuration job comprises:

using the environment configuration information to configure a processing environment of the data to be processed; and

using the service logic information to configure a processing logic of the data to be processed.
The method according to claim 4, characterized in that the step of using the environment configuration information to configure a processing environment of the data to be processed comprises:

using the environment configuration information to configure a data type of the data to be processed; and

using the environment configuration information to configure an output path for a result which is obtained by processing the data source according to the configuration job by the distributed computing orchestrator.
The method according to claim 4, characterized in that the step of submitting the configuration job to at least one distributed computing orchestrator in the industry edge product comprises:

submitting the configuration job obtained after environment configuration and logic configuration, via a first application programming interface provided in advance, to the distributed computing orchestrator.
The method according to claim 1, characterized in that the step of configuring a running parameter for processing of the data to be processed according to the running parameter information comprises:

determining a computing resource component which can be used for task processing, according to the running parameter information; and

allocating a task corresponding to the running parameter to the computing resource component so as to use the computing resource component to configure the running parameter.
The method according to claim 1, characterized in that after the step of submitting the data source and the configuration job to at least one distributed computing orchestrator in the industry edge product, the method further comprises:

monitoring the distributed computing orchestrator in each industry edge product in real time, and collecting the processing result obtained by the distributed computing orchestrator; and

sending the processing result back to the user terminal.
The method according to claim 8, characterized in that the step of sending the processing result back to the user terminal comprises:

sending predefined callback interface information back to the user terminal;

and/or

sending the processing result back to the user terminal in a way of message queue.
A method for processing data of an industry edge product, characterized in that it is applied to a user terminal, comprising:

sending data to be processed of the industry edge product and running parameter information for processing of the data to be processed, via a second application programming interface provided by a distributed computing protocol engine, into the distributed computing protocol engine; and

receiving a processing result for the data to be processed sent back from the distributed computing protocol engine, via the second application programming interface.
A method for processing data of an industry edge product, characterized in that it is applied to a distributed computing orchestrator in the industry edge product, comprising:

receiving a data source and a configuration job sent from a distributed computing protocol engine;

using the configuration job to perform data processing on the data source so as to obtain a processing result; and

sending the processing result back to the distributed computing protocol engine.
A distributed computing protocol engine, characterized in that it comprises: an obtaining module, a data source configuration module, a job configuration module and a submitting module,

the obtaining module is used for obtaining data to be processed and running parameter information sent from a user terminal, wherein the running parameter information comprises environment configuration information and service logic information;

the data source configuration module is used for performing data source configuration for the data to be processed as obtained by the obtaining module so as to obtain a data source;

the job configuration module is used for configuring a running parameter for processing of the data to be processed according to the running parameter information as obtained by the obtaining module so as to obtain a configuration job; and

the submitting module is used for submitting the data source as obtained by the data source configuration module and the configuration job as obtained by the job configuration module to at least one distributed computing orchestrator in the industry edge product, to use the distributed computing orchestrator to process the data source according to the configuration job.
The distributed computing protocol engine according to claim 12, characterized in that the data source configuration module, when performing data source configuration for the data to be processed so as to obtain the data source, is configured to perform the following operations:

transforming a protocol format of the data to be processed into a predefined unified protocol format so as to obtain a parsing data source; and

performing data modeling on the parsing data source according to a predefined data configuration template so as to obtain the data source.
The distributed computing protocol engine according to claim 12, characterized in that it further comprises: a data source store module,

the data source store module is used for using a data store interface to store the data source in a data store format defined by the user; and

the submitting module, when submitting the data source to at least one distributed computing orchestrator in the industry edge product, is configured to perform the following operation:

transmitting the data source stored in the data store format defined by the user into the distributed computing orchestrator.
The distributed computing protocol engine according to claim 12, characterized in that the job configuration module, when configuring the running parameter for processing of the data to be processed according to the running parameter information so as to obtain the configuration job, is configured to perform the following operations:

using the environment configuration information to configure a processing environment of the data to be processed; and

using the service logic information to configure a processing logic of the data to be processed.
The distributed computing protocol engine according to claim 15, characterized in that the job configuration module, when using the environment configuration information to configure the processing environment of the data to be processed, is configured to perform the following operation:

using the environment configuration information to configure a data type of the data to be processed; and

using the environment configuration information to configure an output path for a result which is obtained by processing the data source according to the configuration job by the distributed computing orchestrator.
The distributed computing protocol engine according to claim 15, characterized in that the submitting module, when submitting the configuration job to at least one distributed computing orchestrator in the industry edge product, is configured to perform the following operation:

submitting the configuration job obtained after environment configuration and logic configuration, via a first application programming interface provided in advance, to the distributed computing orchestrator.
The distributed computing protocol engine according to claim 12, characterized in that the job configuration module, when configuring the running parameter for processing of the data to be processed according to the running parameter information, is configured to perform the following operations:

determining a computing resource component which can be used for task processing, according to the running parameter information; and

allocating a task corresponding to the running parameter to the computing resource component so as to use the computing resource component to configure the running parameter.
The distributed computing protocol engine according to claim 12, characterized in that it further comprises: a processing result feedback module which is configured to perform the following operations:

monitoring the distributed computing orchestrator in each industry edge product in real time, and collecting the processing result obtained by the distributed computing orchestrator; and

sending the processing result back to the user terminal.
The distributed computing protocol engine according to claim 19, characterized in that the processing result feedback module, when sending the processing result back to the user terminal, is configured to perform the following operations:

sending predefined callback interface information back to the user terminal;

and/or

sending the processing result back to the user terminal in a way of message queue.
A user terminal, characterized in that it is configured to perform the following operations:

sending data to be processed of the industry edge product and running parameter information for processing of the data to be processed, via a second application programming interface provided by a distributed computing protocol engine, to the distributed computing protocol engine; and

receiving a processing result for the data to be processed sent back from the distributed computing protocol engine, via the second application programming interface.
A distributed computing orchestrator, characterized in that it is configured to perform the following operations:

receiving a data source and a configuration job sent from a distributed computing protocol engine;

using the configuration job to perform data processing on the data source so as to obtain a processing result; and

sending the processing result back to the distributed computing protocol engine.
A computing apparatus, characterized in that it comprises: at least one storage and at least one processor,

the at least one storage is used for storing a machine-readable program; and

the at least one processor is used for invoking the machine-readable program to perform the method according to any one of claims 1-11.
Computer-readable medium, characterized in that it is stored thereon with computer instructions which, when being executed by the processor, make the processor perform the method according to any one of claims 1-11.