CN111694789A

CN111694789A - Embedded reconfigurable heterogeneous determination method, system, storage medium and processor

Info

Publication number: CN111694789A
Application number: CN202010323642.0A
Authority: CN
Inventors: 杨鹏飞; 吴自力; 吕文凯; 党佳乐; 张璐璐; 舒洁琼; 张鹤于; 王振翼; 张昊
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2020-04-22
Filing date: 2020-04-22
Publication date: 2020-09-22

Abstract

The invention belongs to the technical field of reconfigurable computing, and discloses an embedded reconfigurable heterogeneous measurement method, system, storage medium, and processor. The computing power of the boards is centralized to build a task-driven reconfigurable heterogeneous computing platform; the cluster tasks and heterogeneous computing resources are managed in a unified manner by means of dynamic cluster construction, and a reconfigurable virtual computing environment is built using virtualization technology. The invention constructs a heterogeneous computing platform with resource self-organization coordination and unified management capability and a reconfigurable computing environment. The Web visualization module in the user interface layer provides an interactive interface for the user, and the security mechanism in it divides the user level into multiple levels to provide guarantee for the access and authorization of various users; use the graphical interface to complete the management of tasks, resources, users, etc., Reduce user complexity and provide task load balancing capabilities.

Description

Embedded reconfigurable heterogeneous assay method, system, storage medium, processor

技术领域technical field

本发明属于可重构计算技术领域，尤其涉及一种嵌入式可重构异构测定方法、系统、存储介质、处理器。The invention belongs to the technical field of reconfigurable computing, and in particular relates to an embedded reconfigurable heterogeneous measurement method, system, storage medium and processor.

背景技术Background technique

目前，为了应对计算多元化的需求，传统的基于CPU的通用计算体系已无法满足人工智能等对高计算能力的需求，越来越多的场景开始引入GPU、FPGA等硬件进行加速，异构计算应运而生。异构计算(Heterogeneous Computing)是指使用不同类型指令集和体系架构的计算单元组成的计算系统。异构计算是性能、成本和功耗均衡的技术，在计算任务并行性基础上，将任务分配到最适合执行它的计算资源上加以执行，使计算任务总的执行时间最小，达到性能和成本的最优化。由于专用处理器芯片集成度低、设计周期长、计算能力弱、灵活性差，无法满足新技术发展需求。可重构计算能够对计算机硬件结构进行升级或修改，从而更好的满足灵活多变的任务需求。可重构计算相关技术快速发展，呈现出架构共融、资源共用、协同处理的发展趋势和技术要求，但单个计算节点有效载荷有限，计算能力有限，任务处理易达到其载荷性能极限，影响任务处理效率，因此有必要将独立载荷通过有效方式互联形成更大规模、更高计算能力的计算平台，较好的解决载荷计算瓶颈问题。At present, in order to meet the needs of diversified computing, the traditional CPU-based general computing system can no longer meet the needs of high computing power such as artificial intelligence. More and more scenarios are beginning to introduce hardware such as GPU and FPGA for acceleration and heterogeneous computing. came into being. Heterogeneous computing refers to a computing system composed of computing units using different types of instruction sets and architectures. Heterogeneous computing is a technology that balances performance, cost, and power consumption. On the basis of the parallelism of computing tasks, tasks are allocated to the computing resources that are most suitable for executing them, so as to minimize the total execution time of computing tasks and achieve performance and cost. optimization. Due to the low integration of special-purpose processor chips, long design cycles, weak computing power, and poor flexibility, they cannot meet the development needs of new technologies. Reconfigurable computing can upgrade or modify the computer hardware structure to better meet the needs of flexible and changeable tasks. Reconfigurable computing-related technologies are developing rapidly, showing the development trend and technical requirements of architecture integration, resource sharing, and collaborative processing. However, a single computing node has limited payload and limited computing power, and task processing is easy to reach its load performance limit, which affects tasks. Therefore, it is necessary to interconnect independent loads in an effective way to form a computing platform with larger scale and higher computing power, so as to better solve the bottleneck problem of load calculation.

由于可重构计算概念的提出是针对于硬件层面，解决专用集成电路开发周期长、灵活性差的缺陷，所以更多的研究仍集中在可重构逻辑器件、系统硬件性能提升，较少的考虑平台级架构以及系统管理软件与调度策略对系统性能的提升，从而忽略了在系统级的管理调度以及在软件层面的可重构计算对系统性能的提升。而可重构计算要发挥效能优势，可重构硬件是基础，平台架构是关键，资源、通信和架构的自组织协同和动态自演化理论与技术是核心，信息快速存储与检索是提升平台效能的有效途径，有必要对其涉及内容展开深入研究。此外，对于任务驱动模型，从任务接入到任务执行完成的过程中，涉及到故障的自动检测、计算节点的自动切换以及任务的自动恢复等问题。为保证任务持续高效的执行，计算平台的高可用性是必要的。因此，急需一种任务驱动的、异构资源可自组织协同和统一化管理的、高可用的可重构异构计算平台。Since the concept of reconfigurable computing is proposed at the hardware level to solve the defects of long development cycle and poor flexibility of application-specific integrated circuits, more research is still focused on reconfigurable logic devices and system hardware performance improvement, and less consideration is given to Platform-level architecture, system management software and scheduling strategies improve system performance, thus ignoring system-level management scheduling and software-level reconfigurable computing to improve system performance. For reconfigurable computing to take advantage of its efficiency, reconfigurable hardware is the foundation, platform architecture is the key, self-organization, coordination and dynamic self-evolution theory and technology of resources, communication and architecture are the core, and rapid information storage and retrieval is to improve platform efficiency. It is necessary to conduct in-depth research on the content involved. In addition, for the task-driven model, the process from task access to task execution involves issues such as automatic fault detection, automatic switching of computing nodes, and automatic task recovery. In order to ensure the continuous and efficient execution of tasks, high availability of the computing platform is necessary. Therefore, there is an urgent need for a task-driven, highly available reconfigurable heterogeneous computing platform in which heterogeneous resources can be self-organized, coordinated and managed in a unified manner.

通过上述分析，现有技术存在的问题及缺陷为：Through the above analysis, the existing problems and defects in the prior art are:

(1)目前单个计算节点有效载荷有限，计算能力有限，任务处理易达到其载荷性能极限，影响任务处理效率。(1) At present, a single computing node has limited payload and limited computing capacity, and task processing is easy to reach its load performance limit, which affects the task processing efficiency.

(2)目前关于可重构计算的研究，更多的仍集中在系统硬件性能提升的研究，较少的考虑平台级架构以及系统管理软件与调度策略对系统性能的提升。(2) At present, most of the research on reconfigurable computing is still focused on the improvement of system hardware performance, and less consideration is given to the improvement of system performance by platform-level architecture, system management software and scheduling strategies.

解决以上问题及缺陷的难度为：The difficulty of solving the above problems and defects is as follows:

由嵌入式计算节点构建的嵌入式计算平台在物理上呈现出分布式、异构等特点，利用节点间通信技术实现资源自组织协同和统一化管理是集群构建的基础；计算平台不再强依赖于主控制节点，如何保证主控制节点故障时平台的高可用性是平台构建的关键；单个计算节点上的任务执行存在单点故障的风险，如何在平台层面保证任务执行的高可用性是研究的重点。The embedded computing platform constructed by embedded computing nodes is physically distributed and heterogeneous. The use of inter-node communication technology to achieve resource self-organization, coordination and unified management is the basis for cluster construction; computing platforms are no longer strongly dependent on For the main control node, how to ensure the high availability of the platform when the main control node fails is the key to platform construction; the task execution on a single computing node has the risk of a single point of failure, and how to ensure the high availability of task execution at the platform level is the focus of the research. .

解决以上问题及缺陷的意义为：The significance of solving the above problems and defects is:

解决了以上问题及缺陷，多个嵌入式计算节点可以依靠通信技术级联在一起，实现异构资源的自组织协同和统一化管理；集群对主控制节点不再强依赖，同时保证故障节点上任务的自动迁移，构建“无中心”、高性能、高可用的嵌入式计算平台，提供任务驱动的、可重构的计算环境。To solve the above problems and defects, multiple embedded computing nodes can be cascaded together by means of communication technology to achieve self-organized coordination and unified management of heterogeneous resources; the cluster is no longer strongly dependent on the main control node, and at the same time, it ensures that the faulty nodes are connected to each other. Automatic migration of tasks, build a "centerless", high-performance, high-availability embedded computing platform, and provide a task-driven, reconfigurable computing environment.

发明内容SUMMARY OF THE INVENTION

针对现有技术存在的问题，本发明提供了一种嵌入式可重构异构测定方法、系统、存储介质、处理器。Aiming at the problems existing in the prior art, the present invention provides an embedded reconfigurable heterogeneous determination method, system, storage medium and processor.

本发明是这样实现的，一种嵌入式可重构异构测定方法，所述嵌入式可重构异构测定方法将多个分布式的、承载各种异构计算资源的嵌入式计算板卡的计算能力进行集中，构建任务驱动的可重构异构计算平台；利用动态集群构建的方式统一管理集群的任务及异构计算资源，利用虚拟化技术构建可重构的虚拟计算环境。The present invention is implemented in this way, an embedded reconfigurable heterogeneous measurement method, wherein the embedded reconfigurable heterogeneous measurement method combines a plurality of distributed embedded computing boards carrying various heterogeneous computing resources The computing power of the cluster is centralized, and a task-driven reconfigurable heterogeneous computing platform is built; the task and heterogeneous computing resources of the cluster are managed in a unified manner by means of dynamic cluster construction, and a reconfigurable virtual computing environment is built by using virtualization technology.

进一步，所述嵌入式可重构异构测定方法接受用户的任务输入，提供web可视化交互页面，将节点划分为多个独立集群，在集群上层搭建代理，提供负载均衡；Further, the embedded reconfigurable heterogeneous measurement method accepts the task input of the user, provides a web visual interactive page, divides the nodes into multiple independent clusters, builds an agent on the upper layer of the cluster, and provides load balancing;

在代理节点中，基于VRRP协议使得多个节点共有一个虚拟IP，当虚拟IP绑定的节点Master故障时，虚拟IP漂移到一个Slave节点上，此Slave节点上升为新的Master，继续提供任务接入和负载均衡的功能。In the proxy node, multiple nodes share a virtual IP based on the VRRP protocol. When the master of the node bound to the virtual IP fails, the virtual IP drifts to a slave node, and the slave node becomes a new master and continues to provide task connections. Access and load balancing functions.

进一步，所述嵌入式可重构异构测定方法通过动态集群构建模块构建集群，通过任务池构建模块、资源池构建模块构建集群的任务池与资源池，通过虚拟计算环境构建模块匹配任务与资源，生成任务执行的虚拟计算环境。Further, the embedded reconfigurable heterogeneous assay method constructs a cluster through a dynamic cluster building module, constructs a task pool and a resource pool of the cluster through a task pool building module and a resource pool building module, and matches tasks and resources through a virtual computing environment building module. , which generates a virtual computing environment for task execution.

进一步，所述嵌入式可重构异构测定方法的动态集群构建包括心跳检测、数据库一致性和动态中心选举策略；Further, the dynamic cluster construction of the embedded reconfigurable heterogeneous measurement method includes heartbeat detection, database consistency and dynamic center election strategy;

心跳检测在节点地位对等的主从服务方式下，对各个节点的进行存活检测，将新的计算节点加入集群，对故障节点进行删除；当删除计算节点时，系统的心跳检测会自动检测到删除开发板的故障信息；删除完成以后，通过数据库一致性策略同步数据库中系统资源的内容；Heartbeat detection In the master-slave service mode of peer-to-peer node status, the survival detection of each node is performed, new computing nodes are added to the cluster, and the faulty nodes are deleted; when the computing nodes are deleted, the system's heartbeat detection will automatically detect Delete the fault information of the development board; after the deletion is completed, synchronize the content of the system resources in the database through the database consistency policy;

数据库一致性策略同步数据库中的数据，使得每一个计算节点均知晓集群的任务、资源配置信息，当主节点故障时，利用动态中心选举策略选举出新的主节点，同步数据库中数据；The database consistency strategy synchronizes the data in the database, so that each computing node knows the task and resource configuration information of the cluster. When the master node fails, the dynamic center election strategy is used to elect a new master node to synchronize the data in the database;

动态中心选举实现集群中主控制节点的动态选择，在数据库一致性模块的基础上，利用选举策略动态选择主节点，实现无中心化；选举出的主节点负责任务的下发，将具体的任务及任务相关的配置信息，按照一定的分发策略分发到某一计算节点中；当某一计算节点发生故障时，主节点负责故障迁移并将该故障节点上的任务重新下发。The dynamic center election realizes the dynamic selection of the master control node in the cluster. On the basis of the database consistency module, the election strategy is used to dynamically select the master node to achieve decentralization; the elected master node is responsible for the distribution of tasks, and assigns specific tasks to specific tasks. and task-related configuration information, distributed to a computing node according to a certain distribution strategy; when a computing node fails, the master node is responsible for failover and re-delivering the tasks on the failed node.

进一步，所述嵌入式可重构异构测定方法当有任务X来临时，主节点按照计算节点的资源状态进行任务分配，将任务及其配置信息发送至满足要求的计算节点i；如果资源不足，则将任务进行排队等候；在资源分配成功后，更改主节点的数据库，再将该操作同步到其他数据库中；当计算节点i发生故障后，心跳检测将发现此故障并将节点i在数据库中的资源信息清除，通过数据库一致性策略同步数据库后，主节点获得节点i故障的信息，则判定节点i上正在执行的任务X已失败，主节点重新根据集群中存活节点的资源状态等信息下发任务X，任务X被下发到节点j上重新执行。Further, when the embedded reconfigurable heterogeneous determination method has a task X, the master node performs task allocation according to the resource state of the computing node, and sends the task and its configuration information to the computing node i that meets the requirements; if the resources are insufficient , the task is queued up; after the resource allocation is successful, the database of the master node is changed, and the operation is synchronized to other databases; when the computing node i fails, the heartbeat detection will detect the failure and place the node i in the database. After synchronizing the database through the database consistency policy, the master node obtains the information of the failure of node i, and determines that the task X being executed on node i has failed, and the master node re-accords to the information such as the resource status of the surviving nodes in the cluster. Task X is issued, and task X is issued to node j for re-execution.

进一步，所述嵌入式可重构异构测定方法的资源池构建模块在嵌入式计算板卡启动时，完成资源配置的扫描，获取已注册的设备信息并对设备进行健康检测，将可用资源信息存入数据库的资源列表中，实现资源发现与可用性检测，借助数据库一致性策略构建集群的资源池；Further, when the embedded computing board is started, the resource pool building module of the embedded reconfigurable heterogeneous measurement method completes the scanning of resource configuration, obtains the registered device information and performs health detection on the device, and uses the available resource information. It is stored in the resource list of the database, realizes resource discovery and availability detection, and builds the resource pool of the cluster with the help of database consistency strategy;

虚拟计算环境构建将任务与任务所需资源匹配，根据任务的需求，主节点构建任务执行环境的相关配置信息，随任务一同下发给计算节点，计算节点将任务与配置环境整合构建虚拟计算环境。将应用及运行环境打包为docker镜像并上传到docker仓库中，使用docker的方式构建虚拟计算环境，其启动快速，属于秒级别；计算节点按照主节点下发的配置信息从docker仓库中拉取镜像，构建任务执行的虚拟计算环境即可，这样就可以将应用的安装、环境的配置自动化的完成。The construction of the virtual computing environment matches the tasks with the resources required by the tasks. According to the requirements of the tasks, the master node builds the relevant configuration information of the task execution environment, and sends it to the computing nodes along with the tasks. The computing nodes integrate the tasks and the configuration environment to build a virtual computing environment. . Package the application and operating environment as a docker image and upload it to the docker warehouse, and use the docker method to build a virtual computing environment, which starts quickly and belongs to the second level; the computing node pulls the image from the docker warehouse according to the configuration information issued by the master node , it is enough to build a virtual computing environment for task execution, so that the installation of the application and the configuration of the environment can be completed automatically.

进一步，所述嵌入式可重构异构测定方法硬件资源采用总线-组件的体系结构，将挂载有异构计算资源的控制器板卡通过消息总线定义的标准接口进行网路化互联通信；嵌入式计算板卡上异构资源统一接入、板间网络化互联，实现标准的可扩展的高速系统总线和异构资源统一化组件封装和接入。Further, the hardware resource of the embedded reconfigurable heterogeneous measurement method adopts a bus-component architecture, and the controller board mounted with the heterogeneous computing resource performs networked interconnection communication through the standard interface defined by the message bus; The unified access of heterogeneous resources on the embedded computing board and the networked interconnection between boards realize the standard scalable high-speed system bus and unified component encapsulation and access of heterogeneous resources.

本发明的另一目的在于提供一种接收用户输入程序存储介质，所存储的计算机程序使电子设备执行权利要求任意一项所述包括下列步骤：将多个分布式的、承载各种异构计算资源的嵌入式计算板卡的计算能力进行集中，构建任务驱动的可重构异构计算平台；利用动态集群构建的方式统一管理集群的任务及异构计算资源，利用虚拟化技术构建可重构的虚拟计算环境。Another object of the present invention is to provide a program storage medium for receiving user input, and the stored computer program enables an electronic device to perform any one of the following steps: The computing power of the embedded computing boards of the resource is concentrated to build a task-driven reconfigurable heterogeneous computing platform; the tasks and heterogeneous computing resources of the cluster are managed in a unified manner by using the dynamic cluster construction method, and the reconfigurable computing resources are constructed by using the virtualization technology. virtual computing environment.

本发明的另一目的在于提供一种实施所述嵌入式可重构异构测定方法的嵌入式可重构异构测定系统，所述嵌入式可重构异构测定系统包括：Another object of the present invention is to provide an embedded reconfigurable heterogeneous assay system for implementing the embedded reconfigurable heterogeneous assay method, the embedded reconfigurable heterogeneous assay system comprising:

用户接口层，用于提供Web可视化的任务接入方式并提供负载均衡的功能；The user interface layer is used to provide the task access method of Web visualization and provide the function of load balancing;

系统中间件层，用于集群的构建，统一管理异构计算资源；The system middleware layer is used for cluster construction and unified management of heterogeneous computing resources;

硬件层，用于嵌入式计算板卡上异构资源统一接入、板间网络化互联；The hardware layer is used for unified access of heterogeneous resources on embedded computing boards and network interconnection between boards;

所述用户接口层包括：The user interface layer includes:

Web可视化模块，提供Web可视化界面进行任务、资源、用户的管理，并为接入和授权提供保证；Web visualization module, which provides Web visualization interface to manage tasks, resources and users, and provides guarantee for access and authorization;

任务接入模块，提供统一接入地址，运用代理保证统一接入的高可用性并将任务按照负载均衡策略在集群间分发；The task access module provides a unified access address, uses an agent to ensure high availability of unified access, and distributes tasks among clusters according to the load balancing strategy;

所述系统中间件层包括动态集群构建模块、任务池构建模块、资源池构建模块和虚拟计算环境构建模块；The system middleware layer includes a dynamic cluster building module, a task pool building module, a resource pool building module and a virtual computing environment building module;

动态集群构建模块，包括心跳检测、数据库一致性和动态中心选举策略，用于构建和管理集群，实现任务、资源的统一调度；Dynamic cluster building modules, including heartbeat detection, database consistency and dynamic center election strategy, are used to build and manage clusters to achieve unified scheduling of tasks and resources;

任务池构建模块，用于实现集群任务池的构建；The task pool building module is used to realize the construction of the cluster task pool;

资源池构建模块，用于实现异构资源的发现与健康检查；The resource pool building module is used to realize the discovery and health check of heterogeneous resources;

本发明的另一目的在于提供一种处理器，所述处理器搭载所述的嵌入式可重构异构测定系统。Another object of the present invention is to provide a processor equipped with the embedded reconfigurable heterogeneous assay system.

结合上述的所有技术方案，本发明所具备的优点及积极效果为：本发明提供在嵌入式平台上的、任务驱动的可重构异构计算平台，实现异构资源的自组织协同和统一化管理及构建可重构的任务虚拟计算环境，保证任务接入、任务下发、任务执行的高可用性。本发明将分布式嵌入式节点的计算能力进行集中，构建具有资源自组织协同和统一化管理能力、可重构计算环境的异构计算平台。用户接口层中的Web可视化模块为用户提供交互接口，其中的安全机制对用户级别进行多重划分，为多种用户的接入和授权提供保证；使用图形化界面完成任务、资源、用户等管理，降低用户的使用复杂度并提供任务负载均衡的能力。Combined with all the above technical solutions, the advantages and positive effects of the present invention are: the present invention provides a task-driven reconfigurable heterogeneous computing platform on an embedded platform, and realizes self-organization, coordination and unification of heterogeneous resources Manage and build a reconfigurable task virtual computing environment to ensure high availability of task access, task distribution, and task execution. The invention concentrates the computing capabilities of the distributed embedded nodes, and constructs a heterogeneous computing platform with resource self-organization coordination and unified management capabilities and a reconfigurable computing environment. The Web visualization module in the user interface layer provides an interactive interface for users, and the security mechanism in it divides the user levels into multiple levels to provide guarantees for the access and authorization of various users; the graphical interface is used to complete the management of tasks, resources, users, etc. Reduce user complexity and provide task load balancing capabilities.

本发明实现了持续高可用的任务接入，在用户接口层，在集群上层搭建代理集群，基于VRRP协议实现代理集群中的节点共有一个虚拟IP(VIP)，当VIP绑定的节点故障时，VIP自动漂移到新的代理节点，从而为用户提供一个统一、高可用的访问接口，保证了任务接入的高可用性。The invention realizes continuous high-availability task access, builds an agent cluster at the upper layer of the cluster at the user interface layer, and realizes that nodes in the agent cluster share a virtual IP (VIP) based on the VRRP protocol. When the node bound to the VIP fails, The VIP automatically drifts to the new proxy node, thereby providing users with a unified and highly available access interface and ensuring the high availability of task access.

本发明实现了持续高可用的任务下发，采用无中心的主控制节点动态选举技术，保证了平台在任务下发上的高可用性。当集群中当前的主节点故障时，通过心跳检测和数据库一致性策略，该主节点将从集群中去除，同步数据库后通过动态中心选举策略选举出新的主节点，继续进行资源分配和任务下发，从而实现集群的无中心化，即不再需要依赖静态的主控制节点，避免了主节点宕机导致整个集群的崩溃，保证了主节点的高可用性，保证了任务下发的高可用性。The invention realizes the continuous high-availability task distribution, adopts the non-center main control node dynamic election technology, and ensures the high availability of the platform in the task distribution. When the current master node in the cluster fails, the master node will be removed from the cluster through heartbeat detection and database consistency strategy. After synchronizing the database, a new master node will be elected through the dynamic center election strategy, and continue to perform resource allocation and task downlinking. In this way, the decentralization of the cluster is realized, that is, it is no longer necessary to rely on the static master control node, which avoids the collapse of the entire cluster caused by the failure of the master node, ensures the high availability of the master node, and ensures the high availability of task dispatching.

本发明实现了持续高可用的任务执行，当某一计算节点故障后，通过心跳检测和数据库一致性策略，该计算节点将被从集群中去除，所有在该节点上正在执行的任务将失败并由主节点重新进行资源分配和任务下发，实现了任务迁移，保证了任务执行的高可用性。The present invention realizes continuous high-availability task execution. When a certain computing node fails, through heartbeat detection and database consistency strategy, the computing node will be removed from the cluster, and all tasks being executed on the node will fail and fail. Resource allocation and task distribution are carried out by the master node, which realizes task migration and ensures high availability of task execution.

本发明实现了动态可重构的任务执行环境，在虚拟化技术的基础上，为每一个任务提供独立的虚拟计算环境，虚拟计算环境中的计算资源适应任务需求而动态变化，实现任务定制化的任务执行环境，即动态可重构的任务执行环境。The invention realizes a dynamically reconfigurable task execution environment. On the basis of virtualization technology, an independent virtual computing environment is provided for each task. The computing resources in the virtual computing environment change dynamically according to the task requirements and realize task customization. The task execution environment is a dynamically reconfigurable task execution environment.

本发明实现了异构资源的自组织协同和统一化管理，在一个新的计算节点加入到集群中或者一个节点故障(脱离集群)时，通过心跳检测，检测到节点的存活状态，通过资源池构建模块的资源发现与可用性检测即可动态添加和删除异构资源，数据库一致性策略同步集群资源状态，实现了计算节点的热插拔和异构资源的自适应接入和删除，再由集群统一管理，即实现异构资源的自组织协同和统一化管理。The invention realizes the self-organization coordination and unified management of heterogeneous resources. When a new computing node joins the cluster or a node fails (leaves the cluster), the node's survival state is detected through heartbeat detection, and the resource pool The resource discovery and availability detection of building blocks can dynamically add and delete heterogeneous resources, and the database consistency policy synchronizes the status of cluster resources, which realizes the hot swap of computing nodes and the adaptive access and deletion of heterogeneous resources. Unified management is the realization of self-organized coordination and unified management of heterogeneous resources.

附图说明Description of drawings

为了更清楚地说明本申请实施例的技术方案，下面将对本申请实施例中所需要使用的附图做简单的介绍，显而易见地，下面所描述的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the embodiments of the present application. Obviously, the drawings described below are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.

图1是本发明实施例提供的嵌入式可重构异构测定方法流程图。FIG. 1 is a flowchart of an embedded reconfigurable heterogeneous assay method provided by an embodiment of the present invention.

图2是本发明实施例提供的嵌入式可重构异构测定系统的结构示意图；2 is a schematic structural diagram of an embedded reconfigurable heterogeneous assay system provided by an embodiment of the present invention;

图中：1、用户接口层；2、系统中间件层；3、硬件层。In the figure: 1. User interface layer; 2. System middleware layer; 3. Hardware layer.

图3是本发明实施例提供的嵌入式可重构异构测定系统的架构示意图。FIG. 3 is a schematic structural diagram of an embedded reconfigurable heterogeneous measurement system provided by an embodiment of the present invention.

图4是本发明实施例提供的用户接口层代理结构示意图。FIG. 4 is a schematic structural diagram of a user interface layer proxy provided by an embodiment of the present invention.

图5是本发明实施例提供的动态中心实现示意图。FIG. 5 is a schematic diagram of a dynamic center implementation provided by an embodiment of the present invention.

图6是本发明实施例提供的任务下发与故障迁移示意图。FIG. 6 is a schematic diagram of task delivery and failover according to an embodiment of the present invention.

图7是本发明实施例提供的虚拟计算环境构建示意图。FIG. 7 is a schematic diagram of constructing a virtual computing environment provided by an embodiment of the present invention.

图8是本发明实施例提供的异构计算资源板卡互联示意图。FIG. 8 is a schematic diagram of interconnection of heterogeneous computing resource boards according to an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

针对现有技术中存在的问题，本发明提供了一种嵌入式可重构异构测定方法、系统、存储介质、处理器，下面结合附图对本发明作详细的描述。In view of the problems existing in the prior art, the present invention provides an embedded reconfigurable heterogeneous assay method, system, storage medium, and processor. The present invention is described in detail below with reference to the accompanying drawings.

如图1所示，本发明提供的嵌入式可重构异构测定方法包括以下步骤：As shown in Figure 1, the embedded reconfigurable heterogeneous assay method provided by the present invention comprises the following steps:

S101：将多个分布式的、承载各种异构计算资源(如CPU、GPU、FPGA、DSP等)的嵌入式计算板卡的计算能力进行集中，构建任务驱动的可重构异构计算平台；S101: Concentrate the computing capabilities of multiple distributed embedded computing boards that carry various heterogeneous computing resources (such as CPU, GPU, FPGA, DSP, etc.) to build a task-driven reconfigurable heterogeneous computing platform ;

S102：利用动态集群构建的方式统一管理集群的任务及异构计算资源，利用虚拟化技术构建可重构的虚拟计算环境。S102: Use the dynamic cluster construction method to uniformly manage the tasks and heterogeneous computing resources of the cluster, and use the virtualization technology to build a reconfigurable virtual computing environment.

如图2所示，本发明提供的嵌入式可重构异构测定系统包括：As shown in Figure 2, the embedded reconfigurable heterogeneous assay system provided by the present invention includes:

用户接口层1，用于提供Web可视化的任务接入方式并提供负载均衡的功能。User interface layer 1, which is used to provide the task access method of Web visualization and provide the function of load balancing.

系统中间件层2，用于集群的构建，统一管理异构计算资源。System middleware layer 2 is used for cluster construction and unified management of heterogeneous computing resources.

硬件层3，用于嵌入式计算板卡上异构资源统一接入、板间网络化互联。Hardware layer 3 is used for unified access of heterogeneous resources on embedded computing boards and network interconnection between boards.

在本发明中，用户接口层1包括：In the present invention, the user interface layer 1 includes:

Web可视化模块，为用户提供Web可视化界面进行任务、资源、用户等的管理，并为多种用户的接入和授权提供保证。The Web visualization module provides users with a Web visualization interface to manage tasks, resources, users, etc., and provides guarantees for the access and authorization of various users.

任务接入模块，为用户提供统一接入地址，运用代理保证用户统一接入的高可用性并将任务按照一定的负载均衡策略在集群间分发。The task access module provides users with a unified access address, uses an agent to ensure the high availability of unified access for users, and distributes tasks among clusters according to a certain load balancing strategy.

Web可视化模块，为用户提供可视化的交互界面进行管理，同时基于安全认证与授权机制实现用户身份的多重划分，为不同用户提供对应的管理服务。The web visualization module provides users with a visual interactive interface for management, and at the same time realizes multiple division of user identities based on the security authentication and authorization mechanism, and provides corresponding management services for different users.

任务接入模块，用于接收用户的任务请求，该模块是由多个计算节点搭建的代理集群，基于Virtual Router Redundancy Protocol(VRRP)协议实现代理集群中的节点共有一个虚拟IP(VIP)，当VIP绑定的节点故障时，VIP自动漂移到新的代理节点，从而为用户提供一个统一、高可用的访问接口，保证了任务接入的高可用性。同时，代理集群利用负载均衡策略，将任务分配到多个集群的节点中，从而提高任务的并发度和吞吐量、提高资源的利用率。The task access module is used to receive task requests from users. This module is a proxy cluster built by multiple computing nodes. Based on the Virtual Router Redundancy Protocol (VRRP) protocol, the nodes in the proxy cluster share a virtual IP (VIP). When When the node to which the VIP is bound fails, the VIP automatically drifts to a new proxy node, thereby providing users with a unified and highly available access interface and ensuring the high availability of task access. At the same time, the proxy cluster uses the load balancing strategy to allocate tasks to nodes in multiple clusters, thereby improving the concurrency and throughput of tasks and improving resource utilization.

在本发明中，系统中间件层2，包括动态集群构建模块、任务池构建模块、资源池构建模块和虚拟计算环境构建模块。In the present invention, the system middleware layer 2 includes a dynamic cluster building module, a task pool building module, a resource pool building module and a virtual computing environment building module.

动态集群构建模块，包括心跳检测、数据库一致性和动态中心选举策略，用于构建和管理集群，实现任务、资源的统一调度。Dynamic cluster building modules, including heartbeat detection, database consistency, and dynamic center election strategy, are used to build and manage clusters to achieve unified scheduling of tasks and resources.

心跳检测实现集群中节点健康状态的检测，在节点地位对等的主从服务方式下，对各个节点的进行存活检测，将新的计算节点加入集群，对故障节点进行删除。当删除计算节点时，通过数据库一致性策略同步数据库中系统资源的内容，从而保证任务不会下发到已经故障的计算节点上；Heartbeat detection implements the detection of the health status of nodes in the cluster. In the master-slave service mode with equal node status, the survival detection of each node is performed, new computing nodes are added to the cluster, and faulty nodes are deleted. When a computing node is deleted, the content of the system resources in the database is synchronized through the database consistency policy, so as to ensure that the task will not be sent to the failed computing node;

数据库一致性实现同步数据库中的数据，使得每一个计算节点均知晓集群的任务、资源配置信息，当主节点故障时，利用动态中心选举策略选举出新的主节点，同步数据库中数据，从而解决了故障转移的问题。数据库一致性保证集群中每一个节点都可以在主控制节点故障时成为新的主节点，保证了系统的高可用性；Database consistency realizes synchronization of data in the database, so that each computing node knows the tasks and resource configuration information of the cluster. When the master node fails, a new master node is elected by the dynamic center election strategy, and the data in the database is synchronized, thus solving the problem of Failover issues. Database consistency ensures that each node in the cluster can become the new master node when the master control node fails, ensuring the high availability of the system;

动态中心选举实现了集群中主控制节点的动态选择，在数据库一致性的基础上，利用选举策略动态选择主节点，实现了集群的动态构建，保证了集群的高可用性。选举出的主节点负责任务的下发，即将具体的任务及任务相关的配置信息，按照一定的分发策略分发到某一计算节点中。当某一计算节点发生故障时，主节点负责故障迁移并将该故障节点上的任务重新下发。The dynamic center election realizes the dynamic selection of the master control node in the cluster. On the basis of database consistency, the election strategy is used to dynamically select the master node, which realizes the dynamic construction of the cluster and ensures the high availability of the cluster. The elected master node is responsible for the distribution of tasks, that is, specific tasks and task-related configuration information are distributed to a certain computing node according to a certain distribution strategy. When a computing node fails, the master node is responsible for failover and resends tasks on the failed node.

任务池构建模块，实现集群任务池的构建。The task pool building module implements the construction of the cluster task pool.

资源池构建模块，包括资源发现与可用性检测，实现异构资源的发现与健康检查。资源发现，在嵌入式计算板卡启动时，完成资源配置的扫描，获取已注册的设备信息，获取异构计算资源信息。可用性检测，对发现的异构资源进行健康检测，将可用资源信息存入数据库的资源列表中。Resource pool building modules, including resource discovery and availability detection, realize the discovery and health check of heterogeneous resources. Resource discovery, when the embedded computing board is started, completes the scan of resource configuration, obtains registered device information, and obtains heterogeneous computing resource information. Availability detection: perform health detection on the discovered heterogeneous resources, and store the available resource information in the resource list of the database.

硬件层3，用于嵌入式计算板卡上异构资源统一接入、板间网络化互联，实现了标准的可扩展的高速系统总线和异构资源统一化组件封装和接入，使得异构资源能够按照统一的调用接口及协议进行管理和通信，为系统高扩展性和异构资源的组件化服务提供硬件支撑。Hardware layer 3 is used for unified access of heterogeneous resources on embedded computing boards and networked interconnection between boards. It realizes standard scalable high-speed system bus and unified component encapsulation and access of heterogeneous resources, making heterogeneous Resources can be managed and communicated according to a unified calling interface and protocol, providing hardware support for the system's high scalability and componentized services for heterogeneous resources.

下面结合附图对本发明的技术方案作进一步的描述。The technical solutions of the present invention will be further described below with reference to the accompanying drawings.

本发明提供的嵌入式可重构异构测定系统包括：The embedded reconfigurable heterogeneous assay system provided by the present invention includes:

用户接口层1，如图4所示，用户接口层负责接受用户的任务输入，提供web可视化交互页面。同时，将节点划分为多个独立集群，在集群上层搭建代理，提供负载均衡的功能。这样可以保证集群之间数据库的相对独立性，类似于集群分片的作用。对于单个集群来说，任务的并发是以主节点的资源分配为前提，而对于多个集群来说，任务的并发则会分配到多个主节点中，提高任务的并发度和吞吐量。除此以外，也提高了资源的利用率。User interface layer 1, as shown in Figure 4, the user interface layer is responsible for accepting the user's task input and providing a web visual interaction page. At the same time, the nodes are divided into multiple independent clusters, and an agent is built on the upper layer of the cluster to provide the function of load balancing. This ensures the relative independence of databases between clusters, similar to the role of cluster sharding. For a single cluster, the concurrency of tasks is premised on the resource allocation of the master node, while for multiple clusters, the concurrency of tasks is allocated to multiple master nodes to improve the concurrency and throughput of tasks. In addition, the utilization of resources is also improved.

在代理节点中，基于VRRP协议使得多个节点共有一个虚拟IP(VIP)，当VIP绑定的节点Master故障时，VIP会漂移到一个Slave节点上，此Slave节点上升为新的Master，继续提供任务接入和负载均衡的功能。因此，对于用户来讲，只有一个统一的访问接口，保证了代理节点的高可用性，进而保证任务接入的高可用性。In the proxy node, multiple nodes share a virtual IP (VIP) based on the VRRP protocol. When the master of the node bound to the VIP fails, the VIP will drift to a slave node, and the slave node will become a new master and continue to provide Task access and load balancing functions. Therefore, for users, there is only one unified access interface, which ensures the high availability of the proxy node and thus the high availability of task access.

系统中间件层2，通过动态集群工具间模块构建集群，通过任务池构建模块、资源池构建模块构建集群的任务池与资源池，通过虚拟计算环境构建模块匹配任务与资源，生成任务执行的虚拟计算环境。The system middleware layer 2 builds a cluster through the dynamic cluster inter-tool module, builds the task pool and resource pool of the cluster through the task pool building module and the resource pool building module, matches tasks and resources through the virtual computing environment building module, and generates a virtual virtual machine for task execution. computing environment.

动态集群构建模块由心跳检测、数据库一致性和动态中心选举策略组成。Dynamic cluster building blocks consist of heartbeat detection, database consistency, and dynamic center election policies.

心跳检测负责集群中节点的动态加入和删除，所述心跳检测在节点地位对等的主从服务方式下，对各个节点的进行存活检测，将新的计算节点加入集群，对故障节点进行删除，是实现资源自组织协同和统一化管理的基础。当删除计算节点时，系统的心跳检测会自动检测到删除开发板的故障信息。删除完成以后，通过数据库一致性策略同步数据库中系统资源的内容，从而保证任务不会下发到已经故障的计算节点上。The heartbeat detection is responsible for the dynamic addition and deletion of nodes in the cluster. The heartbeat detection performs survival detection on each node under the master-slave service mode with equal node status, adds a new computing node to the cluster, and deletes the faulty node. It is the basis for realizing resource self-organization, coordination and unified management. When the computing node is deleted, the heartbeat detection of the system will automatically detect the fault information of the deleted development board. After the deletion is completed, the content of the system resources in the database is synchronized through the database consistency policy, so as to ensure that the task will not be delivered to the failed computing node.

数据库一致性实现同步数据库中的数据，使得每一个计算节点均知晓集群的任务、资源配置信息，当主节点故障时，利用下述动态中心选举策略选举出新的主节点，同步数据库中数据，从而解决了故障转移的问题。数据库一致性保证集群中每一个节点都可以在主控制节点故障时成为新的主节点，保证了系统的高可用性。Database consistency realizes synchronization of data in the database, so that each computing node knows the tasks and resource configuration information of the cluster. When the master node fails, the following dynamic center election strategy is used to elect a new master node to synchronize the data in the database, thereby Resolved an issue with failover. Database consistency ensures that each node in the cluster can become a new master node when the master control node fails, ensuring high system availability.

动态中心选举模块实现集群中主控制节点的动态选择，如图5所示，在数据库一致性的基础上，利用选举策略动态选择主节点，实现无中心化，此时如果主节点故障宕机，将会选举出新的主节点继续维持集群的服务，从而解决了集群对静态主节点的依赖，保证了集群的高可用性，进而保证任务下发的高可用性。选举出的主节点负责任务的下发，即将具体的任务及任务相关的配置信息，按照一定的分发策略分发到某一计算节点中。当某一计算节点发生故障时，主节点负责故障迁移并将该故障节点上的任务重新下发。The dynamic center election module realizes the dynamic selection of the main control node in the cluster. As shown in Figure 5, on the basis of database consistency, the election strategy is used to dynamically select the main node to achieve decentralization. At this time, if the main node fails and goes down, A new master node will be elected to continue to maintain the cluster service, thus solving the cluster's dependence on the static master node, ensuring the high availability of the cluster, and thus ensuring the high availability of task delivery. The elected master node is responsible for the distribution of tasks, that is, specific tasks and task-related configuration information are distributed to a certain computing node according to a certain distribution strategy. When a computing node fails, the master node is responsible for failover and resends tasks on the failed node.

如图6所示，当有任务X来临时，主节点按照计算节点的资源状态进行任务分配，将任务及其配置信息发送至满足要求的计算节点i。如果资源不足，则将任务进行排队等候。在资源分配成功后，更改主节点的数据库，再将该操作同步到其他数据库中。如图7所示，当计算节点i发生故障后，心跳检测将发现此故障并将节点i在数据库中的资源信息等清除，通过数据库一致性策略同步数据库后，主节点获得节点i故障的信息，则判定节点i上正在执行的任务X已失败，主节点重新根据集群中存活节点的资源状态等信息下发任务X，任务X被下发到节点j上重新执行。As shown in FIG. 6 , when a task X comes, the master node allocates the task according to the resource status of the computing node, and sends the task and its configuration information to the computing node i that meets the requirements. If resources are insufficient, the task is queued for waiting. After the resource allocation is successful, change the database of the master node, and then synchronize the operation to other databases. As shown in Figure 7, when the computing node i fails, the heartbeat detection will detect the failure and clear the resource information of node i in the database. After synchronizing the database through the database consistency policy, the master node obtains the failure information of node i , then it is determined that the task X being executed on node i has failed, the master node re-issues task X according to information such as the resource status of the surviving nodes in the cluster, and task X is delivered to node j for re-execution.

任务池构建模块构建集群的任务池。The task pool building block builds the task pool for the cluster.

资源池构建模块在嵌入式计算板卡启动时，完成资源配置的扫描，获取已注册的设备信息并对设备进行健康检测，将可用资源信息存入数据库的资源列表中，实现资源发现与可用性检测，借助数据库一致性策略构建集群的资源池。When the embedded computing board is started, the resource pool building module completes the scanning of resource configuration, obtains the registered device information, performs health detection on the device, and stores the available resource information in the resource list of the database to realize resource discovery and availability detection. , and build the resource pool of the cluster with the help of the database consistency policy.

虚拟计算环境构建模块将任务与任务所需资源匹配，根据任务的需求，主节点构建任务执行的虚拟计算环境，即构建任务执行环境的相关配置信息，随任务一同下发给计算节点，计算节点将任务与配置环境整合构建虚拟计算环境。在此任务驱动的计算平台上，将应用及运行环境打包为docker镜像并上传到docker仓库中，使用docker的方式构建虚拟计算环境，其启动快速，属于秒级别。如图7所示，计算节点按照主节点下发的配置信息从docker仓库中拉取镜像，构建任务执行的虚拟计算环境即可，这样就可以将应用的安装、环境的配置自动化的完成。The virtual computing environment building module matches the task with the resources required by the task. According to the requirements of the task, the master node builds a virtual computing environment for task execution, that is, the configuration information about the construction task execution environment, and sends it to the computing node along with the task. Integrate tasks and configuration environments to build virtual computing environments. On this task-driven computing platform, the application and operating environment are packaged as a docker image and uploaded to the docker warehouse, and a virtual computing environment is built using the docker method, which starts quickly and belongs to the second level. As shown in Figure 7, the computing node pulls the image from the docker warehouse according to the configuration information issued by the master node, and builds a virtual computing environment for task execution, so that the installation of the application and the configuration of the environment can be automatically completed.

虚拟计算环境构建中，任务执行的环境由任务和任务配置信息动态生成，即任务的执行环境是由任务驱动的，是动态可重构的。In the construction of the virtual computing environment, the task execution environment is dynamically generated by the task and task configuration information, that is, the task execution environment is driven by the task and is dynamically reconfigurable.

硬件层3，构成如图8所示，硬件资源模块采用“总线-组件”的体系结构设计，将挂载有异构计算资源的控制器板卡通过消息总线定义的标准接口进行网路化互联通信。当有新的计算板卡接入时，只需要在系统中将对应的板卡IP上进行相应配置，即可实现新资源的发现和定位，板卡拔除时采用同样的方法即可。硬件层负责嵌入式计算板卡上异构资源统一接入、板间网络化互联，实现了标准的可扩展的高速系统总线和异构资源统一化组件封装和接入，使得异构资源能够按照统一的调用接口及协议进行管理和通信，为系统高扩展性和异构资源的组件化服务提供硬件支撑。The hardware layer 3 is composed as shown in Figure 8. The hardware resource module adopts the "bus-component" architecture design, and the controller boards mounted with heterogeneous computing resources are networked and interconnected through the standard interface defined by the message bus. communication. When a new computing board is connected, it is only necessary to configure the corresponding board IP in the system to realize the discovery and location of new resources. The same method can be used when the board is removed. The hardware layer is responsible for the unified access of heterogeneous resources on the embedded computing board and the networked interconnection between boards, realizing a standard scalable high-speed system bus and unified component encapsulation and access of heterogeneous resources, so that heterogeneous resources can be The unified calling interface and protocol are used for management and communication, providing hardware support for the system's high scalability and componentized services of heterogeneous resources.

基于容器化的任务分配和部署方式细化了任务占用资源的粒度，其粒度不再是片级，而是任务所需的资源大小，即多个任务各自占有资源的一部分，从而大大提高了资源的利用率。The container-based task allocation and deployment method refines the granularity of the resources occupied by tasks. The granularity is no longer at the slice level, but the size of the resources required by the task, that is, multiple tasks occupy a part of the resources, which greatly improves the resources. utilization rate.

为了测试在Docker中执行任务与在主机(Host)中执行任务时CPU的计算性能与资源占用情况的差异，在Xilinx UltraScale+MPSoC ZCU102平台上利用Linpack测试工具以浮点计算峰值为评价指标测试CPU的计算性能、利用ApacheBenchmark对Nginx服务进行压力测试并利用Nmon监控资源占用情况。In order to test the difference between the computing performance and resource occupancy of the CPU when executing tasks in Docker and the host (Host), the Linpack test tool was used on the Xilinx UltraScale+MPSoC ZCU102 platform to test the CPU with the floating-point calculation peak as the evaluation index. computing performance, use ApacheBenchmark to stress test the Nginx service, and use Nmon to monitor resource occupancy.

(1)Linpack测试(1) Linpack test

假设CPU每个时钟周期的浮点运算次数为1，则理论CPU浮点运算峰值为1.2GHz*1*4核＝4.8Gflops。Assuming that the number of floating-point operations per clock cycle of the CPU is 1, the theoretical peak value of CPU floating-point operations is 1.2GHz*1*4 cores=4.8Gflops.

Linkpack测试结果如表1所示，其中Max表示测试得到的实际CPU浮点计算峰值，定义最大效率为Max与理论CPU浮点计算峰值的比值。The Linkpack test results are shown in Table 1, where Max represents the actual CPU floating-point calculation peak obtained by the test, and the maximum efficiency is defined as the ratio of Max to the theoretical CPU floating-point calculation peak.

表1Host和Docker中CPU浮点计算峰值最大效率对比Table 1 Comparison of the peak maximum efficiency of CPU floating-point computing in Host and Docker

Max(Gflops)Max(Gflops) 最大效率(Max/理论峰值)Maximum Efficiency (Max/Theoretical Peak) HostHost 4.065e-014.065e-01 8.47％8.47% Docker(Alpine)Docker (Alpine) 5.6567e-015.6567e-01 11.78％11.78%

表1中可以看出，相比Host，docker中没有明显的性能损失。As can be seen in Table 1, there is no obvious performance loss in docker compared to Host.

(2)ApacheBenchmark测试(2) Apache Benchmark test

分别对Host与Docker中的Nginx服务进行远程访问压力测试，监控结果如表2所示。其中，CPU(usr％+sys％)表示用户空间与内核空间CPU的占用比，CPU-WAvg表示CPU占用的加权平均占比。The remote access stress test is performed on the Nginx service in the Host and Docker respectively, and the monitoring results are shown in Table 2. Among them, CPU(usr%+sys%) represents the occupancy ratio of user space and kernel space CPU, and CPU-WAvg represents the weighted average proportion of CPU occupancy.

表2Host和Docker中CPU资源占用对比Table 2 Comparison of CPU resource usage in Host and Docker

CPU(usr％+sys％)CPU(usr%+sys%) CPU-WAvgCPU-WAvg HostHost 25％+70％25%+70% 17.6％17.6% DockerDocker 20％+50％20%+50% 11.0％11.0%

表2中可以看出，相比于Host，Docker中的CPU加权平均占比减少了6.6％，即Docker中的资源占用更少。As can be seen in Table 2, compared with Host, the weighted average proportion of CPU in Docker is reduced by 6.6%, that is, the resource occupancy in Docker is less.

综合以上两种测试可以看到，相比于Host，在Docker环境中不会有明显的计算性能下降，却有效的减少了资源的占用。此外，利用容器化技术提高了任务的部署效率，降低了运维复杂度。Combining the above two tests, it can be seen that compared with Host, there is no obvious decrease in computing performance in the Docker environment, but it effectively reduces the occupation of resources. In addition, the use of containerization technology improves the deployment efficiency of tasks and reduces the complexity of operation and maintenance.

应当注意，本发明的实施方式可以通过硬件、软件或者软件和硬件的结合来实现。硬件部分可以利用专用逻辑来实现；软件部分可以存储在存储器中，由适当的指令执行系统，例如微处理器或者专用设计硬件来执行。本领域的普通技术人员可以理解上述的设备和方法可以使用计算机可执行指令和/或包含在处理器控制代码中来实现，例如在诸如磁盘、CD或DVD-ROM的载体介质、诸如只读存储器(固件)的可编程的存储器或者诸如光学或电子信号载体的数据载体上提供了这样的代码。本发明的设备及其模块可以由诸如超大规模集成电路或门阵列、诸如逻辑芯片、晶体管等的半导体、或者诸如现场可编程门阵列、可编程逻辑设备等的可编程硬件设备的硬件电路实现，也可以用由各种类型的处理器执行的软件实现，也可以由上述硬件电路和软件的结合例如固件来实现。It should be noted that the embodiments of the present invention may be implemented by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using special purpose logic; the software portion may be stored in memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those of ordinary skill in the art will appreciate that the apparatus and methods described above may be implemented using computer-executable instructions and/or embodied in processor control code, for example on a carrier medium such as a disk, CD or DVD-ROM, such as a read-only memory Such code is provided on a programmable memory (firmware) or a data carrier such as an optical or electronic signal carrier. The device and its modules of the present invention can be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., It can also be implemented by software executed by various types of processors, or by a combination of the above-mentioned hardware circuits and software, such as firmware.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，都应涵盖在本发明的保护范围之内。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person skilled in the art is within the technical scope disclosed by the present invention, and all within the spirit and principle of the present invention Any modifications, equivalent replacements and improvements made within the scope of the present invention should be included within the protection scope of the present invention.

Claims

1. an embedded reconfigurable heterogeneous assay method, characterized in that, the embedded reconfigurable heterogeneous assay method uses multiple distributed, embedded computing boards that carry various heterogeneous computing resources. The computing power is centralized to build a task-driven reconfigurable heterogeneous computing platform; the cluster tasks and heterogeneous computing resources are managed in a unified manner by means of dynamic cluster construction, and a reconfigurable virtual computing environment is built using virtualization technology.

2. The embedded reconfigurable heterogeneous measurement method according to claim 1, wherein the embedded reconfigurable heterogeneous measurement method accepts a user's task input, provides a web visual interactive page, and divides the nodes into Multiple independent clusters, build an agent on the upper layer of the cluster to provide load balancing;

In the proxy node, multiple nodes share a virtual IP based on the VRRP protocol. When the master of the node bound to the virtual IP fails, the virtual IP drifts to a slave node, and the slave node becomes a new master and continues to provide task connections. Access and load balancing functions.

3. The embedded reconfigurable heterogeneous assay method of claim 1, wherein the embedded reconfigurable heterogeneous assay method constructs a cluster through a dynamic cluster building module, and constructs a module through a task pool and a resource pool. The building module builds the task pool and resource pool of the cluster, matches tasks and resources through the virtual computing environment building module, and generates a virtual computing environment for task execution.

4. The embedded reconfigurable heterogeneous measurement method according to claim 3, wherein the dynamic cluster construction of the embedded reconfigurable heterogeneous measurement method comprises heartbeat detection, database consistency and dynamic center election strategy ;

In the master-slave service mode with equal node status, heartbeat detection performs survival detection on each node, adds new computing nodes to the cluster, and deletes faulty nodes; when a computing node is deleted, the system's heartbeat detection will automatically detect the deletion. The fault information of the development board; after the deletion is completed, the content of the system resources in the database is synchronized through the database consistency policy;

Database consistency realizes the synchronization of data in the database, so that each computing node knows the task and resource configuration information of the cluster. When the master node fails, a new master node is elected by the dynamic center election strategy to synchronize the data in the database;

The dynamic center election realizes the dynamic selection of the master control node in the cluster. On the basis of the database consistency module, the election strategy is used to dynamically select the master node to achieve decentralization; the elected master node is responsible for the distribution of tasks, and assigns specific tasks to specific tasks. and task-related configuration information, distributed to a computing node according to a certain distribution strategy; when a computing node fails, the master node is responsible for failover and re-delivering the tasks on the failed node.

5 . The embedded reconfigurable heterogeneous measurement method according to claim 3 , wherein when a task X comes in the embedded reconfigurable heterogeneous measurement method, the master node performs the process according to the resource state of the computing node. 6 . Task allocation, send the task and its configuration information to the computing node i that meets the requirements; if the resources are insufficient, queue the task; after the resource allocation is successful, change the database of the master node, and then synchronize the operation to other databases ; When the computing node i fails, the heartbeat detection will detect the failure and clear the resource information of node i in the database. After synchronizing the database through the database consistency policy, the master node obtains the information of the failure of node i, and then determines that the node i is on the node i. The task X being executed has failed, and the master node re-issues task X according to information such as the resource status of the surviving nodes in the cluster, and task X is delivered to node j for re-execution.

6 . The embedded reconfigurable heterogeneous assay method according to claim 3 , wherein the resource pool of the embedded reconfigurable heterogeneous assay method is constructed when the embedded computing board is started, and the resource configuration is completed. 7 . scan, obtain the registered device information and perform health detection on the device, store the available resource information in the resource list of the database, realize resource discovery and availability detection, and build the resource pool of the cluster with the help of the database consistency policy;

The construction of the virtual computing environment matches the tasks with the resources required by the tasks. According to the requirements of the tasks, the master node builds a virtual computing environment for task execution, builds the relevant configuration information of the task execution environment, and sends it to the computing nodes along with the tasks. Integrate with the configuration environment to build a virtual computing environment. Package the application and operating environment as a docker image and upload it to the docker warehouse, and use the docker method to build a virtual computing environment, which starts quickly and belongs to the second level; the computing node pulls the image from the docker warehouse according to the configuration information issued by the master node , to build a virtual computing environment for task execution, so as to automate the installation of applications and the configuration of the environment.

7. The embedded reconfigurable heterogeneous measurement method according to claim 1, wherein the hardware resource of the embedded reconfigurable heterogeneous measurement method adopts a bus-component architecture, The controller board that constitutes computing resources performs networked interconnection and communication through the standard interface defined by the message bus; the heterogeneous resources on the embedded computing board are uniformly accessed, and the boards are networked and interconnected to realize a standard scalable high-speed system bus. Unified component encapsulation and access with heterogeneous resources.

8. A program storage medium for receiving user input, the stored computer program enables electronic equipment to perform any one of the following steps: the embedded computing board of a plurality of distributed, carrying various heterogeneous computing resources The computing power of the card is centralized, and a task-driven reconfigurable heterogeneous computing platform is built; the task and heterogeneous computing resources of the cluster are managed in a unified manner by means of dynamic cluster construction, and a reconfigurable virtual computing environment is built by using virtualization technology.

9 . An embedded reconfigurable heterogeneous assay system for implementing the embedded reconfigurable heterogeneous assay method according to any one of claims 1 to 7 , wherein the embedded reconfigurable heterogeneous assay system include:

The user interface layer is used to provide the task access method of Web visualization and provide the function of load balancing;

The system middleware layer is used for cluster construction and unified management of heterogeneous computing resources;

The hardware layer is used for unified access of heterogeneous resources on embedded computing boards and network interconnection between boards;

The user interface layer includes:

Web visualization module, which provides Web visualization interface to manage tasks, resources and users, and provides guarantee for access and authorization;

The task access module provides a unified access address, uses an agent to ensure high availability of unified access, and distributes tasks among clusters according to the load balancing strategy;

The system middleware layer includes a dynamic cluster building module, a task pool building module, a resource pool building module and a virtual computing environment building module;

Dynamic cluster building modules, including heartbeat detection, database consistency and dynamic center election strategy, are used to build and manage clusters to achieve unified scheduling of tasks and resources;

The task pool building module is used to realize the construction of the cluster task pool;

Resource pool building block for discovering and health checking of heterogeneous resources.

10 . A processor, wherein the processor is equipped with the embedded reconfigurable heterogeneous measurement system according to claim 9 . 11 .