JP6740210B2

JP6740210B2 - System and method for parallel processing using dynamically configurable advance co-processing cells

Info

Publication number: JP6740210B2
Application number: JP2017503021A
Authority: JP
Inventors: アリフォンソイニゲス，
Original assignee: アリフォンソイニゲス，
Priority date: 2014-07-24
Filing date: 2015-07-10
Publication date: 2020-08-12
Anticipated expiration: 2035-07-10
Also published as: JP2017521796A

Description

（優先権データ）
本願は、２０１３年１月２５日に出願された米国出願第１３／７５０，６９６号の継続出願であり、該米国出願は、参照により本明細書中に援用される。 (Priority data)
This application is a continuation of US Application No. 13/750,696, filed January 25, 2013, which is incorporated herein by reference.

本発明は、概して、並列プロセスコンピューティングに関し、特に、中央処理ユニットによって投入されるタスクプールからタスクを先回りして読み出すように構成される、自律的コプロセッサを伴う処理アーキテクチャに関する。 The present invention relates generally to parallel process computing, and more particularly to a processing architecture with an autonomous coprocessor configured to proactively read tasks from a task pool submitted by a central processing unit.

モノのインターネット（モノのクラウドとも称される）は、既存のインターネットインフラストラクチャ内の一意に認識可能な埋め込まれたコンピューティングデバイスのアドホックネットワークを指す。モノのインターネット（ＩｏＴ）は、マシンツーマシン通信（Ｍ２Ｍ）に優る、デバイス、システム、およびサービスの高度な接続性を予示する。ＩｏＴによって想定されるモノの範囲は、無制限であり、例えば、心臓監視インプラント、バイオチップトランスポンダ、自動車センサ、航空宇宙および防衛分野オペレーションデバイス、ならびに探索および救助活動において消防士を補助する公衆安全用途等のデバイスを含み得る。現在の市場の実施例は、遠隔監視のためにＷｉ−Ｆｉを利用するスマートサーモスタット、電球、および洗濯機／乾燥機を伴う家庭ベースのネットワークを含む。ＩｏＴにおいて接続される物体の遍在性に起因して、２０２０年までに３００億個を上回るデバイスが、モノのインターネットに無線接続されるであろうことが推定されている。これらのデバイスと関連付けられるコントローラおよびプロセッサの処理容量を利用することが、本発明の目的のうちの１つである。 The Internet of Things (also called the Cloud of Things) refers to an ad-hoc network of uniquely recognizable embedded computing devices within an existing Internet infrastructure. The Internet of Things (IoT) predicts a high degree of device, system, and service connectivity over machine-to-machine communication (M2M). The range of things envisioned by IoT is unlimited, including, for example, cardiac monitoring implants, biochip transponders, automotive sensors, aerospace and defense operating devices, and public safety applications to assist firefighters in search and rescue operations. Device may be included. Current market examples include home-based networks with smart thermostats, light bulbs, and washers/dryers that utilize Wi-Fi for remote monitoring. It is estimated that by 2020, over 30 billion devices will be wirelessly connected to the Internet of Things due to the ubiquity of connected objects in IoT. Utilizing the processing capacity of the controller and processor associated with these devices is one of the objectives of the present invention.

コンピュータプロセッサは、伝統的に、マシンコード化命令を直列的に実行する。複数のアプリケーションを並行して起動するために、単一のプロセッサが、種々のプログラムからの命令をインターリーブし、それらを直列的に実行するが、ユーザの視点からは、アプリケーションは、並列に処理されているように見える。一方、真の並列またはマルチコア処理は、大規模な算出タスクを個々の算出ブロックに細分化し、それらを２つまたはそれを上回るプロセッサ間で分散する算出アプローチである。タスク並列性（並列処理）を使用するコンピューティングアーキテクチャは、大規模な算出要件を実行可能なコードの離散モジュールに分割する。モジュールは、次いで、それらのそれぞれの優先順位に基づいて、並行して、または連続的に実行される。 Computer processors traditionally execute machine-coded instructions serially. To launch multiple applications in parallel, a single processor interleaves instructions from different programs and executes them serially, but from the user's perspective, applications are processed in parallel. It looks like True parallel or multi-core processing, on the other hand, is a computational approach that subdivides a large computational task into individual computational blocks and distributes them among two or more processors. Computing architectures that use task parallelism (parallelism) divide large computational requirements into discrete modules of executable code. The modules are then executed in parallel or sequentially based on their respective priorities.

典型的なマルチプロセッサシステムは、中央処理ユニット（「ＣＰＵ」）と、１つまたはそれを上回るコプロセッサとを含む。ＣＰＵは、算出要件をタスクにパーティション化し、そのタスクをコプロセッサに分散する。完了したスレッドは、ＣＰＵに報告され、これは、必要に応じて付加的なスレッドをコプロセッサに分散し続ける。現在公知の多重処理アプローチは、タスク分散によって有意な量のＣＰＵ帯域幅が消費される点において不利であり、新しいタスクを分散する前にタスクが完了するまで待機し（多くの場合、以前のタスクに依存する）、タスクが完了すると、コプロセッサからの割込みに応答し、コプロセッサからの他のメッセージに応答する。加えて、コプロセッサは、多くの場合、ＣＰＵからの新しいタスクを待ちながら、アイドルしたままである。 A typical multiprocessor system includes a central processing unit ("CPU") and one or more coprocessors. The CPU partitions the computational requirements into tasks and distributes the tasks to coprocessors. Completed threads are reported to the CPU, which keeps additional threads distributed to the coprocessor as needed. Presently known multiprocessing approaches have the disadvantage that task distribution consumes a significant amount of CPU bandwidth, waiting for tasks to complete before distributing new tasks (often the previous task When the task completes, it responds to interrupts from the coprocessor and other messages from the coprocessor. In addition, the coprocessor often remains idle, waiting for new tasks from the CPU.

したがって、ＣＰＵ管理オーバーヘッドを低減させ、また、利用可能な共処理リソースをより効果的に利用および活用するマルチプロセッサアーキテクチャが、必要とされる。 Therefore, there is a need for a multiprocessor architecture that reduces CPU management overhead and more effectively utilizes and utilizes available co-processing resources.

並列処理コンピューティングアーキテクチャの種々の実施形態は、タスクプールに投入するように構成される、ＣＰＵと、タスクプールからスレッド（タスク）を先回りして読み出すように構成される、１つまたはそれを上回るコプロセッサとを含む。各コプロセッサは、タスクの完了に応じてタスクプールに通知し、別のタスクが処理のために利用可能になるまでタスクプールをｐｉｎｇ確認する。このように、ＣＰＵは、タスクプールと直接通信し、タスクプールを通してコプロセッサと間接的に通信する。 Various embodiments of parallel computing architectures include a CPU configured to submit to a task pool and one or more CPUs configured to proactively read threads (tasks) from the task pool. Including a coprocessor. Each coprocessor notifies the task pool upon completion of the task and pings the task pool until another task is available for processing. In this way, the CPU communicates directly with the task pool and indirectly with the coprocessor through the task pool.

コプロセッサはまた、自律的に作用することも可能であり得、つまり、それらは、ＣＰＵから独立してタスクプールと相互作用してもよい。好ましい実施形態では、各コプロセッサは、実施するタスクを探求するためにタスクプールに問い合わせるエージェントを含む。その結果、コプロセッサは、相互に、およびタスクプールと「連帯して」協働し、相互に関連する、またはしない場合もある個々のタスクを自律的に読み出し、完了することによって、集約算出要件を完了する。非限定的実施例として、タスクＢは、経時的な平均温度を算出するステップを伴うと仮定する。経時的な温度読取値を捕捉するステップを含むようにタスクＡを定義することによって、さらに、捕捉された読取値を取得するステップを含むようにタスクＢを定義することによって、ＣＰＵおよび種々のコプロセッサは、それによって、タスクプールを介して相互に推測的に通信してもよい。 Coprocessors may also be capable of acting autonomously, that is, they may interact with the task pool independent of the CPU. In the preferred embodiment, each coprocessor includes an agent that queries the task pool to seek tasks to perform. As a result, coprocessors work together “in solidarity” with each other and with task pools to autonomously read and complete individual tasks, which may or may not be interrelated, to achieve aggregate computation requirements. To complete. As a non-limiting example, assume task B involves calculating an average temperature over time. By defining task A to include capturing temperature readings over time, and by defining task B to include capturing captured readings, the CPU and various coprocessors are The processors may thereby communicate a priori with each other via the task pool.

種々の実施形態では、コプロセッサは、自律的かつ先見的な連帯セルと称される。この文脈において、自律的という用語は、コプロセッサが、ＣＰＵによってまたはタスクプールによってそうするように命令されることなく、タスクプールと相互作用し得ることを含意する。先見的という用語は、各コプロセッサが、周期的にエージェントを送信し、そのコプロセッサに適切な利用可能なタスクに関してタスクプールを監視するように構成（例えば、プログラム）され得ることを示唆する。連帯という用語は、共処理セルが、タスクプール内の全ての利用可能なタスクを監視および実行する際に共通の目的を共有することを含意する。 In various embodiments, coprocessors are referred to as autonomous and proactive solidarity cells. In this context, the term autonomous implies that a coprocessor may interact with a task pool without being instructed to do so by the CPU or by the task pool. The term forward-looking suggests that each coprocessor may be configured (eg, programmed) to periodically send agents and monitor the task pool for available tasks appropriate to that coprocessor. The term solidarity implies that the co-processing cells share a common purpose in monitoring and executing all available tasks in the task pool.

連帯セル（コプロセッサ）は、汎用または専用プロセッサであってもよく、したがって、本システム内のＣＰＵおよび他の連帯セルと比較して、同一または異なる命令セット、アーキテクチャ、およびマイクロアーキテクチャを有してもよい。さらに、実行されるべきソフトウェアプログラムおよび処理されるべきデータは、１つまたはそれを上回るメモリユニット内に含有されてもよい。典型的なコンピュータシステムでは、例えば、ソフトウェアプログラムは、プログラムによって使用されるべきデータを要求し得る、一連の命令から成る。例えば、プログラムがメディアプレーヤに対応する場合、メモリ内に含有されるデータは、コプロセッサによって読み取られ、最終的にスピーカで再生される圧縮オーディオデータであってもよい。 A solid state cell (coprocessor) may be a general purpose or special purpose processor and therefore has the same or different instruction set, architecture and microarchitecture as the CPU and other solid state cells in the system. Good. Moreover, the software programs to be executed and the data to be processed may be contained in one or more memory units. In a typical computer system, for example, a software program consists of a series of instructions that may request data to be used by the program. For example, if the program is compatible with a media player, the data contained in the memory may be compressed audio data that is read by a coprocessor and eventually played on a speaker.

本システム内の各連帯セルは、ファブリックとしても公知である、クロスバースイッチを通してタスクプールとオーム的に、または無線で通信するように構成されてもよい。純粋に無線のメッシュ形態では、無線信号自体が、ファブリックを構成してもよい。種々の実施形態では、コプロセッサはまた、ＣＰＵと直接通信してもよい。スイッチングファブリックは、システムリソース間の通信を促進する。各連帯セルは、連帯セルが実施する処理を全く有していないとき、または代替として、連帯セルがその通常動作を妨げることなく処理サイクルに寄与することが可能であるとき、そのエージェントをタスクプールに送信することによって、実施するタスクを取得するという点において先見的である。非限定的実施例として、モノのインターネット（以下により詳細に議論される）の文脈では、電球等のデバイスと関連付けられるコプロセッサは、その通常動作として、マスターデバイス（スマートフォン等）から「オン」および「オフ」コマンドを待つようにプログラムされてもよいが、その処理リソースもまた、タスクプールを通して利用されてもよい。 Each joint cell in the system may be configured to communicate with the task pool in an ohmic or wireless manner through a crossbar switch, also known as a fabric. In the purely wireless mesh form, the wireless signals themselves may comprise the fabric. In various embodiments, the coprocessor may also be in direct communication with the CPU. The switching fabric facilitates communication between system resources. Each joint cell has its agent in the task pool when it has no processing to perform, or alternatively, when the joint cell is able to contribute to the processing cycle without disturbing its normal operation. Is forward-looking in that it obtains the task to be performed by sending it to. As a non-limiting example, in the context of the Internet of Things (discussed in more detail below), a coprocessor associated with a device, such as a light bulb, has its normal operation of being “on” and “from” a master device (such as a smartphone). Although it may be programmed to wait for an "off" command, its processing resources may also be utilized through the task pool.

本明細書に説明される種々の実施形態の文脈では、エージェントという用語は、タスクプールと相互作用し、それによって、そのコプロセッサセルに対して適切である利用可能なタスクを取得するコプロセッサと関連付けられる、ネットワークパケットに類似するソフトウェアモジュールを指す。連帯セルは、タスクが以前のタスクの実行を条件とするとき、連続的に、または実行のために１つを上回る連帯セルが利用可能であり、１つを上回る整合タスクが利用可能であるとき、並列にタスクを実行してもよい。タスクは、（該当する場合）ＣＰＵによって提供されるタスクスレッド制限に応じて、独立して、または共同で実行されてもよい。タスクプール内の相互依存したタスクは、論理的に組み合わせられてもよい。タスクプールは、タスクスレッドが完了すると、これをＣＰＵに通知する。タスクスレッドが単一のタスクから成る場合、タスクプールは、そのようなタスクの完了時にＣＰＵに通知してもよい。タスクスレッドが複数のタスクから成る場合、タスクプールは、そのような一連のタスクの完了時にＣＰＵに通知してもよい。タスクスレッドは論理的に組み合わせられ得るため、論理的に組み合わせられたタスクスレッドの完了後に、タスクプールがＣＰＵに通知する場合があることも想定される。 In the context of the various embodiments described herein, the term agent refers to a coprocessor that interacts with a task pool, thereby obtaining an available task that is appropriate for that coprocessor cell. Refers to the associated software module that resembles a network packet. A joint cell is a task, if more than one joint cell is available, and more than one matching task is available, either continuously or for execution, when the task is conditional on the execution of a previous task. , Tasks may be executed in parallel. Tasks may be run independently or jointly, depending on the task thread limits provided by the CPU (if applicable). Interdependent tasks within a task pool may be logically combined. The task pool notifies the CPU when the task thread is complete. If the task thread consists of a single task, the task pool may notify the CPU upon completion of such task. If the task thread consists of multiple tasks, the task pool may notify the CPU upon completion of such a series of tasks. Since task threads can be logically combined, it is also envisioned that the task pool may notify the CPU after completion of the logically combined task threads.

当業者は、ＣＰＵおよびコプロセッサ間の相互運用性が、種々のコプロセッサと関連付けられる、命令セットアーキテクチャから独立した抽象レベルでタスクを構成および／または構築するようにＣＰＵを構成し、それによって、命令レベルでではなく、タスクレベルで構成要素が通信することを可能にすることによって促進され得ることを理解するであろう。したがって、デバイスおよびそれらの関連付けられるコプロセッサは、「プラグアンドプレイ」ベースでネットワークに追加されてもよい。本発明の別の側面は、異なる命令セットアーキテクチャを伴うＣＰＵの異種アレイ内の相互運用性を提供する。 Those skilled in the art will configure the CPU such that interoperability between the CPU and the coprocessor configures and/or builds tasks at an abstraction level independent of the instruction set architecture associated with the various coprocessors, thereby It will be appreciated that it may be facilitated by allowing components to communicate at the task level rather than at the instruction level. Thus, devices and their associated coprocessors may be added to the network on a "plug and play" basis. Another aspect of the invention provides interoperability within a heterogeneous array of CPUs with different instruction set architectures.

本発明の種々の特徴は、とりわけ、モノのインターネットデバイスおよびセンサのネットワーク、異種コンピューティング環境、高性能コンピューティング２次元および３次元モノリシック集積回路、ならびにモーション制御およびロボットに適用可能である。
本発明は、例えば、以下を提供する。
（項目１）
処理システムであって、
タスクプールと、
第１のタスクを前記タスクプールに投入するように構成される、コントローラと、
前記タスクプールから前記第１のタスクを先回りして読み出すように構成される、第１のコプロセッサと、
を備える、処理システム。
（項目２）
前記第１のコプロセッサは、前記コントローラと通信することなく、前記タスクプールから前記第１のタスクを読み出すように構成される、第１のエージェントを備える、項目１に記載の処理システム。
（項目３）
前記第１のタスクは、第１のタスクタイプの印を含み、前記第１のコプロセッサは、前記第１のタイプのタスクを実施するように構成され、前記第１のエージェントは、前記第１のタイプのタスクに関して前記タスクプールを検索するように構成される、項目２に記載の処理システム。
（項目４）
前記第１のコプロセッサはさらに、前記第１のものを処理し、前記第１のタスクの完了に応じて、前記タスクプールに通知するように構成される、項目１に記載の処理システム。
（項目５）
前記タスクプールは、前記第１のタスクの完了に応じて、前記コントローラに通知するように構成される、項目１に記載の処理システム。
（項目６）
前記コントローラおよび前記第１のコプロセッサは、相互に前記タスクプールを通してのみ通信するように構成される、項目１に記載の処理システム。
（項目７）
前記コントローラおよび前記第１のコプロセッサは、相互に直接、前記タスクプールを通して通信するように構成される、項目１に記載の処理システム。
（項目８）
前記第１のコプロセッサは、これが利用可能な処理容量を有していることを判定し、前記判定に応答して前記エージェントを前記タスクプールにディスパッチするように構成される、項目２に記載の処理システム。
（項目９）
前記コントローラはさらに、第２のタスクを前記タスクプールに投入するように構成され、前記システムはさらに、前記タスクプールから前記第２のタスクを先回りして読み出すように構成される第２のエージェントを有する、第２のコプロセッサを備える、項目３に記載の処理システム。
（項目１０）
前記第２のタスクは、第２のタスクタイプの印を含み、前記第２のコプロセッサは、前記第２のタイプのタスクを実施するように構成され、前記第２のエージェントは、前記第２のタイプのタスクに関して前記タスクプールを検索するように構成される、項目９に記載の処理システム。
（項目１１）
前記コントローラおよび前記タスクプールは、モノリシック集積回路（ＩＣ）上に常駐し、前記第１のコプロセッサは、前記ＩＣ上に常駐しない、項目１に記載の処理システム。
（項目１２）
前記コントローラ、前記タスクプール、ならびに前記第１および第２のコプロセッサは、モノリシック集積回路（ＩＣ）上に常駐する、項目９に記載の処理システム。
（項目１３）
第１のタスクタイプを有する第１のタスクをタスクプールに投入するように構成される、中央処理ユニット（ＣＰＵ）を含むタイプのネットワーク内の処理リソースを動的に制御する方法であって、
前記第１のタスクタイプを実施するように第１のセルをプログラムするステップと、
前記プログラムされた第１のセルを前記ネットワークに追加するステップと、
前記第１のセルから前記タスクプールに、第１のエージェントを先回りして送信するステップと、
前記第１のエージェントによって、前記第１のタイプのタスクに関して前記タスクプールを検索するステップと、
前記第１のエージェントによって、前記タスクプールから前記第１のタスクを読み出すステップと、
前記第１のエージェントによって、前記第１のセルに前記第１のタスクをトランスポートするステップと、
前記第１のセルによって、前記第１のタスクを処理するステップと、
前記第１のセルから前記タスクプールに、前記第１のタスクが完了した通知を送信するステップと、
を含む、方法。
（項目１４）
前記タスクプールによって、前記第１のタスクを完了しているものとしてマーキングするステップと、
前記タスクプールから前記ＣＰＵに、前記第１のタスクが完了した通知を送信するステップと、
をさらに含む、項目１３に記載の方法。
（項目１５）
前記第１のエージェントを前記タスクプールに先回りして送信するための述語として、前記第１のセルが利用可能な処理容量を有すると判定するように前記第１のセルを構成するステップをさらに含む、項目１３に記載の方法。
（項目１６）
前記プログラムされた第１のセルを前記ネットワークに追加することに先立って、前記第１のセルを第１のデバイスに統合するステップをさらに含む、項目１３に記載の方法。
（項目１７）
前記第１のデバイスは、センサ、電球、電力スイッチ、電化製品、バイオメトリックデバイス、医療デバイス、診断デバイス、ラップトップ、タブレット、スマートフォン、モータコントローラ、およびセキュリティデバイスのうちの１つを含む、項目１６に記載の方法。
（項目１８）
前記プログラムされた第１のセルをネットワークに追加するステップは、前記第１のセルと前記タスクプールとの間に通信リンクを確立するステップを含む、項目１３に記載の方法。
（項目１９）
前記（ＣＰＵ）はさらに、第２のタスクタイプを有する第２のタスクをタスクプールに投入するように構成され、
前記第２のタスクタイプを実施するように前記第２のセルをプログラムするステップと、
前記第２のセルと前記タスクプールとの間に通信リンクを確立するステップと、
前記第２のセルから前記タスクプールに、第２のエージェントを先回りして送信するステップと、
前記第２のエージェントによって、前記第２のタイプのタスクに関して前記タスクプールを検索するステップと、
前記第２のエージェントによって、前記タスクプールから前記第２のタスクを読み出すステップと、
前記第２のエージェントによって、前記第２のセルに前記第２のタスクをトランスポートするステップと、
前記第２のセルによって、前記第２のタスクを処理するステップと、
前記第２のセルから前記タスクプールに、前記第２のタスクが完了した通知を送信するステップと、
前記タスクプールによって、前記第２のタスクを完了しているものとしてマーキングするステップと、
前記タスクプールから前記ＣＰＵに、前記第２のタスクが完了した通知を送信するステップと、
をさらに含む、項目１３に記載の方法。
（項目２０）
モノのインターネット（ＩｏＴ）コンピューティング環境内の分散処理リソースを制御するためのシステムであって、
集約コンピューティング要件を複数のタスクにパーティション化し、前記タスクをプール内に置くように構成される、ＣＰＵと、
それぞれ、前記ＣＰＵとの直接通信を伴わずに、前記プールからタスクを先回りして読み出すように構成される、一意の専用エージェントを有する、複数のデバイスと、
を備える、システム。 Various features of the present invention are applicable to, among other things, networks of Internet of Things devices and sensors, heterogeneous computing environments, high performance computing two-dimensional and three-dimensional monolithic integrated circuits, and motion control and robots.
The present invention provides, for example, the following.
(Item 1)
A processing system,
A task pool,
A controller configured to submit a first task to the task pool,
A first coprocessor configured to read ahead the first task from the task pool;
And a processing system.
(Item 2)
The processing system of item 1, wherein the first coprocessor comprises a first agent configured to read the first task from the task pool without communicating with the controller.
(Item 3)
The first task includes a first task type indicia, the first coprocessor is configured to perform the first type task, and the first agent is configured to perform the first task. The processing system of item 2, wherein the processing system is configured to search the task pool for tasks of this type.
(Item 4)
The processing system of item 1, wherein the first coprocessor is further configured to process the first one and notify the task pool upon completion of the first task.
(Item 5)
The processing system of item 1, wherein the task pool is configured to notify the controller upon completion of the first task.
(Item 6)
The processing system of item 1, wherein the controller and the first coprocessor are configured to communicate with each other only through the task pool.
(Item 7)
The processing system of item 1, wherein the controller and the first coprocessor are configured to communicate with each other directly through the task pool.
(Item 8)
Item 3. The first coprocessor is configured to determine that it has available processing capacity and to dispatch the agent to the task pool in response to the determination. Processing system.
(Item 9)
The controller is further configured to submit a second task to the task pool, and the system further includes a second agent configured to proactively read the second task from the task pool. 4. The processing system of item 3, comprising a second coprocessor having.
(Item 10)
The second task includes a second task type indicium, the second coprocessor is configured to perform the second type task, and the second agent is configured to perform the second task. The processing system of item 9, wherein the processing system is configured to search the task pool for tasks of this type.
(Item 11)
The processing system of item 1, wherein the controller and the task pool reside on a monolithic integrated circuit (IC) and the first coprocessor does not reside on the IC.
(Item 12)
10. The processing system of item 9, wherein the controller, the task pool, and the first and second coprocessors reside on a monolithic integrated circuit (IC).
(Item 13)
A method for dynamically controlling processing resources in a network of a type including a central processing unit (CPU), configured to submit a first task having a first task type to a task pool, the method comprising:
Programming a first cell to perform the first task type;
Adding the programmed first cell to the network;
Proactively sending a first agent from the first cell to the task pool;
Retrieving the task pool by the first agent for tasks of the first type;
Reading the first task from the task pool by the first agent;
Transporting the first task by the first agent to the first cell;
Processing the first task by the first cell;
Sending a notification that the first task has been completed from the first cell to the task pool;
Including the method.
(Item 14)
Marking the first task as completed by the task pool;
Sending a notification from the task pool to the CPU that the first task has been completed;
14. The method of item 13, further comprising:
(Item 15)
The method further comprises the step of configuring the first cell to determine that the first cell has available processing capacity as a predicate for sending the first agent in advance to the task pool. , The method described in item 13.
(Item 16)
14. The method of item 13, further comprising integrating the first cell with a first device prior to adding the programmed first cell to the network.
(Item 17)
Item 16. The first device comprises one of a sensor, a light bulb, a power switch, an appliance, a biometric device, a medical device, a diagnostic device, a laptop, a tablet, a smartphone, a motor controller, and a security device. The method described in.
(Item 18)
14. The method of item 13, wherein adding the programmed first cell to the network comprises establishing a communication link between the first cell and the task pool.
(Item 19)
The (CPU) is further configured to submit a second task having a second task type to a task pool,
Programming the second cell to perform the second task type;
Establishing a communication link between the second cell and the task pool;
Proactively sending a second agent from the second cell to the task pool;
Searching the task pool for the second type of task by the second agent;
Reading the second task from the task pool by the second agent;
Transporting the second task by the second agent to the second cell;
Processing the second task by the second cell;
Sending a notification that the second task is completed from the second cell to the task pool;
Marking the second task as completed by the task pool;
Sending a notification from the task pool to the CPU that the second task has been completed;
14. The method of item 13, further comprising:
(Item 20)
A system for controlling distributed processing resources within an Internet of Things (IoT) computing environment, comprising:
A CPU configured to partition the aggregate computing requirements into multiple tasks and to place the tasks in a pool;
A plurality of devices each having a unique dedicated agent configured to proactively read tasks from the pool without direct communication with the CPU;
A system comprising.

本発明は、以降では、添付される図面と併せて説明され、同様の番号は、同様の要素を表す。 The present invention will now be described in conjunction with the accompanying drawings, wherein like numerals represent like elements.

図１は、ある実施形態による、ＣＰＵと、メモリと、タスクプールと、ファブリックを通して通信するように構成される複数のコプロセッサとを含む、並列処理アーキテクチャの概略ブロック図である。FIG. 1 is a schematic block diagram of a parallel processing architecture including a CPU, memory, a task pool, and a plurality of coprocessors configured to communicate through a fabric, according to an embodiment.

図２は、ある実施形態による、例示的タスクプールの詳細を例証する、概略ブロック図である。FIG. 2 is a schematic block diagram illustrating details of an exemplary task pool, according to an embodiment.

図３は、ある実施形態による、タスクプールと相互作用する、共処理セルおよびそれらの対応するエージェントを含むネットワークの概略ブロック図である。FIG. 3 is a schematic block diagram of a network including co-processing cells and their corresponding agents that interact with a task pool, according to an embodiment.

図４は、ある実施形態による、利用可能なプラグアンドプレイデバイスを含む、モノのインターネットネットワークの概略配置である。FIG. 4 is a schematic arrangement of an Internet of Things network, including available Plug and Play devices, according to an embodiment.

図５は、ある実施形態による、近傍デバイスの動的利用を例証する、例示的モノのインターネットの使用事例の概略配置図である。FIG. 5 is a schematic layout of an exemplary Internet of Things use case illustrating dynamic utilization of nearby devices, according to an embodiment.

図６は、ある実施形態による、例示的並列コンピューティング環境の動作を例証する、フローチャートである。FIG. 6 is a flow chart illustrating the operation of an exemplary parallel computing environment, according to an embodiment.

種々の実施形態は、例えば、モノのインターネットの文脈における、限定ではないが、データ暗号化；グラフィック、ビデオ、およびオーディオ処理；ダイレクトメモリアクセス；数学的算出；データマイニング；ゲームアルゴリズム；ネットワーク外のデータの構築、受信、および伝送を含む、イーサネット（登録商標）パケットおよび他のネットワークプロトコル処理；金融サービスおよびビジネス方法；検索エンジン；インターネットデータストリーミングおよび他のウェブベースアプリケーション；内部または外部ソフトウェアプログラムの実行；電化製品、電球、消費者用電子機器、および同等物のオンおよびオフ切替ならびに／または別様にそれらの制御もしくは操作を含む、単純な切替および制御機能から複雑なプログラムおよびアルゴリズムまでの並列処理コンピューティングシステムおよび環境に関する。 Various embodiments include, but are not limited to, data encryption; graphics, video, and audio processing; direct memory access; mathematical calculations; data mining; game algorithms; data outside the network, for example, in the context of the Internet of Things. Ethernet packet and other network protocol processing, including building, receiving, and transmitting; financial services and business methods; search engines; Internet data streaming and other web-based applications; execution of internal or external software programs; Parallel processing computer from simple switching and control functions to complex programs and algorithms, including switching on and off and/or otherwise controlling appliances, light bulbs, consumer electronics, and the like. Wing system and environment.

種々の特徴が、任意の現在公知の、または後に開発されるコンピュータアーキテクチャ中に組み込まれてもよい。例えば、同期、データセキュリティ、順不同実行、およびメインプロセッサ割込みに関する並列処理の問題は、本明細書に説明される発明的概念を使用して対処され得る。 Various features may be incorporated into any currently known or later developed computer architecture. For example, the issues of synchronization, data security, out-of-order execution, and parallel processing with respect to main processor interrupts may be addressed using the inventive concepts described herein.

ここで図１を参照すると、分散処理システム１０が、単一またはマルチコアＣＰＵ１１と、クロスバースイッチングファブリック１４を通してタスクプール１３と通信するように構成される、１つまたはそれを上回る連帯もしくは共処理セル１２Ａ−１２とを含む。連帯セル１２はまた、スイッチングファブリック１４を通して、または別個のセルバス（図示せず）を通して、相互に通信してもよい。ＣＰＵ１１は、直接、またはスイッチングファブリック１４を通して、タスクプール１３と通信してもよい。１つまたはそれを上回るメモリユニット１５が、それぞれ、データおよび／または命令を含有する。本文脈では、「命令」という用語は、ＣＰＵ１１による実行のためにコンパイルされ得るソフトウェアプログラムを含む。メモリユニット１５、セル１２、およびタスクプール１３は、スイッチングファブリック１４を介して、ＣＰＵと、および／または相互に通信するようにオーム的に、または無線で相互接続されてもよい。いくつかの実施形態では、ＣＰＵ１１は、タスクプールを通して間接的にのみセル１２と通信する。他の実施形態では、ＣＰＵ１１はまた、中間物としてタスクプールを使用することなく、セル１２と直接通信してもよい。 Referring now to FIG. 1, a distributed processing system 10 is configured to communicate with a single or multi-core CPU 11 and a task pool 13 through a crossbar switching fabric 14 and one or more solidarity or co-processing cells. 12A-12. The joint cells 12 may also communicate with each other through the switching fabric 14 or through separate cell buses (not shown). The CPU 11 may communicate with the task pool 13 directly or through the switching fabric 14. One or more memory units 15 each contain data and/or instructions. In the present context, the term “instructions” includes software programs that can be compiled for execution by the CPU 11. The memory units 15, cells 12, and task pools 13 may be interconnected ohmicly or wirelessly via a switching fabric 14 to communicate with the CPU and/or with each other. In some embodiments, the CPU 11 communicates with the cell 12 only indirectly through the task pool. In other embodiments, the CPU 11 may also communicate directly with the cell 12 without using the task pool as an intermediary.

いくつかの実施形態では、システム１０は、１つを上回るＣＰＵ１１と、１つを上回るタスクプール１３とを含んでもよく、その場合では、特定のＣＰＵ１１が、特定のタスクプール１３と排他的に相互作用してもよい、または複数のＣＰＵ１１が、１つまたはそれを上回るタスクプール１３を共有してもよい。さらに、各連帯セルは、１つを上回るタスクプール１３と相互作用するように構成されてもよい。代替として、特定のセルが、例えば、高性能または高セキュリティの文脈において、単一の指定されたタスクプールと相互作用するように構成されてもよい。 In some embodiments, system 10 may include more than one CPU 11 and more than one task pool 13, in which case a particular CPU 11 may interact exclusively with a particular task pool 13. It may act, or multiple CPUs 11 may share one or more task pools 13. Further, each solidarity cell may be configured to interact with more than one task pool 13. Alternatively, a particular cell may be configured to interact with a single designated task pool, for example in the context of high performance or high security.

種々の実施形態では、セルは、以下の３つの条件が満たされると、タスクプールとオーム的に（プラグアンドプレイ）、または無線で（オンザフライ）動的にペアリングされてもよい。
１）セルが、タスクプールとオーム的に、または無線で通信することが可能である。タスクプールへの接続は、タスクプール自体におけるポートを通して、またはタスクプールに接続されるスイッチングファブリックを通してであり得る。
２）タスクプールが、例えば、パスワードの有無にかかわらず、伝統的なＷｉ−Ｆｉ、Ｂｌｕｅｔｏｏｔｈ（登録商標）、もしくは類似するペアリングを通した、スマートフォンもしくはタブレッド上で起動するグラフィカルソフトウェアプログラムを通した手動の、または任意の他のセキュアもしくは非セキュアな方法による、ユーザからの入力を使用して、セルによって送信されたエージェントを信頼できるものとして認識する。
３）タスクプール内の利用可能なタスクのうちの少なくとも１つが、連帯セルの能力と互換性がある。 In various embodiments, a cell may be dynamically paired with the task pool in an ohmic (plug and play) or wireless (on the fly) condition when the following three conditions are met.
1) It is possible for the cell to communicate with the task pool in an ohmic or wireless manner. The connection to the task pool can be through a port in the task pool itself or through a switching fabric connected to the task pool.
2) The task pool, for example, through a graphical software program running on a smartphone or tabred, through traditional Wi-Fi, Bluetooth®, or similar pairing, with or without a password. Recognize the agent sent by the cell as trusted using input from the user, either manually or by any other secure or non-secure method.
3) At least one of the available tasks in the task pool is compatible with the capabilities of the joint cell.

複数のタスクプールを伴うマルチプロセッサ環境の場合では、前述の動的なペアリング条件が適用され、但し、所与のセルが、タスクプールのうちの１つのみと協働するようにロックまたは制限され得、そうでなければ、セルは、第１発見ベース、ラウンドロビンベース、または任意の他の選択スキームを使用して、１つまたはそれを上回るタスクプールと接続し得る。また、タスクプール内のタスクに優先順位を割り当てることも可能であり、それによって、セルは、優先順位の高いタスクを優先し、より優先順位の高いタスクに別様に従事しないとき、より優先順位の低いタスクを扱う。 In the case of a multi-processor environment with multiple task pools, the dynamic pairing conditions described above apply, provided that a given cell is locked or restricted to work with only one of the task pools. Otherwise, the cell may connect to one or more task pools using a first discovery based, round robin based, or any other selection scheme. It is also possible to assign priorities to tasks within a task pool, which allows cells to prioritize higher priority tasks and not prioritize higher priority tasks differently. Low tasks.

ＣＰＵ１１は、ソフトウェアプログラムを実行するために使用される、任意の単一もしくはマルチコアプロセッサ、アプリケーションプロセッサ、またはマイクロコントローラであってもよい。システム１０は、パーソナルコンピュータ、スマートフォン、タブレット、またはモノのインターネットデバイス上に実装されてもよく、その場合では、ＣＰＵ１１は、Ｉｎｔｅｌ（登録商標）Ｐｅｎｔｉｕｍ（登録商標）または直近のコンピューティング環境にローカルである、もしくはそれからリモートであるマルチコアプロセッサ等の任意のパーソナルコンピュータ、中央プロセッサ、またはプロセッサクラスタであってもよい。代替として、システム１０は、スーパーコンピュータ上に実装されてもよく、ＣＰＵ１１は、縮小命令セットコンピュータ（「ＲＩＳＣ」）プロセッサ、アプリケーションプロセッサ、マイクロコントローラ、または同等物であってもよい。 CPU 11 may be any single or multi-core processor, application processor, or microcontroller used to execute software programs. The system 10 may be implemented on a personal computer, smart phone, tablet, or Internet of Things device, in which case the CPU 11 is local to the Intel® Pentium® or immediate computing environment. It may be any personal computer, such as a multicore processor, that is, or is remote from, a central processor, or a processor cluster. Alternatively, system 10 may be implemented on a supercomputer and CPU 11 may be a reduced instruction set computer (“RISC”) processor, application processor, microcontroller, or equivalent.

他の実施形態では、システム１０は、Ｂｅｏｗｕｌｆクラスタ等のローカル接続された一連のパーソナルコンピュータ上に実装されてもよく、その場合では、ＣＰＵ１１は、ネットワーク化コンピュータの全て、そのサブセット、またはそのうちの１つの中央プロセッサを含んでもよい。代替として、システム１０は、リモート接続されたコンピュータのネットワーク上に実装されてもよく、その場合では、ＣＰＵ１１は、サーバまたはメインフレームのための現在公知の、または後に開発される中央プロセッサであってもよい。ＣＰＵ１１が、ここで説明されるシステム１０内で対象並列処理方法を実施する特定の様式は、ＣＰＵのオペレーティングシステムによって影響を受け得る。例えば、ＣＰＵ１１は、以下に説明されるように、タスクプール１３を認識し、これと通信し、コンピューティング要件をスレッドに分割するようにこれをプログラムすることによって、システム１０内で使用するために構成されてもよい。 In other embodiments, system 10 may be implemented on a series of locally connected personal computers, such as a Beowulf cluster, in which case CPU 11 may include all of the networked computers, a subset thereof, or one of them. It may include two central processors. Alternatively, system 10 may be implemented on a network of remotely connected computers, in which case CPU 11 may be a now known or later developed central processor for a server or mainframe. Good. The particular manner in which the CPU 11 implements the subject parallelism method within the system 10 described herein may be influenced by the CPU's operating system. For example, the CPU 11 recognizes the task pool 13, communicates with it, and programs it to divide computing requirements into threads, for use within the system 10, as described below. It may be configured.

システム１０は、本明細書に説明される機能性を実装するように修正または別様に構成され得るオペレーティングシステムを有する、任意のコンピュータまたはコンピュータネットワーク上に遡及的に実装され得ることがさらに想定される。当分野で公知であるように、処理されるべきデータは、例えば、ランダムアクセスもしくは読取専用メモリ、ＣＰＵ１１のためのキャッシュメモリ、またはフラッシュメモリおよび磁気記憶装置等の他の形態のデータ記憶装置のアドレス指定可能領域または区画の文脈において、メモリユニット１５内に含有される。メモリユニット１５は、処理されるべきデータだけではなく、処理されたデータの結果を置く場所も含有する。例えば、システム１０にデータを戻し得るスマートメータおよび自動車計器の場合のように、またはある機構を作動し得るロボットおよびモータコントローラの場合のように、全てのタスクが、メモリユニット１５にアクセスすることを要求されるわけではない。 It is further envisioned that system 10 may be retrospectively implemented on any computer or computer network having an operating system that may be modified or otherwise configured to implement the functionality described herein. It As is known in the art, the data to be processed may be, for example, random access or read-only memory, cache memory for CPU 11, or addresses of other forms of data storage such as flash memory and magnetic storage. Contained within the memory unit 15 in the context of the addressable area or partition. The memory unit 15 contains not only the data to be processed, but also a place to put the result of the processed data. For example, all tasks may access the memory unit 15 as in the case of smart meters and vehicle instruments that may return data to the system 10, or as in the case of robots and motor controllers that may operate certain mechanisms. It is not required.

各セル１２は、１つまたはそれを上回るタスク／スレッドを実行することが可能である、概念的または論理的に独立した算出ユニットである。セル１２は、マイクロコントローラ、マイクロプロセッサ、アプリケーションプロセッサ、「ダム」スイッチ、またはＢｅｏｗｕｌｆクラスタ内のマシン等の独立型コンピュータであってもよい。 Each cell 12 is a conceptually or logically independent computing unit capable of executing one or more tasks/threads. The cell 12 may be a stand-alone computer such as a microcontroller, microprocessor, application processor, "dumb" switch, or machine in a Beowulf cluster.

セル１２は、ＣＰＵの機能、または、例えば、周囲監視およびロボットアクチュエータ等のＣＰＵ１１とは無関係の機能を補完する、それらの全てを実施する、またはそれらの限定された範囲を実施するように構成される、汎用または専用コプロセッサであってもよい。専用プロセッサは、特殊化タスクを実施するように設計、プログラム、または別様に構成される、専用ハードウェアモジュールであってもよい、またはこれは、グラフィック処理、浮動小数点演算、もしくはデータ暗号化等の特殊化タスクを実施するように構成される、汎用プロセッサであってもよい。 The cell 12 is configured to complement, perform all of, or perform a limited range of functions of the CPU, or functions unrelated to the CPU 11, such as ambient monitoring and robot actuators, for example. It may be a general or special purpose coprocessor. A dedicated processor may be a dedicated hardware module designed, programmed, or otherwise configured to perform a specialized task, or it may be a graphics processing, floating point arithmetic, or data encryption, etc. A general-purpose processor configured to perform the specialized tasks of

ある実施形態では、専用プロセッサである任意のセル１２はまた、メモリにアクセスし、それに書き込み、以下に説明されるように、記述子を実行し、ならびに他のソフトウェアプログラムを実行するように構成されてもよい。 In one embodiment, any cell 12, which is a dedicated processor, is also configured to access memory, write to it, execute descriptors, as well as execute other software programs, as described below. May be.

さらに、任意の数のセル１２が、異種コンピューティング環境、つまり、ＡＭＤベースおよび／またはＩｎｔｅｌベースのプロセッサ、または３２ビットおよび６４ビットプロセッサの混合物等の１種類を上回るプロセッサを使用するシステムを含み得る。 Further, any number of cells 12 may include a heterogeneous computing environment, that is, a system using more than one type of processor, such as AMD-based and/or Intel-based processors, or a mixture of 32-bit and 64-bit processors. ..

各セル１２は、以下のイベントのシーケンスに例証されるように、１つまたは複数の特殊化タスクを実施するように構成される。ポーリングフェーズ中、各セルは、整合タスクが見出されるまで、タスクプールにエージェントを周期的に送信する。この整合を促進するために、セルおよびタスクプールは両方とも、送受信機を具備してもよい。タスクプールの場合では、送受信機は、タスクプール自体に、またはタスクプールが接続されるスイッチングファブリックに位置してもよい。タスクプール内にタスク整合が見出されると、タスクプールは、セルに肯定応答を伝送する。次のステップは、「通信チャネル」フェーズである。通信チャネルフェーズ中、セルは、タスクを受信し、そのタスクを実行し始める。一実装では、いったん第１のタスクが完了すると、通信チャネルは、連帯セルが「ポーリング」および「肯定応答」フェーズを繰り返す必要なく別のタスクをフェッチし得るように維持される。 Each cell 12 is configured to perform one or more specialization tasks, as illustrated by the sequence of events below. During the polling phase, each cell periodically sends agents to the task pool until a matching task is found. To facilitate this alignment, both the cell and task pool may include transceivers. In the case of task pools, the transceiver may be located in the task pool itself or in the switching fabric to which the task pool is connected. If a task match is found in the task pool, the task pool sends an acknowledgment to the cell. The next step is the "communication channel" phase. During the communication channel phase, the cell receives a task and begins executing that task. In one implementation, once the first task is complete, the communication channel is maintained so that the joint cell can fetch another task without having to repeat the "polling" and "acknowledgement" phases.

システム１０は、複数のセルを含んでもよく、セルのいくつかは、他のセルと同一のタスクタイプを実施することが可能であり、それによって、システム１０内に冗長性をもたらす。所与のセル１２によって実施されるタスクタイプのセットは、別のセルによって実施されるタスクタイプのセットのサブセットであってもよい。例えば、図１では、システム１０は、集約算出問題をタスクの群に分割し、第１のタイプ、第２のタイプ、および第３のタイプのタスクをタスクプール１３に投入してもよい。第１のセル１２Ａは、第１のタイプのタスクのみを実施することが可能であり得、第２のセル１２Ｂは、第２のタイプのタスクを実施することが可能であり得、第３のセル１２Ｃは、第３のタイプのタスクを実施することが可能であり得、第４のセル１２Ｄは、第２または第３のタイプのタスクを実施することが可能であり得、第５のセル１２Ｎは、全ての３つのタスクタイプを実施することが可能であり得る。システム１０は、所与のセルがシステム１０から除去される（または現在ビジーである、または別様に利用不可能である）場合、システム１０がシームレスに機能し続け得るように、この冗長性とともに構成されてもよい。さらに、セルがシステム１０に動的に追加される場合、システム１０は、性能向上の利益を用いて、シームレスに機能し続け得る。 System 10 may include multiple cells, some of which may perform the same task type as other cells, thereby providing redundancy within system 10. The set of task types performed by a given cell 12 may be a subset of the set of task types performed by another cell. For example, in FIG. 1, the system 10 may divide the aggregate calculation problem into a group of tasks and put the tasks of the first type, the second type, and the third type into the task pool 13. The first cell 12A may be capable of performing only the first type of task, the second cell 12B may be capable of performing the second type of task, and the third Cell 12C may be capable of performing a third type of task, fourth cell 12D may be capable of performing a second or third type of task, and a fifth cell 12N may be able to perform all three task types. System 10 works with this redundancy so that if a given cell is removed from system 10 (or is currently busy or otherwise unavailable), system 10 can continue to function seamlessly. It may be configured. Further, if cells are dynamically added to system 10, system 10 may continue to function seamlessly with the benefit of performance improvements.

ここで図１および２を参照すると、タスクプール１３は、ＣＰＵ１１によってアクセス可能である物理メモリの領域を占有し得る。代替として、タスクプール１３は、ＭＡＣアドレスまたはＩＰアドレスによってアクセス可能であってもよい。複数の実施形態が、タスクプール１３に関して想定され、これは、同一の２Ｄもしくは３ＤモノリシックＩＣ内でＣＰＵとともに物理的に位置してもよい、またはこれは、独立型ＩＣとして実装され、コンピュータボード、スマートフォン、タブレット、ルータ、もしくはモノのインターネットデバイスに物理的に相互接続されてもよい。さらなる代替実施形態では、タスクプールは、複数のＣＰＵ１１システム間で共有される、または所与のＣＰＵ１１の専用であり得る、独立型マルチポート、有線、および／または無線接続されたデバイスであってもよい。タスクプール１３はまた、セル１２によってアドレス指定可能であってもよい。タスクプール１３は、ＣＰＵ１１およびセル１２によって最大アクセス速度を提供するために、専用ハードウェアブロック内に配置されてもよい。代替として、タスクプール１３は、ソフトウェアベースであってもよく、タスクプール１３のコンテンツは、ハードウェアベースの実施形態に類似して、メモリ内に記憶されるが、データ構造によって表される。 Referring now to FIGS. 1 and 2, task pool 13 may occupy an area of physical memory that is accessible by CPU 11. Alternatively, the task pool 13 may be accessible by MAC address or IP address. Embodiments are envisioned for the task pool 13, which may be physically located with the CPU in the same 2D or 3D monolithic IC, or it may be implemented as a stand-alone IC, a computer board, It may be physically interconnected with a smartphone, tablet, router, or Internet of Things device. In a further alternative embodiment, the task pool may be a stand-alone multi-port, wired, and/or wirelessly connected device that may be shared between multiple CPU 11 systems or dedicated to a given CPU 11. Good. The task pool 13 may also be addressable by the cell 12. The task pool 13 may be located in dedicated hardware blocks to provide maximum access speed by the CPU 11 and cells 12. Alternatively, the task pool 13 may be software-based, and the content of the task pool 13 is stored in memory, similar to hardware-based embodiments, but represented by a data structure.

ＣＰＵ１１によって投入されると、タスクプール１３は、１つまたはそれを上回るタスクスレッド２１を含有する。各タスクスレッド２１は、ＣＰＵ１１に付与されたより大規模な集約算出要件の構成要素またはサブセットであり得る、算出タスクを表す。一実施形態では、ＣＰＵ１１は、並行して実行可能なスレッド２１を伴うタスクプール１３を初期化し、次いで、それに投入してもよい。各スレッド２１は、１つまたはそれを上回る離散タスク２２を含んでもよい。タスク２２が、タスクタイプと、記述子とを有してもよい。タスクタイプは、どのセル１２がタスク２２を実施することが可能であるかを示す。タスクプール１３はまた、同一タイプを有するタスク２２に優先順位を付けるために、タスクタイプを使用してもよい。一実施形態では、タスクプール１３は、システム１０内に存在する連帯セル１２、各セルが実施することが可能なタスク２２のタイプ、および各セルが現在タスク２２を処理しているかどうかを記録する、優先順位テーブル（図示せず）を維持してもよい。タスクプール１３は、以下に説明されるように、適格なタスク２２のどれを要求セルに割り当てるかを判定するために、優先順位テーブルを使用してもよい。 When submitted by the CPU 11, the task pool 13 contains one or more task threads 21. Each task thread 21 represents a calculation task that may be a component or a subset of a larger aggregate calculation requirement given to the CPU 11. In one embodiment, the CPU 11 may initialize a task pool 13 with threads 21 that can execute in parallel and then populate it. Each thread 21 may include one or more discrete tasks 22. The task 22 may have a task type and a descriptor. The task type indicates which cell 12 is capable of performing the task 22. The task pool 13 may also use task types to prioritize tasks 22 that have the same type. In one embodiment, the task pool 13 records the joint cells 12 that are present in the system 10, the types of tasks 22 each cell can perform, and whether each cell is currently processing the task 22. , A priority table (not shown) may be maintained. The task pool 13 may use the priority table to determine which of the eligible tasks 22 to assign to the requesting cell, as described below.

いくつかの実施形態では、ＣＰＵ１１は、タスクプールからタスクまたはスレッドを読み出し、それを実行してもよい。さらに、ＣＰＵ１１は、古くなった、壊れた、膠着した、または誤ったと判定される任意のタスクを中断してもよい。そのような場合では、ＣＰＵ１１は、タスクをリフレッシュし、後続処理のために利用可能にしてもよい。例えば、人工知能によって要求され得るように、ＣＰＵ１１が適応タスク管理を実装することを妨げるものはなく、その後、ＣＰＵ１１は、未完成の既存のスレッド２１内のタスクを追加、除去、または変更してもよい。 In some embodiments, the CPU 11 may read a task or thread from the task pool and execute it. Further, the CPU 11 may suspend any task that is determined to be outdated, broken, stalled, or erroneous. In such a case, the CPU 11 may refresh the task and make it available for subsequent processing. For example, nothing prevents CPU 11 from implementing adaptive task management, as may be required by artificial intelligence, after which CPU 11 adds, removes, or modifies tasks within an existing unfinished thread 21. Good.

記述子は、存在する場合、実行されるべき具体的命令、実行のモード、処理されるべきデータの場所（例えば、アドレス）、およびタスク結果を置くための場所のうちの１つまたはそれを上回るものを含有してもよい。結果を置くための場所は、多くの場合、結果をメモリ内に記憶するのではなく、それらをディスプレイに提示する、アニメーションおよびマルチメディアタスクの場合等では、随意である。さらに、タスク記述子は、リンクされたリストのように、ともに連鎖されてもよく、したがって、処理されるべきデータは、記述子がともに連鎖されない場合よりも少ないメモリ呼び出しでアクセスされ得る。ある実施形態では、記述子は、ヘッダおよびメモリ場所の複数の参照ポインタを含有するデータ構造であり、タスク２２は、データ構造のメモリアドレスを含む。ヘッダは、実行されるべき機能または命令を定義する。第１のポインタが、処理されるべきデータの場所を参照する。第２の随意のポインタが、処理されたデータを置くための場所を参照する。記述子が、連続的に実行されるべき別の記述子にリンクされている場合、記述子は、次の記述子を参照する第３のポインタを含んでもよい。記述子がデータ構造である代替実施形態では、タスク２２は、完全なデータ構造を含んでもよい。 The descriptor, if present, is one or more of the specific instruction to be executed, the mode of execution, the location (eg, address) of the data to be processed, and the location to place the task result. You may contain a thing. The place for placing the results is often optional, such as in the case of animation and multimedia tasks, where the results are presented on a display rather than stored in memory. Furthermore, task descriptors may be chained together, like linked lists, so that the data to be processed may be accessed with fewer memory calls than if the descriptors were not chained together. In one embodiment, the descriptor is a data structure that contains a header and a plurality of reference pointers to memory locations, and task 22 includes the memory address of the data structure. The header defines the function or instruction to be performed. The first pointer references the location of the data to be processed. A second optional pointer references a place to put the processed data. If the descriptor is linked to another descriptor that is to be executed sequentially, the descriptor may include a third pointer that references the next descriptor. In an alternative embodiment where the descriptor is a data structure, task 22 may include the complete data structure.

スレッド２１はさらに、タスク２２が実施され得る順序および性能の順序に影響を及ぼす任意の条件を記述する、「レシピ」を備えてもよい。レシピによると、タスク２２は、ブール演算に従って、連続的に、並行して、順不同に、相互依存的に、または条件的に実行されてもよい。例えば、図２では、スレッド２１Ａは、４つのタスク、すなわち、２２Ａ、２２Ｂ、２２Ｃ、および２２Ｄを備える。例証される実施形態では、第１のタスク２２Ａは、第２のタスク２２Ｂまたは第３のタスク２２Ｃのいずれかが始まり得る前に完了しなければならない。レシピによると、いったん第２のタスク２２Ｂまたは第３のタスク２２Ｃのいずれかが完了すると、第４のタスク２２Ｄが、始まり得る。 Thread 21 may further comprise a "recipe" that describes any conditions that affect the order in which tasks 22 may be performed and the order of performance. According to the recipe, task 22 may be performed sequentially, in parallel, out of order, interdependently, or conditionally according to a Boolean operation. For example, in FIG. 2, thread 21A comprises four tasks, 22A, 22B, 22C, and 22D. In the illustrated embodiment, the first task 22A must be completed before either the second task 22B or the third task 22C can begin. According to the recipe, once either the second task 22B or the third task 22C has completed, the fourth task 22D can begin.

スレッド２１はまた、相互依存的であってもよい。例えば、図２に示されるように、スレッド２１Ｂ内のブール演算に起因して、完了したタスク２２Ｃは、スレッド２１Ｂ内のタスクの処理が継続することを可能にしてもよい。タスクプール１３は、タスク２２が、これが依存する別のタスク２２の完了を待機している間、タスク２２をロックしてもよい。タスク２２がロックされると、これは、セルによって取得されることはできない。スレッド２１のタスク２２が完了すると、タスクプール１３は、ＣＰＵ１１に完了を通知してもよい。ＣＰＵは、次いで、完了したスレッド２１を越えて処理を進めてもよい。 Threads 21 may also be interdependent. For example, as shown in FIG. 2, a completed task 22C may allow the tasks in thread 21B to continue processing due to Boolean operations in thread 21B. Task pool 13 may lock task 22 while task 22 is waiting for the completion of another task 22 on which it depends. Once task 22 is locked, it cannot be acquired by the cell. When the task 22 of the thread 21 is completed, the task pool 13 may notify the CPU 11 of the completion. The CPU may then proceed beyond the completed thread 21.

セルは、有利なこととして、相互に、およびＣＰＵ１１と連帯を維持し、それによって、システム１０が、タスクプール１３からタスクを自律的かつ先回りして読み出すことによって、複雑な算出を実施することに役立つ。セル１２は、それらがＣＰＵ１１または任意の他のコプロセッサから独立して作用し得るという点において、自律的に作用する。代替として、セルは、ＣＰＵに応じて作用する、またはそれによって直接命令されてもよい。各セルは、さらなる処理のためにセルが利用可能になるとすぐに、タスクプール１３からタスク２２を探求するという点において、先回りして作用する。 The cells advantageously maintain solidarity with each other and with the CPU 11, so that the system 10 autonomously and proactively reads tasks from the task pool 13 to perform complex calculations. Be useful. The cells 12 act autonomously in that they may act independently of the CPU 11 or any other coprocessor. Alternatively, the cell may act on or be commanded directly by the CPU. Each cell acts proactively in that it seeks task 22 from task pool 13 as soon as the cell is available for further processing.

より具体的には、ある実施形態では、セル１２は、エージェント３０を送信し、タスクプールに問い合わせ（検索し）、完了を要求する、ロックされていない、セルによって実施され得るタスクタイプを有する利用可能なタスク２２を読み出すことによって、タスクプールからタスクを取得する。典型的には、システム１０は、連帯共処理セルと同数のエージェントを有する。本文脈では、エージェントは、概して、エージェントがソースアドレス、宛先アドレス、およびペイロードを具備し得るという点で、ネットワークの意味におけるデータフレームと類似する。ある実施形態では、宛先アドレスは、エージェント３０がタスク２２を探求しているとき、タスクプール１３のアドレスであり、宛先アドレスは、エージェント３０がタスク２２を伴うそのセルに戻っているとき、対応するセル１２のアドレスである。対応して、ソースアドレスは、エージェント３０がタスク２２を探求しているとき、セル１２のアドレスであり、ソースアドレスは、エージェント３０がタスク２２を伴うそのセルに戻っているとき、タスクプール１３のアドレスである。 More specifically, in one embodiment, the cell 12 sends an agent 30, queries (searches) a task pool, requests completion, is unlocked, and has a task type that can be performed by the cell. A task is obtained from the task pool by reading the possible tasks 22. Typically, system 10 has as many agents as there are joint co-processing cells. In this context, an agent is generally similar to a data frame in the sense of the network in that the agent may comprise a source address, a destination address and a payload. In one embodiment, the destination address is the address of the task pool 13 when the agent 30 is seeking the task 22, and the destination address corresponds when the agent 30 is returning to its cell with the task 22. This is the address of the cell 12. Correspondingly, the source address is the address of the cell 12 when the agent 30 is seeking the task 22, and the source address of the task pool 13 when the agent 30 is returning to that cell with the task 22. Address.

加えて、ソースおよび宛先アドレスは、フレーム同期を促進し得る。つまり、システム１０は、ペイロードデータから明確にアドレスを区別するように構成されてもよく、したがって、エージェント３０のコンテンツが読み取られると、宛先アドレスは、フレームの開始を示し、ソースアドレスは、フレームの終了を示し、逆もまた同様である。これは、ペイロードが、アドレス間に置かれるとき、サイズを変動することを可能にする。可変サイズのペイロードの別の実施形態では、エージェント３０は、ペイロードサイズを示すヘッダを含んでもよい。ヘッダ情報は、データ完全性を検証するために、ペイロードと比較されてもよい。さらに別の実施形態では、ペイロードは、固定長であってもよい。エージェント３０がそのコプロセッサセルによってタスクプール１３にディスパッチされると、ペイロードは、セル１２が実施し得るタスクのタイプの識別情報を含有する。エージェント３０がタスクプール１３から戻ると、ペイロードは、メモリ場所または完全な記述子データ構造のいずれかの形態において、タスク２２の記述子を含有する。 In addition, the source and destination addresses can facilitate frame synchronization. That is, the system 10 may be configured to clearly distinguish the address from the payload data, so that when the content of the agent 30 is read, the destination address indicates the beginning of the frame and the source address indicates the frame. Indicates end and vice versa. This allows the payload to vary in size when placed between addresses. In another embodiment of a variable size payload, the agent 30 may include a header that indicates the payload size. The header information may be compared with the payload to verify data integrity. In yet another embodiment, the payload may be fixed length. When the agent 30 is dispatched by its coprocessor cell to the task pool 13, the payload contains identification of the type of task that the cell 12 may perform. When the agent 30 returns from the task pool 13, the payload contains the descriptor for the task 22, either in the form of a memory location or a complete descriptor data structure.

他の実施形態では、エージェント３０のいくつかまたは全ては、それらのそれぞれの対応するセル１２の自律的な代表である。つまり、各エージェント３０は、その対応するセル１２によってディスパッチされ、セルがアイドルしている、または付加的な処理を実施することが可能であるときはいつでも、タスク２２を読み出してもよい。このように、連帯セル１２の処理容量は、セルがＣＰＵ１１からの命令をアイドルして待機する必要がない限り、より完全に活用され得る。本アプローチは、タスクプールからタスクを読み出すために、セルに要求を送信する必要性からＣＰＵを解放することによって、ＣＰＵオーバーヘッドを低減させる付加的な利益も有する。これらの利点は、補助モジュールおよびコプロセッサがメインＣＰＵからの命令に依存する伝統的なコンピュータアーキテクチャよりも、システム１０を効率的にする。 In other embodiments, some or all of agents 30 are autonomous representatives of their respective corresponding cells 12. That is, each agent 30 may read task 22 whenever dispatched by its corresponding cell 12 and the cell is idle or capable of performing additional processing. In this way, the processing capacity of the joint cell 12 can be more fully utilized unless the cell has to idle and wait for commands from the CPU 11. This approach also has the additional benefit of reducing CPU overhead by freeing the CPU from having to send a request to the cell to read the task from the task pool. These advantages make system 10 more efficient than traditional computer architectures where auxiliary modules and coprocessors rely on instructions from the main CPU.

さらに、連帯セル１２Ａ−１２ｎは、スレッド自体の特定の組成に関して曖昧である。むしろ、エージェントは、その対応するセルの能力と、タスクプール１３内の完了するべき利用可能なタスク２２との間の整合を見出すことにのみ関与する。つまり、タスクプール１３内に利用可能なタスク２２が存在し、利用可能なタスク２２がセルの能力と整合する限り、本システムは、セルの処理容量を効果的に利用し得る。 Moreover, the solidarity cells 12A-12n are ambiguous with respect to the particular composition of the thread itself. Rather, the agent is only involved in finding a match between the capabilities of its corresponding cell and the available tasks 22 in the task pool 13 to be completed. That is, as long as there are available tasks 22 in the task pool 13 and the available tasks 22 match the capacity of the cell, the present system can effectively use the processing capacity of the cell.

連帯セル１２Ａ−１２ｎのいくつかまたは全ては、相互に独立して機能してもよい、または直接、スイッチングファブリック１４を通して、タスクプール１３を通して、もしくはＣＰＵからのコマンドもしくは要求に従って相互に通信し、別の連帯セルを呼び出し、データを処理、移動、もしくは伝送する際に補助してもよい。一実施形態では、エージェント３０Ａは、レディタスク２２のタスクタイプと、セル１２Ａが実施することが可能であるタスクのタイプとの間の整合を検索してもよい。本アーキテクチャは、ＣＰＵ１１が生成するように構成されるタスクのタイプのハードコーディングを伴ってもよい。したがって、タスクプール１３が３つのタイプのタスク２２を含有し、大規模な算出要件が第４のタイプのタスクを含む場合、この第４のタイプのタスクは、第４のタイプのタスクを実施することが可能なセルがシステム１０に含まれる、または追加される場合であっても、タスクプール１３内に置かれない場合がある。その結果、ＣＰＵ１１は、利用可能な処理リソースをより完全に活用するために、第４のタイプのタスクを生成する方法を「学習する」または教示されるように構成され得る。 Some or all of the joint cells 12A-12n may function independently of each other, or may communicate directly with each other, either through the switching fabric 14, through the task pool 13, or according to commands or requests from the CPU. May be invoked to assist in processing, moving, or transmitting data. In one embodiment, agent 30A may search for a match between the task type of ready task 22 and the types of tasks cell 12A is capable of performing. The architecture may involve hard coding of the type of task the CPU 11 is configured to generate. Thus, if the task pool 13 contains three types of tasks 22 and the large computational requirement includes a fourth type of task, this fourth type of task implements the fourth type of task. Even if possible cells are included in or added to the system 10, they may not be placed in the task pool 13. As a result, the CPU 11 may be configured to "learn" or be taught how to generate a fourth type of task in order to more fully utilize the available processing resources.

別の実施形態では、エージェント３０Ａは、セル１２Ａが実施することが可能である命令のうちの１つと整合する実行可能な命令のためのタスク２２記述子を検索する。整合タスク２２が見出されると、エージェント３０Ａは、整合タスク２２の記述子をセル１２Ａに送達し、その後、セル１２Ａは、タスク２２を処理し始める。特に、エージェント３０Ａは、記述子のメモリアドレスをセル１２Ａに送達してもよく、セル１２Ａは、メモリからデータ構造を読み出す。代替として、記述子の全データ構造がタスク２２内に含有される場合、エージェント３０Ａは、処理のためにセル１２Ａに完全なデータ構造を送達してもよい。記述子は、セル１２Ａに、どの命令を実行するか、処理されるべきデータが見出され得るメモリユニット１５内の場所、および結果が置かれるべきメモリ１５内の場所を知らせる。タスク２２の完了に応じて、セル１２Ａは、選択されたタスク２２のステータスを「完了するべき」から「完了した」に変更するようにタスクプール１３に通知する。さらに、いったんセル１２Ａがタスク２２を終了すると、セルは、そのエージェント３０Ａをタスクプール１３にディスパッチし、別のタスク２２を探求してもよい。 In another embodiment, the agent 30A searches for a task 22 descriptor for an executable instruction that matches one of the instructions that the cell 12A can implement. Once the match task 22 is found, the agent 30A delivers the match task 22 descriptor to the cell 12A, after which the cell 12A begins processing the task 22. In particular, agent 30A may deliver the memory address of the descriptor to cell 12A, which reads the data structure from memory. Alternatively, if the descriptor's entire data structure is contained within task 22, agent 30A may deliver the complete data structure to cell 12A for processing. The descriptor tells the cell 12A which instruction to execute, where in the memory unit 15 the data to be processed can be found, and where in the memory 15 the result should be placed. Upon completion of the task 22, the cell 12A notifies the task pool 13 to change the status of the selected task 22 from “should be completed” to “completed”. Further, once cell 12A completes task 22, the cell may dispatch its agent 30A to task pool 13 to explore another task 22.

エージェント３０Ａ−３０ｎのいくつかまたは全ては、システム１０の特定のアーキテクチャおよび／または実装による、例えば、Ｗｉ−Ｆｉネットワーク、無線イーサネット（登録商標）、無線ＵＳＢ、無線ブリッジ、無線中継装置、無線ルータ、Ｚｉｇｂｅｅ（登録商標）、ＡＮＴ＋（登録商標）、またはＢｌｕｅｔｏｏｔｈ（登録商標）ペアリングを使用して、有線または無線でシステム１０を通して進行してもよい。ある実施形態では、エージェント３０は、タスクプール１３にレセプタ特徴を含めることによって、さらに、セル１２に送信機特徴を含めることによって、無線でタスクプール１３に誘導されてもよい。同様に、タスクプールは、タスクプールに送信機を、連帯セルに受信機を具備することによって、セルに無線で応答してもよい。本様式では、セルは、スイッチングファブリックの使用の有無にかかわらず、タスクプールと無線で通信し得る。 Some or all of the agents 30A-30n may be, for example, a Wi-Fi network, a wireless Ethernet, a wireless USB, a wireless bridge, a wireless relay device, a wireless router, depending on the particular architecture and/or implementation of the system 10. Zigbee(R), ANT+(R), or Bluetooth(R) pairing may be used to proceed through the system 10 in a wired or wireless manner. In some embodiments, the agent 30 may be wirelessly directed to the task pool 13 by including a receptor feature in the task pool 13 and further by including a transmitter feature in the cell 12. Similarly, a task pool may wirelessly respond to a cell by providing the task pool with a transmitter and the joint cell with a receiver. In this manner, cells can communicate wirelessly with the task pool with or without the use of switching fabrics.

しかしながら、好ましい実施形態では、ある形態のスイッチングファブリック１４が、使用される。スイッチングファブリック１４は、システムリソース間のデータ転送およびアービトレーションのための接続を促進する。スイッチングファブリック１４は、種々のセルとタスクプールとの間に接続性を提供する、ルータまたはクロスバースイッチであってもよい。スイッチングファブリック１４はさらに、各連帯セル１２Ａ−１２ｎと、ＣＰＵ１１、メモリユニット１５、および限定ではないが、ダイレクトメモリアクセスユニット、送信機、ハードディスクおよびそれらのコントローラ、ディスプレイおよび他の入力／出力デバイス、ならびに他のコプロセッサを含む、伝統的なシステム構成要素等のシステムリソースとの間に接続性を提供してもよい。セル１２Ａ−１２ｎは、スイッチングファブリック１４に物理的に接続されてもよい、またはセルは、無線で接続されてもよい。 However, in the preferred embodiment, some form of switching fabric 14 is used. The switching fabric 14 facilitates connections for data transfer and arbitration between system resources. The switching fabric 14 may be a router or crossbar switch that provides connectivity between various cells and task pools. The switching fabric 14 further includes each solid state cell 12A-12n, CPU 11, memory unit 15, and, but not limited to, direct memory access units, transmitters, hard disks and their controllers, displays and other input/output devices, and Connectivity may be provided to and from system resources, such as traditional system components, including other coprocessors. The cells 12A-12n may be physically connected to the switching fabric 14, or the cells may be wirelessly connected.

システム１０へのセルの無線接続は、システム１０内で使用するためのセルの動的な追加および／または除去を促進する。例えば、ＣＰＵ１１は、他のセルシステムからセルを補充し、動的拡張および性能向上を可能にしてもよい。本様式では、２つまたはそれを上回るセルシステム（例えば、ネットワーク）が、連帯セルを共有し得る。一実施形態では、アイドル状態になるセルは、付加的な処理リソースの必要性を有する、すなわち、完了する必要がある利用可能な処理タスクを有する、別のシステムを探す、および／またはそれによって補充されてもよい。同様に、システム１０は、特定のタスクのための付加的なセルのクラスタを組み込むことによって、性能を拡張させてもよい。例えば、システム１０は、これらのタスクを実施することが可能な近傍セルを組み込むことによって、暗号化／復号化機能、またはオーディオおよび／もしくはビデオデータの処理の性能を増進させてもよい。 The wireless connection of cells to system 10 facilitates the dynamic addition and/or removal of cells for use within system 10. For example, the CPU 11 may replenish cells from other cell systems to allow dynamic expansion and performance improvements. In this manner, two or more cell systems (eg, networks) may share a joint cell. In one embodiment, the idle cell has a need for additional processing resources, ie, has available processing tasks that need to be completed, and seeks and/or is replenished by another system. May be done. Similarly, system 10 may scale performance by incorporating additional clusters of cells for particular tasks. For example, system 10 may enhance the performance of encryption/decryption functions, or processing of audio and/or video data, by incorporating neighbor cells that are capable of performing these tasks.

望ましくない接続を防止するために、ＣＰＵ１１は、タスクプール１３に、信頼できるおよび／または信頼できないセル、ならびに認証要件もしくはプロトコルのリスト、または代替として、それらを識別するための基準を提供してもよい。さらに、タスクプール自体は、低性能、信頼性のない接続、データ処理量の不良、または悪意のある、もしくは別様に不適切な活動の疑いに基づいて、特定のセルを除外してもよい。種々の実施形態では、セル１２は、ユーザによって、スマートフォン、タブレット、または他のデバイスもしくはアプリケーションの使用を通して、タスクプール１３に追加される、またはタスクプール１３から除外されてもよい。一実施形態では、グラフィカルアプリケーションインターフェースは、ネットワークから特定のセルを追加または除去した結果、利用可能なセルおよび他のデバイスの場所、性能向上、または性能ペナルティ等の有用な統計および／またはアイコン情報をユーザに提供してもよい。 To prevent undesired connections, the CPU 11 may also provide the task pool 13 with trusted and/or untrusted cells, as well as a list of authentication requirements or protocols, or alternatively criteria for identifying them. Good. Further, the task pool itself may exclude certain cells based on low performance, unreliable connections, poor data throughput, or suspected malicious or otherwise inappropriate activity. .. In various embodiments, cells 12 may be added to or removed from task pool 13 by the user through the use of smartphones, tablets, or other devices or applications. In one embodiment, the graphical application interface provides useful statistics and/or icon information such as locations of available cells and other devices, performance enhancements, or performance penalties as a result of adding or removing particular cells from the network. It may be provided to the user.

代替実施形態では、共処理セルのいくつかまたは全ては、通信のためのスイッチングファブリック１４を要求しない有線構成等によって、タスクプール１３に直接接続してもよい。セルの有線接続はさらに、有線接続が周辺機器デバイスの物理的（例えば、手動）統合および抽出であり得るが、上記に議論される無線構成に類似するシステム１０の動的拡張および縮小を促進してもよい。いずれの場合も、システム１０の変更を考慮して、コプロセッサがＣＰＵ１１を再プログラムすることなく追加および除去され得るため、本システムのスケーラビリティは、従来の並列処理スキームを上回って大幅に増進される。 In alternative embodiments, some or all of the co-processing cells may be directly connected to the task pool 13, such as by a wired configuration that does not require the switching fabric 14 for communication. The wired connection of the cells further facilitates dynamic expansion and contraction of the system 10 similar to the wireless configuration discussed above, although the wired connection may be physical (eg, manual) integration and extraction of peripheral devices. May be. In any case, the scalability of the present system is greatly enhanced over conventional parallel processing schemes, as coprocessors can be added and removed without reprogramming the CPU 11 to account for changes in the system 10. ..

ここで図３を参照すると、ネットワーク３００が、ＣＰＵ３０２と、第１のメモリ３０４と、第２のメモリ３０６と、タスクプール３０８と、スイッチングファブリック３１０と、タイプＡタスクを実施（実行）するように構成される第１の共処理セル３１２と、タイプＢタスクを実施するように構成される第２のセル３１４と、タイプＣタスクを実施するように構成される第３のセル３１６と、タイプＡおよびタイプＢタスクの両方を実施するように構成される第４のセル３１８とを含む。示されるように、タスクプール３０８は、（例えば、ＣＰＵ３０２によって）タスクタイプＡのタスク（またはタスクスレッド）３３０および３３２、タスクタイプＢのタスク３３４および３３６、ならびにタスクタイプＣのタスク３４０および３４２を投入される。ある実施形態では、各セルは、好ましくは、一意の専用エージェントを有する。特に、セル３１２は、エージェント３２０を含み、セル３１４は、エージェント３２２を含み、セル３１６は、エージェント３２４を含み、セル３１８は、エージェント３２６を含む。各エージェントは、好ましくは、その関連付けられるセルが実施するように構成されるタスクのタイプ、例えば、単一タスクまたはタスクＡ、Ｂ、Ｃの組み合わせを識別する、情報フィールドまたはヘッダを含む。 Referring now to FIG. 3, the network 300 is configured to perform (execute) a CPU 302, a first memory 304, a second memory 306, a task pool 308, a switching fabric 310, and a type A task. A first co-processing cell 312 configured, a second cell 314 configured to perform a Type B task, a third cell 316 configured to perform a Type C task, and a type A And a fourth cell 318 configured to perform both Type B tasks. As shown, task pool 308 populates task type A tasks (or task threads) 330 and 332 (eg, by CPU 302), task type B tasks 334 and 336, and task type C tasks 340 and 342. To be done. In some embodiments, each cell preferably has a unique dedicated agent. In particular, cell 312 includes agent 320, cell 314 includes agent 322, cell 316 includes agent 324, and cell 318 includes agent 326. Each agent preferably includes an information field or header that identifies the type of task that its associated cell is configured to perform, eg, a single task or a combination of tasks A, B, C.

動作中、セルがアイドル状態であるか、または別様に利用可能な処理容量を有するかのいずれかのとき、そのエージェントは、タスクプールに先回りして問い合わせ、その特定のセルに適切である任意のタスクがタスクキュー内にあるかどうかを判定する。例えば、セル３１２は、そのエージェント３２０をディスパッチし、タスクタイプＡに対応するタスク３３０および３３２の一方または両方を読み出してもよい。同様に、セル３１４は、そのエージェント３２２をディスパッチし、タスクタイプＢに対応するタスク３３４または３３６のいずれかを（それらの相対的優先順位に応じて）読み出してもよく、以下同様である。タスクタイプＡおよびＢを実施するように構成されるセル３１８等、１つを上回るタスクタイプを実施することが可能であるセルに関して、エージェント３２６は、タスク３３０、３３２、３３４、および／または３３６のうちのいずれか１つを読み出してもよい。 During operation, when a cell is either idle or otherwise has available processing capacity, its agent proactively queries the task pool and any that are appropriate for that particular cell. Determines whether the task is in the task queue. For example, cell 312 may dispatch its agent 320 and read one or both of tasks 330 and 332 corresponding to task type A. Similarly, cell 314 may dispatch its agent 322 and read either task 334 or 336 (depending on their relative priority) corresponding to task type B, and so on. For cells that are capable of performing more than one task type, such as cell 318 that is configured to perform task types A and B, agent 326 may include tasks 330, 332, 334, and/or 336. Any one of them may be read.

タスクプールからタスクを読み出すと、セルは、次いで、典型的には、第１のメモリ３０４内の特定の場所からデータを読み出し、そのデータを処理し、処理されたデータを第２のメモリ３０６内の特定の場所に記憶することによって、そのタスクを処理し得る。タスクが完了すると、セルは、タスクプールに通知し、タスクプールは、完了したものとしてタスクをマーキングし、タスクプールは、タスクが完了したことをＣＰＵに通知する。代替として、タスクプールは、タスクスレッドが単一タスク、一連のタスク、またはブール組み合わせのタスクを含み得る限り、タスクスレッドが完了すると、これをＣＰＵに通知してもよい。有意なこととして、セルによるタスクの読み出しおよびデータの処理は、ＣＰＵと種々のセルとの間の直接通信を伴わずに生じ得る。 Upon reading a task from the task pool, the cell then typically reads data from a particular location in the first memory 304, processes the data, and processes the processed data in the second memory 306. The task may be handled by storing it in a specific location in the. When the task is completed, the cell notifies the task pool, the task pool marks the task as completed, and the task pool notifies the CPU that the task is completed. Alternatively, the task pool may notify the CPU when a task thread completes, as long as the task thread can include a single task, a series of tasks, or a Boolean combination of tasks. Significantly, reading tasks and processing data by cells can occur without direct communication between the CPU and the various cells.

ここで図４を参照すると、モノのインターネットネットワーク４００が、コントローラ（ＣＰＵ）４０２と、タスクプール４０８と、種々のデバイス４１０−４２２とを含み、そのいくつかまたは全てが、処理容量を具現化する、集積回路（ＩＣ）チップまたは他の構成要素等の関連付けられる、または埋め込まれたマイクロコントローラを含む。非限定的実施例として、本デバイスは、電球４１０、サーモスタット４１２、電気レセプタクル４１４、電力スイッチ４１６、電化製品（例えば、トースター）４１８、車両４２０、キーボード４２２、およびネットワークとインターフェースをとることが可能な事実上任意の他のプラグアンドプレイデバイスまたはアプリケーションを含んでもよい。 Referring now to FIG. 4, an Internet of Things network 400 includes a controller (CPU) 402, a task pool 408, and various devices 410-422, some or all of which embody processing capacity. , Associated or embedded microcontrollers such as integrated circuit (IC) chips or other components. As a non-limiting example, the device can interface with a light bulb 410, a thermostat 412, an electrical receptacle 414, a power switch 416, an appliance (eg, toaster) 418, a vehicle 420, a keyboard 422, and a network. It may include virtually any other Plug and Play device or application.

例証される実施形態では、コントローラ４０２は、スマートフォン、タブレット、ラップトップ、またはネットワーク上の種々のデバイスとのユーザ相互作用を促進するためのディスプレイ４０４およびユーザインターフェース（例えば、キーパッド）４０６を含み得る他のデバイスであってもよい。コントローラ４０２の処理容量（例えば、帯域幅）がネットワークを適正にサポートするために不十分であり得る限りにおいて、コントローラは、例えば、図５と併せて以下に説明されるように、周辺機器デバイスからタスクプールを介して、処理リソースを効果的に採取または補充してもよい。 In the illustrated embodiment, the controller 402 may include a display 404 and a user interface (eg, keypad) 406 to facilitate user interaction with various devices on a smartphone, tablet, laptop, or network. It may be another device. To the extent that the processing capacity (eg, bandwidth) of the controller 402 may be insufficient to properly support the network, the controller may be configured from peripheral devices, eg, as described below in conjunction with FIG. Through the task pool, processing resources may be effectively harvested or replenished.

ここで図５を参照すると、モノのインターネットネットワーク５００の使用事例が、近傍（または別様に利用可能な）デバイスの動的な利用を例証する。ネットワーク５００は、主要制御ユニット５０２（例えば、ラップトップ、タブレット、またはゲームデバイス）と、タスクプール５０４と、第１のコプロセッサデバイス５０６と、第２のコプロセッサデバイス５０８とを含む。ネットワーク５００の文脈における例示的使用事例が、ここで説明される。 Referring now to FIG. 5, a use case for the Internet of Things network 500 illustrates the dynamic utilization of nearby (or otherwise available) devices. The network 500 includes a main control unit 502 (eg, laptop, tablet, or gaming device), a task pool 504, a first coprocessor device 506, and a second coprocessor device 508. Exemplary use cases in the context of network 500 are described herein.

ユーザが、自身のラップトップコンピュータ５０２でビデオゲームをプレイしていると仮定する。ビデオゲームは、詳細なコンピュータ生成イメージを要求し、おそらく、ラップトップ５０２における処理能力は、単一の写実的に見えるキャラクタをレンダリングするために十分であるが、第２のキャラクタが画面上に導入されると、画像品質は、劣化し、キャラクタの移動はもはや、連続的ではない。本発明は、ユーザの近傍内に位置する、または別様にユーザに利用可能な十分に利用されていないコンピュータリソースの処理能力を利用する方法を提案する。 Suppose a user is playing a video game on his laptop computer 502. Video games require detailed computer-generated images, and perhaps the processing power on laptop 502 is sufficient to render a single photorealistic character, but a second character is introduced on the screen. If so, the image quality is degraded and the character movement is no longer continuous. The present invention proposes a method of utilizing the processing power of underutilized computer resources that are located in the vicinity of the user or otherwise available to the user.

付加的な処理能力の必要性に対処するために、ラップトップ５０２は、タスクプール５０４に接続する。この点で、ラップトップ自体は、タスクプールを具備してもよい、またはタスクプールは、ラップトップ５０２から無線到達範囲内に位置する外部デバイスまたはアプリケーションの形態であってもよい。外部タスクプールの場合では、タスクプール自体は、ポートを伴うスイッチングファブリックの役割を果たし、複数の共処理セルへの接続を可能にし得る。ラップトップ５０２は、算出集約的タスクをタスクプール５０４に投入する。スマートフォン５０８等の近傍の十分に利用されていないデバイスは、続けてタスクプール５０４に接続し、整合タスクタイプをフェッチするためにそのエージェントを送信する。その結果、スマートフォン５０８は、ラップトップ５０２をシームレスに補助するコプロセッサになり、それによって、ビデオゲーム体験を向上させる。同一の方法は、他の十分に利用されていない処理リソースが存在し、必要とされる場合に繰り返されてもよい。実際、利用可能な電球５０６の処理能力であっても、ラップトップへのコプロセッサになり得る。 To address the need for additional processing power, laptop 502 connects to task pool 504. In this regard, the laptop itself may comprise a task pool, or the task pool may be in the form of an external device or application located within wireless range from laptop 502. In the case of an external task pool, the task pool itself may act as a switching fabric with ports, allowing connections to multiple co-processing cells. The laptop 502 submits the computationally intensive task to the task pool 504. Nearby underutilized devices, such as smartphones 508, continue to connect to task pool 504 and send their agents to fetch matching task types. As a result, smartphone 508 becomes a coprocessor that seamlessly assists laptop 502, thereby enhancing the video game experience. The same method may be repeated when other under-utilized processing resources exist and are needed. In fact, even the available light bulb 506 processing power can be a coprocessor to a laptop.

図６は、例示的並列コンピューティング環境の動作を例証する、フローチャートである。特に、方法６００が、タスクをタスクプールに投入するステップ（ステップ６０２）と、１つまたはそれを上回るエージェントを、１つまたはそれを上回る対応するセルからタスクプールに先回りしてディスパッチするステップ（ステップ６０４）と、タスクを読み出し、処理するステップ（ステップ６０６）と、タスクスレッドが実施されたことをタスクプールおよびＣＰＵに通知するステップ（ステップ６０８）とを含む。方法６００はさらに、必要に応じて、付加的なデバイスをネットワークに動的に組み込むステップ（ステップ６１０）を含む。 FIG. 6 is a flow chart illustrating the operation of an exemplary parallel computing environment. In particular, method 600 includes the steps of submitting a task to a task pool (step 602) and dispatching one or more agents proactively to the task pool from one or more corresponding cells (step 602). 604), reading and processing the task (step 606), and notifying the task pool and the CPU that the task thread has been performed (step 608). Method 600 further includes dynamically incorporating additional devices into the network (step 610), if desired.

したがって、タスクプールと、第１のタスクをタスクプールに投入するように構成される、コントローラと、タスクプールから第１のタスクを先回りして読み出すように構成される、第１のコプロセッサとを含む処理システムが、提供される。 Therefore, a task pool, a controller configured to submit the first task to the task pool, and a first coprocessor configured to proactively read the first task from the task pool. A processing system including is provided.

ある実施形態では、第１のコプロセッサは、コントローラと通信することなく、タスクプールから第１のタスクを読み出すように構成される、第１のエージェントを備える。 In certain embodiments, the first coprocessor comprises a first agent configured to read the first task from the task pool without communicating with the controller.

ある実施形態では、第１のタスクは、第１のタスクタイプの印を含み、第１のコプロセッサは、第１のタイプのタスクを実施するように構成され、第１のエージェントは、第１のタイプのタスクに関してタスクプールを検索するように構成される。 In an embodiment, the first task includes a first task type indicia, the first coprocessor is configured to perform the first type of task, and the first agent is configured to perform the first task. Configured to search the task pool for tasks of this type.

ある実施形態では、第１のコプロセッサはさらに、第１のものを処理し、第１のタスクの完了に応じて、タスクプールに通知するように構成され、タスクプールは、第１のタスクの完了に応じて、コントローラに通知するように構成される。 In certain embodiments, the first coprocessor is further configured to process the first one and notify the task pool upon completion of the first task, the task pool of the first task. Configured to notify the controller upon completion.

ある実施形態では、コントローラおよび第１のコプロセッサは、相互にタスクプールを通してのみ通信するように構成される。 In some embodiments, the controller and the first coprocessor are configured to communicate with each other only through the task pool.

ある実施形態では、コントローラおよび第１のコプロセッサは、相互に直接、タスクプールを通して通信するように構成される。 In one embodiment, the controller and the first coprocessor are configured to communicate with each other directly through the task pool.

ある実施形態では、第１のコプロセッサは、これが利用可能な処理容量を有していることを判定し、判定に応答してエージェントをタスクプールにディスパッチするように構成される。 In one embodiment, the first coprocessor is configured to determine that it has available processing capacity and dispatch the agent to the task pool in response to the determination.

ある実施形態では、コントローラはさらに、第２のタスクをタスクプールに投入するように構成され、本システムはさらに、タスクプールから第２のタスクを先回りして読み出すように構成される第２のエージェントを有する、第２のコプロセッサを備える。 In an embodiment, the controller is further configured to submit the second task to the task pool, and the system is further configured to secondarily read the second task from the task pool. With a second coprocessor.

ある実施形態では、第２のタスクは、第２のタスクタイプの印を含み、第２のコプロセッサは、第２のタイプのタスクを実施するように構成され、第２のエージェントは、第２のタイプのタスクに関してタスクプールを検索するように構成される。 In some embodiments, the second task includes a second task type indicia, the second coprocessor is configured to perform the second type of task, and the second agent is configured to perform the second task. Configured to search the task pool for tasks of this type.

ある実施形態では、コントローラおよびタスクプールは、モノリシック集積回路（ＩＣ）上に常駐し、第１のコプロセッサは、ＩＣ上に常駐しない。 In some embodiments, the controller and task pool reside on a monolithic integrated circuit (IC) and the first coprocessor does not reside on the IC.

別の実施形態では、コントローラ、タスクプール、ならびに第１および第２のコプロセッサは、モノリシック集積回路（ＩＣ）上に常駐する。 In another embodiment, the controller, task pool, and first and second coprocessors reside on a monolithic integrated circuit (IC).

第１のタスクタイプを有する第１のタスクをタスクプールに投入するように構成される、中央処理ユニット（ＣＰＵ）を含むタイプのネットワーク内の処理リソースを動的に制御する方法もまた、提供される。本方法は、第１のタスクタイプを実施するように第１のセルをプログラムするステップと、プログラムされた第１のセルをネットワークに追加するステップと、第１のセルからタスクプールに、第１のエージェントを先回りして送信するステップと、第１のエージェントによって、第１のタイプのタスクに関してタスクプールを検索するステップと、第１のエージェントによって、タスクプールから第１のタスクを読み出すステップと、第１のエージェントによって、第１のセルに第１のタスクをトランスポートするステップと、第１のセルによって、第１のタスクを処理するステップと、第１のセルからタスクプールに、第１のタスクが完了した通知を送信するステップとを含む。 A method for dynamically controlling processing resources in a network of a type including a central processing unit (CPU), configured to submit a first task having a first task type to a task pool is also provided. It The method comprises the steps of programming a first cell to perform a first task type, adding the programmed first cell to a network, and from the first cell to a task pool. Sending the agent in advance, retrieving the task pool by the first agent for tasks of the first type, and retrieving the first task from the task pool by the first agent, Transporting a first task by a first agent to a first cell; processing a first task by a first cell; and sending a first task from the first cell to a task pool, Sending a notification that the task has been completed.

ある実施形態では、本方法はまた、タスクプールによって、第１のタスクを完了しているものとしてマーキングするステップと、タスクプールからＣＰＵに、第１のタスクが完了した通知を送信するステップとを含む。 In an embodiment, the method also includes marking the first task as completed by the task pool and sending a notification from the task pool to the CPU that the first task is complete. Including.

ある実施形態では、本方法はまた、第１のエージェントをタスクプールに先回りして送信するための述語として、第１のセルが利用可能な処理容量を有すると判定するように第１のセルを構成するステップを含む。 In an embodiment, the method also uses the first cell as a predicate to proactively send the first agent to the task pool to determine that the first cell has available processing capacity. Including configuring steps.

ある実施形態では、本方法はまた、プログラムされた第１のセルをネットワークに追加することに先立って、第１のセルを第１のデバイスに統合するステップを含む。 In certain embodiments, the method also includes integrating the first cell into the first device prior to adding the programmed first cell to the network.

ある実施形態では、第１のデバイスは、センサ、電球、電力スイッチ、電化製品、バイオメトリックデバイス、医療デバイス、診断デバイス、ラップトップ、タブレット、スマートフォン、モータコントローラ、およびセキュリティデバイスのうちの１つを含む。 In some embodiments, the first device comprises one of a sensor, light bulb, power switch, appliance, biometric device, medical device, diagnostic device, laptop, tablet, smartphone, motor controller, and security device. Including.

ある実施形態では、プログラムされた第１のセルをネットワークに追加するステップは、第１のセルとタスクプールとの間に通信リンクを確立するステップを含む。 In one embodiment, adding the programmed first cell to the network comprises establishing a communication link between the first cell and the task pool.

ある実施形態では、（ＣＰＵ）はさらに、第２のタスクタイプを有する、第２のタスクをタスクプールに投入するように構成され、本方法はさらに、第２のタスクタイプを実施するように第２のセルをプログラムするステップと、第２のセルとタスクプールとの間に通信リンクを確立するステップと、第２のセルからタスクプールに、第２のエージェントを先回りして送信するステップと、第２のエージェントによって、第２のタイプのタスクに関してタスクプールを検索するステップと、第２のエージェントによって、タスクプールから第２のタスクを読み出すステップと、第２のエージェントによって、第２のセルに第２のタスクをトランスポートするステップと、第２のセルによって、第２のタスクを処理するステップと、第２のセルからタスクプールに、第２のタスクが完了した通知を送信するステップと、タスクプールによって、第２のタスクを完了しているものとしてマーキングするステップと、タスクプールからＣＰＵに、第２のタスクが完了した通知を送信するステップとを含む。 In an embodiment, the (CPU) is further configured to submit a second task to the task pool having a second task type, the method further comprising: performing the second task type. Programming the second cell, establishing a communication link between the second cell and the task pool, and proactively sending a second agent from the second cell to the task pool; Retrieving a task pool for a second type of task by a second agent, retrieving a second task from the task pool by a second agent, and retrieving a second cell by a second agent. Transporting the second task, processing the second task by the second cell, sending a notification from the second cell to the task pool that the second task has been completed, The task pool includes marking the second task as complete and sending from the task pool to the CPU a notification that the second task is complete.

モノのインターネット（ＩｏＴ）コンピューティング環境内の分散処理リソースを制御するためのシステムもまた、提供され、集約コンピューティング要件を複数のタスクにパーティション化し、タスクをプール内に置くように構成される、ＣＰＵと、それぞれ、ＣＰＵとの直接通信を伴わずに、プールからタスクを先回りして読み出すように構成される、一意の専用エージェントを有する、複数のデバイスとを含む。 A system for controlling distributed processing resources within an Internet of Things (IoT) computing environment is also provided, configured to partition aggregate computing requirements into multiple tasks and place the tasks in a pool. A CPU and a plurality of devices each having a unique dedicated agent configured to proactively read tasks from the pool without direct communication with the CPU.

本発明者に把握される最良の形態を含む、種々の実施形態を可能にする説明が例証されたが、本発明の範囲から逸脱することなく、種々の変更および修正が成され得、均等物が種々の要素に対して代用され得ることが、当業者によって理解されるであろう。したがって、本明細書に開示される本発明は、開示される特定の実施形態に限定されず、本発明は、添付される請求項の文字通りかつ同等の範囲内に該当する全ての実施形態を含むであろうことが意図される。 While the description, which allows for various embodiments, including the best mode known to the inventor, has been illustrated, various changes and modifications may be made and equivalents without departing from the scope of the invention. It will be appreciated by those skilled in the art that can be substituted for various elements. Accordingly, the invention disclosed herein is not limited to the particular embodiments disclosed, the invention including all embodiments falling within the literal and equivalent scope of the appended claims. Is intended to be.

Claims

A device for parallel processing of large scale computing requirements, said device comprising:
A central processing unit (“CPU”),
A task pool in electronic communication with the CPU,
A first joint cell in electronic communication with the task pool, the first joint cell performing a matching task processed by the joint cell from the task pool without requiring an instruction from the CPU. A first solidarity cell, comprising a first agent configured to read ahead,
The CPU submits to the task pool by dividing the requirement into one or more threads and placing the threads in the task pool, each thread comprising one or more tasks, The matching task is one of the above tasks,
Each task comprises a descriptor, said descriptor
The function to be performed,
At least a memory location of data on which the function should be performed,
The first agent is a data frame comprising a source address, a destination address and a payload,
The first agent is
The first agent is dispatched to the task pool by the first joint cell, during which the source address is the address of the first joint cell and the destination address is the An address of a task pool, the payload comprising a list of functions that the first joint cell is configured to perform, and
The first agent searching the task pool for a task that is in a state that can be processed and that has a function that the first joint cell can perform;
The first agent returning to the first joint cell, during which the source address is the task pool address and the destination address is the first joint cell address. The payload comprises the descriptor of the matching task, thereby reading the matching task.

The device according to claim 1, wherein the task pool notifies the CPU when a task of the thread is completed.

The tasks each include a task type selected from a set of task types, and the first joint cell is configured to perform one or more tasks of the task types. The apparatus according to Item 1.

4. The apparatus of claim 3, wherein the matching task is a task that is in a state that can be processed and that has a task type that the first joint cell is capable of performing.

A task pool,
A controller configured to submit a plurality of first tasks and a plurality of second tasks to the task pool,
A first coprocessor, the first coprocessor continuously reading a first task from the task pool without any communication between the first coprocessor and the controller. A first coprocessor configured to process the first task, generate first result data, and update the task pool to reflect completion of the first task;
A second coprocessor, wherein the second coprocessor continuously reads the second task from the task pool without any communication between the second coprocessor and the controller. A second coprocessor configured to process the second task, generate second result data, and update the task pool to reflect completion of the second task. A processing system comprising:
The processing system dynamically accepts the first coprocessor, the second coprocessor, and additional coprocessors on a plug and play basis into the processing system without communicating with the controller. Processing system configured as.

The first task includes a first task type indicia, the first coprocessor is configured to perform the first type task, and the first agent is configured to perform the first task. Configured to search the task pool for a type of task,
The second task includes a second task type indicia, the second coprocessor is configured to perform the second type of task, and the second agent is configured to perform the second task. The processing system of claim 5, wherein the processing system is configured to search the task pool for a type of task.

The first coprocessor is configured to determine when it has available processing capacity and, in response to the determination, dispatch the first agent to the task pool. Item 7. The processing system according to Item 6.

7. The processing system of claim 6, wherein the controller and the task pool reside on a monolithic integrated circuit (IC) and the first coprocessor and the second coprocessor do not reside on the IC.

The processing system of claim 6, wherein the controller, the task pool, and the first coprocessor and the second coprocessor reside on a monolithic integrated circuit (IC).

A first device associated with the first coprocessor;
A second device associated with the second coprocessor,
The first device and the second device are a sensor, a light bulb, a power switch, an electric appliance, a biometric device, a medical device, a diagnostic device, a laptop, a tablet, a smartphone, a motor controller, and a security device, respectively. 7. The processing system of claim 6, including one of: