CN106537343A - System and method for parallel processing using dynamically configurable proactive co-processing cells - Google Patents

System and method for parallel processing using dynamically configurable proactive co-processing cells Download PDF

Info

Publication number
CN106537343A
CN106537343A CN201580039190.0A CN201580039190A CN106537343A CN 106537343 A CN106537343 A CN 106537343A CN 201580039190 A CN201580039190 A CN 201580039190A CN 106537343 A CN106537343 A CN 106537343A
Authority
CN
China
Prior art keywords
task
pool
task pool
unit
cpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201580039190.0A
Other languages
Chinese (zh)
Inventor
阿方索·伊尼格斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/340,332 external-priority patent/US9852004B2/en
Application filed by Individual filed Critical Individual
Publication of CN106537343A publication Critical patent/CN106537343A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/509Offload

Abstract

A parallel processing architecture includes a CPU, a task pool populated by the CPU, and a plurality of autonomous co-processing cells each having an agent configured to proactively interrogate the task pool to retrieve tasks appropriate for a particular so-processor. Each co-processor communicates with the task pool through a switching fabric, which facilitates connections for data transfer and arbitration between all system resources. Each so- processor notifies the task pool when a task or task thread is completed, whereupon the task pool notifies the CPU.

Description

Using the system and method for the parallel processing of dynamic and configurable active collaboration processing unit
The application is the continuation application of the Application U.S. Serial No 13/750,696 that on January 25th, 2013 submits to, and which passes through It is incorporated herein by reference.
Technical field
Present invention relates in general to parallel processing is calculated, and in particular to a kind of processing framework, which is related to be configured to from Jing The active collaboration processor of the task pool active retrieval tasks inserted by central processing unit.
Background technology
Internet of Things (also referred to as Internet of Things cloud) refers to unique discernible embedded in existing the Internet infrastructure The AD-HOC network of computing device.Internet of Things (IoT) means to surmount equipment, system and the service of machine-to-machine communication (M2M) Senior connection.The scope of the things that IoT envisions is unlimited, it may include such as cardiac monitoring implant, biochip response Device, automobile sensor, Aero-Space and defence and for example assist fireman in search and rescue operation at execute-in-place equipment Public safety applications equipment.Vehicles Collected from Market example includes the network based on family, and which is related to intelligent thermostat, bulb and profit The washer/dryer of remote monitoring is carried out with wifi.Due to the immanent property of the object being connected in Internet of Things, estimate To the year two thousand twenty, Internet of Things will be wirelessly connected to more than 30,000,000,000 equipment.An object of the present invention is to utilize and these equipment The disposal ability of related controller and processor.
Computer processor generally serially performs machine code instructions.In order to run multiple applications, single process simultaneously Device staggeredly processes the instruction from various programs and serial performs them, although from from the point of view of user, application seem by Parallel processing.On the other hand, it is a kind of big calculating task to be divided into individually calculating that real parallel processing or multinuclear are processed Block simultaneously allocates them to the computational methods among two or more computers.Using task concurrency (parallel processing) Computing architecture big calculating demand is divided into the discrete block of executable code.The priority for being then based on each of which is same When or be sequentially performed these modules.
Typical multicomputer system includes central processing unit (CPU) and one or more coprocessors.CPU will be counted Calculation demand is divided into task, and these tasks are distributed to coprocessor.Completed thread is reported to CPU, CPU according to Need to continue to distribute additional thread to coprocessor.The shortcoming of the multiprocessing method being currently known is:Task distribution can disappear Consume substantial amounts of CPU bandwidth;Before distribution new task task was waited to complete (generally with the dependency to previous tasks);When appoint Interruption of the response from coprocessor when business is completed;Respond other message from coprocessor.Additionally, coprocessor The free time is generally remained when waiting from the new task of CPU.
Accordingly, it would be desirable to a kind of multiple processor structure, the framework reduces CPU administration overheads, and also more effectively processes With using available collaboration process resource.
The content of the invention
The various embodiments of parallel processing computing architecture include:CPU, is configured to insert task pool;One or more associations Same processor, is configured to actively retrieve thread (task) from task pool.Each coprocessor notifies to appoint when task is completed Business pond, and be made available by before being processed, sending back (ping) task pool until another task.By this way, CPU is straight Connect and communicated with task pool, and communicated with coprocessor by task pool indirectly.
Coprocessor can be independently can operating;That is, they can be handed over task pool independently of CPU Mutually.In a preferred embodiment, each coprocessor includes inquiring agency of the task pool with search mission to perform.Therefore, assist Together with same processor " equally " work each other and together with task pool, by automatically retrieval and complete possibly or can Can not be mutually related independent task to complete PC cluster demand.As non-restrictive example, it is assumed that task B be related to calculate with The mean temperature of time.By limit task A with include catch with the time the temperature number of degrees, further by restriction task B with Including the reading that catches is obtained, thus CPU and various coprocessors can communicate with one another via task pool.
In various embodiments, coprocessor be referred to as automatically, active equivalent cell.In the present context, term " oneself It is main " mean that coprocessor can be interacted with task pool, and do so is not indicated by CPU or task pool.Term " active " is carried Discuss each coprocessor to be configured (for example, program) be suitable in task pool the association for periodically sending agency to monitor With the available task of processor.Term " equivalent " means that synergetic unit is monitoring and performing all available in task pool Common target is shared during task.
Equivalent cell (coprocessor) can be general processor or application specific processor, thus can compared to CPU or Other equivalent cells in system have identical or different instruction set, structure and micro-architecture.Additionally, by the software program being performed With processed data can be included in one or more memory element.In conventional computer system, for example, software Program includes to need the strings of commands of the data used by program.For example, if the program is corresponding with media player, wrap Can be the voice data of compression containing data in memory, the voice data can be read by coprocessor and finally be existed Play on speaker.
Each equivalent cell in systems can be configured to crossbar switch (also referred to as structure) and enter with task pool Row resistance or radio communication.In pure wireless mesh topology, radio signal oneself may make up structure.In various embodiments In, coprocessor also directly can be communicated with CPU.Switching construction is conducive to the communication between system resource.Each equivalent Unit is active, because when equivalent cell process not performed, or follow when equivalent cell can aid in process Ring and when not hindering its normal operating, it is sent to task pool obtaining task to be performed by acting on behalf of.As non-limit Property example processed, in the context of Internet of Things (being more thoroughly discussed below), at the collaboration being associated with the equipment of such as bulb Reason device can be programmed to monitoring come from main equipment (such as smart mobile phone) "ON" and "Off" order as its normal operating, But its process resource can also be utilized by task pool.
In the context of the various embodiments being described herein, term " agency " refers to what is be associated with coprocessor Software module, similar to network bag, wherein, the coprocessor is interacted so as to obtain to coprocessor with task pool The available task that unit is adapted to.When task may previously in the operation of task when, equivalent cell can be sequentially performed task, or Person is performed in parallel appointing when more than one equivalent cell is that available and more than one matching task can be used for operation Business.(if any) is limited according to the mission thread provided by CPU, task can independently or be collaboratively performed.In task pool Complementary task can be combined in logic.When mission thread is completed, task pool notifies CPU.If mission thread bag Individual task is included, then task pool can notify CPU when the task is completed.If mission thread includes multiple tasks, task pool CPU is notified when this task chain is completed.As mission thread can be combined in logic, it is possible to expect with following feelings Condition:The mission thread that task pool is logically combined notifies CPU after completing.
It will be understood by those skilled in the art that can be by CPU be configured to independently of related to various coprocessors The abstraction level of the instruction set architecture of connection is combining and/or structure task is promoting the interoperability between CPU and coprocessor Property, so as to allow component to be communicated in task rank rather than in instruction-level.Therefore, it can on the basis of " plug and play " On equipment and its associated coprocessor are added to into network.Another aspect of the present invention is provided with different instruction set frame Interoperability in the isomery array of the CPU of structure.
The various features of the present invention are particularly suited for the network of internet of things equipment and sensor;Heterogeneous computing environment;Gao Xing Can calculating, two and three dimensions monolithic integrated optical circuit;Motor control and robot.
Description of the drawings
The present invention will hereinafter be described in conjunction with, and wherein, identical numeral represents similar element, in accompanying drawing In:
Fig. 1 is and to be configured to carry out via structure (fabric) including CPU, memorizer, task pool according to embodiment The schematic block diagram of the parallel processing architecture of multiple coprocessors of communication;
Fig. 2 is the schematic block diagram of the details for illustrating the example tasks pond according to embodiment;
Fig. 3 be according to embodiment including synergetic unit and with their corresponding agencies that task pool is interacted Network schematic block diagram;
Fig. 4 is including available plug and the diagrammatic layout of the Internet of Things of playback equipment according to embodiment;
Fig. 5 is the signal cloth of the exemplary Internet of Things use case for illustrating that the dynamic of the neighbouring equipment according to embodiment is processed Office;
Fig. 6 is the flow chart of the operation for illustrating the exemplary Parallel computing environment according to embodiment.
Specific embodiment
Various embodiments are related to parallel processing computing system and environment, from simple switching and control function to complicated journey Sequence and algorithm, including but not limited to:Data encryption;Figure, video and Audio Processing;Direct memory access;Mathematical calculation;Data Excavate;Game algorithm;Ethernet bag and other procotols are processed, including data construction, reception and the transmission of external network;Gold Melt service and business method;Search engine;Internet data stream and other network applications;Software inside or outside performing Program;Utensil, bulb, consumption electricity is for example switched on and off and/or is otherwise controlled or manipulate in the environment of Internet of Things Sub- product etc..
Various features are may be incorporated in the computer architecture of any currently known or later exploitation.For example, it is related to synchronization, data Safely, execute out the parallel processing problem interrupted with primary processor to solve using inventive concepts described herein.
Referring now to Fig. 1, distributed processing system(DPS) 10 includes monokaryon or multi-core CPU 11 and is configured to via cross bar One or more equivalents that switching construction 14 is communicated with task pool 13 or synergetic unit 12A are to 12.Equivalent cell 12 Can also communicate with one another via switching construction 14 or via single unit bus (not shown).CPU 11 can be directly or via switching Structure 14 is communicated with task pool 13.Each in one or more memory element 15 includes data and/or instruction.At this In context, term " instruction " includes the software program that can perform compiling via CPU 11.Memory element 15, unit 12 and task 13 adjustable resistance of pond is wireless interconnected communicating with CPU and/or each other via switching construction 14.In certain embodiments, CPU 11 are only communicated with unit 12 indirectly by task pool.In other embodiments, CPU 11 also can not use task pool as in Between thing directly communicated with unit 12.
In certain embodiments, system 10 may include more than one CPU 11 and more than one task pool 13, at this In the case of kind, a specific CPU 11 can be interacted with a specific task pool 13, or multiple CPU 11 can share One or more task pool 13.Additionally, each equivalent cell can be configured to be handed over more than one task pool 13 Mutually.Alternatively, a specific unit can be configured to interact with single appointed task pond, for example, in high-performance or height Under security environment.
In various embodiments, when three below condition is met, unit can be with task pool dynamic pairing, resistance (insertion And play) or wireless (aerial):
1) unit can carry out resistance communication or radio communication with task pool.Connection to task pool can pass through task pool Port in itself, or by being connected to the switching construction of task pool;
2) task pool identification by unit send agency be it is believable, for example, using with or without password from The input of user, by traditional Wi-Fi, Blootooth or similar pairing, manually by smart mobile phone or panel computer The graphical software programs of operation pass through any other safe or unsafe method;
3) at least one of task pool available task is compatible with the ability of equivalent cell.
In the case of the multi-processor environment with multitask pond, except given unit can be locked or be restricted to Outside only working together with any one task pool, aforementioned dynamic pairing condition is suitable for;Otherwise, unit can search base using first Plinth, round-robin basis or any other selection scheme, are connected with one or more task pools.Can be with to the task in task pool point With priority, thus unit gives high-priority task priority, and when without otherwise by higher priority task During occupancy, servicing lower priority task.
CPU 11 could be for performing single core processor or polycaryon processor, application processor or the micro-control of software program Device processed.System 10 can be implemented in personal computer, smart phone, panel computer, internet device, in this case, CPU 11 can be any personal computer, central processing unit or processor cluster, such as,Or The Local or Remote polycaryon processor of computing environment immediately.Alternatively, system 10 can be realized on supercomputer, and CPU 11 can be Reduced Instruction Set Computer (" RISC ") processor, application processor, microprocessor etc..
In other embodiments, system 10 can be in a series of locally-attached personal computer (such as, Beowulfs (Beowulf) cluster) on realize, in this case, CPU 11 may include all of central processing unit, subset or networking calculate One in machine.Alternatively, system 10 can be realized on the network on the computer of long-range connection, in this case, CPU 11 can be currently known or after a while by exploitation for server or the central processing unit of large scale computer.CPU 11 is retouched currently The concrete mode for performing object parallel processing method in the system 10 stated can be affected by the operating system of CPU.For example, such as Lower described, CPU 11 can be configured to which is programmed to recognize and communicate with task pool 13 and draw calculating demand It is divided into thread and uses in system 10.
It is also contemplated that system 10 can on any computer with operating system or computer network retroaction reality Existing, the operating system can be changed or be otherwise configured to realize functionality described herein.As known in the art , data to be processed are included in memory cell 15, such as in random-access addressable area or subregion or read-only deposit Under the scene of reservoir, for the cache memory of CPU 11, or other forms data storage such as flash memory and magnetic memory. Memory cell 15 includes data to be processed and places the position of the result of the data for processing.It is not that each task is required for Memory cell 15 is accessed, such as, in the case of intelligence instrument and automobile instrument, which can return data to system 10, or In the case of robot and motor controller, which can make machine brake to person.
Each unit 12 is conceptually or in logic the independent calculating list that can run one or more tasks/threads Unit.Unit 12 can be microcontroller, microprocessor, application processor, " mute " switch or stand-alone computer, such as Beowulf Machine in cluster.
Unit 12 can be configured as supplementing, perform the general or special of function that is whole or performing narrow CPU With coprocessor, or for example it is outside function to CPU 11, such as environmental surveillance and robot actuator.Dedicated processes Device can be the application specific hardware modules for being designed, program or being otherwise configured to perform special duty, or which can be It is configured to perform the general processor of the such as dedicated task of graphics process, floating-point arithmetic or data encryption.
In embodiment, it is also configured to access and write memorizer and hold as any unit 12 of application specific processor Row descriptor as described below and other software programs.
Additionally, any amount of unit 12 can include heterogeneous computing environment;That is, using processor more than a type The system of the mixing of (such as based on AMD and/or the processor based on Intel) or 32 and 64 bit processors.
As shown in following event sequence, each unit 12 is configured to perform one or more dedicated tasks.In wheel During the inquiry stage, each unit periodically sends to task pool and acts on behalf of, until finding matching task.For the ease of this Match somebody with somebody, unit and task pool can be equipped with transceivers.In the case of task pool, transceiver can be located at task pool itself or position In the switching construction being connected with task pool.When task matching is found in task pool, task pool sends an acknowledgement to unit. Following step is " communication channel " stage.In the communication channel stage, unit receives task and starts execution task.At one In embodiment, once first task is done, then signal of communication is maintained so that equivalent cell can need not repeat " poll " " confirmation " stage can capture other tasks.
System 10 may include multiple units, and wherein, some of these units unit is able to carry out identical with other units Task type, so that create redundancy in system 10.Can be by another list by the task type set that given unit 12 is performed The subset of the set of the task type that unit performs.For example, in FIG, the computational problem of aggregation can be divided into task by system 10 Group, fills task pool 13 using the task of the first kind, Second Type and the 3rd type.First module 12A only can be held The task of the row first kind;Second unit 12B can be able to carry out the task of Second Type;3rd unit 12C can be able to carry out The task of three types;4th unit 12D can be able to carry out second or the 3rd type task;5th unit 12N can be able to carry out All three task type.System 10 can be configured with this redundancy so that if given unit is (or current busy from system 10 Or alternate manner is unavailable) remove, then system 10 can continue seamless operation.If additionally, unit is dynamically added to system 10, then system 10 can continue seamless operation under the benefit with higher performance.
Referring now to Fig. 1 and Fig. 2, task pool 13 can occupy the region of the physical storage that can be accessed by CPU 11.It is optional Ground, task pool 13 can be accessed by MAC Address or IP address.For task pool 13 it is envisioned that multiple embodiments;It can be with CPU in thing It is located in identical 2D or 3D monolithic IC in reason, or it may be implemented as independent IC and is physically interconnected to computer Plate, smart phone, panel computer, router or internet of things equipment.In another alternative embodiment, task pool can be The independent multiport of given CPU 11, the equipment for wiredly and/or wirelessly connecting is shared or is exclusively used between 11 systems of multiple CPU. Task pool 13 can be being addressed by unit 12.Task pool 13 can be arranged in hardware blocks, with by CPU 11 and unit 12 provide maximum access speed.Alternatively, task pool 13 can be based on software, wherein, similar to hardware based enforcement Example, the content of task pool 13 are stored in memory, but are represented by data structure.
When being filled by CPU 11, task pool 13 includes one or more mission threads 21.Each mission thread 21 is represented Calculating task, the larger polymerization that the calculating task can be consequently exerted on CPU 11 calculate component or the subset of demand.At one In embodiment, CPU 11 can be initialized and subsequently with while the filling task pool 13 of executable thread 21.Each thread 21 can be with Including one or more discrete tasks 22.Task 22 can have task type and descriptor.Task type indicates which unit 12 It is able to carry out task 22.Task pool 13 can also carry out priority row using task type to the task 22 with same type Sequence.In one embodiment, task pool 13 can safeguard that the priority list of the equivalent cell 12 that record is present in system 10 (does not show Go out), whether the type of task 22 that is able to carry out of each unit and each unit currently processed.It is as described below, task Pond 13 can be determined with the use priority table distributes to request unit by which eligible task 22.
In certain embodiments, CPU 11 can retrieve and perform task or thread from task pool.Additionally, CPU 11 can interrupt Be confirmed as it is out-of-date, damage, block or mistake any task.In this case, 11 renewable tasks of CPU so as to available In subsequent treatment.Do not have anything to prevent CPU 11 from realizing that adaptive task is managed, for example, such as artificial intelligence may need , thus CPU 11 can add, remove or change the task in the existing thread 21 not completed.
Descriptor can be comprising by the specific instruction being performed, execution pattern, by position (for example, the ground of processed data Location) and the placement location (if any) of task result in it is one or more.As a result placement location be it is optional, it is all Such as, in the case of animation and Multimedia Task, result is generally presented to display rather than is stored in storage by which In device.Additionally, task descriptor can be linked together, such as in chained list so that if be not linked together with descriptor Compare, data to be processed can be accessed with less memory calls.In one embodiment, descriptor is comprising head With the data structure of the multiple reference pointers to memory location, task 22 includes the storage address of data structure.Head is fixed Justice function to be performed or instruction.First pointer quotes the position of data to be processed.Second optional pointer reference is processed The placement location of data.If descriptor chain is connected to another descriptor to be performed serially, descriptor can include quoting 3rd pointer of next descriptor.In descriptor is the alternate embodiment of data structure, task 22 can include partial data Structure.
Thread 21 may also include description and can perform the order of task 22 and " matching somebody with somebody for any condition for affecting order of performance Side ".According to formula, can according to Boolean calculation sequentially, simultaneously, disorderly, interdependently or be conditionally executed appoint Business 22.For example, as shown in Fig. 2 thread 21A includes four tasks:22A, 22B, 22C and 22D.In the embodiment shown, first Task 22A can must be completed before starting in the second task 22B or the 3rd task 22C.According to formula, once the second task 22B or the 3rd task 22C are completed, then the 4th task 22D can start.
Thread 21 can also be complementary.For example, as shown in Fig. 2 due to the Boolean calculation in thread 21B, completing Task 22C the process of the task in thread 21B can be allowed to continue.Task pool 13 can lock task 22, while task 22 Completing for its another task 22 for being relied on is being waited just.When task 22 is locked, which can not be obtained by unit.When thread 21 When task 22 is completed, task pool 13 can notify that CPU 11 is completed.Then, CPU can make process exceed the thread 21 for completing.
These units advantageously keep being equal to each other and with CPU 11, so as to help system 10 is by independently and actively Ground performs the calculating of complexity from 13 retrieval tasks of task pool.Unit 12 is independently operated, because they can be independently of CPU 11 or any other coprocessors.It is alternatively possible to directly be acted on or indicating member by CPU.Each unit is taken action on one's own initiative, Because once unit becomes available for further processing, it just finds task 22 from task pool 13.
More specifically, in one embodiment, unit 12 is by sending agency 30 to inquire (search for) task pool and retrieve Available task 22 obtains task from task pool, and available task 22 needs to complete, is not locked out and has executable of unit Service type.As a rule, system 10 and the agency for being equal to synergetic unit and having equal number.In this case, act on behalf of The Frame being generally similar in network significance, because agency can be equipped with source address, destination address and load.In embodiment In, when agency 30 is look for task 22, destination address is the address of task pool 13, has task when agency 30 returns to which During 22 unit, destination address is the address of corresponding unit 12.Correspondingly, when agency 30 is look for task 22, source ground Location is the address of unit 12, and when agency 30 returns to the unit which has task 22, source address is the ground of task pool 13 Location.
Additionally, source address and destination address can be conducive to frame synchronization.That is, system 10 can be configured to clearly distinguish Address and load data so that when the content of agency 30 is read, destination address indicate the beginning of frame, and source address indicates frame End, or vice versa it is as the same.This allows to be supported on when being placed between address and changes in size.In the variable load of size In another embodiment, agency 30 may include to indicate the head of load.Header information and load can be compared to verify Data integrity.In yet another embodiment, load can be regular length.When agency 30 passes through its coprocessor unit quilt When being assigned to task pool 13, load includes the identification information of the task type that unit 12 can be performed.When agency 30 is from task pool 13 when returning, and load includes the descriptor of the task 22 in the form of the storage location or in the form of whole descriptor data structures.
In another embodiment, some of agency 30 or whole agencies are the autonomous generations of its each self-corresponding unit 12 Table.That is, each agency 30 can be assigned by its corresponding unit 12, in the unit free or to be able to carry out additional place Retrieval tasks 22 during reason.In this way it is possible to more fully using the disposal ability of equivalent cell 12, because unit need not Instruction from CPU 11 is waited idly.The method has by mitigating CPU to unit transmission request to retrieve from task pool The demand of task is reducing the additional advantage of CPU overhead.These advantages cause system 10 more more effective than convention computer architecture, its Middle supplementary module and coprocessor depend on the instruction from host CPU.
Additionally, equivalent cell 12A to 12n is contradiction for the concrete composition of thread itself.Conversely, agency is concerned only with and looks for To matching between the ability and the available task 22 that will be completed in task pool 13 of its corresponding units.As long as that is, appointing There is available task 22, and the ability of 22 matching unit of available task in pond 13 in business, then system can effectively utilize unit Disposal ability.
Some of equivalent cell 12A to 12n all can be worked independently of one another, or can pass through switching construction 14, lead to Cross task pool 13 or communicated with one another to wake up another equivalent cell to help process, move according to the order from CPU or request Or send data.In one embodiment, act on behalf of 30A may search for the task type and unit 12A of ready task 22 can Matching between the type of the task of execution.The framework can relate to the hard volume that CPU 11 is configured to the type of creating for task Code.Therefore, if task pool 13 includes the task 22 of three types, and big calculating demand includes the task of the 4th type, Then the task of the 4th type can be not placed in task pool 13, even if the task of being able to carry out the 4th type is included in being System 10 in or be added in system 10.Therefore, CPU 11 can be configured to " study " or be taught how to create the 4th type Task, more fully to utilize available process resource.
In another embodiment, one of instruction that search is able to carry out with unit 12A in 22 descriptor of task of agency 30 The executable instruction of matching.When matching task 22 is found, act on behalf of 30A and the descriptor of matching task 22 be distributed to into unit 12A, Therefore, unit 12A start to process task 22.Specifically, act on behalf of 30A and the storage address of descriptor can be distributed to unit 12A, Unit 12A is from memory search data structure.Alternatively, the partial data structure of descriptor is included in task 22, agency Partial data structure can be distributed to unit 12A to be processed by 30A.Which instruction descriptor notification unit 12A performs, and can look for To the data in memory element 15 by processed position and in memorizer 15 structure by the position being placed.Complete to appoint Be engaged in 22 when, unit 12A notifies that the state of task 22 for selecting will be changed into " completing " from " will be done " by task pool 13.This Outward, once unit 12A completes task 22, then its 30A that acts on behalf of can be assigned to task pool 13 to search for another task 22 by unit.
Act on behalf of some of 30A to 30n agencies or all agency can be according to the concrete framework of system 10 and/or embodiment party Case, by wired or wireless (for example, using Wi-Fi network, wireless ethernet, Wireless USB, wireless bridge, wireless repeater, nothing Line router,Or Bluetooth pairing) pass through system 10.In embodiment, agency 30 can pass through in office Business pond 13 includes receptor feature and further by including that the transmitter feature with unit 12 is wirelessly guided to task pool 13.Similarly, task pool to unit can be carried out by being equipped with transmitter for task pool and for equivalent cell is equipped with receptor Wireless answering.By this way, unit can wirelessly be communicated with task pool in the case where using or not using switching construction.
In a preferred embodiment, however, using a certain form of switching construction 14.Switching construction 14 utilizes data transfer Connection and system resource between arbitration.Switching construction 14 can provide connection between various units and task pool Router or crossbar switch.Switching construction 14 is may also provide in each equivalent cell 12A to 12n and system resource (such as, CPU 11st, memory element 15 and legacy system component, including but not limited to, direct memory access unit, transmitter, hard disk and its control Device processed, display and other input-output apparatus and other coprocessors) between connection.Unit 12A to 12n can quilt Switching construction 14 is physically connected to, or unit can be wirelessly connected.
Unit wireless is connected to system 10 and is conducive to the dynamic of the unit for using in system 10 to increase and/or remove.Example Such as, CPU 11 can recruit unit from other cellular systems, it is allowed to dynamic expansion and raising performance.By this way, two or More cellular systems (for example, network) can share equivalent cell.In one embodiment, the unit for becoming idle can be sought Look for and/or recruited by another system for needing additional processing resources, i.e. which has the available processes task that needs are completed.It is similar Ground, system 10 can be by being incorporated to the cluster of the additional unit for specific tasks come scalability.For example, system 10 can be by simultaneously Enter to be able to carry out the neighbouring unit of these tasks to strengthen the performance of encryption function/decryption function, or to voice data and/or regard The process of frequency evidence.
In order to prevent undesirable connection, CPU 11 from can provide credible and/or insincere for identifying to task pool 13 The list of unit and authentication requesting or agreement or alternatively, for identifying the standard of credible and/or insincere unit.This Outward, task pool itself can exclude tool based on low performance, unreliable connection, the data throughout of difference or malice or misbehavior Body unit.In various embodiments, unit 12 can pass through making for smart mobile phone, panel computer or miscellaneous equipment or application by user With, and be added to task pool 13 or exclude from task pool 13.In one embodiment, figure application interface can be carried to user For useful static and/or icon information, such as, the position of available cell and miscellaneous equipment, performance gain or performance are compensatory, make For increasing concrete unit or removing the result of concrete unit from network.
In an alternative embodiment, some of synergetic unit unit or whole units can such as by being used for Wired configuration of the switching construction 14 of communication is directly connected to task pool 13.The wired connection of unit is may additionally facilitate similar to above-mentioned The dynamic expansion of the system 10 of radio configuration and contraction, although wired connection can be physics, (for example, manually) is integrated and peripheral The extraction of equipment.In either case, compared with conventional parallel processing scheme, the extensibility of system is greatly enhanced, because can Reprogramming is not carried out to CPU 11 to consider the change to system 10 add and to remove coprocessor.
Referring now to Fig. 3, network 300 include CPU 302, first memory 304, second memory 306, task pool 308, Switching construction 310, be configured to perform (operation) type A task the first synergetic unit 312, be configured to perform type The second unit 314 of B tasks, the 3rd unit 316 for being configured to execution Type C task, and be configured to perform type A times 4th unit 318 of both business and type B task.As described above, task pool 308 is by task (or the task line of task type A Journey) 330 and 332;The task 340 and 342 of the task 334 and 336 of task type B and task type C and insert (for example, by CPU 302).In embodiment, each unit preferably has unique special agency.Specifically, unit 312 includes agency 320;Unit 314 includes agency 322;Unit 316 includes agency 324;Unit 318 includes agency 326.Each agency preferably wraps Include the head of the type of information field or identification mission, the task be unit associated there be configured perform task, example Such as, individual task or task A, B, the combination of C.
During operation, when unit is idle or otherwise has available processes ability, its agency master It is dynamic to inquire task pool to determine whether any task is suitable for the concrete unit in task queue.For example, unit 312 can be assigned One or two to retrieve in corresponding with task type A task 330 and 332 of its agency 320.Similarly, unit 314 can divide Send its agency 322 to retrieve task 334 or 336 (according to their corresponding priority) corresponding with task type B etc..For energy Enough units performed more than a task type, such as unit 318 are configured to execution task type A and B, agency 326 and can examine Any one in rope task 330,332,334 and/or 336.
When task is retrieved from task pool, unit can be with the post processing task, generally by from first memory 304 Particular location retrieval data, process the data the particular location by the data storage after process in second memory 306 On.When task is done, unit notifies task pool, and by the task flagging for completing, task pool notifies that CPU should appoint to task pool Business is completed.Alternatively, task pool has completed to notify CPU when mission thread, because mission thread may include individual task, appoint The boolean combination of business string or task.Importantly, in the case where there is no the direction communication between CPU and unit, may Retrieval of the generating unit to task and the process to data.
Referring now to Fig. 4, Internet of Things network 400 include controller (CPU) 402, task pool 408 and various equipment 410 to 422, wherein, some of described equipment or all including associated or embedded microcontroller, such as, integrated circuit (IC) core Piece realizes other components of disposal ability.Used as non-restrictive example, the equipment may include bulb 410, thermostat 412, electric mortiser Seat 414, on and off switch 416, utensil (for example, bread baker) 418, vehicle 420, keyboard 422 and can be with network interaction Actually any other PnP device or application.
In the illustrated embodiment, controller 402 can be smart phone, panel computer, laptop or can wrap Include display 404 and user interface (for example, keyboard) 406 to facilitate user to carry out user mutual with the various equipment in network Miscellaneous equipment.May be not sufficient enough in the degree of support network in the disposal ability (for example, bandwidth) of controller 402, control Device effectively can be obtained from ancillary equipment via task pool or recruit process resource, for example, such as explain with reference to Fig. 5.
Referring now to Fig. 5,500 use case of Internet of Things network illustrates the dynamic profit of (or alternate manner is available) equipment nearby With.Network 500 include main control unit 502 (for example, laptop, panel computer or game station), task pool 504, First coprocessor equipment 506 and the second coprocessor equipment 508.Showing under the background of network 500 will now be described Example property use case.
Assume that user is just playing video-game on her laptop computer 502.Video-game needs detailed calculating Machine generates the disposal ability in image, and possible laptop computer 502 and be enough to single true outward appearance role is presented, but works as When second role is introduced on screen, deterioration in image quality, and the movement of role is no longer continuous.The present invention proposes a kind of profit With the method for the disposal ability of the neighbouring or available computer resource underused of user positioned at user.
In order to solve the demand of added processing power, laptop computer 502 is connected to task pool 504.In this regard, Laptop computer itself can be equipped with task pool, or task pool can be with external equipment or application positioned at from calculating on knee In the range of 502 wireless arrival of machine.In the case of outside task pool, task pool itself can perform the knot of the switching with port The responsibility of structure, to allow attachment to multiple synergetic units.Laptop computer 502 is filled using computation-intensive task Task pool 504.(such as, 508) smart phone is subsequently connected to task pool 504 to the equipment nearby underused, and sends which Act on behalf of the task type to extract matching.Therefore, smart phone 508 becomes at the seamless collaboration for assisting laptop computer 502 Reason device, so that strengthen video game experience.In the case where there are other process resources for underusing and needing, can be again Multiple identical method.Even if in fact, the collaboration that can also become laptop computer with the disposal ability of bulb 506 is processed Device.
Fig. 6 is the flow chart of the operation for illustrating exemplary Parallel computing environment.Specifically, method 600 includes:Using appoint Task pool (step 602) is inserted in business;One or more agencies are actively assigned to into task pool (step from one or more corresponding units It is rapid 604);Retrieval process task (step 606);Notify that task pool and CPU mission threads are performed (step 608).It is described Method also includes optional equipment is dynamically incorporated to network (step 610) as needed.
It thus provides a kind of processing system, the processing system includes task pool;Controller, is configured to, with One task fills task pool;And first coprocessor, it is configured to actively retrieve first task from task pool.
In embodiment, the first coprocessor includes first agent, is configured to be led to controller First task is retrieved from task pool in the case of letter.
In embodiment, first task includes the labelling of first task type, and the first coprocessor is configured to perform The task of the first kind, first agent are configured to search for the task of the first kind in task pool.
In embodiment, the first coprocessor is additionally configured to process first task and notifies when first task is completed Task pool, task pool are configured to the notification controller when first task is completed.
In embodiment, controller and the first coprocessor are configured to only by being led between task pool Letter.
In embodiment, controller and the first coprocessor are configured to be led to both directly and through task pool each other Letter.
In embodiment, the first coprocessor is configured to determine which has available disposal ability and in response to described It is determined that by agent allocation to task pool.
In embodiment, controller is additionally configured to insert task pool using the second task, and wherein, the system also includes Second coprocessor, second coprocessor have be configured to from task pool actively retrieve the second filial generation of the second task Reason.
In embodiment, the second task includes the labelling of the second task type, and the second coprocessor is configured to perform The task of Second Type, second agent are configured to search for the task of Second Type in task pool.
In embodiment, controller and task pool are resided on monolithic integrated optical circuit (IC), and the first coprocessor is not normal Reside on IC.
In another embodiment, controller, task pool and the first coprocessor and the second coprocessor are resided in On monolithic integrated optical circuit (IC).
Further it is provided that one kind is dynamically control on the process resource in the network including the type of CPU (CPU) Method, the CPU is configured to, with the first task with first task type and inserts task pool.Methods described includes following Step:First module is programmed to perform first task type;First module after programming is added to into network;By first Agency is actively sent to task pool from first module;First agent searches for the task of the first kind in task pool;First agent First task is retrieved from task pool;First task is transferred to first module by first agent;First module processes first task;Will First task task is completed to be notified to be sent to task pool from first module.
In embodiment, methods described also includes:First task is labeled as completing by task pool;First task is complete Into notice be sent to CPU from task pool.
In embodiment, methods described also includes:Configuration first module has available processes ability to determine first module As according to (predicate) so that first agent to be actively sent to task pool.
In embodiment, methods described also includes:First module after by programming is single by first before being added to network Unit is integrated into the first equipment.
In embodiment, the first equipment includes that sensor, bulb, on and off switch, utensil, biometric apparatus, medical treatment set One in standby, diagnostic device, laptop, panel computer, smart phone, electric machine controller and safety equipment.
In embodiment, the first module of programming is added to network includes setting up logical between first module and task pool Letter link.
In embodiment, CPU is additionally configured to insert task pool using the second task with the second task type, described Method is further comprising the steps of:Second unit is programmed to perform the second task type;Second unit and task pool it Between set up communication linkage;Second agent is actively sent to into the task pool from second unit;Second agent is searched in task pool The task of rope Second Type;Second agent retrieves the second task from task pool;It is single that second task is sent to second by second agent Unit;Second unit processes the second task;Notify to be sent to task pool from second unit by the second task completed;Task pool will Second task flagging is to complete;And notify to be sent to CPU from task pool by the second task completed.
A kind of system for controlling the distributed processing resources under Internet of Things (IoT) computing environment is additionally provided, it is described System includes:CPU, is configured to PC cluster demand is divided into multiple tasks and task is placed in pond;And it is multiple Equipment, each equipment have unique dedicated proxies, be configured to need not with CPU direction communications in the case of from the pond master Dynamic retrieval tasks.
Although it is shown that the enable description to the various embodiments including optimal mode known for inventor, but this Art personnel will be understood that, without departing from the scope of the invention, can make various changes and modifications, and wait Jljl can substitute various elements.The invention is not restricted to disclosed specific embodiment it is therefore intended that disclosed herein, and It is all embodiments that will include and falling in the word and equivalency range of claim of the invention.

Claims (20)

1. a kind of processing system, including:
Task pool;
Controller, is configured to insert the task pool with first task;And
First coprocessor, is configured to actively retrieve the first task from the task pool.
2. processing system as claimed in claim 1, wherein, first coprocessor includes first agent, described first Agency is configured to be retrieved the first task in the case where not communicating with the controller from the task pool.
3. processing system as claimed in claim 2, wherein, the first task includes the mark of first task type, described First coprocessor is configured to perform the task of the first kind, and the first agent is configured in the task pool The middle task of searching for the first kind.
4. processing system as claimed in claim 1, wherein, first coprocessor is additionally configured to process described first Task simultaneously notifies the task pool when the first task is completed.
5. processing system as claimed in claim 1, wherein, the task pool is configured to lead to when the first task is completed Know the controller.
6. processing system as claimed in claim 1, wherein, the controller and first coprocessor are only configured to Communicated with each other by the task pool.
7. processing system as claimed in claim 1, wherein, the controller and first coprocessor are configured to directly Ground connection communicates with each other and passes through the task pool and communicates with each other.
8. processing system as claimed in claim 2, wherein, first coprocessor is configured to determine that have can use Reason ability, and determine the agent allocation to the task pool in response to described.
9. processing system as claimed in claim 3, wherein, the controller is additionally configured to insert described appointing with the second task Business pond, and wherein, the system also includes the second coprocessor with second agent, the second agent is configured to Second task is retrieved actively from the task pool.
10. processing system as claimed in claim 9, wherein, second task includes the mark of the second task type, described Second coprocessor is configured to perform the task of Second Type, and the second agent is configured in the task pool The middle task of searching for the Second Type.
11. processing systems as claimed in claim 1, wherein, the controller and the task pool reside in single-chip integration electricity On road (IC), and first coprocessor is not resided on the IC.
12. processing systems as claimed in claim 9, wherein, at the controller, the task pool and first collaboration Reason device and second coprocessor are resided on monolithic integrated optical circuit (IC).
The method that 13. one kind are dynamically control on the process resource in the network including the type of central processing unit (CPU), it is described CPU is configured to insert task pool with the first task with first task type, the method comprising the steps of:
First module is programmed to perform the first task type;
First module after the programming is added to into the network;
First agent is actively sent to into the task pool from the first module;
The first agent searches for the task of the first kind in the task pool;
The first agent retrieves the first task from the task pool;
The first task is transported to the first module by the first agent;
The first module processes the first task;And
Notify to be sent to the task pool from the first module by the first task completed.
14. methods as claimed in claim 13, also include:The first task is labeled as completing by the task pool;With And notify to be sent to the CPU from the task pool by the first task completed.
15. methods as claimed in claim 13, also include:Configure the first module with determine the first module have can With disposal ability as foundation so that the first agent is actively sent to the task pool.
16. methods as claimed in claim 13, also include:First module after by the programming be added to the network it It is front that the first module is integrated into into the first equipment.
17. methods as claimed in claim 16, wherein, first equipment include sensor, bulb, on and off switch, utensil, Biometric apparatus, armarium, diagnostic device, laptop, panel computer, smart phone, electric machine controller and peace One in full equipment.
18. methods as claimed in claim 13, wherein, the first module after the programming is added to the network includes:
Communication linkage is set up between the first module and the task pool.
19. methods as claimed in claim 13, wherein, (CPU) is additionally configured to with the second task type the Two tasks insert the task pool, and methods described is further comprising the steps of:
Second unit is programmed to perform second task type;
Communication linkage is set up between the second unit and the task pool;
Second agent is actively sent to into the task pool from the second unit;
The second agent searches for the task of Second Type in the task pool;
The second agent retrieves second task from the task pool;
Second task is transported to the second unit by the second agent;
The second unit processes second task;
Notify to be sent to the task pool from the second unit by second task completed;
Second task flagging is to complete by the task pool;And
Notify to be sent to the CPU from the task pool by second task completed.
A kind of 20. systems for controlling the distributed processing resources under Internet of Things (IoT) computing environment, including:
CPU, is configured to PC cluster demand is divided into multiple tasks and the task is placed in pond;And
Multiple equipment, each equipment have unique dedicated proxies, and the agency is configured to directly lead to the CPU From the pond active retrieval tasks in the case of letter.
CN201580039190.0A 2014-07-24 2015-07-10 System and method for parallel processing using dynamically configurable proactive co-processing cells Pending CN106537343A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/340,332 2014-07-24
US14/340,332 US9852004B2 (en) 2013-01-25 2014-07-24 System and method for parallel processing using dynamically configurable proactive co-processing cells
PCT/US2015/039993 WO2016014263A2 (en) 2014-07-24 2015-07-10 System and method for parallel processing using dynamically configurable proactive co-processing cells

Publications (1)

Publication Number Publication Date
CN106537343A true CN106537343A (en) 2017-03-22

Family

ID=55165563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580039190.0A Pending CN106537343A (en) 2014-07-24 2015-07-10 System and method for parallel processing using dynamically configurable proactive co-processing cells

Country Status (3)

Country Link
EP (1) EP3172669A4 (en)
CN (1) CN106537343A (en)
WO (1) WO2016014263A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112713993A (en) * 2020-12-24 2021-04-27 天津国芯科技有限公司 Encryption algorithm module accelerator and high-speed data encryption method
CN112799792A (en) * 2021-02-01 2021-05-14 安徽芯纪元科技有限公司 Method for protecting task context register of embedded operating system
CN113535405A (en) * 2021-07-30 2021-10-22 上海壁仞智能科技有限公司 Cloud service system and operation method thereof

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117389731B (en) * 2023-10-20 2024-04-02 上海芯高峰微电子有限公司 Data processing method and device, chip, device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000011561A1 (en) * 1998-08-21 2000-03-02 Corporate Media Partners Doing Business As Americast System and method for a master scheduler
US20030005029A1 (en) * 2001-06-27 2003-01-02 Shavit Nir N. Termination detection for shared-memory parallel programs
US20070074207A1 (en) * 2005-09-27 2007-03-29 Sony Computer Entertainment Inc. SPU task manager for cell processor
US20110310977A1 (en) * 2009-02-18 2011-12-22 Nec Corporation Task allocation device, task allocation method, and storage medium storing tas allocation program
CN102427577A (en) * 2011-12-06 2012-04-25 安徽省徽商集团有限公司 System for pushing information from collaboration server to mobile terminal and method thereof
US8209702B1 (en) * 2007-09-27 2012-06-26 Emc Corporation Task execution using multiple pools of processing threads, each pool dedicated to execute different types of sub-tasks

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8108867B2 (en) * 2008-06-24 2012-01-31 Intel Corporation Preserving hardware thread cache affinity via procrastination
US8732713B2 (en) * 2010-09-29 2014-05-20 Nvidia Corporation Thread group scheduler for computing on a parallel thread processor
US8949853B2 (en) * 2011-08-04 2015-02-03 Microsoft Corporation Using stages to handle dependencies in parallel tasks
US8990833B2 (en) * 2011-12-20 2015-03-24 International Business Machines Corporation Indirect inter-thread communication using a shared pool of inboxes

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000011561A1 (en) * 1998-08-21 2000-03-02 Corporate Media Partners Doing Business As Americast System and method for a master scheduler
US20030005029A1 (en) * 2001-06-27 2003-01-02 Shavit Nir N. Termination detection for shared-memory parallel programs
US20070074207A1 (en) * 2005-09-27 2007-03-29 Sony Computer Entertainment Inc. SPU task manager for cell processor
US8209702B1 (en) * 2007-09-27 2012-06-26 Emc Corporation Task execution using multiple pools of processing threads, each pool dedicated to execute different types of sub-tasks
US20110310977A1 (en) * 2009-02-18 2011-12-22 Nec Corporation Task allocation device, task allocation method, and storage medium storing tas allocation program
CN102427577A (en) * 2011-12-06 2012-04-25 安徽省徽商集团有限公司 System for pushing information from collaboration server to mobile terminal and method thereof

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112713993A (en) * 2020-12-24 2021-04-27 天津国芯科技有限公司 Encryption algorithm module accelerator and high-speed data encryption method
CN112799792A (en) * 2021-02-01 2021-05-14 安徽芯纪元科技有限公司 Method for protecting task context register of embedded operating system
CN112799792B (en) * 2021-02-01 2023-12-05 安徽芯纪元科技有限公司 Task context register protection method of embedded operating system
CN113535405A (en) * 2021-07-30 2021-10-22 上海壁仞智能科技有限公司 Cloud service system and operation method thereof

Also Published As

Publication number Publication date
WO2016014263A3 (en) 2016-03-17
WO2016014263A2 (en) 2016-01-28
EP3172669A4 (en) 2018-03-14
EP3172669A2 (en) 2017-05-31

Similar Documents

Publication Publication Date Title
US20200183735A1 (en) System and Method For Swarm Collaborative Intelligence Using Dynamically Configurable Proactive Autonomous Agents
CN108268425B (en) Programmable matrix processing engine
CN103597784B (en) The method and system of the master device-slave unit pair in the switching fabric of dynamic creation and service portable computing with across described switching fabric
JP3836840B2 (en) Multiprocessor system
US20120290756A1 (en) Managing Bandwidth Allocation in a Processing Node Using Distributed Arbitration
US11010313B2 (en) Method, apparatus, and system for an architecture for machine learning acceleration
JP2013501299A (en) Data multicasting in distributed processor systems.
CN106537343A (en) System and method for parallel processing using dynamically configurable proactive co-processing cells
US10713026B2 (en) Heterogeneous distributed runtime code that shares IOT resources
US20090113138A1 (en) Combined Response Cancellation for Load Command
WO2007074905A1 (en) Network equipment system
JP2005209207A (en) Method for managing data in array processor, and array processor
Dogan et al. Accelerating graph and machine learning workloads using a shared memory multicore architecture with auxiliary support for in-hardware explicit messaging
JP2020027613A (en) Artificial intelligence chip and instruction execution method used in artificial intelligence chip
Si et al. Direct MPI library for Intel Xeon Phi co-processors
CN103455371A (en) Mechanism for optimized intra-die inter-nodelet messaging communication
JP3836837B2 (en) Method, processing unit, and data processing system for microprocessor communication in a multiprocessor system
US10339059B1 (en) Global socket to socket cache coherence architecture
CN103093446A (en) Multi-source image fusion device and method based on on-chip system of multiprocessor
Zimmer et al. Nocmsg: Scalable noc-based message passing
JP2021507384A (en) On-chip communication system for neural network processors
EP4022446B1 (en) Memory sharing
CN113556242B (en) Method and equipment for performing inter-node communication based on multi-processing nodes
CN112948001A (en) Method for setting tensor hardware configuration, readable storage medium and device
CN103294623B (en) A kind of multi-thread dispatch circuit of configurable SIMD system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1232997

Country of ref document: HK

WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170322

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1232997

Country of ref document: HK