CN106537343A - System and method for parallel processing using dynamically configurable proactive co-processing cells - Google Patents
System and method for parallel processing using dynamically configurable proactive co-processing cells Download PDFInfo
- Publication number
- CN106537343A CN106537343A CN201580039190.0A CN201580039190A CN106537343A CN 106537343 A CN106537343 A CN 106537343A CN 201580039190 A CN201580039190 A CN 201580039190A CN 106537343 A CN106537343 A CN 106537343A
- Authority
- CN
- China
- Prior art keywords
- task
- pool
- task pool
- unit
- cpu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 title claims description 55
- 230000008569 process Effects 0.000 claims description 30
- 238000004891 communication Methods 0.000 claims description 15
- 230000003287 optical effect Effects 0.000 claims description 4
- 230000004044 response Effects 0.000 claims description 4
- 230000005611 electricity Effects 0.000 claims description 2
- 238000004148 unit process Methods 0.000 claims description 2
- 230000010354 integration Effects 0.000 claims 1
- 239000004744 fabric Substances 0.000 abstract description 3
- 238000012546 transfer Methods 0.000 abstract description 2
- 238000010276 construction Methods 0.000 description 16
- 239000003795 chemical substances by application Substances 0.000 description 12
- 230000002195 synergetic effect Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 4
- 238000012544 monitoring process Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000000018 DNA microarray Methods 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 235000008429 bread Nutrition 0.000 description 1
- 230000000747 cardiac effect Effects 0.000 description 1
- 230000001447 compensatory effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000007943 implant Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000003127 knee Anatomy 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000008672 reprogramming Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5011—Pool
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/509—Offload
Abstract
A parallel processing architecture includes a CPU, a task pool populated by the CPU, and a plurality of autonomous co-processing cells each having an agent configured to proactively interrogate the task pool to retrieve tasks appropriate for a particular so-processor. Each co-processor communicates with the task pool through a switching fabric, which facilitates connections for data transfer and arbitration between all system resources. Each so- processor notifies the task pool when a task or task thread is completed, whereupon the task pool notifies the CPU.
Description
The application is the continuation application of the Application U.S. Serial No 13/750,696 that on January 25th, 2013 submits to, and which passes through
It is incorporated herein by reference.
Technical field
Present invention relates in general to parallel processing is calculated, and in particular to a kind of processing framework, which is related to be configured to from Jing
The active collaboration processor of the task pool active retrieval tasks inserted by central processing unit.
Background technology
Internet of Things (also referred to as Internet of Things cloud) refers to unique discernible embedded in existing the Internet infrastructure
The AD-HOC network of computing device.Internet of Things (IoT) means to surmount equipment, system and the service of machine-to-machine communication (M2M)
Senior connection.The scope of the things that IoT envisions is unlimited, it may include such as cardiac monitoring implant, biochip response
Device, automobile sensor, Aero-Space and defence and for example assist fireman in search and rescue operation at execute-in-place equipment
Public safety applications equipment.Vehicles Collected from Market example includes the network based on family, and which is related to intelligent thermostat, bulb and profit
The washer/dryer of remote monitoring is carried out with wifi.Due to the immanent property of the object being connected in Internet of Things, estimate
To the year two thousand twenty, Internet of Things will be wirelessly connected to more than 30,000,000,000 equipment.An object of the present invention is to utilize and these equipment
The disposal ability of related controller and processor.
Computer processor generally serially performs machine code instructions.In order to run multiple applications, single process simultaneously
Device staggeredly processes the instruction from various programs and serial performs them, although from from the point of view of user, application seem by
Parallel processing.On the other hand, it is a kind of big calculating task to be divided into individually calculating that real parallel processing or multinuclear are processed
Block simultaneously allocates them to the computational methods among two or more computers.Using task concurrency (parallel processing)
Computing architecture big calculating demand is divided into the discrete block of executable code.The priority for being then based on each of which is same
When or be sequentially performed these modules.
Typical multicomputer system includes central processing unit (CPU) and one or more coprocessors.CPU will be counted
Calculation demand is divided into task, and these tasks are distributed to coprocessor.Completed thread is reported to CPU, CPU according to
Need to continue to distribute additional thread to coprocessor.The shortcoming of the multiprocessing method being currently known is:Task distribution can disappear
Consume substantial amounts of CPU bandwidth;Before distribution new task task was waited to complete (generally with the dependency to previous tasks);When appoint
Interruption of the response from coprocessor when business is completed;Respond other message from coprocessor.Additionally, coprocessor
The free time is generally remained when waiting from the new task of CPU.
Accordingly, it would be desirable to a kind of multiple processor structure, the framework reduces CPU administration overheads, and also more effectively processes
With using available collaboration process resource.
The content of the invention
The various embodiments of parallel processing computing architecture include:CPU, is configured to insert task pool;One or more associations
Same processor, is configured to actively retrieve thread (task) from task pool.Each coprocessor notifies to appoint when task is completed
Business pond, and be made available by before being processed, sending back (ping) task pool until another task.By this way, CPU is straight
Connect and communicated with task pool, and communicated with coprocessor by task pool indirectly.
Coprocessor can be independently can operating;That is, they can be handed over task pool independently of CPU
Mutually.In a preferred embodiment, each coprocessor includes inquiring agency of the task pool with search mission to perform.Therefore, assist
Together with same processor " equally " work each other and together with task pool, by automatically retrieval and complete possibly or can
Can not be mutually related independent task to complete PC cluster demand.As non-restrictive example, it is assumed that task B be related to calculate with
The mean temperature of time.By limit task A with include catch with the time the temperature number of degrees, further by restriction task B with
Including the reading that catches is obtained, thus CPU and various coprocessors can communicate with one another via task pool.
In various embodiments, coprocessor be referred to as automatically, active equivalent cell.In the present context, term " oneself
It is main " mean that coprocessor can be interacted with task pool, and do so is not indicated by CPU or task pool.Term " active " is carried
Discuss each coprocessor to be configured (for example, program) be suitable in task pool the association for periodically sending agency to monitor
With the available task of processor.Term " equivalent " means that synergetic unit is monitoring and performing all available in task pool
Common target is shared during task.
Equivalent cell (coprocessor) can be general processor or application specific processor, thus can compared to CPU or
Other equivalent cells in system have identical or different instruction set, structure and micro-architecture.Additionally, by the software program being performed
With processed data can be included in one or more memory element.In conventional computer system, for example, software
Program includes to need the strings of commands of the data used by program.For example, if the program is corresponding with media player, wrap
Can be the voice data of compression containing data in memory, the voice data can be read by coprocessor and finally be existed
Play on speaker.
Each equivalent cell in systems can be configured to crossbar switch (also referred to as structure) and enter with task pool
Row resistance or radio communication.In pure wireless mesh topology, radio signal oneself may make up structure.In various embodiments
In, coprocessor also directly can be communicated with CPU.Switching construction is conducive to the communication between system resource.Each equivalent
Unit is active, because when equivalent cell process not performed, or follow when equivalent cell can aid in process
Ring and when not hindering its normal operating, it is sent to task pool obtaining task to be performed by acting on behalf of.As non-limit
Property example processed, in the context of Internet of Things (being more thoroughly discussed below), at the collaboration being associated with the equipment of such as bulb
Reason device can be programmed to monitoring come from main equipment (such as smart mobile phone) "ON" and "Off" order as its normal operating,
But its process resource can also be utilized by task pool.
In the context of the various embodiments being described herein, term " agency " refers to what is be associated with coprocessor
Software module, similar to network bag, wherein, the coprocessor is interacted so as to obtain to coprocessor with task pool
The available task that unit is adapted to.When task may previously in the operation of task when, equivalent cell can be sequentially performed task, or
Person is performed in parallel appointing when more than one equivalent cell is that available and more than one matching task can be used for operation
Business.(if any) is limited according to the mission thread provided by CPU, task can independently or be collaboratively performed.In task pool
Complementary task can be combined in logic.When mission thread is completed, task pool notifies CPU.If mission thread bag
Individual task is included, then task pool can notify CPU when the task is completed.If mission thread includes multiple tasks, task pool
CPU is notified when this task chain is completed.As mission thread can be combined in logic, it is possible to expect with following feelings
Condition:The mission thread that task pool is logically combined notifies CPU after completing.
It will be understood by those skilled in the art that can be by CPU be configured to independently of related to various coprocessors
The abstraction level of the instruction set architecture of connection is combining and/or structure task is promoting the interoperability between CPU and coprocessor
Property, so as to allow component to be communicated in task rank rather than in instruction-level.Therefore, it can on the basis of " plug and play "
On equipment and its associated coprocessor are added to into network.Another aspect of the present invention is provided with different instruction set frame
Interoperability in the isomery array of the CPU of structure.
The various features of the present invention are particularly suited for the network of internet of things equipment and sensor;Heterogeneous computing environment;Gao Xing
Can calculating, two and three dimensions monolithic integrated optical circuit;Motor control and robot.
Description of the drawings
The present invention will hereinafter be described in conjunction with, and wherein, identical numeral represents similar element, in accompanying drawing
In:
Fig. 1 is and to be configured to carry out via structure (fabric) including CPU, memorizer, task pool according to embodiment
The schematic block diagram of the parallel processing architecture of multiple coprocessors of communication;
Fig. 2 is the schematic block diagram of the details for illustrating the example tasks pond according to embodiment;
Fig. 3 be according to embodiment including synergetic unit and with their corresponding agencies that task pool is interacted
Network schematic block diagram;
Fig. 4 is including available plug and the diagrammatic layout of the Internet of Things of playback equipment according to embodiment;
Fig. 5 is the signal cloth of the exemplary Internet of Things use case for illustrating that the dynamic of the neighbouring equipment according to embodiment is processed
Office;
Fig. 6 is the flow chart of the operation for illustrating the exemplary Parallel computing environment according to embodiment.
Specific embodiment
Various embodiments are related to parallel processing computing system and environment, from simple switching and control function to complicated journey
Sequence and algorithm, including but not limited to:Data encryption;Figure, video and Audio Processing;Direct memory access;Mathematical calculation;Data
Excavate;Game algorithm;Ethernet bag and other procotols are processed, including data construction, reception and the transmission of external network;Gold
Melt service and business method;Search engine;Internet data stream and other network applications;Software inside or outside performing
Program;Utensil, bulb, consumption electricity is for example switched on and off and/or is otherwise controlled or manipulate in the environment of Internet of Things
Sub- product etc..
Various features are may be incorporated in the computer architecture of any currently known or later exploitation.For example, it is related to synchronization, data
Safely, execute out the parallel processing problem interrupted with primary processor to solve using inventive concepts described herein.
Referring now to Fig. 1, distributed processing system(DPS) 10 includes monokaryon or multi-core CPU 11 and is configured to via cross bar
One or more equivalents that switching construction 14 is communicated with task pool 13 or synergetic unit 12A are to 12.Equivalent cell 12
Can also communicate with one another via switching construction 14 or via single unit bus (not shown).CPU 11 can be directly or via switching
Structure 14 is communicated with task pool 13.Each in one or more memory element 15 includes data and/or instruction.At this
In context, term " instruction " includes the software program that can perform compiling via CPU 11.Memory element 15, unit 12 and task
13 adjustable resistance of pond is wireless interconnected communicating with CPU and/or each other via switching construction 14.In certain embodiments, CPU
11 are only communicated with unit 12 indirectly by task pool.In other embodiments, CPU 11 also can not use task pool as in
Between thing directly communicated with unit 12.
In certain embodiments, system 10 may include more than one CPU 11 and more than one task pool 13, at this
In the case of kind, a specific CPU 11 can be interacted with a specific task pool 13, or multiple CPU 11 can share
One or more task pool 13.Additionally, each equivalent cell can be configured to be handed over more than one task pool 13
Mutually.Alternatively, a specific unit can be configured to interact with single appointed task pond, for example, in high-performance or height
Under security environment.
In various embodiments, when three below condition is met, unit can be with task pool dynamic pairing, resistance (insertion
And play) or wireless (aerial):
1) unit can carry out resistance communication or radio communication with task pool.Connection to task pool can pass through task pool
Port in itself, or by being connected to the switching construction of task pool;
2) task pool identification by unit send agency be it is believable, for example, using with or without password from
The input of user, by traditional Wi-Fi, Blootooth or similar pairing, manually by smart mobile phone or panel computer
The graphical software programs of operation pass through any other safe or unsafe method;
3) at least one of task pool available task is compatible with the ability of equivalent cell.
In the case of the multi-processor environment with multitask pond, except given unit can be locked or be restricted to
Outside only working together with any one task pool, aforementioned dynamic pairing condition is suitable for;Otherwise, unit can search base using first
Plinth, round-robin basis or any other selection scheme, are connected with one or more task pools.Can be with to the task in task pool point
With priority, thus unit gives high-priority task priority, and when without otherwise by higher priority task
During occupancy, servicing lower priority task.
CPU 11 could be for performing single core processor or polycaryon processor, application processor or the micro-control of software program
Device processed.System 10 can be implemented in personal computer, smart phone, panel computer, internet device, in this case,
CPU 11 can be any personal computer, central processing unit or processor cluster, such as,Or
The Local or Remote polycaryon processor of computing environment immediately.Alternatively, system 10 can be realized on supercomputer, and CPU
11 can be Reduced Instruction Set Computer (" RISC ") processor, application processor, microprocessor etc..
In other embodiments, system 10 can be in a series of locally-attached personal computer (such as, Beowulfs
(Beowulf) cluster) on realize, in this case, CPU 11 may include all of central processing unit, subset or networking calculate
One in machine.Alternatively, system 10 can be realized on the network on the computer of long-range connection, in this case, CPU
11 can be currently known or after a while by exploitation for server or the central processing unit of large scale computer.CPU 11 is retouched currently
The concrete mode for performing object parallel processing method in the system 10 stated can be affected by the operating system of CPU.For example, such as
Lower described, CPU 11 can be configured to which is programmed to recognize and communicate with task pool 13 and draw calculating demand
It is divided into thread and uses in system 10.
It is also contemplated that system 10 can on any computer with operating system or computer network retroaction reality
Existing, the operating system can be changed or be otherwise configured to realize functionality described herein.As known in the art
, data to be processed are included in memory cell 15, such as in random-access addressable area or subregion or read-only deposit
Under the scene of reservoir, for the cache memory of CPU 11, or other forms data storage such as flash memory and magnetic memory.
Memory cell 15 includes data to be processed and places the position of the result of the data for processing.It is not that each task is required for
Memory cell 15 is accessed, such as, in the case of intelligence instrument and automobile instrument, which can return data to system 10, or
In the case of robot and motor controller, which can make machine brake to person.
Each unit 12 is conceptually or in logic the independent calculating list that can run one or more tasks/threads
Unit.Unit 12 can be microcontroller, microprocessor, application processor, " mute " switch or stand-alone computer, such as Beowulf
Machine in cluster.
Unit 12 can be configured as supplementing, perform the general or special of function that is whole or performing narrow CPU
With coprocessor, or for example it is outside function to CPU 11, such as environmental surveillance and robot actuator.Dedicated processes
Device can be the application specific hardware modules for being designed, program or being otherwise configured to perform special duty, or which can be
It is configured to perform the general processor of the such as dedicated task of graphics process, floating-point arithmetic or data encryption.
In embodiment, it is also configured to access and write memorizer and hold as any unit 12 of application specific processor
Row descriptor as described below and other software programs.
Additionally, any amount of unit 12 can include heterogeneous computing environment;That is, using processor more than a type
The system of the mixing of (such as based on AMD and/or the processor based on Intel) or 32 and 64 bit processors.
As shown in following event sequence, each unit 12 is configured to perform one or more dedicated tasks.In wheel
During the inquiry stage, each unit periodically sends to task pool and acts on behalf of, until finding matching task.For the ease of this
Match somebody with somebody, unit and task pool can be equipped with transceivers.In the case of task pool, transceiver can be located at task pool itself or position
In the switching construction being connected with task pool.When task matching is found in task pool, task pool sends an acknowledgement to unit.
Following step is " communication channel " stage.In the communication channel stage, unit receives task and starts execution task.At one
In embodiment, once first task is done, then signal of communication is maintained so that equivalent cell can need not repeat " poll "
" confirmation " stage can capture other tasks.
System 10 may include multiple units, and wherein, some of these units unit is able to carry out identical with other units
Task type, so that create redundancy in system 10.Can be by another list by the task type set that given unit 12 is performed
The subset of the set of the task type that unit performs.For example, in FIG, the computational problem of aggregation can be divided into task by system 10
Group, fills task pool 13 using the task of the first kind, Second Type and the 3rd type.First module 12A only can be held
The task of the row first kind;Second unit 12B can be able to carry out the task of Second Type;3rd unit 12C can be able to carry out
The task of three types;4th unit 12D can be able to carry out second or the 3rd type task;5th unit 12N can be able to carry out
All three task type.System 10 can be configured with this redundancy so that if given unit is (or current busy from system 10
Or alternate manner is unavailable) remove, then system 10 can continue seamless operation.If additionally, unit is dynamically added to system
10, then system 10 can continue seamless operation under the benefit with higher performance.
Referring now to Fig. 1 and Fig. 2, task pool 13 can occupy the region of the physical storage that can be accessed by CPU 11.It is optional
Ground, task pool 13 can be accessed by MAC Address or IP address.For task pool 13 it is envisioned that multiple embodiments;It can be with CPU in thing
It is located in identical 2D or 3D monolithic IC in reason, or it may be implemented as independent IC and is physically interconnected to computer
Plate, smart phone, panel computer, router or internet of things equipment.In another alternative embodiment, task pool can be
The independent multiport of given CPU 11, the equipment for wiredly and/or wirelessly connecting is shared or is exclusively used between 11 systems of multiple CPU.
Task pool 13 can be being addressed by unit 12.Task pool 13 can be arranged in hardware blocks, with by CPU 11 and unit
12 provide maximum access speed.Alternatively, task pool 13 can be based on software, wherein, similar to hardware based enforcement
Example, the content of task pool 13 are stored in memory, but are represented by data structure.
When being filled by CPU 11, task pool 13 includes one or more mission threads 21.Each mission thread 21 is represented
Calculating task, the larger polymerization that the calculating task can be consequently exerted on CPU 11 calculate component or the subset of demand.At one
In embodiment, CPU 11 can be initialized and subsequently with while the filling task pool 13 of executable thread 21.Each thread 21 can be with
Including one or more discrete tasks 22.Task 22 can have task type and descriptor.Task type indicates which unit 12
It is able to carry out task 22.Task pool 13 can also carry out priority row using task type to the task 22 with same type
Sequence.In one embodiment, task pool 13 can safeguard that the priority list of the equivalent cell 12 that record is present in system 10 (does not show
Go out), whether the type of task 22 that is able to carry out of each unit and each unit currently processed.It is as described below, task
Pond 13 can be determined with the use priority table distributes to request unit by which eligible task 22.
In certain embodiments, CPU 11 can retrieve and perform task or thread from task pool.Additionally, CPU 11 can interrupt
Be confirmed as it is out-of-date, damage, block or mistake any task.In this case, 11 renewable tasks of CPU so as to available
In subsequent treatment.Do not have anything to prevent CPU 11 from realizing that adaptive task is managed, for example, such as artificial intelligence may need
, thus CPU 11 can add, remove or change the task in the existing thread 21 not completed.
Descriptor can be comprising by the specific instruction being performed, execution pattern, by position (for example, the ground of processed data
Location) and the placement location (if any) of task result in it is one or more.As a result placement location be it is optional, it is all
Such as, in the case of animation and Multimedia Task, result is generally presented to display rather than is stored in storage by which
In device.Additionally, task descriptor can be linked together, such as in chained list so that if be not linked together with descriptor
Compare, data to be processed can be accessed with less memory calls.In one embodiment, descriptor is comprising head
With the data structure of the multiple reference pointers to memory location, task 22 includes the storage address of data structure.Head is fixed
Justice function to be performed or instruction.First pointer quotes the position of data to be processed.Second optional pointer reference is processed
The placement location of data.If descriptor chain is connected to another descriptor to be performed serially, descriptor can include quoting
3rd pointer of next descriptor.In descriptor is the alternate embodiment of data structure, task 22 can include partial data
Structure.
Thread 21 may also include description and can perform the order of task 22 and " matching somebody with somebody for any condition for affecting order of performance
Side ".According to formula, can according to Boolean calculation sequentially, simultaneously, disorderly, interdependently or be conditionally executed appoint
Business 22.For example, as shown in Fig. 2 thread 21A includes four tasks:22A, 22B, 22C and 22D.In the embodiment shown, first
Task 22A can must be completed before starting in the second task 22B or the 3rd task 22C.According to formula, once the second task
22B or the 3rd task 22C are completed, then the 4th task 22D can start.
Thread 21 can also be complementary.For example, as shown in Fig. 2 due to the Boolean calculation in thread 21B, completing
Task 22C the process of the task in thread 21B can be allowed to continue.Task pool 13 can lock task 22, while task 22
Completing for its another task 22 for being relied on is being waited just.When task 22 is locked, which can not be obtained by unit.When thread 21
When task 22 is completed, task pool 13 can notify that CPU 11 is completed.Then, CPU can make process exceed the thread 21 for completing.
These units advantageously keep being equal to each other and with CPU 11, so as to help system 10 is by independently and actively
Ground performs the calculating of complexity from 13 retrieval tasks of task pool.Unit 12 is independently operated, because they can be independently of CPU
11 or any other coprocessors.It is alternatively possible to directly be acted on or indicating member by CPU.Each unit is taken action on one's own initiative,
Because once unit becomes available for further processing, it just finds task 22 from task pool 13.
More specifically, in one embodiment, unit 12 is by sending agency 30 to inquire (search for) task pool and retrieve
Available task 22 obtains task from task pool, and available task 22 needs to complete, is not locked out and has executable of unit
Service type.As a rule, system 10 and the agency for being equal to synergetic unit and having equal number.In this case, act on behalf of
The Frame being generally similar in network significance, because agency can be equipped with source address, destination address and load.In embodiment
In, when agency 30 is look for task 22, destination address is the address of task pool 13, has task when agency 30 returns to which
During 22 unit, destination address is the address of corresponding unit 12.Correspondingly, when agency 30 is look for task 22, source ground
Location is the address of unit 12, and when agency 30 returns to the unit which has task 22, source address is the ground of task pool 13
Location.
Additionally, source address and destination address can be conducive to frame synchronization.That is, system 10 can be configured to clearly distinguish
Address and load data so that when the content of agency 30 is read, destination address indicate the beginning of frame, and source address indicates frame
End, or vice versa it is as the same.This allows to be supported on when being placed between address and changes in size.In the variable load of size
In another embodiment, agency 30 may include to indicate the head of load.Header information and load can be compared to verify
Data integrity.In yet another embodiment, load can be regular length.When agency 30 passes through its coprocessor unit quilt
When being assigned to task pool 13, load includes the identification information of the task type that unit 12 can be performed.When agency 30 is from task pool
13 when returning, and load includes the descriptor of the task 22 in the form of the storage location or in the form of whole descriptor data structures.
In another embodiment, some of agency 30 or whole agencies are the autonomous generations of its each self-corresponding unit 12
Table.That is, each agency 30 can be assigned by its corresponding unit 12, in the unit free or to be able to carry out additional place
Retrieval tasks 22 during reason.In this way it is possible to more fully using the disposal ability of equivalent cell 12, because unit need not
Instruction from CPU 11 is waited idly.The method has by mitigating CPU to unit transmission request to retrieve from task pool
The demand of task is reducing the additional advantage of CPU overhead.These advantages cause system 10 more more effective than convention computer architecture, its
Middle supplementary module and coprocessor depend on the instruction from host CPU.
Additionally, equivalent cell 12A to 12n is contradiction for the concrete composition of thread itself.Conversely, agency is concerned only with and looks for
To matching between the ability and the available task 22 that will be completed in task pool 13 of its corresponding units.As long as that is, appointing
There is available task 22, and the ability of 22 matching unit of available task in pond 13 in business, then system can effectively utilize unit
Disposal ability.
Some of equivalent cell 12A to 12n all can be worked independently of one another, or can pass through switching construction 14, lead to
Cross task pool 13 or communicated with one another to wake up another equivalent cell to help process, move according to the order from CPU or request
Or send data.In one embodiment, act on behalf of 30A may search for the task type and unit 12A of ready task 22 can
Matching between the type of the task of execution.The framework can relate to the hard volume that CPU 11 is configured to the type of creating for task
Code.Therefore, if task pool 13 includes the task 22 of three types, and big calculating demand includes the task of the 4th type,
Then the task of the 4th type can be not placed in task pool 13, even if the task of being able to carry out the 4th type is included in being
System 10 in or be added in system 10.Therefore, CPU 11 can be configured to " study " or be taught how to create the 4th type
Task, more fully to utilize available process resource.
In another embodiment, one of instruction that search is able to carry out with unit 12A in 22 descriptor of task of agency 30
The executable instruction of matching.When matching task 22 is found, act on behalf of 30A and the descriptor of matching task 22 be distributed to into unit 12A,
Therefore, unit 12A start to process task 22.Specifically, act on behalf of 30A and the storage address of descriptor can be distributed to unit 12A,
Unit 12A is from memory search data structure.Alternatively, the partial data structure of descriptor is included in task 22, agency
Partial data structure can be distributed to unit 12A to be processed by 30A.Which instruction descriptor notification unit 12A performs, and can look for
To the data in memory element 15 by processed position and in memorizer 15 structure by the position being placed.Complete to appoint
Be engaged in 22 when, unit 12A notifies that the state of task 22 for selecting will be changed into " completing " from " will be done " by task pool 13.This
Outward, once unit 12A completes task 22, then its 30A that acts on behalf of can be assigned to task pool 13 to search for another task 22 by unit.
Act on behalf of some of 30A to 30n agencies or all agency can be according to the concrete framework of system 10 and/or embodiment party
Case, by wired or wireless (for example, using Wi-Fi network, wireless ethernet, Wireless USB, wireless bridge, wireless repeater, nothing
Line router,Or Bluetooth pairing) pass through system 10.In embodiment, agency 30 can pass through in office
Business pond 13 includes receptor feature and further by including that the transmitter feature with unit 12 is wirelessly guided to task pool
13.Similarly, task pool to unit can be carried out by being equipped with transmitter for task pool and for equivalent cell is equipped with receptor
Wireless answering.By this way, unit can wirelessly be communicated with task pool in the case where using or not using switching construction.
In a preferred embodiment, however, using a certain form of switching construction 14.Switching construction 14 utilizes data transfer
Connection and system resource between arbitration.Switching construction 14 can provide connection between various units and task pool
Router or crossbar switch.Switching construction 14 is may also provide in each equivalent cell 12A to 12n and system resource (such as, CPU
11st, memory element 15 and legacy system component, including but not limited to, direct memory access unit, transmitter, hard disk and its control
Device processed, display and other input-output apparatus and other coprocessors) between connection.Unit 12A to 12n can quilt
Switching construction 14 is physically connected to, or unit can be wirelessly connected.
Unit wireless is connected to system 10 and is conducive to the dynamic of the unit for using in system 10 to increase and/or remove.Example
Such as, CPU 11 can recruit unit from other cellular systems, it is allowed to dynamic expansion and raising performance.By this way, two or
More cellular systems (for example, network) can share equivalent cell.In one embodiment, the unit for becoming idle can be sought
Look for and/or recruited by another system for needing additional processing resources, i.e. which has the available processes task that needs are completed.It is similar
Ground, system 10 can be by being incorporated to the cluster of the additional unit for specific tasks come scalability.For example, system 10 can be by simultaneously
Enter to be able to carry out the neighbouring unit of these tasks to strengthen the performance of encryption function/decryption function, or to voice data and/or regard
The process of frequency evidence.
In order to prevent undesirable connection, CPU 11 from can provide credible and/or insincere for identifying to task pool 13
The list of unit and authentication requesting or agreement or alternatively, for identifying the standard of credible and/or insincere unit.This
Outward, task pool itself can exclude tool based on low performance, unreliable connection, the data throughout of difference or malice or misbehavior
Body unit.In various embodiments, unit 12 can pass through making for smart mobile phone, panel computer or miscellaneous equipment or application by user
With, and be added to task pool 13 or exclude from task pool 13.In one embodiment, figure application interface can be carried to user
For useful static and/or icon information, such as, the position of available cell and miscellaneous equipment, performance gain or performance are compensatory, make
For increasing concrete unit or removing the result of concrete unit from network.
In an alternative embodiment, some of synergetic unit unit or whole units can such as by being used for
Wired configuration of the switching construction 14 of communication is directly connected to task pool 13.The wired connection of unit is may additionally facilitate similar to above-mentioned
The dynamic expansion of the system 10 of radio configuration and contraction, although wired connection can be physics, (for example, manually) is integrated and peripheral
The extraction of equipment.In either case, compared with conventional parallel processing scheme, the extensibility of system is greatly enhanced, because can
Reprogramming is not carried out to CPU 11 to consider the change to system 10 add and to remove coprocessor.
Referring now to Fig. 3, network 300 include CPU 302, first memory 304, second memory 306, task pool 308,
Switching construction 310, be configured to perform (operation) type A task the first synergetic unit 312, be configured to perform type
The second unit 314 of B tasks, the 3rd unit 316 for being configured to execution Type C task, and be configured to perform type A times
4th unit 318 of both business and type B task.As described above, task pool 308 is by task (or the task line of task type A
Journey) 330 and 332;The task 340 and 342 of the task 334 and 336 of task type B and task type C and insert (for example, by
CPU 302).In embodiment, each unit preferably has unique special agency.Specifically, unit 312 includes agency
320;Unit 314 includes agency 322;Unit 316 includes agency 324;Unit 318 includes agency 326.Each agency preferably wraps
Include the head of the type of information field or identification mission, the task be unit associated there be configured perform task, example
Such as, individual task or task A, B, the combination of C.
During operation, when unit is idle or otherwise has available processes ability, its agency master
It is dynamic to inquire task pool to determine whether any task is suitable for the concrete unit in task queue.For example, unit 312 can be assigned
One or two to retrieve in corresponding with task type A task 330 and 332 of its agency 320.Similarly, unit 314 can divide
Send its agency 322 to retrieve task 334 or 336 (according to their corresponding priority) corresponding with task type B etc..For energy
Enough units performed more than a task type, such as unit 318 are configured to execution task type A and B, agency 326 and can examine
Any one in rope task 330,332,334 and/or 336.
When task is retrieved from task pool, unit can be with the post processing task, generally by from first memory 304
Particular location retrieval data, process the data the particular location by the data storage after process in second memory 306
On.When task is done, unit notifies task pool, and by the task flagging for completing, task pool notifies that CPU should appoint to task pool
Business is completed.Alternatively, task pool has completed to notify CPU when mission thread, because mission thread may include individual task, appoint
The boolean combination of business string or task.Importantly, in the case where there is no the direction communication between CPU and unit, may
Retrieval of the generating unit to task and the process to data.
Referring now to Fig. 4, Internet of Things network 400 include controller (CPU) 402, task pool 408 and various equipment 410 to
422, wherein, some of described equipment or all including associated or embedded microcontroller, such as, integrated circuit (IC) core
Piece realizes other components of disposal ability.Used as non-restrictive example, the equipment may include bulb 410, thermostat 412, electric mortiser
Seat 414, on and off switch 416, utensil (for example, bread baker) 418, vehicle 420, keyboard 422 and can be with network interaction
Actually any other PnP device or application.
In the illustrated embodiment, controller 402 can be smart phone, panel computer, laptop or can wrap
Include display 404 and user interface (for example, keyboard) 406 to facilitate user to carry out user mutual with the various equipment in network
Miscellaneous equipment.May be not sufficient enough in the degree of support network in the disposal ability (for example, bandwidth) of controller 402, control
Device effectively can be obtained from ancillary equipment via task pool or recruit process resource, for example, such as explain with reference to Fig. 5.
Referring now to Fig. 5,500 use case of Internet of Things network illustrates the dynamic profit of (or alternate manner is available) equipment nearby
With.Network 500 include main control unit 502 (for example, laptop, panel computer or game station), task pool 504,
First coprocessor equipment 506 and the second coprocessor equipment 508.Showing under the background of network 500 will now be described
Example property use case.
Assume that user is just playing video-game on her laptop computer 502.Video-game needs detailed calculating
Machine generates the disposal ability in image, and possible laptop computer 502 and be enough to single true outward appearance role is presented, but works as
When second role is introduced on screen, deterioration in image quality, and the movement of role is no longer continuous.The present invention proposes a kind of profit
With the method for the disposal ability of the neighbouring or available computer resource underused of user positioned at user.
In order to solve the demand of added processing power, laptop computer 502 is connected to task pool 504.In this regard,
Laptop computer itself can be equipped with task pool, or task pool can be with external equipment or application positioned at from calculating on knee
In the range of 502 wireless arrival of machine.In the case of outside task pool, task pool itself can perform the knot of the switching with port
The responsibility of structure, to allow attachment to multiple synergetic units.Laptop computer 502 is filled using computation-intensive task
Task pool 504.(such as, 508) smart phone is subsequently connected to task pool 504 to the equipment nearby underused, and sends which
Act on behalf of the task type to extract matching.Therefore, smart phone 508 becomes at the seamless collaboration for assisting laptop computer 502
Reason device, so that strengthen video game experience.In the case where there are other process resources for underusing and needing, can be again
Multiple identical method.Even if in fact, the collaboration that can also become laptop computer with the disposal ability of bulb 506 is processed
Device.
Fig. 6 is the flow chart of the operation for illustrating exemplary Parallel computing environment.Specifically, method 600 includes:Using appoint
Task pool (step 602) is inserted in business;One or more agencies are actively assigned to into task pool (step from one or more corresponding units
It is rapid 604);Retrieval process task (step 606);Notify that task pool and CPU mission threads are performed (step 608).It is described
Method also includes optional equipment is dynamically incorporated to network (step 610) as needed.
It thus provides a kind of processing system, the processing system includes task pool;Controller, is configured to, with
One task fills task pool;And first coprocessor, it is configured to actively retrieve first task from task pool.
In embodiment, the first coprocessor includes first agent, is configured to be led to controller
First task is retrieved from task pool in the case of letter.
In embodiment, first task includes the labelling of first task type, and the first coprocessor is configured to perform
The task of the first kind, first agent are configured to search for the task of the first kind in task pool.
In embodiment, the first coprocessor is additionally configured to process first task and notifies when first task is completed
Task pool, task pool are configured to the notification controller when first task is completed.
In embodiment, controller and the first coprocessor are configured to only by being led between task pool
Letter.
In embodiment, controller and the first coprocessor are configured to be led to both directly and through task pool each other
Letter.
In embodiment, the first coprocessor is configured to determine which has available disposal ability and in response to described
It is determined that by agent allocation to task pool.
In embodiment, controller is additionally configured to insert task pool using the second task, and wherein, the system also includes
Second coprocessor, second coprocessor have be configured to from task pool actively retrieve the second filial generation of the second task
Reason.
In embodiment, the second task includes the labelling of the second task type, and the second coprocessor is configured to perform
The task of Second Type, second agent are configured to search for the task of Second Type in task pool.
In embodiment, controller and task pool are resided on monolithic integrated optical circuit (IC), and the first coprocessor is not normal
Reside on IC.
In another embodiment, controller, task pool and the first coprocessor and the second coprocessor are resided in
On monolithic integrated optical circuit (IC).
Further it is provided that one kind is dynamically control on the process resource in the network including the type of CPU (CPU)
Method, the CPU is configured to, with the first task with first task type and inserts task pool.Methods described includes following
Step:First module is programmed to perform first task type;First module after programming is added to into network;By first
Agency is actively sent to task pool from first module;First agent searches for the task of the first kind in task pool;First agent
First task is retrieved from task pool;First task is transferred to first module by first agent;First module processes first task;Will
First task task is completed to be notified to be sent to task pool from first module.
In embodiment, methods described also includes:First task is labeled as completing by task pool;First task is complete
Into notice be sent to CPU from task pool.
In embodiment, methods described also includes:Configuration first module has available processes ability to determine first module
As according to (predicate) so that first agent to be actively sent to task pool.
In embodiment, methods described also includes:First module after by programming is single by first before being added to network
Unit is integrated into the first equipment.
In embodiment, the first equipment includes that sensor, bulb, on and off switch, utensil, biometric apparatus, medical treatment set
One in standby, diagnostic device, laptop, panel computer, smart phone, electric machine controller and safety equipment.
In embodiment, the first module of programming is added to network includes setting up logical between first module and task pool
Letter link.
In embodiment, CPU is additionally configured to insert task pool using the second task with the second task type, described
Method is further comprising the steps of:Second unit is programmed to perform the second task type;Second unit and task pool it
Between set up communication linkage;Second agent is actively sent to into the task pool from second unit;Second agent is searched in task pool
The task of rope Second Type;Second agent retrieves the second task from task pool;It is single that second task is sent to second by second agent
Unit;Second unit processes the second task;Notify to be sent to task pool from second unit by the second task completed;Task pool will
Second task flagging is to complete;And notify to be sent to CPU from task pool by the second task completed.
A kind of system for controlling the distributed processing resources under Internet of Things (IoT) computing environment is additionally provided, it is described
System includes:CPU, is configured to PC cluster demand is divided into multiple tasks and task is placed in pond;And it is multiple
Equipment, each equipment have unique dedicated proxies, be configured to need not with CPU direction communications in the case of from the pond master
Dynamic retrieval tasks.
Although it is shown that the enable description to the various embodiments including optimal mode known for inventor, but this
Art personnel will be understood that, without departing from the scope of the invention, can make various changes and modifications, and wait
Jljl can substitute various elements.The invention is not restricted to disclosed specific embodiment it is therefore intended that disclosed herein, and
It is all embodiments that will include and falling in the word and equivalency range of claim of the invention.
Claims (20)
1. a kind of processing system, including:
Task pool;
Controller, is configured to insert the task pool with first task;And
First coprocessor, is configured to actively retrieve the first task from the task pool.
2. processing system as claimed in claim 1, wherein, first coprocessor includes first agent, described first
Agency is configured to be retrieved the first task in the case where not communicating with the controller from the task pool.
3. processing system as claimed in claim 2, wherein, the first task includes the mark of first task type, described
First coprocessor is configured to perform the task of the first kind, and the first agent is configured in the task pool
The middle task of searching for the first kind.
4. processing system as claimed in claim 1, wherein, first coprocessor is additionally configured to process described first
Task simultaneously notifies the task pool when the first task is completed.
5. processing system as claimed in claim 1, wherein, the task pool is configured to lead to when the first task is completed
Know the controller.
6. processing system as claimed in claim 1, wherein, the controller and first coprocessor are only configured to
Communicated with each other by the task pool.
7. processing system as claimed in claim 1, wherein, the controller and first coprocessor are configured to directly
Ground connection communicates with each other and passes through the task pool and communicates with each other.
8. processing system as claimed in claim 2, wherein, first coprocessor is configured to determine that have can use
Reason ability, and determine the agent allocation to the task pool in response to described.
9. processing system as claimed in claim 3, wherein, the controller is additionally configured to insert described appointing with the second task
Business pond, and wherein, the system also includes the second coprocessor with second agent, the second agent is configured to
Second task is retrieved actively from the task pool.
10. processing system as claimed in claim 9, wherein, second task includes the mark of the second task type, described
Second coprocessor is configured to perform the task of Second Type, and the second agent is configured in the task pool
The middle task of searching for the Second Type.
11. processing systems as claimed in claim 1, wherein, the controller and the task pool reside in single-chip integration electricity
On road (IC), and first coprocessor is not resided on the IC.
12. processing systems as claimed in claim 9, wherein, at the controller, the task pool and first collaboration
Reason device and second coprocessor are resided on monolithic integrated optical circuit (IC).
The method that 13. one kind are dynamically control on the process resource in the network including the type of central processing unit (CPU), it is described
CPU is configured to insert task pool with the first task with first task type, the method comprising the steps of:
First module is programmed to perform the first task type;
First module after the programming is added to into the network;
First agent is actively sent to into the task pool from the first module;
The first agent searches for the task of the first kind in the task pool;
The first agent retrieves the first task from the task pool;
The first task is transported to the first module by the first agent;
The first module processes the first task;And
Notify to be sent to the task pool from the first module by the first task completed.
14. methods as claimed in claim 13, also include:The first task is labeled as completing by the task pool;With
And notify to be sent to the CPU from the task pool by the first task completed.
15. methods as claimed in claim 13, also include:Configure the first module with determine the first module have can
With disposal ability as foundation so that the first agent is actively sent to the task pool.
16. methods as claimed in claim 13, also include:First module after by the programming be added to the network it
It is front that the first module is integrated into into the first equipment.
17. methods as claimed in claim 16, wherein, first equipment include sensor, bulb, on and off switch, utensil,
Biometric apparatus, armarium, diagnostic device, laptop, panel computer, smart phone, electric machine controller and peace
One in full equipment.
18. methods as claimed in claim 13, wherein, the first module after the programming is added to the network includes:
Communication linkage is set up between the first module and the task pool.
19. methods as claimed in claim 13, wherein, (CPU) is additionally configured to with the second task type the
Two tasks insert the task pool, and methods described is further comprising the steps of:
Second unit is programmed to perform second task type;
Communication linkage is set up between the second unit and the task pool;
Second agent is actively sent to into the task pool from the second unit;
The second agent searches for the task of Second Type in the task pool;
The second agent retrieves second task from the task pool;
Second task is transported to the second unit by the second agent;
The second unit processes second task;
Notify to be sent to the task pool from the second unit by second task completed;
Second task flagging is to complete by the task pool;And
Notify to be sent to the CPU from the task pool by second task completed.
A kind of 20. systems for controlling the distributed processing resources under Internet of Things (IoT) computing environment, including:
CPU, is configured to PC cluster demand is divided into multiple tasks and the task is placed in pond;And
Multiple equipment, each equipment have unique dedicated proxies, and the agency is configured to directly lead to the CPU
From the pond active retrieval tasks in the case of letter.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/340,332 | 2014-07-24 | ||
US14/340,332 US9852004B2 (en) | 2013-01-25 | 2014-07-24 | System and method for parallel processing using dynamically configurable proactive co-processing cells |
PCT/US2015/039993 WO2016014263A2 (en) | 2014-07-24 | 2015-07-10 | System and method for parallel processing using dynamically configurable proactive co-processing cells |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106537343A true CN106537343A (en) | 2017-03-22 |
Family
ID=55165563
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580039190.0A Pending CN106537343A (en) | 2014-07-24 | 2015-07-10 | System and method for parallel processing using dynamically configurable proactive co-processing cells |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP3172669A4 (en) |
CN (1) | CN106537343A (en) |
WO (1) | WO2016014263A2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112713993A (en) * | 2020-12-24 | 2021-04-27 | 天津国芯科技有限公司 | Encryption algorithm module accelerator and high-speed data encryption method |
CN112799792A (en) * | 2021-02-01 | 2021-05-14 | 安徽芯纪元科技有限公司 | Method for protecting task context register of embedded operating system |
CN113535405A (en) * | 2021-07-30 | 2021-10-22 | 上海壁仞智能科技有限公司 | Cloud service system and operation method thereof |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117389731B (en) * | 2023-10-20 | 2024-04-02 | 上海芯高峰微电子有限公司 | Data processing method and device, chip, device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2000011561A1 (en) * | 1998-08-21 | 2000-03-02 | Corporate Media Partners Doing Business As Americast | System and method for a master scheduler |
US20030005029A1 (en) * | 2001-06-27 | 2003-01-02 | Shavit Nir N. | Termination detection for shared-memory parallel programs |
US20070074207A1 (en) * | 2005-09-27 | 2007-03-29 | Sony Computer Entertainment Inc. | SPU task manager for cell processor |
US20110310977A1 (en) * | 2009-02-18 | 2011-12-22 | Nec Corporation | Task allocation device, task allocation method, and storage medium storing tas allocation program |
CN102427577A (en) * | 2011-12-06 | 2012-04-25 | 安徽省徽商集团有限公司 | System for pushing information from collaboration server to mobile terminal and method thereof |
US8209702B1 (en) * | 2007-09-27 | 2012-06-26 | Emc Corporation | Task execution using multiple pools of processing threads, each pool dedicated to execute different types of sub-tasks |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8108867B2 (en) * | 2008-06-24 | 2012-01-31 | Intel Corporation | Preserving hardware thread cache affinity via procrastination |
US8732713B2 (en) * | 2010-09-29 | 2014-05-20 | Nvidia Corporation | Thread group scheduler for computing on a parallel thread processor |
US8949853B2 (en) * | 2011-08-04 | 2015-02-03 | Microsoft Corporation | Using stages to handle dependencies in parallel tasks |
US8990833B2 (en) * | 2011-12-20 | 2015-03-24 | International Business Machines Corporation | Indirect inter-thread communication using a shared pool of inboxes |
-
2015
- 2015-07-10 EP EP15825147.0A patent/EP3172669A4/en not_active Ceased
- 2015-07-10 CN CN201580039190.0A patent/CN106537343A/en active Pending
- 2015-07-10 WO PCT/US2015/039993 patent/WO2016014263A2/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2000011561A1 (en) * | 1998-08-21 | 2000-03-02 | Corporate Media Partners Doing Business As Americast | System and method for a master scheduler |
US20030005029A1 (en) * | 2001-06-27 | 2003-01-02 | Shavit Nir N. | Termination detection for shared-memory parallel programs |
US20070074207A1 (en) * | 2005-09-27 | 2007-03-29 | Sony Computer Entertainment Inc. | SPU task manager for cell processor |
US8209702B1 (en) * | 2007-09-27 | 2012-06-26 | Emc Corporation | Task execution using multiple pools of processing threads, each pool dedicated to execute different types of sub-tasks |
US20110310977A1 (en) * | 2009-02-18 | 2011-12-22 | Nec Corporation | Task allocation device, task allocation method, and storage medium storing tas allocation program |
CN102427577A (en) * | 2011-12-06 | 2012-04-25 | 安徽省徽商集团有限公司 | System for pushing information from collaboration server to mobile terminal and method thereof |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112713993A (en) * | 2020-12-24 | 2021-04-27 | 天津国芯科技有限公司 | Encryption algorithm module accelerator and high-speed data encryption method |
CN112799792A (en) * | 2021-02-01 | 2021-05-14 | 安徽芯纪元科技有限公司 | Method for protecting task context register of embedded operating system |
CN112799792B (en) * | 2021-02-01 | 2023-12-05 | 安徽芯纪元科技有限公司 | Task context register protection method of embedded operating system |
CN113535405A (en) * | 2021-07-30 | 2021-10-22 | 上海壁仞智能科技有限公司 | Cloud service system and operation method thereof |
Also Published As
Publication number | Publication date |
---|---|
WO2016014263A3 (en) | 2016-03-17 |
WO2016014263A2 (en) | 2016-01-28 |
EP3172669A4 (en) | 2018-03-14 |
EP3172669A2 (en) | 2017-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200183735A1 (en) | System and Method For Swarm Collaborative Intelligence Using Dynamically Configurable Proactive Autonomous Agents | |
CN108268425B (en) | Programmable matrix processing engine | |
CN103597784B (en) | The method and system of the master device-slave unit pair in the switching fabric of dynamic creation and service portable computing with across described switching fabric | |
JP3836840B2 (en) | Multiprocessor system | |
US20120290756A1 (en) | Managing Bandwidth Allocation in a Processing Node Using Distributed Arbitration | |
US11010313B2 (en) | Method, apparatus, and system for an architecture for machine learning acceleration | |
JP2013501299A (en) | Data multicasting in distributed processor systems. | |
CN106537343A (en) | System and method for parallel processing using dynamically configurable proactive co-processing cells | |
US10713026B2 (en) | Heterogeneous distributed runtime code that shares IOT resources | |
US20090113138A1 (en) | Combined Response Cancellation for Load Command | |
WO2007074905A1 (en) | Network equipment system | |
JP2005209207A (en) | Method for managing data in array processor, and array processor | |
Dogan et al. | Accelerating graph and machine learning workloads using a shared memory multicore architecture with auxiliary support for in-hardware explicit messaging | |
JP2020027613A (en) | Artificial intelligence chip and instruction execution method used in artificial intelligence chip | |
Si et al. | Direct MPI library for Intel Xeon Phi co-processors | |
CN103455371A (en) | Mechanism for optimized intra-die inter-nodelet messaging communication | |
JP3836837B2 (en) | Method, processing unit, and data processing system for microprocessor communication in a multiprocessor system | |
US10339059B1 (en) | Global socket to socket cache coherence architecture | |
CN103093446A (en) | Multi-source image fusion device and method based on on-chip system of multiprocessor | |
Zimmer et al. | Nocmsg: Scalable noc-based message passing | |
JP2021507384A (en) | On-chip communication system for neural network processors | |
EP4022446B1 (en) | Memory sharing | |
CN113556242B (en) | Method and equipment for performing inter-node communication based on multi-processing nodes | |
CN112948001A (en) | Method for setting tensor hardware configuration, readable storage medium and device | |
CN103294623B (en) | A kind of multi-thread dispatch circuit of configurable SIMD system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1232997 Country of ref document: HK |
|
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170322 |
|
REG | Reference to a national code |
Ref country code: HK Ref legal event code: WD Ref document number: 1232997 Country of ref document: HK |