CN103491024A

CN103491024A - Job scheduling method and device for streaming data

Info

Publication number: CN103491024A
Application number: CN201310451552.XA
Authority: CN
Inventors: 王旻; 韩冀中; 李勇; 张章; 孟丹
Original assignee: Institute of Information Engineering of CAS
Current assignee: Institute of Information Engineering of CAS
Priority date: 2013-09-27
Filing date: 2013-09-27
Publication date: 2014-01-01
Anticipated expiration: 2033-09-27
Also published as: CN103491024B

Abstract

The invention relates to a job scheduling method for streaming data. The job scheduling method for the streaming data comprises the following steps: a scheduling manager acquires jobs to be scheduled from a scheduling queue in real time, and utilizes a directed acyclic graph for generating a processing unit queue according to information of the jobs to be scheduled; according to the ratio of the non-local communication number of physical nodes to dominant resources, the scheduling manager selects one physical node for each processing unit in the processing unit queue and distributes all the processing units to the corresponding physical nodes respectively; when an actuator starts one processing unit, a linux container is established on the physical node of the processing unit, and then the processing unit is started inside the linux container. According to the job scheduling method for the streaming data, the processing units are scheduled to the physical nodes which are small in non-local communication number and low in load, the processing units which need communication frequently can be concentrated to the same physical node, and therefore network communication across the physical nodes is reduced.

Description

A kind of job scheduling method towards stream data and device

Technical field

The present invention relates to computer parallel computation field, particularly a kind of job scheduling method towards stream data and device.

Background technology

In recent years, along with the development of the application such as real-time search, advertisement recommendation, social networks, daily record on-line analysis, a kind of new the data form---stream data rises.Stream data refers to one group of a large amount of, quick, continual sequence of events.Under different scenes, stream data can be the several data forms such as real-time query, user click, online daily record, Streaming Media.Real-time, interactive is focused in the streaming application, and the response of too high time delay can have a strong impact on its function or the user experiences.The S4 system.

Event is the basic composition unit of stream data, with key-value (key-value) form, occurs.Processing unit is the base unit of processing event, and specific event type and key are arranged, and special disposal has the event of respective type and key.Processing unit receives stream data, event is wherein processed, then outgoing event or directly issue result.

Stream data is processed characteristics such as having communication flows is large, amount of calculation large, fast response time, and systematic function is had relatively high expectations.In stream data is processed, processing unit is distributed on physical node, uses the physical resources such as its CPU, internal memory, the network bandwidth, and the physical node overload can directly affect the performance performance of processing unit.In addition, processing unit need to intercom, transmit stream data mutually, and this will produce network delay and communication cost.How, according to the characteristics of stream data, reasonably scheduling processing unit is the key issue that stream data is processed.

Existing stream data treatment system does not consider processing unit communication and the real-time load of physical node, just guarantee processing unit distributing equilibrium on data volume, but, due to the demand difference of different processing units to various resources, quantitative equilibrium does not mean that resource is used and the equilibrium of actual loading.And frequent processing unit of communicating by letter may be scheduled for remote environment mutually, increases its communication cost and time delay.Existing dispatching method can not meet the demand that stream data is processed well, causes the phenomenon that the physical node load is uneven, communication delay is higher, affects the overall performance that stream data is processed.

Summary of the invention

Technical problem to be solved by this invention is to provide a kind of by calculating non-local number of communications and leading resource ratio, and as standard, physical node is sorted, can be by the job scheduling method towards stream data on the physical node that processing unit dispatching is less to non-local communication, load is lower and device.

The technical scheme that the present invention solves the problems of the technologies described above is as follows: a kind of job scheduling method towards stream data comprises the following steps:

Step 1: scheduler handler treats that from storage the scheduling queue of schedule job, Real-time Obtaining is treated schedule job; and utilize directed acyclic graph to generate the processing unit queue that comprises a plurality of processing units according to the information for the treatment of schedule job; described scheduler handler is arranged on the physical node of high configuration, is not provided with on other physical nodes of scheduler handler and is respectively arranged with an actuator;

Step 2: scheduler handler is according to non-local number of communications and the leading resource ratio of physical node, for each processing unit in the processing unit queue is selected respectively physical node, all processing units are distributed to respectively to corresponding physical node, and the quantity that is provided with processing unit on each physical node is zero to a plurality of;

Step 3: actuator, when starting processing unit, first creates a linux container on this physical node, then at the linux internal tank, starts processing unit.

The invention has the beneficial effects as follows: the method will need the processing unit of frequent communication to focus on the Same Physical node, reduce the network service across physical node.Simultaneously, in the situation that communication cost is identical or the physical node load too high, the method can select the lower physical node of load to be disposed, and the phenomenon of avoiding transshipping occurs.The present invention has reduced communication cost and network delay that stream data is processed, has realized load balancing, has improved the overall performance that stream data is processed.

On the basis of technique scheme, the present invention can also do following improvement.

Further, described step 1 specifically comprises the following steps:

Step 1.1: scheduler handler is obtained and is treated schedule job from the scheduling queue of schedule job is treated in storage according to the first-in first-out principle;

Step 1.2: according to predetermined business demand information and the concurrent demand information for the treatment of schedule job, by operation be decomposed into take processing unit as summit, the data path of the take directed acyclic graph that is limit;

Step 1.3: select a summit that does not enter limit from directed acyclic graph, the summit that this is not entered to limit adds in the processing unit queue;

Step 1.4: delete the summit that does not enter limit of selecting in described step 1.3 from directed acyclic graph, and delete all limits that send on the summit that do not enter limit from this;

Step 1.5: judge in described directed acyclic graph whether also have summit, if having go to step 1.2; If not finish.

Further, described step 2 is further comprising the steps:

Step 2.1: non-local number of communications and the leading resource ratio of calculating each physical node, described non-local number of communications refer to current physical node on processing unit carry out network service, not in the quantity of the processing unit to be scheduled of Same Physical node, described leading resource ratio refer to the multiple resources demand can use than in the highest resource requirement can use than;

Step 2.2: according to non-local number of communications and leading resource ratio, physical node is sorted, obtain sorted lists, and the physical node in list is labeled as to " not reading ";

Step 2.3: select first physical node of " not reading " from described sorted lists, and it is labeled as to " reading ";

Step 2.4: whether the weighting load value that judges first physical node of " not reading " of described selection is less than predetermined value, if the predetermined value of being less than is positioned over processing unit to be scheduled on this physical node, finishes;

Step 2.5: whether also have the physical node of " not reading " in described sorted lists, if having go to step 2.3; If no, select first physical node in sorted lists.

Further, further comprise during the non-local number of communications of Computational Physics node:

Non-local number of communications is initialized as to 0;

Suppose that the processing unit on current physical node has been dispatched to this physical node, then travels through all processing units that completed scheduling;

If processing unit needs to communicate by letter with processing unit to be scheduled and two processing units not on the Same Physical node, current non-local number of communications adds one, finally obtains non-local number of communications.

Further, the computational methods of described leading resource ratio are specially:

The resource requirement of calculating every kind of resource of this physical node can be used can be by the ratio than being the resource available quantity of the resources requirement of processing unit and physical node than, described resource requirement;

Select the multiple resources demand can with than in a highest leading resource ratio that is this physical node.

Further, in described step 2.2, according to non-local number of communications and leading resource ratio, physical node is sorted and specifically comprises the following steps from small to large:

Two physical nodes are more non-local number of communications first, if do not wait, the physical node that non-local number of communications is little comes front; If equate, then compare the leading resource ratio of the two, the leading little physical node of resource ratio comes front.

Further, the circular of described weighting load value is:

Weighting load value=cpu busy percentage * 0.3+ memory usage * 0.3+ network bandwidth utilization factor * 0.4.

Further, a kind of device of the job scheduling towards stream data, comprise scheduler handler, actuator and scheduling queue;

Described scheduler handler, be arranged on the physical node of high configuration, for from the scheduling queue Real-time Obtaining, treating schedule job, and utilize directed acyclic graph to generate the processing unit queue that comprises a plurality of processing units according to the information for the treatment of schedule job, non-local number of communications and leading resource ratio according to physical node, for each processing unit in the processing unit queue is selected respectively physical node, all processing units are distributed to respectively to corresponding physical node, and each physical node is provided with at least one processing unit;

Described actuator, for starting processing unit, the processing unit that scheduler handler is dispatched to this physical node is positioned over the linux internal tank, at the linux internal tank, starts processing unit;

Described scheduling queue, be deployed on the Same Physical node with scheduler handler, for storage, treats schedule job.

Further, described scheduler handler comprises collection module and scheduler module;

Described collection module, for total amount and the available quantity of the IP address of collecting each actuator place physical node, communication port, every kind of resource, the resource behaviour in service of actuator resource behaviour in service and processing unit;

Described scheduler module, for from scheduling queue, obtaining and treat schedule job, and comprise the processing unit queue of a plurality of processing units according to the Information generation for the treatment of schedule job, select respectively physical node for each processing unit in the processing unit queue, all processing units are dispatched respectively to corresponding physical node.

Further, each described processing unit is provided with unique processing unit sign.

The accompanying drawing explanation

Fig. 1 is the inventive method flow chart of steps;

Fig. 2 is the flow chart that the present invention generates the processing unit queue;

The schematic diagram that Fig. 3 is processing unit directed acyclic graph of the present invention;

Fig. 4 is the schematic diagram that processing unit directed acyclic graph of the present invention changes;

The flow chart that Fig. 5 is processing unit dispatching method of the present invention;

The flow chart that Fig. 6 is the deployment process unit;

Fig. 7 is apparatus of the present invention structure chart.

In accompanying drawing, the list of parts of each label representative is as follows:

1, scheduler handler, 2, actuator, 3, scheduling queue.

Embodiment

Below in conjunction with accompanying drawing, principle of the present invention and feature are described, example, only for explaining the present invention, is not intended to limit scope of the present invention.

As shown in Figure 1, be the inventive method flow chart of steps; Fig. 2 is the flow chart that the present invention generates the processing unit queue; The schematic diagram that Fig. 3 is processing unit directed acyclic graph of the present invention; Fig. 4 is the schematic diagram that processing unit directed acyclic graph of the present invention changes; The flow chart that Fig. 5 is processing unit dispatching method of the present invention; The flow chart that Fig. 6 is the deployment process unit; Fig. 7 is apparatus of the present invention structure chart.

Embodiment 1

The embodiment of the present invention has realized a stream data treatment system, and this system comprises a plurality of actuators and a scheduler handler.Wherein actuator is the finger daemon run on physical node, except the physical node at scheduler handler place, is moving an actuator on each physical node of system management.

Actuator can start and close processing unit on this physical node.While starting processing unit, actuator will first create the Linux container of an allocated resource capacity on physical node, then at the Linux internal tank, start the task that processing unit need to be carried out.Processing unit is corresponding one by one with the Linux container, and each processing unit is placed among a Linux container.The Linux container can be course allocation allocated resource wherein, because the stream data transaction module accompanies by high flow capacity communication usually, so the resource type that native system distributes is more comprehensive, comprises CPU, internal memory, the network bandwidth etc.Like this, each processing unit, at the Linux internal tank, is used the allocated resource of system assignment, and separate operation, realized resource isolation, avoided resource contention, promoted overall performance and the operation stability of processing unit.

Simultaneously, actuator is running status and the resource behaviour in service for monitoring processing unit also, because each Linux internal tank only has a processing unit, therefore monitors the resource behaviour in service that processing unit can be converted into monitoring Linux container.Actuator regularly sends heartbeat to the collection module of scheduler handler.While needing to send heartbeat, actuator can gather resource behaviour in service and the overall resource behaviour in service of actuator of the processing unit of its management, is organized as heartbeat, sends to collection module at every turn.Eartbeat interval can be arranged and be managed by configuration file.

In the stream data treatment system, can transmit sequence of events between processing unit, so the present invention need to support that processing unit communicates each other, system provides name space mechanism for this reason.System is distributed a sign (ID) that the overall situation is unique for each processing unit, and processing unit only need record processing unit ID and the corresponding service logic relation of communication with it when initialization.The name space of system can be safeguarded the mapping relations of processing unit sign (ID) to its mailing address (IP address and port).When processing unit is communicated by letter with other processing units first, need first access name word space, obtain its mailing address, then communication with it.

Scheduler handler is the hard core control person of system, comprises collection module, two parts of scheduler module.For avoiding the program internal process too much, affect program feature and stability, system realizes two modules with the form of process, before module, by remote procedure call (Remote Procedure Call), communicates.Can be deployed in different physical nodes on two Modularity Theory, but, for reducing communication overhead, should be deployed on the Same Physical node in actual motion.

Collection module is safeguarded overall actuator resource information, comprises total amount, available quantity of IP address, communication port and every kind of resource of each actuator place physical node etc., and scheduler module be take above-mentioned resource information is dispatched as basis.After the scheduler module startup, closing corresponding processing unit, collection module, according to resource requirement and the deployment node of this processing module, can upgrade global resource information.In addition, collection module receives the heartbeat of each actuator timed sending, comprising the resource behaviour in service of actuator and the resource behaviour in service of processing unit, mainly comprises the state of actuator and processing unit and the resource utilization of various resources.

Scheduler module is regularly obtained and is treated scheduler task from scheduling queue, according to mission bit stream, generates processing unit, on the basis that obtains collection module global resource information, uses the processing unit dispatching method, scheduling, startup processing unit; According to the operation demand of system or system manager's instruction, scheduler module can be controlled, the dynamic migration processing unit in addition.System manager or external program are undertaken alternately by client and whole system, and concrete mode is mutual by client and scheduler module, and interaction content comprises submission task or designated order.

There are two class configuration files in the embodiment of the present invention, respectively for scheduler handler and actuator.Wherein the configuration file of scheduler handler comprises the mailing address, resource allocation policy option, Linux container configuration information of scheduler module, collection module etc., need to obtain configuration file content when three modules start and carry out initialization.The Actuator configuration file comprises the information such as mailing address, this physical node binding network interface card of collection module in actuator communication port, resource management, actuator also needs to carry out initialization by obtaining configuration file content when starting, and send heartbeat to collection module, registered.

Fig. 1 is the inventive method flow chart of steps.The method, specifically comprises dispatching deployment according to following steps:

In embodiments of the present invention, step 1 obtains and treats schedule job according to following method: according to " first-in first-out principle ", from job queue, obtain and treat schedule job.For all users jointly, the user uses client-side management instrument submit job in the job scheduling queue of system, and the operation of first submitting to will take the lead in being scheduled.

The flow chart of the generation processing unit queue that Fig. 2 is the embodiment of the present invention, for according to schedule job information, generate processing unit queue to be scheduled, and its step is as follows:

Described step 1 specifically comprises the following steps:

Step 1.2: according to predetermined business demand information and the concurrent demand information for the treatment of schedule job, by operation be decomposed into take processing unit as summit, the data path of the take directed acyclic graph (Directed Acyclic Graph) that is limit;

In embodiments of the present invention, the described directed acyclic graph of step 1.2 is comprised of summit and limit, and wherein summit is processing unit, and limit is data path.The processing unit queue is that processing unit is carried out to the result that " topological sorting " obtains.Topological sorting refers to a kind of sequence to the summit of directed acyclic graph, if it makes, has one from the summit A path of B to the limit, and in sequence, B appears at the back of A so.

The schematic diagram of the processing unit directed acyclic graph that Fig. 3 is the embodiment of the present invention.Shown in Fig. 3, in example, system, according to job information, has generated A, B, C, E, D, F, G, H for 7 processing units.Outside stream data flows into system from A, B, tri-processing units of C, after a series of processing, finally at processing unit H, converges, and Output rusults.From the angle of single processing unit, take D as example, it receives the sequence of events of A, B, C transmission, generates after treatment new event, and the event of generation is sent to F.

The schematic diagram that the processing unit directed acyclic graph that Fig. 4 is the embodiment of the present invention changes.Topological sorting is that directed acyclic graph is carried out to the sequence that constantly conversion obtains, and concrete mapping mode is each summit that does not enter limit of selecting from figure, then this summit and the limit of sending from this summit is deleted.Fig. 3 is exactly the intermediary that Fig. 3 carries out this conversion.In Fig. 3, we select not enter the summit A on limit, and delete two limits of sending from A, so just obtained the directed acyclic graph shown in Fig. 4.Directed acyclic graph shown in Fig. 3 is carried out to complete topological sorting, can obtain the sequence of A, B, C, D, E, F, G, H.

The flow chart of the processing unit dispatching method that Fig. 5 is the embodiment of the present invention, for scheduling processing unit, select suitable physical node for processing unit, and its step is as follows:

Step 2.2: according to non-local number of communications and leading resource ratio, physical node is sorted, obtain sorted lists L, and the physical node in list is labeled as to " not reading ";

Step 2.3: select first physical node N " do not read " from described sorted lists L, and it is labeled as to " reading ";

Step 2.4: whether the weighting load value that judges first physical node of " not reading " of described selection is less than 80%, if be less than 80%, processing unit to be scheduled is positioned on this physical node, finishes;

Step 2.5: whether also have the physical node of " not reading " in described sorted lists L, if having go to step 2.3; If no, select first physical node in sorted lists L.

In embodiments of the present invention, " the non-local number of communications " that above-mentioned steps 2.1 is mentioned refer to need to treat scheduling processing unit carry out network service and not in the quantity of the processing unit of Same Physical node.The non-local number of communications of physical node is calculated according to following method: non-local number of communications is initialized as to 0, and hypothesis treats that scheduling processing unit has been deployed to this physical node, then travel through all processing units that completed scheduling, if processing unit with treat that scheduling processing unit needs to communicate by letter and two processing units not on the Same Physical node, " non-local number of communications " adds one." the non-local number of communications " that finally obtain is required.Take the processing unit shown in Fig. 3 as example, and supposing the system has completed scheduling to processing unit A, B, C, and it is upper that 3 processing units are scheduled for different physical node a, b, c, and system needs scheduling processing unit D now.Because processing unit D need to communicate by letter with A, B, C, so for D, physical node a, b, c " non-local number of communications is all 2.If also comprise another physical node d in system, d " non-local number of communications " is 3 so.

In embodiments of the present invention, step 2.1 adopts the leading resource ratio of following method Computational Physics node: the resource requirement of calculating every kind of resource of this physical node can use than, described resource requirement can be by the ratio than being the resource available quantity of the resources requirement of processing unit and physical node, the multiple resources demand can with than in a highest leading resource ratio that is this physical node.For instance, native system is safeguarded CPU, internal memory, three kinds of resources of the network bandwidth, the resource requirement of a processing unit is<1CPU, the 2G internal memory, 1Mbits/s >, now the available resources of physical node are 3 CPU, 10G internal memory, 4Mbits/s, and the resource requirement of CPU, internal memory, the network bandwidth can be with than being respectively 1/3,1/5,1/4 so.Due to 1/3 maximum, so the leading resource ratio of this physical node is 1/3, leading resource is CPU.

It should be noted that if certain resource of physical node the resource available quantity be less than the resource requirement of processing unit, think that this physical node does not have enough resources, its leading resource ratio is labeled as to infinity, do not participate in sequence.For instance, the resource requirement of a processing unit is<1CPU, the 2G internal memory, 1Mbits/s >, and now the available resources of physical node are 3 CPU, 1G internal memory, 4Mbits/s, this physical node does not have enough available resources, just its leading resource ratio is labeled as to infinity.If all physical nodes all do not have enough resources, this is dispatched unsuccessfully, processing unit is relay to the queue of readjustment degree, re-starts scheduling after a period of time.

In inventive embodiments, step 2.2, according to non-local number of communications and leading resource ratio, sorts from small to large to physical node.Two physical nodes are more non-local number of communications first, if do not wait, the physical node that non-local number of communications is little comes front; If equate, then compare the leading resource ratio of the two, the leading little physical node of resource ratio comes front.

In inventive embodiments, the described weighting load value of step 2.4 is the weighting load gone out according to CPU, internal memory, three kinds of Resource Calculations of the network bandwidth, and the weighting load value equals " cpu busy percentage * 0.3+ memory usage * 0.3+ network bandwidth utilization factor * 0.4 ".

The flow chart of the deployment process unit that Fig. 6 is the embodiment of the present invention, described step 3, for deployment process unit after selecting physical node, specifically comprises:

Step 3.1, in global resource information, deduct the resource quota of processing unit;

Step 3.2, be handed down to the information of processing unit the physical node of described selection;

Step 3.3 creates Linux container (Linux Container), and is that the Linux container arranges the resource quota according to the resource requirement of processing unit on this physical node;

Step 3.4, in the Linux container created in described step 3.3, start processing unit.

In embodiments of the present invention, step 701 indication global resource information, refer to various total resources and the resource available quantity of all physical nodes that the collection module of scheduler handler is safeguarded.Can the be expressed as<32CPU of various total resources of a physical node, 64G internal memory, 100Mb/s >, can be expressed as<8CPU of resource available quantity now, 15G internal memory, 30Mb/s >.We suppose that the resource available quantity of selected physical node a is<8CPU, the 15G internal memory, 30Mb/s >, the resource requirement for the treatment of the deployment process unit is<2CPU, 4G internal memory, 10Mb/s >, so just need in the resource available quantity of a, deduct the resource requirement of processing unit, after upgrading, the resource available quantity of a is<6CPU, 11G internal memory, 20Mb/s >.

Native system choice for use Linux container (Linux Container) provides the resource isolation environment for processing unit, the Linux container is the Intel Virtualization Technology of lightweight, performance cost is very little, usually can ignore, and is applicable to very much stream data and processes the characteristic of focusing on performance.In addition, the Linux container is the adjustresources capacity dynamically, for the resource quota automatic telescopic method of native system provides feasibility.Start processing unit at the Linux internal tank, according to the character of Linux container, its inner processing unit can only be used the resource quota of distributing to it, when resource is nervous, can not seize the resource of other processing units.

A kind of device of the job scheduling towards stream data, comprise scheduler handler 1, actuator 2 and scheduling queue 3;

Described scheduler handler 1, be arranged on the physical node of high configuration, for from scheduling queue 3 Real-time Obtainings, treating schedule job, and utilize directed acyclic graph to generate the processing unit queue that comprises a plurality of processing units according to the information for the treatment of schedule job, non-local number of communications and leading resource ratio according to physical node, for each processing unit in the processing unit queue is selected respectively physical node, all processing units are distributed to respectively to corresponding physical node, and each physical node is provided with at least one processing unit;

Described actuator 2, for starting processing unit, the processing unit that scheduler handler 1 is dispatched to this physical node is positioned over the linux internal tank, at the linux internal tank, starts processing unit;

Described scheduling queue 3, be deployed on the Same Physical node with scheduler handler, for storage, treats schedule job.

Described scheduler handler 1 comprises collection module and scheduler module;

Each described processing unit is provided with unique processing unit sign.

The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. the job scheduling method towards stream data, is characterized in that, comprises the following steps:

2. the job scheduling method towards stream data according to claim 1, it is characterized in that: described step 1 specifically comprises the following steps:

3. the job scheduling method towards stream data according to claim 1, it is characterized in that: described step 2 is further comprising the steps:

4. the job scheduling method towards stream data according to claim 3 is characterized in that: during the non-local number of communications of Computational Physics node, further comprise:

Non-local number of communications is initialized as to 0;

5. the job scheduling method towards stream data according to claim 3, it is characterized in that: the computational methods of described leading resource ratio are specially:

6. the job scheduling method towards stream data according to claim 3 is characterized in that:

In described step 2.2, according to non-local number of communications and leading resource ratio, physical node is sorted and specifically comprises the following steps from small to large:

7. the job scheduling method towards stream data according to claim 3, it is characterized in that: the circular of described weighting load value is:

8. the device of the job scheduling towards stream data, is characterized in that: comprise scheduler handler (1), actuator (2) and scheduling queue (3);

Described scheduler handler (1), be arranged on the physical node of high configuration, for from scheduling queue (3) Real-time Obtaining, treating schedule job, and utilize directed acyclic graph to generate the processing unit queue that comprises a plurality of processing units according to the information for the treatment of schedule job, non-local number of communications and leading resource ratio according to physical node, for each processing unit in the processing unit queue is selected respectively physical node, all processing units are distributed to respectively to corresponding physical node, and each physical node is provided with at least one processing unit;

Described actuator (2), for starting processing unit, the processing unit that scheduler handler (1) is dispatched to this physical node is positioned over the linux internal tank, at the linux internal tank, starts processing unit;

Described scheduling queue (3), be deployed on the Same Physical node with scheduler handler, for storage, treats schedule job.

9. the device of the job scheduling towards stream data according to claim 8, it is characterized in that: described scheduler handler (1) comprises collection module and scheduler module;

10. the device of the job scheduling towards stream data according to claim 8 is characterized in that: each described processing unit is provided with unique processing unit sign.