CN111291973A

CN111291973A - Space crowdsourcing task allocation method based on alliance

Info

Publication number: CN111291973A
Application number: CN202010051354.4A
Authority: CN
Inventors: 郑凯; 赵艳; 李响
Original assignee: Mycos Suzhou Data Technology Co Ltd
Current assignee: Maikesi Wuxi Data Technology Co ltd
Priority date: 2020-01-17
Filing date: 2020-01-17
Publication date: 2020-06-16
Anticipated expiration: 2040-01-17
Also published as: CN111291973B

Abstract

The invention discloses a space crowdsourcing task allocation method based on alliance, which comprises the following steps: (1) the workers are organized into a worker union; (2) formulating a greedy task allocation method based on contracts; (3) formulating an algorithm based on equalization; (4) a simulated annealing method is established to find better nash equilibrium. Through the mode, the alliance-based space crowdsourcing task allocation method gives a group of workers and a group of tasks during space crowdsourcing task allocation, and determines a stable worker alliance for each task so as to achieve the highest total return of task allocation, and has wide market prospect in popularization of the alliance-based space crowdsourcing task allocation method.

Description

Space crowdsourcing task allocation method based on alliance

Field of the method

The invention relates to the field of spatial crowdsourcing, in particular to a spatial crowdsourcing task allocation method based on alliances.

Background method

Spatial Crowdsourcing (SC) is a new class of Spatial Crowdsourcing that enables people to move in the manner of multimodal sensors, instantly collecting and sharing high fidelity spatiotemporal data of various types. Specifically, task requesters may post spatial tasks to the SC server, which then uses the smart device carriers as workers who physically walk to the designated locations and complete the tasks, a behavior known as task assignment.

A large amount of existing research has focused on individual task assignments, where each task can only be assigned to one worker. However, there are inevitably some SC applications in practice, where a single worker may not be able to perform tasks, such as monitoring traffic conditions or cleaning rooms, effectively on its own. Thus, workers must form a small group or a federation that collectively perform tasks that exceed the capabilities of a single worker. Further, for each worker, she may prefer to work with other workers who may be rewarding her with better reputation or money.

Some recent work has explored a redundant task assignment approach that allows each spatial task to be assigned to completion by some reachable workers in the vicinity of the task. However, a large portion of these studies employ centralized control task distribution systems, where computing matches for all reachable workers and tasks may require high computational costs in large-scale SC scenarios. As a result, they greatly increase the difficulty of system implementation and the human effort to apply these methods in practical situations. In addition, they neglect the difference in worker behavior during federation formation, which leads to instability of the federation due to the inability to guarantee the utility of the federation members. In other words, because the utility achieved is not satisfactory, members of the federation may leave the federation causing the task to be incomplete. More importantly, the above-mentioned existing strategies naturally lack coordination. Recently Cheng Peng et al was the first to consider collaborations in task assignment, where workers collaborate and complete tasks together to achieve a higher overall collaboration quality score. They believe that workers are willing to perform tasks for a free basis, which is considered unrealistic, because workers may not be motivated to perform assigned tasks unless they are faced with a cost of compensating for their participation (e.g., mobile device batteries for sensing and data processing) that may receive a satisfactory return. In addition, Cheng Peng is only aimed at finding a nash equalization, and does not further explore the better equalization points that may exist.

Disclosure of Invention

The method problem to be solved by the invention is to provide a league-based space crowdsourcing task allocation method, by formulating a novel task allocation in space crowdsourcing tasks, namely league-based task allocation, wherein workers need to perform corresponding tasks by constructing worker leagues to interact with others, given a set of workers and a set of tasks, determining a stable worker league for each task to achieve the highest overall return on task allocation, further, employing a contractual-based greedy task allocation method to efficiently allocate tasks, wherein penalizing contracts are formulated and implemented, penalizing is applied to workers leaving the league, the league-based task allocation problem is converted into a multiplayer game, and a balance-based solution is proposed, wherein nash balance is found based on an optimal response framework, and simultaneously, in updating worker strategies, and a simulated annealing strategy is further introduced to improve the effectiveness of task allocation, so that the method has a wide market prospect in popularization of a space crowdsourcing task allocation method based on alliances.

In order to solve the problem of the method, the invention provides a space crowdsourcing task allocation method based on alliance, which comprises the following steps:

(1) the workers are organized into a worker union to perform corresponding tasks, so that the workers can interact with other people;

(2) formulating a contract-based greedy task allocation method to form a group of worker alliances so as to effectively allocate tasks;

(3) formulating a balance-based algorithm, wherein nash balance is found based on an optimal response framework to form a stable worker union for a person to obtain a higher overall return;

(4) because the equilibrium point obtained by the optimal response method in the step (3) is not unique and is not optimal in the aspect of total reward, the simulated annealing method is formulated to find better Nash equilibrium, the strategy of workers is continuously and asynchronously updated by the optimal response method, so that pure Nash equilibrium is achieved, and the optimal response and the simulated annealing method are converged to one equilibrium point.

In a preferred embodiment of the present invention, the worker broadly refers to a person who can only pay to perform a space task.

In a preferred embodiment of the invention, the worker's work mode includes an online mode and an offline mode, when in the online mode, indicating that he is ready to accept a task, and once a server assigns a task to a worker, the worker is considered to be in the offline mode until the assigned task is completed.

In a preferred embodiment of the present invention, the constraint conditions in step (1) include an achievable range and a task failure time.

In a preferred embodiment of the present invention, the contract-based greedy task assignment method in step (2) assumes that workers all behave in a selfish manner.

In a preferred embodiment of the present invention, the contract-based greedy task allocation method in step (2) is implemented by making and implementing a penalty contract, which imposes a penalty on workers leaving the federation.

In a preferred embodiment of the present invention, the penalty contract in step (3) is embodied in that once a worker alliance is established to perform a task to be performed, alliance members are penalized by a fine if they leave the alliance, the amount of penalty being equal to the reward due to the task.

In a preferred embodiment of the present invention, the equalization-based algorithm in step (3) is specifically that workers form leagues in sequence and update their policies in turn, selecting the best response task to maximize their utility until nash equalization is reached, where none of the workers can unilaterally switch from the assigned league to the other leagues to increase their utility.

In a preferred embodiment of the invention, the utility is a reward for a league left by the worker.

The invention has the beneficial effects that: the invention discloses an alliance-based space crowdsourcing task allocation method, which makes a novel task allocation during space crowdsourcing task allocation, namely alliance-based task allocation, wherein workers need to execute corresponding tasks by building worker alliances so as to interact with other people, a group of workers and a group of tasks are given, a stable worker alliance is determined for each task to realize the highest total return of task allocation, further, a contractual-based greedy task allocation method is adopted to effectively allocate the tasks, a punishment contract is made and implemented, punishment is applied to workers leaving the alliances, the task allocation problem based on the alliances is converted into a multiplayer game, a solution based on equilibrium is provided, Nash equilibrium is found based on an optimal response framework, and meanwhile, a simulated annealing strategy is further introduced in the process of updating worker strategies, the effectiveness of task allocation is improved, the highest total return of task allocation is realized, and the method has wide market prospect in popularization of a space crowdsourcing task allocation method based on alliances.

Drawings

In order to illustrate the method solutions in the embodiments of the present invention more clearly, the drawings that are needed in the description of the embodiments will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained by those skilled in the art without inventive effort, wherein:

FIG. 1 is a flowchart of an exemplary embodiment of a federation-based spatial crowd-sourced task allocation method of the present invention;

FIG. 2 is a task reward pricing model of a preferred embodiment of the federation-based spatial crowd-sourced task allocation method of the present invention;

FIG. 3 is a task s of a preferred embodiment of the federation-based spatial crowdsourcing task allocation method of the present invention₂Of workers { w₂,w₃Workload allocation;

FIG. 4 is a diagram illustrating the effect of the synthetic data set | S | according to a preferred embodiment of the federation-based spatial crowd-sourcing task distribution method of the present invention;

FIG. 5 is a diagram illustrating the effect of the gMission data set | S | in a preferred embodiment of the federation-based spatial crowd-sourcing task allocation method of the present invention;

FIG. 6 is the effect of the synthetic data set | W | of a preferred embodiment of the federation-based spatial crowd-sourcing task distribution method of the present invention;

FIG. 7 is the effect of the gMission dataset | W | of a preferred embodiment of the federation-based spatial crowd-sourcing task allocation method of the present invention;

FIG. 8 is a diagram of the impact of the synthetic data set r of a preferred embodiment of the federation-based spatial crowd-sourced task allocation method of the present invention;

FIG. 9 is a diagram of the impact of a synthetic dataset e of a preferred embodiment of the federation-based spatial crowd-sourced task allocation method of the present invention;

FIG. 10 is a diagram of the impact of the synthetic data sets d-e of a preferred embodiment of the federation-based spatial crowd-sourced task allocation method of the present invention;

FIG. 11 is a main flow of a greedy contract-based method according to a preferred embodiment of the federation-based spatial crowd-sourcing task allocation method of the present invention;

FIG. 12 is a main flow of the best response method of a preferred embodiment of the federation-based spatial crowd-sourcing task distribution method of the present invention;

FIG. 13 shows experimental parameters of a preferred embodiment of the federation-based spatial crowd-sourcing task allocation method of the present invention.

Detailed Description

The method schemes in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person of ordinary skill in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1 to 13, an embodiment of the present invention includes:

a space crowdsourcing task distribution method based on alliance includes the following steps:

Preferably, the worker broadly refers to a person who can perform a space task only with a paid fee.

Preferably, the worker's work mode includes an online mode and an offline mode, when in the online mode, indicating that he is ready to accept a task, and once the server assigns a task to a worker, the worker is considered to be in the offline mode before completing the assigned task.

Preferably, the constraint conditions in step (1) include an achievable range and a task failure time.

Preferably, the contract-based greedy task assignment method in step (2) assumes that workers all behave in a selfish manner.

Preferably, the contract-based greedy task allocation method in step (2) is implemented by making and implementing a penalty contract, and applying a penalty to workers leaving the alliance.

Preferably, the penalty contract in step (3) is embodied in such a way that once a worker alliance is established to perform the task to be performed, alliance members are penalized by a fine if leaving the alliance, the penalty being equal to the reward due to the task.

Preferably, the one equalization-based algorithm in step (3) is specifically that workers form leagues in sequence and update their policies in turn, selecting the best response task to maximize their utility until nash equalization is reached, where none of the workers can unilaterally switch from the assigned league to the other leagues to increase their utility.

Preferably, the utility is a reward for a league left by the worker.

In this embodiment, the definition of the relevant symbol is: s represents a space task, s.l represents the position of the space task S, s.p represents the release time of the space task S, s.e represents the expected completion time of the space task S, s.d represents the failure time of the space task S, s.wl represents the workload of the space task S, s.maxr represents the maximum reward of the space task S, s.pr represents the penalty rate of the space task S, S represents a space task set, w represents a worker, and w.l represents the position of the worker w; w.r denotes the reachable radius of worker W, W denotes the set of workers, AW(s) denotes the set of available workers for task s, t (a, b) denotes the travel time from location a to location b, d (a, b) denotes the travel distance from location a to location b, w.RS denotes the reachable set of tasks for worker W, WC(s) denotes the worker's league for task s, R_WC(s)Reward for indicating the completion of task s by a worker alliance WC(s), s.t_eIndicating the completion time of the worker alliance WC(s) to complete task s, s.t_sIndicating the start time (i.e. the allocation time), T, of task s_WC(s)Wl (WC (s)) denotes the work contribution of worker w in performing task s in WC(s), MWC(s) denotes the minimum worker union of task s, A denotes a spatial task allocation,

representing task assignments

The total prize of (a) is awarded,

it is meant that all possible allocation manners,

representing a globally optimal allocation.

A, define

Definition 1 (spatial task): a space task may be represented as s ═ s.l, s.p, s.e, s.d, s.wl, s.maxr, s.pr >, where s.l represents the place of issuance of the space task s, which may be described by one point (x, y) in a two-dimensional space, s.p represents the time of issuance of s, s.e represents the time when s is expected to complete, and s.d represents the time of failure of s. Each task is also marked with the amount of work s.wl required for a normal worker to complete the task s (in the present invention we use the time required to complete the task to denote s.wl). s.maxr represents the maximum reward that a requester of task s can provide, and s.pr is a penalty rate that establishes a correlation between task completion time and reward.

Definition 2 (worker): a worker is a person who is only paid to perform space tasks and is denoted as w ═ w.l, w.r. The worker may select either an online mode or an offline mode. When the worker is in the online mode, it indicates that he is ready to accept the task.

Each worker w on line has a location w.l where it is currently located and the area accessible to the worker is a circular area centered at w.l and having a radius of w.r, and the worker can only accept the allocation of space tasks within that area.

In the model we define, each worker can only handle one task at a particular time, whether alone or as part of a federation that together strives to accomplish the task, which is justified in real life. Once the server assigns a task to a worker, the worker is considered to be in an offline mode until the assigned task is completed.

Due to the constraints of worker reach and task dead time, each task can only be completed by a small fraction of workers (i.e., an available worker set), which is defined as follows.

Definition 3 (set of available workers): the set of available workers for task s may be denoted as AW(s), which is a series of workers that satisfies the following two conditions for

1) The worker w can reach the location of the task s before the task s fails, i.e., t_now+t(w.l,s.l)<s.d，

2) Task s is within reach of worker w, i.e. d (w.l, s.l) ≦ w.r

Wherein t is_nowRepresenting the current time, t (a, b) representing the travel time from position a to position b, and d (a, b) being the distance between a given position a and position b. Both of the above conditions ensure that the worker can travel directly from her origin to her location where the task s can be reached before the task fails. If worker w is available for task s, i.e., w ∈ AW(s), we say s is the reachable task of w, and denote the reachable task set of worker w as w.RS.

Definition 4 (worker alliance): given a task s to be assigned and its set of available workers AW(s), the worker union for task s may be denoted WC(s), which is a subset of AW(s), such that all workers in WC(s) have sufficient time to complete task s together before the task fails, i.e., Σ_w∈WC(s)(s.d-(t_now+t(w.l,s.l)))≥s.wl。

Taking FIG. 1 as an example, task s₁Is { w }₂,w₃,w₅,w₆Where all available workers can be at s₁Before d reaches s₁L and s₁Accessible to the worker. Since all workers in each federation can be at s₁D before collaborating together to complete task s₁So task s₁Is { w }₃},{w₂,w₃And { w }₂,w₃,w₆}。

As for task Reward calculation, we use a Reward Pricing Model (RPM), which effectively quantifies the time constraints of the task. More specifically, RPM considers a single task s and one of the worker leagues, with the emphasis on task completion time and true rewards (i.e., actual payment of the task by the requester), as shown in fig. 2.

Subject to the task primary time constraints (i.e., release time s.p, expected completion time s.e, and expiration time s.d for the task), penalty rate s.pr, and maximum reward s.maxr, RPM may be expressed as the following equation:

wherein R is_WC(s)Indicating the actual reward of the task requester for the worker in wc(s), s.t_eRepresents the completion time of the worker alliance wc(s) to complete the task s.

To calculate s.t_eWe use s.t_sIndicating the start time (i.e., the assignment time) of task s, using T_WC(s)Expressed as the duration of task s (i.e., the elapsed time from assignment to completion), w.wl (wc (s)) is expressed as the work contribution of worker w (measured in time) when task s is performed in wc(s). Illustrated by FIG. 3, which depicts task s₁Of workers { w₂,w₃The workload distribution (where the worker's task is from the example of the operation in fig. 1) is easily understood. For any w ∈ wc(s), the duration of the task is equal to the travel time of worker w plus her work contribution, i.e.:

by adding the right formula for all workers in the worker alliance wc(s), one can obtain:

in view of s.wl ═ Σ_w∈WC(s)w.wl (WC (s)), so the above equation can be changed to:

finally, s.t_eCan be calculated as s.t_e＝s.t_s+T_WC(s)The workload per worker may be calculated as w.wl (wc(s) ═ T_WC(s)-t (w.l, s.l). From the perspective of a worker league, we consider the goal of a worker joining the league to be to promote the overall rewards of the league, thereby providing satisfactory rewards and reputation rewards for that worker.

Notably, because we require w.wl (WC (s))>0, therefore, is asIf the travel time of a worker exceeds the duration of the task, i.e., T (w.l, s.l) ≧ T_WC(s)Then worker w does not contribute to task s and the worker should be deleted from wc(s). Additionally, as shown by the RPM in fig. 2, when a worker consortium wc(s) can collaborate to complete a task s before its expected completion time s.e, which means they get the maximum reward (s.maxr) for this task, even adding more workers to wc(s) cannot get a higher reward. In other words, more workers in wc(s) does not mean an earlier completion time or more total rewards. This observation motivates the concept of a minimum worker league.

Definition 5 (minimum worker alliance): if a subset of worker leagues WC(s) for task s cannot be obtained equal to R_WC(s)The worker union is minimal, which is denoted mwc(s).

Albeit { w₃},{w₂,w₃And { w }₂,w₃,w₆Is task s₁All of the workers in (b) but because

Can generate a sum { w₂,w₃,w₆Like prizes (i.e. prizes of same size)

) Therefore, { w₂,w₃,w₆Is not the smallest worker league. For task s₄，{w₇},{w₁,w₇All are their worker associations, but only w₇Is its smallest worker union, because at w₇Performing task s by itself₄Hour, worker w₇Can obtain the task s₄Without the cooperation of others.

Definition 6 (spatial task assignment): given a worker set W and a task set S, spatial task allocation

Comprises<task,MVC>Set pairs, of the form<s₁,MVC(s₁)>,<s₂,MVC(s₂)>,…,<s_|S|,MVC(s_|S|)>Wherein

We use

Representing task assignments

Total reward of, i.e.

Wherein R is_MWC(s)Can be calculated by formula (1) using

Representing all possible allocation manners. The problem studied in the present invention can be formally expressed as follows.

Problem definition: given a set of workers W and a set of tasks S at a time, a joint task allocation (CTA) problem aims to find a globally optimal allocation

Thereby making the following steps to

Introduction 1: the CTA problem is the NP-hard problem.

Proof 1: the lemma can be justified by the convention to the 0-1 knapsack problem. The 0-1 backpack problem can be described as follows: given a C set containing n items, where each item C_iC are all marked with a weight m_iSum value v_i. The 0-1 knapsack problem is to determine a subset C' of C such that

In that

Maximization under action, where M represents the maximum weight capacity.

For a given 0-1 knapsack problem, it can be converted to an example of a CTA problem as follows. We present a spatial set S of tasks comprising n tasks, where each task S_iAnd a release time s_iP is 0, expected completion time s_i E 1, time to failure s_i D 1, workload s_i.wl＝m_iAnd maximum prize s_i.maxR＝v_iAnd (4) associating. In addition, we present a worker set W containing M workers, each worker W_iOnly one workload is allowed to complete. All workers and tasks are located in the same location. In this case, to obtain each task s_iIs awarded v_iThe system needs to allocate m_iTask for individual worker s_i。

Given this mapping, we can demonstrate that the converted CTA problem can only be solved if and only if the 0-1 knapsack problem can be solved. Since the 0-1 backpack problem is known to be the NP-hard problem, the CTA problem is also the NP-hard problem.

Greedy method based on contract

A straightforward solution to maintaining worker federation stability is to enforce contracts where workers cannot leave their constituent federation at will. Thus, in this section, we designed a contract-based greedy algorithm that encourages worker leagues to get more total rewards and manages worker behavior based on contracts. The algorithm is based on the following consensus: because rewards do not increase over time, nearby workers who choose to perform a task may generate higher rewards (i.e., first the maximum reward provided by the task requester remains stable and then gradually decreases, as seen by the Reward Pricing Model (RPM) of fig. 2); workers in the vicinity can handle a greater amount of work in order to obtain more rewards. In addition, in order to eliminate the instability of the worker alliance (i.e. the departure of some workers from the alliance which has already been formed can cause the corresponding task to be not completed), a mandatory contract among workers is designed. The basic principle of a contract is that once a worker league has been established to perform a task to be performed, league members will be penalized by a fine equal to the reward due to the task if they leave the league.

Algorithm 1 outlines the main flow of a contract-based greedy approach, which takes as input a set of workers W and a set of tasks S, first computes a set of available workers AW (S) for each task S, and after initialization, for each task S ∈ S, the algorithm generates a minimum worker union MWC (S) by selecting nearby workers that can contribute a higher total reward for task S and assigns that worker union MWC (S) to S. Further, if worker w ∈ MWC(s) leaves MWC(s) and stops performing task s, the algorithm invokes the penalty term in the contract and worker w will pay R_MWC(s)The penalty is calculated as the total reward for the penalty (the penalty is calculated based on the union mwc (s)). This means that task s fails to be assigned and will be reassigned (line 22), and eventually algorithm 1 will obtain an appropriate task assignment result.

Algorithm 1 is shown in figure 11 of the specification.

As illustrated in fig. 1, a contractual-based greedy algorithm may obtain a task assignment,

{<s₁,{w₂,w₃,w₅}>,<s₃,{w₁,w₄}>,<s₄,{w₇}>h, where the prize is 8.53.

Method based on balance

Although contract-based algorithms may alleviate the instability problem of the federation to some extent, they reduce the interest of workers to initially participate in existing federations. The essence of the CTA problem is that each worker needs to select a task to perform by interacting with other workers during the task assignment process, which means that the task selection of one worker depends on the decisions made by other workers, and this interdependent decision can be modeled by gambling theory, and workers can be considered independent players participating in the game. Based on this basis, the CTA problem can be viewed as a multiplayer game.

More specifically, our problem can be modeled as a force field game with at least one nash equilibrium in the pure strategy (i.e., pure nash equilibrium). We then use a best response algorithm, which is one of the most basic algorithms in a strict force field game, because it can effectively resolve the player's previous conflicts. In the optimal response dynamics, the participants are required to sequentially and asynchronously update the strategies of other participants in a short-sight mode on the basis of the optimal response utility function of the participants, and finally pure Nash equilibrium is achieved. Nash equilibrium represents the state of a game in which when other workers remain in an assigned league, no single worker can increase its utility by unilaterally diverting from the assigned league to the other leagues, suggesting that when workers are free to choose, they may voluntarily choose the assigned task. In this case, the formed worker league may be considered a stable league. However, since there may be multiple equalization points, nash equalization implemented by the best response algorithm may not be optimal. To solve this problem, we introduce a Simulated Annealing Strategy (SA) in the optimal response dynamics that finds better nash equilibrium corresponding to near-optimal task allocation. With the help of the SA policy, the update process has a better chance of achieving near-optimal task allocation. Finally we analyzed the feasibility of our algorithm.

Game modeling and Nash equilibrium

We first expressed the CTA problem as a strategic game of n players, which can be expressed as

Consists of players, strategic space and utility functions. The detailed description is as follows:

W＝{w₁,…,w_nand (n ≧ 2)) represents a limited set of workers playing the game player. In the followingIn the article, we use user players and workers interchangeably when context is clear.

Is the set of all the strategies for all players (i.e., the strategic space of the game). ST (ST)_iIs w_iContains w_iReachable task set and null tasks (meaning w)_iNo task to be performed is selected), may be denoted as ST_i＝{w_iRS, null }, where w_iRS denotes worker w_iNull denotes an empty task.

A utility function representing all of the players is shown,

is a player w_iThe utility function of (2). For each federated policy

Represents player w_iThe utility of (c) can be calculated as follows:

wherein

Is a federation MWC(s) ∪ { w }_iTotal reward of MWC(s) ∪ { w }_iIs a mixture comprising MWC(s) and w_iNew worker alliance of, MWC(s)₀)-{w_iDenotes a slave MWC(s)₀) Delete { w_iWorker alliance after, MWC(s) and MWC(s)₀) Respectively represents w_iAlliances of workers willing to join and w_iAn ongoing coalition of workers. When in use

When the context of (1) is clear, we use U_iTo represent

In a strategy game, strategy configuration files

(wherein

Is ST_iProbability distribution) is called nash equilibrium of the mixing strategy (i.e., mixed nash equilibrium) and only if for arbitrary w_iBelongs to W and meets the following conditions:

wherein ∑_iRepresents player w_iThe strategic space of (1). For any given policy profile, pi ═ pi (pi ═ pi-₁,...,π_n) Then there is

Nash equilibrium is a pure Nash equilibrium (i.e., Nash equilibrium with pure strategy) only when the player plays a deterministic strategy, which means that worker w is_iCan be selected from ST_iIs 1, and worker w_iCan be selected from ST_iThe other strategy probability of selection in (1) is 0.

Nash, as demonstrated by John f.nash, each game with a limited number of players and a limited set of policies is a hybrid nash equilibrium, which simply means a stable probability distribution over the profiles, rather than a fixed play of a particular joint policy profile. This uncertainty is unacceptable in our task assignment scenario, where each worker needs to have an explicit strategy (i.e., choose the task to perform or do nothing). Thus, we next demonstrate that my CTA game is a pure Nash equilibrium in which each player can choose a strategy deterministically.

Given a federation policy

st_i(i.e., s ∈ w)_iRS or null) represents w_i(0<i is less than or equal to n). For player w_iA federation policy

Can be expressed as

Wherein

Representing the union strategy of all other players.

2, leading: the CTA game is a pure nash equilibrium.

Proof 2: to demonstrate lemma 2, there is a need to demonstrate that the CTA game is an Exact potential field game (EPG) in which the motivation of all players can be mapped to the global potential function. For EPGs, the best response framework can always converge to pure nash equalization for countable strategies.

Next, we introduce the definition of EPG and illustrate that the CTA game is EPG.

Definition 7 (Exact Potential field Game): policy gaming

Is a strict force field game (EPG) only if there is a function

So that for all

The following conditions are satisfied: for any worker w_i∈W，

Wherein st_i' and st_iIs available for workers w_iThe strategy of the selection is such that,

is in addition to w_iThe function phi, a joint strategy for other workers than the one called gaming

Is determined.

And 3, introduction: CTA is the rigorous potential field function (EPG).

Proof 3: we define the potential function as

It represents the total reward for all tasks in S. It can then be calculated as follows:

wherein in the policy st_i' and st_iIs s respectively_kAnd s_gThe strategic game of the CTA problem is a strict force field game according to definition 7, for which reason the CTA game has nash equilibrium in the pure strategy.

Use of

Represents player w_iCan combine other strategies

The best strategy to respond. Thus for a given

Maximizing utility

By means of a federation policy

Pure nash equilibrium can be achieved so that no player can get any utility by changing the strategy unilaterally.

(II) optimal response method

Because the CTA game is pure nash equilibrium, we solve the CTA problem using a best response approach that creates a large number of stable worker leagues to perform the task by achieving pure nash equilibrium. Specifically, the designed optimal response algorithm is formed by adjusting own strategies by players in turn according to the latest strategies of other people, and finally Nash equilibrium of local optimal task distribution is achieved. Algorithm 2 describes the overall framework of the optimal response method.

Given a set of workers W and a set of tasks S to be assigned, tasks are assigned

Is initialized to

(line 1), the algorithm first selects one available worker for each task, obtains the corresponding policy for each worker (i.e., one reachable task or nothing), and updates the task assignments accordingly

The algorithm then iteratively adjusts each worker's strategy to the best response strategy, maximizing the growth of the federation according to the current federation strategies of others (as shown in equation (9)) until nash equilibrium is found (i.e., no one changes his strategy). In each iteration, only one worker is allowed to select his best response strategy and the game should be played in sequence.

Algorithm 2 is shown in figure 12 of the specification.

Specifically, for each worker w_iE.g. W, we first find the best response task s with the largest reward increase^*It can be calculated by equation (9):

for current task assignment, if worker w_iWithout the task of optimal response, w_iIts policy is not changed. For workers who get the best response task, we examine the current strategy as follows:

if her current policy does nothing, i.e. w_iLet st be null, we will best respond to the task (i.e. w)_i.st＝s^*) Assign her and update task s^*A minimum worker union.

If her current policy contains a task (labeled w)_iSt), which means that task w is a task_iSt to federation MWC (w)_iSt) w_iThen alliance MWC (w)_iSt) depending on whether they can complete the task w on time_iSt is updated. Then, w_iThe policies and worker federation of (c) are updated.

Finally, we update task assignments according to Nash equilibrium

(III) optimization strategy based on simulated annealing

While pure nash equilibrium, calculated by the best response algorithm, can produce acceptable task allocation results under a stable worker union, this is a locally optimal approach to the CTA problem and is not necessarily unique. In such a case where there are multiple pure nash balances, we want better results than those produced by the best response algorithm. As the name implies, Simulated Annealing (SA) originates from the annealing process of metals and is a random optimization procedure. Inspired by David e.kaufman et al to solve the discrete optimization problem using SA, we used SA to determine a better approach to global optimal task allocation.

In particular, when each worker is according to a given

Update her policy st sequentially_iTo maximize utility function

In time, the worker may reach nash equilibrium, i.e., a steady state. Consider that the search space is discrete (i.e., a set of policies)

Discrete), a simulated annealing method may be applied to update each worker's strategy to achieve better local optimization. SA is considered to be an effective probabilistic solution to the discrete-time non-uniform markov chain x (k) st₁,…,st_n. In the present invention, the state x (k) st₁,…,st_nIs the combination of the strategies of the workers in the k-th iteration of algorithm 2. For worker w_iStrategy st_iThe current task s can be preserved₀Or alter other reachable tasks (i.e. w)_i.RS-{s₀}). For the thermal simulation (randomness), we consider worker w to be_iCan be modified by using another (| w)_iRS | -1) have the same probability

Randomly change its current policy, where st_i′＝s(s∈w_i.RS-{s₀}). Each worker may update its own policy in the following order of rules.

If it is not

Then

If it is not

Then

With the following probabilities:

where t (k) >0 represents the temperature of the kth iteration, which gradually decreases throughout the update process.

If not, then,

by following the above rules, we can update lines 15-28 in Algorithm 2, and we omit the detailed pseudo code due to space limitations. Formally, the transition probability can be calculated using the following formula:

function(s)

Non-incremental, called cooling schedule, in which

Is a set of positive integers. As can be seen from equation (11), when t (k) is large, the policy selection is almost random, and when t (k) is close to 0, a better policy with greater utility is more likely to be selected. While allowing less effective task selection may result in a reduction in overall effectiveness, this "irregular" strategy selection is more likely to result in better nash equilibrium (i.e., better task allocation), which is verified experimentally.

(IV) Convergence analysis

In the field of game theory, the problem of convergence to nash equilibrium has attracted a wide range of attention. We then demonstrate that our solution converges to a pure nash equilibrium point, where no worker is encouraged to deviate unilaterally.

And (4) introduction: the optimal response algorithm converges to pure nash equalization.

Proof 4: as shown in equation (8), the utility of all workers is mapped to the potential function (i.e.,. phi.), indicating that each worker individually adjusts his strategy to cause the same amount of change in utility and potential functions. For a potential game, each worker would update its strategy for maximum utility in turn through a best response algorithm, and the potential function would correspondingly reach a local maximum, where the best response is dynamically equivalent to a local search of the potential function of the potential game. Noam Nisan et al have demonstrated that in any limited potential game, the sequential updates with the best response dynamics always converge to Nash equilibrium.

In terms of convergence time, Fabrikant et al propose that when the best response of each player can be found at Polynomial time then determining the pure Nash equilibrium in the potential game is done by a Polynomial Local Search (PLS). In our CTA game, each worker w_iOnly | w_iRS | policies with his reachable tasks w_iRS correlation, | w in practical applications due to space-time constraints of workers and tasks_iRS | is not large. Within the polynomial time, each worker can choose one task from his reachable tasks to maximize the existing utility. Therefore, convergence to nash equilibrium is rather fast.

And (5) introduction: the best response algorithm with simulated annealing optimization converges to pure nash equilibrium after adjusting the cooling schedule.

Proof 5: by integrating the simulated annealing strategy into the optimal response algorithm, randomness is added to the update process of the worker strategy (i.e., worker w)_iBy using others with probability (| w)_iRS | -1) reachable tasks to randomly alter their current policy). Specifically, this process "warms up" before "cool down" to help the potential function avoid local optimality (obtained by the best response algorithm) and converge to another better nash equalization where the cooling schedule should be adjusted so that the process will eventually "freeze" (i.e., converge). Liu Yanqing et al have demonstrated that when a cooling schedule is set

When (β ≧ d^*Is a normal number) can ensure the convergence of the simulated annealing strategy. If it is not

To

There is a path, then d^*Representing the maximum depth of a path from any of the federation policies

Begin and end at the final federated policy

The convergence speed of the best response algorithm with simulated annealing must be slower than that of the best response algorithm. In experiments, we provide experimental evidence of this statement and demonstrate that pure nash equilibrium can be computed efficiently in a limited time, as described in lemma 4 and lemma 5.

Fourth, experiment

In this section, we evaluated our approach based on real and synthetic datasets, and all experiments were performed on an intel (r) xeon (r) CPU E5-2650 v2@2.60GHZ 128GB RAM memory machine.

(one) Experimental setup

The experiment was performed based on two datasets, a gMission dataset (labeled GM) and a synthetic dataset (labeled SYN). In particular, GMission is an open source crowdsourcing platform, where each task is associated with its location, deadline, and reward, and each worker is associated with its location and reachable radius. Since gMission data does not generate time, workload, expected completion time, and penalty rate for tasks, we generate these attributes according to a uniform distribution. For the synthetic data set, based on the observation of the real data set, we generate the position, generation time, expected completion time, deadline, workload and penalty rate of the task according to a uniform distribution, and generate its maximum reward according to a normal distribution. The worker's position and reachable radius are generated from the uniform distribution.

We studied and compared the performance of the following algorithms:

and OTA: an Optimal Task Assignment (Optimal Task Assignment) algorithm based on a tree decomposition strategy, the OTA finds the minimum worker alliance of each Task by using a dynamic programming strategy, and then obtains the Optimal Task Assignment with the maximum reward by using the tree decomposition-based algorithm.

CGTA: our Contract-based Greedy task assignment (CGTA) algorithm.

BR: the Best Response (BR) method we propose.

BR + SA: our optimal response method with simulated annealing optimization, where the cooling schedule is set to

And k represents the kth iteration of the algorithm.

We compare the above algorithm, max reward, CPU time, update number, which represents the number of updated strategies that the worker is doing, using three metrics to find the final task assignment. Description figure 13 shows our experimental setup where all parameter defaults are underlined.

(II) results of the experiment

a) The effect of | S |. To investigate the scalability of all algorithms, we randomly chose to generate 5 datasets from the composite dataset (gMission dataset) containing 1000 to 5000(100 to 500) tasks. As shown in fig. 4(a) and 4(b), the total reward for all methods shows a similar growth trend as | S | increases. Obviously, OTA received the highest total reward in the composite dataset and gMission dataset, which are BR + SA, BR and CGTA. The reward of BR + SA is always higher than BR, which indicates the superiority of the simulation optimization strategy. In fig. 4(b), 4(c) and 5(b), 5(c), although the CPU time of all methods increases with increasing | S |, our proposed algorithms (including BR + SA, BR and CGTA) have significantly better performance than OTA. The OTA degradation speed is significantly accelerated in terms of efficiency. As expected, CGTA is the fastest algorithm, although it produces a smaller reward than other algorithms. BR + SA may have an excellent compromise between efficiency and accuracy. As for the update numbers in fig. 4(d) and 5(d), the worker updates her current policy at the BR + SA more frequently than the BR because the BR + SA allows the worker to randomly change her current policy, whereas the worker in the BR updates her current policy only when there is a best response policy. We also note that as the number of tasks | S | increases, the number of updates for BR + SA and BR both increase, the reason behind this being self evident, i.e. the more tasks to be allocated, the more reachable tasks per worker (i.e. the available strategies) so that he has more opportunities to update his strategy.

b) The influence of | W |. Next we investigate the effect of | W |, i.e. the number of workers to be allocated. As shown in fig. 6(a), 6(b), 7(a), 7(b), the global rewards earned by BR + SA and BR are higher than CGTA, at the same time sacrificing some efficiency. However, the computational efficiency of BR + SA and BR is acceptable. Although the overall prize of OTA is certainly the highest, OTA is also the most time consuming as can be seen in fig. 6(c) and 7 (c). More precisely, the BR + SA may achieve 98.6% of the maximum reward, but its CPU time is almost negligible compared to OTA. From fig. 6(d) and 7(d), we can know that the update number is in an upward trend with the | W | change because more workers need to change their strategies as the number of workers to be allocated becomes larger. Notably, the number of worker policy updates is much lower than the number of workers to be allocated. This can be explained for a number of reasons. The first is that some workers insist on their originally assigned tasks (BR + SA and BR algorithms give each worker an initial policy, i.e., one of the reachable tasks or an empty task, please refer to algorithm 2) without any policy updates. The second reason is that there are some workers who have no reachable tasks, which means that the number of their policy updates is 0. In order to save space, we will not show the CPU time of OTA (this is large) nor the results of the gMisson dataset (which are similar to those of the composite dataset) in the following experiments.

c) The influence of r. FIG. 8 depicts the effect of the worker's reachable radius r on the performance of the algorithm, with r ranging from 1km to 5 km. As the worker's reachable radius increases, the worker has more tasks reachable and as a result they have more opportunities to select higher rewarding tasks, which also explains the trend in fig. 8(a) for the total reward to rise as r increases. Accordingly, as r changes, the CPU time for all methods increases in FIG. 8(b), as the worker must search for more reachable tasks to find a suitable one. In addition, as can be seen from fig. 8(c), the number of updates of the BR + SA algorithm is always higher than that of the BR algorithm regardless of r.

d) e, the influence of the temperature. Next we investigated how the expected time of the task affected the efficiency and effectiveness of all methods. Clearly, as shown in fig. 9(a), the total reward for all methods increases progressively as e increases, since a larger e means that more tasks can reach the maximum reward. OTA still receives the maximum reward, and BR + SA performs better than BR and CGTA. However, it is worth noting that all methods tend to remain stable when e >8h, possibly because most tasks can be completed before 8h to reach their maximum reward. As for CPU time, the trend for all approaches is to ramp up and then stay stable because initially as e increases, each worker tends to have more reachable tasks and more CPU time is required to search for these tasks, and then as the task expected completion time continues to extend, the number of reachable tasks per worker will stay stable due to space-time constraints (e.g., worker reachable radius), so that the CPU time to find the task assignment stays stable. In fig. 9(c), the update numbers of BR + SA and BR decrease as e increases, since as e becomes larger, workers are more likely to select their satisfactory tasks to avoid policy updates.

e) The influence of d-e. In the last set of experiments, we investigated the effect of d-e. Not surprisingly, as shown in FIGS. 10(a) and 10(b), all methods have higher rewards and higher CPU time when the deadlines are more relaxed. As expected, a larger d-e means more tasks per worker on average, which can result in more total rewards and more CPU time spent. Another observation is that BR related methods and CGTA algorithms are more and more spread in performance in terms of total rewards, because the total rewards are more sensitive to the average number of available workers per task when BR related algorithms are applied. In this case, the benefits of the BR related approach become more important. As shown in fig. 10(c), the update numbers of BR and BR + SA both tend to increase.

The space crowdsourcing task allocation method based on the alliance has the beneficial effects that:

a novel task allocation is made in the process of space crowdsourcing task allocation, namely task allocation based on alliance, wherein workers need to perform corresponding tasks by building worker unions to interact with others, given a set of workers and a set of tasks, a stable worker union is determined for each task, to achieve the highest total return of task allocation, and further, to adopt a contract-based greedy task allocation method to effectively allocate tasks, wherein a punishment contract is established and implemented, punishment is applied to workers leaving the alliance, the task allocation problem based on the alliance is converted into a multiplayer game, a solution based on balance is provided, nash equilibrium is found based on the optimal response frame, and meanwhile, in the process of updating the worker strategy, a simulated annealing strategy is further introduced to improve the effectiveness of task allocation so as to realize the highest total return of the task allocation.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, or direct or indirect applications in other related fields of the invention by using the content of the present specification are included in the scope of the present invention.

Claims

1. A space crowdsourcing task allocation method based on alliance is characterized by comprising the following steps:

(1) the method comprises the steps of establishing a worker union for workers to perform corresponding tasks so as to interact with other people, wherein the workers are limited by constraint conditions;

2. A federation-based spatial crowd-sourced task allocation method as recited in claim 1, wherein the worker is broadly a person who can only pay to perform a spatial task.

3. A federation-based spatial crowd-sourced task allocation method as recited in claim 2, wherein the worker's work modes include an online mode and an offline mode, when in the online mode indicating that he is ready to accept tasks, once a server allocates a task to a worker, that worker is considered to be in the offline mode before completing the allocated task.

4. A federation-based spatial crowd-sourced task allocation method as recited in claim 1, wherein the constraints in step (1) include reach and task dead time.

5. The alliance-based spatial crowdsourcing task allocation method of claim 1, wherein the contract-based greedy task allocation method in step (2) assumes that workers all behave in a selfish manner.

6. A federation-based spatial crowdsourcing task allocation method as claimed in claim 1, wherein the contract-based greedy task allocation method in step (2) is specifically to make and implement a penalty contract, imposing a penalty on workers leaving the federation.

7. A federation-based spatial crowdsourcing task allocation method as claimed in claim 6, wherein in step (3) the penalty contract is embodied in that once a worker federation is established to perform a task to be performed, federation members are penalized by a penalty if leaving the federation, the penalty being equal to the reward due to the task.

8. A federation-based spatial crowd-sourced task allocation method as recited in claim 1, wherein said one equalization-based algorithm in step (3) specifically forms federations for workers in sequence and updates their policies in turn, selecting the best-response task to maximize their utility until nash equilibrium is reached, where none of the workers can unilaterally switch from an allocated federation to another federation to improve their utility.

9. A federation-based spatial crowd-sourced task allocation method as recited in claim 8, wherein the utility is a reward for a federation left by workers.