CN109377218B

CN109377218B - Method, server and mobile terminal for suppressing false sensing attack

Info

Publication number: CN109377218B
Application number: CN201811101427.5A
Authority: CN
Inventors: 刘杨; 张珍杰; 关建峰; 许长桥
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2018-09-20
Filing date: 2018-09-20
Publication date: 2020-10-27
Anticipated expiration: 2038-09-20
Also published as: CN109377218A

Abstract

The embodiment of the invention provides a method, a server and a mobile terminal for suppressing false sensing attacks, wherein the method comprises the following steps: acquiring a first perception task, and formulating a first payment rule according to the first perception task; sending a first perception task and a first payment rule to a plurality of mobile terminals so that the mobile terminals can select whether to participate in the first perception task or not according to the first payment rule; acquiring perception data sent by a plurality of mobile terminals participating in a first perception task, and performing EM (effective electromagnetic) algorithm evaluation on the perception data to acquire perception accuracy corresponding to each perception data; paying corresponding remuneration to each mobile terminal based on the sensing accuracy corresponding to each sensing data according to a first payment rule; and acquiring a second payment rule, wherein the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, and the second payment rule is used for the next sensing task. The embodiment of the invention provides the effect of restraining the user from sending false sensing attack.

Description

Method, server and mobile terminal for suppressing false sensing attack

Technical Field

The embodiment of the invention relates to the field of crowd sensing, in particular to a method, a server and a mobile terminal for suppressing false sensing attacks.

Background

With the rapid growth of Mobile devices such as smart phones, tablet computers, smart watches, smart bracelets and the like, more and more Mobile devices are equipped with sensors with various functions, such as accelerometers, gyroscopes, global positioning systems, thermometers and the like, and Mobile crowd sensing networks (hereinafter referred to as MCS) are gradually formed by using the Mobile devices as sensing basic units, and sensing tasks are distributed and sensing data uploaded by the Mobile devices are collected through cooperation with the Mobile internet to complete large-scale sensing tasks. Therefore, in terms of environment, network, and traffic monitoring, the MCS platform or server provides a lot of services by recruiting mobile users to monitor the conditions of the surrounding environment. With the rapid development of intelligent programmable wireless devices, users can freely control their wireless devices, for example, users can accurately determine the effort to complete the sensing task by manipulating some specific embedded sensors, which further affects the quality of data. While a smartphone user, as a private person, may choose a perception effort to maximize personal revenue, crowd sensing systems must stimulate the user to provide accurate sensing reports and suppress attacks of falsified sensing data. Otherwise, if the user knows that the sending of the false sensing data in the MCS task is not punished, even some smart phone users are motivated to upload the false sensing data for attack, the MCS server will receive a large amount of low-falsification sensing reports.

To address the above issues, game theory is an important means of formulating MCS processes, such as auction, price-based or reputation-based mechanisms that are utilized to incentivize users to contribute to MCS tasks. Wherein the solution proposed by the auction based MCS pays the price in the user auction with the lowest bid to save costs. The utility MCS server we note does not depend solely on paying users for the service, but on their location, strength of sensing, and quality of sensors. Thus, the MCS server may improve its sensing performance by assessing the sensing quality and recruiting only to provide accurate smartphone reports. The mobile sensing server applies data mining and learning algorithms to evaluate false sensing reports to suppress the motivation for fraud. However, due to evaluation errors, the accuracy of the server in motivating the user to provide the report without knowing the user's sensory model remains unsecured.

Because of errors in the evaluation, it is now challenging for the server to motivate the user to provide accurate reports without knowing the user's sensory model. Therefore, there is a need for a method of suppressing false perception attacks.

Disclosure of Invention

The embodiment of the invention provides a method, a server and a mobile terminal for suppressing false sensing attacks, aiming at solving the defect of low accuracy of a sensing report provided by a user in an MCS (modulation and coding scheme) in the prior art.

In a first aspect, an embodiment of the present invention provides a method for suppressing a false sensing attack, including:

101. acquiring a first perception task, and formulating a first payment rule according to the first perception task;

102. sending the first perception task and the first payment rule to a plurality of mobile terminals so that the mobile terminals can select whether to participate in the first perception task according to the first payment rule;

103. acquiring perception data sent by a plurality of mobile terminals participating in the first perception task, and performing EM (effective electromagnetic) algorithm evaluation on the perception data to acquire perception accuracy corresponding to each perception data;

104. according to the first payment rule, paying corresponding payment to each mobile terminal based on the perception accuracy corresponding to each perception data;

105. and acquiring a second payment rule, wherein the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, and the second payment rule is used for the next sensing task.

In a second aspect, an embodiment of the present invention provides a method for suppressing a false sensing attack, including:

acquiring a perception task and a payment rule;

acquiring a pre-estimated perception task reward, wherein the pre-estimated perception task reward is obtained based on perception quality pre-estimation according to the perception task and the payment rule;

selecting whether to accept the perception task according to the estimated task reward, and if so, sending perception data of the perception task to a server after the perception task is completed;

and receiving the corresponding reward of the perception task, wherein the corresponding reward of the perception task is obtained according to the perception accuracy of the perception data and the payment rule.

In a third aspect, an embodiment of the present invention provides a server for suppressing a false sensing attack, including:

the first processing module is used for acquiring a first perception task and formulating a first payment rule according to the first perception task;

the first sending module is used for sending the first perception task and the first payment rule to a plurality of mobile terminals so that the mobile terminals can select whether to participate in the first perception task or not according to the first payment rule;

the second processing module is used for acquiring perception data sent by a plurality of mobile terminals participating in the first perception task, performing EM (effective electromagnetic field) algorithm evaluation on the perception data and acquiring perception accuracy corresponding to each perception data;

the first payment module is used for paying corresponding payment to each mobile terminal based on the perception accuracy corresponding to each perception data according to the first payment rule;

and the third processing module is used for acquiring a second payment rule, the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the perception accuracy of the first perception task, and the second payment rule is used for the next perception task.

In a fourth aspect, an embodiment of the present invention provides a mobile terminal for suppressing a false sensing attack, including:

the first acquisition module is used for acquiring the perception task and the payment rule;

the second acquisition module is used for acquiring a pre-estimated sensing task reward, and the pre-estimated sensing task reward is obtained through pre-estimation based on sensing quality according to the sensing task and the payment rule;

the selection module is used for selecting whether to accept the perception task according to the estimated task reward, and if so, sending perception data of the perception task to a server after the perception task is completed;

and the reward receiving module is used for receiving the perception task reward, and the perception task reward is obtained according to the perception accuracy of the perception data and the payment rule.

In a fifth aspect, embodiments of the present invention provide a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of suppressing false perception attacks as described in the first or second aspect when executing the program.

In a sixth aspect, embodiments of the invention provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of suppressing a false-artifact-perception attack as set forth in the first or second aspect.

According to the method and the device for suppressing the false sensing attack, provided by the embodiment of the invention, the sensing data of each sensing task is evaluated through an EM (effective man) algorithm, and the best payment rule is learned by utilizing a Q-learning algorithm or a DQN (Quadrature reference number) algorithm according to the sensing data, so that the user is stimulated to send the most accurate sensing data, and the effect of suppressing the user from sending the false sensing data is achieved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a method for suppressing a false sensing attack according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of another method for suppressing a false sensing attack according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a server for suppressing false sensing attacks according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a mobile terminal for suppressing a false sensing attack according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The MCS is a wireless network which takes mobile equipment such as a smart phone, a tablet personal computer or wearable equipment of a user as a basic sensing node, issues sensing tasks and collects sensing data through the mobile internet. The game theory is an important means for researching the crowd-sourcing perception network at present, and for example, auction theory, pricing theory and reputation system are applied to the crowd-sourcing perception network to stimulate users to participate in perception tasks. However, some malicious users send false sensing data to the MCS for their own benefit, which reduces the network efficiency on one hand and also degrades the quality of the network sensing report on the other hand, even causing network congestion.

Fig. 1 is a schematic flow diagram of a method for suppressing a false sensing attack according to an embodiment of the present invention, and as shown in fig. 1, an embodiment of the present invention provides a method for suppressing a false sensing attack, including:

step 101, acquiring a first perception task, and formulating a first payment rule according to the first perception task;

the MCS server is responsible for data collection, processing and application and comprises a plurality of perception platform servers, the server firstly classifies perception tasks, at present, with the development and the deepening of crowd sensing network research, the perception tasks are divided into environment monitoring, infrastructure monitoring, social behaviors, social medical information and the like, meanwhile, according to classified task requirements, perception information of specific users is collected, in order to enable perception behaviors paid by the users to be reasonably returned, the server formulates payment rules according to the perception tasks. It should be noted that the first sensing task and the first payment rule in the embodiment of the present invention are set at an initial stage of establishing the crowd sensing network, where the first payment rule is formulated according to the sensing task and historical data in the network, human experience, or classified reference data in the internet, and the second payment rule described later in the embodiment of the present invention and the new payment rule obtained through learning later are both obtained through iterative learning of an algorithm, so that the first payment rule may show a defect that the distribution principle is unreasonable or the distribution accuracy is low at the initial stage of MCS, and the new payment rule will gradually tend to be optimal through iterative learning of the new payment rule later.

102, sending the first perception task and the first payment rule to a plurality of mobile terminals so that the mobile terminals can select whether to participate in the first perception task according to the first payment rule;

firstly, the MCS server broadcasts recruitment information, wherein the recruitment information comprises a first perception task and a first payment rule, and the first payment rule carries reward information for completing the corresponding first perception task, so that a user is encouraged to actively participate in the perception task. After receiving the recruitment information through the mobile terminal, the user determines whether to choose to participate in the perception task according to the resources paid by the user, such as perception consumption time, terminal electricity consumption or CPU occupation, and whether the obtained reward meets the expectation after the resources are paid.

103, acquiring perception data sent by a plurality of mobile terminals participating in the first perception task, and performing EM (effective electromagnetic) algorithm evaluation on the perception data to acquire perception accuracy corresponding to each perception data;

in the MCS, any user carrying a mobile terminal can receive a sensing task, so that accuracy of sensing data uploaded by the user cannot be guaranteed, and particularly when the user uploads false sensing attacks to the mobile terminal due to improper operation or some malicious users, reliability of the sensing data needs to be evaluated, and accuracy of the whole MCS can be improved. In the embodiment of the invention, the accuracy of the perception data submitted by the user is unknown, accuracy evaluation is required, and the maximum likelihood cannot be directly used for estimation due to the existence of hidden variables, namely the accurate value (interval) of each perception task, so that an EM algorithm is adopted.

In the embodiment of the present invention, for eachIndividual user a_kSetting an effort matrix e^kHere e^kIs a matrix of m by m, each element of which

Where i 1, 2 … m, j 1, 2.. m, represents a user a_kSubmitted perception data in interval d_jIn the interval d, but the accurate sensing data is_iIn particular, in the case of a liquid crystal,

m possible cases containing correct perceptual data submitted by the user, where the values in the matrix satisfy the following equation:

then, a set of interval probabilities of possible distributions for each perceptual task is defined as

Initializing P according to a collected data set S, then executing the step E in the EM algorithm, namely estimating the value of an effort square matrix E, then executing the step M in the EM algorithm, estimating a task correct interval P in reverse according to the obtained value of the square matrix E, and then continuously executing the step E and the step M until convergence, wherein t represents the t-th perception task.

104, paying corresponding remuneration to each mobile terminal based on the perception accuracy corresponding to each perception data according to the first payment rule;

the effort of user j in time period k estimated from the EM algorithm is set to

The total number of reports that the server can obtain an estimated accuracy level i can be represented by the following equation:

wherein the I function is a knowledge function when

When the equation is established, I is 1, otherwise I is 0.

The benefit of the server side is that the benefit obtained by the server subtracts the fee to be paid, and the formula is as follows:

wherein G is⁽ⁱ⁾In the embodiment of the invention, the contribution of the data submitted by the user to the server is influenced due to factors such as sensing position, submission time and the like, an influence factor is set to be lambda in the formula, and then for the data submitted by the user j in the level i, the server obtains the benefit of lambda_jG⁽ⁱ⁾. y represents a payment rule, and in the embodiment of the invention, the perception accuracy corresponding to the payment rule y is divided into H levels which are represented as

Wherein P is_iContaining H different payment rules. Thus, based on the estimated accuracy and payment rules, the reward given to the user by the server over time period k can be expressed as:

the reward to all users participating in the task may be expressed as:

through the formula, according to the perception accuracy of perception data evaluated by the EM algorithm, corresponding rewards are paid to users participating in the perception task in combination with the payment rule, and meanwhile, the total benefit of the crowd sensing network in the current state is obtained.

And 105, obtaining a second payment rule, wherein the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, and the second payment rule is used for the next sensing task.

Since the payment rules in the MCS may affect the effort the mobile terminal user is going to engage in the perceptual task, in an embodiment of the present invention, the payment process may be described as a limited Markov Decision Process (MDP). In a dynamic environment in which a sensing model of an active user is difficult to obtain, the MCS can apply a Q-learning algorithm, a model-free reinforcement learning method, and an optimal payment strategy is obtained through a limited Markov decision experiment. More specifically, the Q-learning based crowd sensing network determines a payment policy for each sensing task based on the observed prior sensing report quality, the payment policy, and the quality function (Q-function discounting long term rewards). For example, in the Q-learning algorithm, the payment rule may be regarded as an action, and the environment state of the time period k is set as s^(k)Each set of exact level numbers and payment rules comprising the previous state, the formula is:

setting the Q-learning based MCS payment strategy with the Q equation relied upon as Q (s, y), then the expected benefit of the state action over the long term (s, y) is updated according to bellman's equation, as:

Q(s,y)←(1-α)Q(s,y)+α(u_s(s,y)+γV(s'))；

wherein s 'is the next state after the state s executes the strategy y, the value equation V provides the maximum value of the Q equation, gamma is a discount factor, the longer the representation time is, the lower the reward obtained in the future is, and alpha belongs to [0,1], thereby representing the learning efficiency of s-y-s'.

According to the state value s of the current system^(k)And the value of the Q equation obtained by calculating the action value, and the MCS server applies a greedy algorithm to select the action value, so that the action value can be prevented from staying at a local optimal value. Specifically, when selecting an action, the optimal strategy predicted in the current state is selected with a high probability of 1-:

other strategies are randomly selected with a probability of being.

When the state space reaches a certain base, the operating efficiency of the Q-learning based MCS will be low due to the large amount of computation required, and this problem can be solved well by the Deep Q Network (DQN). More specifically, by combining the deep Convolutional Neural Network (CNN) and the Q-learning algorithm which are applied to various fields, the security of a perception task is achieved, the learning state space is compressed, and the operation efficiency is improved.

According to the method and the device for suppressing the false sensing attack, after the MCS server issues the sensing task, an initial payment rule is firstly formulated, the user is stimulated to upload the sensing data, the sensing data of each sensing task is evaluated through the EM algorithm, the optimal payment rule is learned through the Q-learning algorithm or the DQN algorithm according to the sensing data, the user is further stimulated to send the most accurate sensing data, and the effect of suppressing the user from sending the false sensing data is achieved.

On the basis of the foregoing embodiment, in step 105, after the obtaining a second payment rule, the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, and the second payment rule is used for a next sensing task, the method includes:

and repeating the steps 102 to 105 based on the next sensing task and the second payment rule, and updating the benefit value of the current state until the total benefit of the MCS is converged.

The Q-learning algorithm or DQN algorithm is an artificial intelligence technology for learning behaviors through trial and error experiments in a dynamic environment, and the ideal behaviors in a specific environment are automatically selected through later actions through learning, so that the optimal state is achieved. In the embodiment of the invention, when the MCS is in the initial stage, the sensing data uploaded by the external network environment and the mobile terminal lack prior knowledge, the established payment rule is not perfect, and a general mobile terminal user can select the sensing action which can obtain the maximum expected benefit according to the payment rule.

However, some malicious mobile terminal users may find some loopholes of the payment rules according to the imperfection of the payment rules, and upload false sensing data at a small cost to deceive the MCS into payment, so that the MCS is very vulnerable to false sensing attacks. The MCS is iteratively learned through a Q-learning algorithm or a DQN algorithm, the server can gradually know and master the observed sensing report quality, payment rules and the like, the quotation of the sensing task is gradually improved according to the last observed sensing data, a pricing table is adjusted, and finally the server obtains the optimal payment rules, so that the probability of uploading false sensing attacks by malicious users is reduced to the minimum, and a stable value is achieved.

In the embodiment of the invention, iteration is carried out based on the Q-learning algorithm or the DQN algorithm to obtain the payment rule, so that on one hand, a user is stimulated to upload high-quality sensing data to avoid uploading false sensing attack, and on the other hand, the reward paid by the MCS for uploading the high-quality sensing data by the user is reduced as much as possible.

On the basis of the foregoing embodiment, in step 105, the obtaining a second payment rule, where the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, includes:

obtaining a plurality of payment rules based on a Q-learning algorithm or a DQN algorithm according to the perception accuracy of the first perception task;

selecting a target payment rule from the plurality of payment rules as a second payment rule according to a-greedy algorithm.

In the learning process of the Q-learning algorithm or the DQN algorithm, in order to avoid the false convergence of a learning result, an optimal payment rule is determined according to the-greedy algorithm, in the current sensing task, the server selects the payment rule with the highest benefit according to the probability, and randomly selects other payment rules by using the probability of 1-, wherein 0< <1 and approaches to 1.

In the embodiment of the invention, the optimal payment rule is determined through a greedy algorithm, so that the MCS can obtain the maximum benefit as much as possible by the payment rule selected each time, and the local maximum value is prevented from being trapped, thereby obtaining the globally optimal payment rule.

On the basis of the foregoing embodiment, in step 101, the obtaining a first perception task and formulating a first payment rule according to the first perception task includes:

taking the classified sensing data as initial sensing data;

and acquiring a first perception task, and formulating a first payment rule according to the first perception task and corresponding initial perception data.

In the embodiment of the invention, because the initialized MCS has the defect of insufficient prior experience, in order to improve the learning efficiency, the existing classified sensing data with the evaluated accuracy are input into the MCS, and the classified sensing data belong to historical sensing data and are obtained according to the prior art, and the specific mode is not repeated here. When the server receives the first sensing task, the first sensing task is classified, it should be noted that the classified sensing data is only used as a reference sample for making a payment rule, and at the initial stage of MCS establishment, an initial payment rule, that is, a first payment rule, is made through the existing sensing data and the sensing task. In addition, the classified sensing data also includes malicious sensing attacks or data information which is not qualified due to low sensing quality.

The existing initial sensing data is input into the initialized MCS, so that the network learning efficiency is improved, the whole MCS can achieve benefit convergence in a short time, and the effect of restraining false sensing attacks is achieved.

On the basis of the foregoing embodiment, step 104, the paying a corresponding reward to each mobile terminal based on the sensing accuracy corresponding to each sensing data according to the first payment rule includes:

if the sensing accuracy is smaller than or equal to a first threshold value, determining that the sensing data is a false data attack;

if the perception accuracy is larger than a first threshold and smaller than or equal to a second threshold, determining that the perception data is target perception data;

if the perception data is larger than the second threshold value, determining that the perception data is excess perception data;

paying corresponding remuneration for the mobile terminal which is determined to be the target perception data and the excess perception data;

wherein the second threshold is greater than the first threshold.

In the embodiment of the invention, after the perception data of the perception task is evaluated, the perception data are classified according to the precision levels, and if each perception data can be assigned to one of the precision levels, a first threshold value and a second threshold value are set, wherein the second threshold value is determined to be the perception data which has the highest quality and meets the requirement of the perception task, that is, if the quality of the perception data needed by the perception task at a certain time only needs to reach a medium level, if the user submits the perception data with the quality exceeding the task requirement, the perception data are judged not to meet the requirement of the perception task. It should be noted that, the values of the first threshold and the second threshold may be set according to actual requirements, and this is not specifically limited in the embodiment of the present invention.

For example, the first threshold is set to 0, the second threshold is set to 1, and the sensing report with sensing data less than or equal to 0 is defined as a false sensing attack, wherein, in the actual evaluation, the sensing report with sensing data less than 0 may also be defined as a situation where the user accepts the task but does not participate in the sensing task, at this time, the sensing data equal to 0 is still defined as a false sensing attack, and at this time, no reward is sent to the mobile terminal with such sensing data. And when the perception accuracy is greater than 0 and less than or equal to 1, defining that the perception data is target perception data, and paying corresponding reward to the mobile terminal according to the payment rule. Similarly, when the perception accuracy is greater than 1, corresponding payment is paid to the mobile terminal according to the payment rule, and the difference is that the higher the perception data quality is, the more the perception data quality deviates from the requirement of the perception task, the payment corresponding to the payment rule is generally set to be that the higher the perception data quality is, the less the payment is made, so that the user of the higher perception device balances the resource consumption condition of the user, and whether the perception task is selected to be accepted or not is selected.

The embodiment of the invention restrains the probability of the user initiating false sensing attack by setting the first threshold and the second threshold, leads the user with higher sensing quality to keep silent under the budget limit, and reduces the redundancy of sensing data and the loss of transmission power.

Fig. 2 is another method for suppressing a false sensing attack according to an embodiment of the present invention, and as shown in fig. 2, an embodiment of the present invention provides a method for suppressing a false sensing attack, including:

step 201, obtaining a perception task and a payment rule;

202, obtaining a prediction perception task reward, wherein the prediction perception task reward is obtained through prediction based on perception quality according to the perception task and the payment rule;

step 203, selecting whether to accept the perception task according to the estimated task reward, and if so, sending perception data of the perception task to a server after the perception task is completed;

and 204, receiving the corresponding reward of the perception task, wherein the corresponding reward of the perception task is obtained according to the perception accuracy of the perception data and the payment rule.

In the embodiment of the invention, firstly, the MCS server broadcasts recruitment information, the recruitment information comprises a perception task and a payment rule, wherein the payment rule can encourage users to actively participate in the perception task, and after each user receives the recruitment information through the mobile terminal, the user decides the perception strategy. For example, whether the perception task is accepted or not is agreed, if yes, how many resources are allocated for the perception task to be processed is considered, the quality of perception data depends on the perception strength of a sensor of the mobile terminal, such as perception consumed time, electric quantity and the like, at the moment, the mobile terminal can predict a perception task reward according to the predicted allocated resources of the user and the perception strength of the mobile terminal, if the predicted perception task reward reaches the ideal expected value of the user, the user receives the perception task at the time, the completed perception data is sent to the MCS server through the mobile terminal, the server evaluates the perception accuracy of the perception data, and corresponding rewards are sent to the mobile terminal according to payment rules.

In the embodiment of the invention, the mobile terminal predicts the reward of the sensing task according to the payment rule and by combining the sensing strength of the mobile terminal, and a user can analyze the ideal expected value of the user at the early stage of receiving the sensing task and select the next corresponding action, thereby improving the accuracy of sensing data and inhibiting the probability of false sensing attack.

Fig. 3 is a schematic structural diagram of a server for suppressing false sensing attacks according to an embodiment of the present invention, and as shown in fig. 3, an embodiment of the present invention provides a server for suppressing false sensing attacks, including: the payment system comprises a first processing module 301, a sending module 302, a second processing module 303, a payment module 304 and a third processing module 305, wherein the first processing module 301 is used for acquiring a first perception task and formulating a first payment rule according to the first perception task; the sending module 302 is configured to send the first sensing task and the first payment rule to a plurality of mobile terminals, so that the mobile terminals select whether to participate in the first sensing task according to the first payment rule; the second processing module 303 is configured to acquire sensing data sent by multiple mobile terminals participating in the first sensing task, perform EM algorithm evaluation on the sensing data, and acquire sensing accuracy corresponding to each sensing data; the payment module 304 is configured to pay a corresponding reward to each mobile terminal based on the sensing accuracy corresponding to each sensing data according to the first payment rule; the third processing module 305 is configured to obtain a second payment rule, where the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, and the second payment rule is used for a next sensing task.

In the embodiment of the present invention, the third processing module 305 iterates the sensing data of the sensing task based on the Q-learning algorithm or the DQN algorithm to obtain the payment rule for the next sensing task, so as to, on one hand, encourage the user to upload high-quality sensing data to avoid uploading false sensing attacks, and on the other hand, reduce the reward paid by the MCS for the user to upload high-quality sensing data as much as possible.

Fig. 4 is a schematic structural diagram of a mobile terminal for suppressing false sensing attacks according to an embodiment of the present invention, and as shown in fig. 4, the embodiment of the present invention provides a mobile terminal for suppressing false sensing attacks, including: the system comprises a first acquisition module 401, a second acquisition module 402, a selection module 403 and a reward receiving module 404, wherein the first acquisition module 401 is used for acquiring a perception task and a payment rule; the second obtaining module 402 is configured to obtain a pre-estimation sensing task reward, where the pre-estimation sensing task reward is obtained based on sensing quality pre-estimation according to the sensing task and the payment rule; the selection module 403 is configured to select whether to accept the sensing task according to the estimated task reward, and if so, send sensing data of the sensing task to a server after the sensing task is completed; the reward receiving module 404 is configured to receive the perceived task reward, which is obtained according to the perceived accuracy of the perceived data and the payment rule.

In the embodiment of the invention, the second obtaining module 402 predicts the reward of the sensing task according to the payment rule and by combining the sensing strength of the mobile terminal, and the user can analyze the ideal expected value of the user at the early stage of receiving the sensing task and select the next corresponding action, so that the accuracy of sensing data is improved, and the probability of false sensing attack is suppressed.

The apparatus provided in the embodiment of the present invention is used for executing the above method embodiments, and for details of the process and the details, reference is made to the above embodiments, which are not described herein again.

Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention, and as shown in fig. 5, the computer device may include: a processor (processor)501, a communication Interface (Communications Interface)502, a memory (memory)503, and a communication bus 504, wherein the processor 501, the communication Interface 502, and the memory 503 are configured to communicate with each other via the communication bus 504. The processor 501 may call logic instructions in the memory 503 to perform the following method: acquiring a first perception task, and formulating a first payment rule according to the first perception task; sending the first perception task and the first payment rule to a plurality of mobile terminals so that the mobile terminals can select whether to participate in the first perception task according to the first payment rule; acquiring perception data sent by a plurality of mobile terminals participating in the first perception task, and performing EM (effective electromagnetic) algorithm evaluation on the perception data to acquire perception accuracy corresponding to each perception data; according to the first payment rule, paying corresponding payment to each mobile terminal based on the perception accuracy corresponding to each perception data; acquiring a second payment rule, wherein the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, and the second payment rule is used for the next sensing task;

or acquiring a perception task and a payment rule; acquiring a pre-estimated perception task reward, wherein the pre-estimated perception task reward is obtained based on perception quality pre-estimation according to the perception task and the payment rule; selecting whether to accept the perception task according to the estimated task reward, and if so, sending perception data of the perception task to a server after the perception task is completed; and receiving the corresponding reward of the perception task, wherein the corresponding reward of the perception task is obtained according to the perception accuracy of the perception data and the payment rule.

In addition, the logic instructions in the memory 503 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

An embodiment of the present invention discloses a computer program product, which includes a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, the computer can execute the methods provided by the above method embodiments, for example, the method includes: acquiring a first perception task, and formulating a first payment rule according to the first perception task; sending the first perception task and the first payment rule to a plurality of mobile terminals so that the mobile terminals can select whether to participate in the first perception task according to the first payment rule; acquiring perception data sent by a plurality of mobile terminals participating in the first perception task, and performing EM (effective electromagnetic) algorithm evaluation on the perception data to acquire perception accuracy corresponding to each perception data; according to the first payment rule, paying corresponding payment to each mobile terminal based on the perception accuracy corresponding to each perception data; acquiring a second payment rule, wherein the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, and the second payment rule is used for the next sensing task;

An embodiment of the present invention provides a non-transitory computer-readable storage medium storing server instructions, where the server instructions cause a computer to execute the method for suppressing false perception attacks provided in the foregoing embodiment, for example, the method includes: acquiring a first perception task, and formulating a first payment rule according to the first perception task; sending the first perception task and the first payment rule to a plurality of mobile terminals so that the mobile terminals can select whether to participate in the first perception task according to the first payment rule; acquiring perception data sent by a plurality of mobile terminals participating in the first perception task, and performing EM (effective electromagnetic) algorithm evaluation on the perception data to acquire perception accuracy corresponding to each perception data; according to the first payment rule, paying corresponding payment to each mobile terminal based on the perception accuracy corresponding to each perception data; acquiring a second payment rule, wherein the second payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to the sensing accuracy of the first sensing task, and the second payment rule is used for the next sensing task;

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method of suppressing false perception attacks, comprising:

2. The method of claim 1, wherein after the obtaining a second payment rule, the second payment rule being obtained based on a Q-learning algorithm or a DQN algorithm according to the perceptual accuracy of the first perceptual task, the second payment rule being used for a next perceptual task, the method comprises:

and repeating the steps 102 to 105 based on the next sensing task and the second payment rule, and updating the benefit value of the current state until the total benefit of the crowd sensing network is converged.

3. The method according to claim 1 or 2, wherein said obtaining a second payment rule, said second payment rule being derived based on a Q-learning algorithm or a DQN algorithm according to the perceptual accuracy of the first perceptual task, comprises:

4. The method of claim 1, wherein obtaining the first perceptual task and formulating a first payment rule based on the first perceptual task comprises:

taking the classified sensing data as initial sensing data;

5. The method according to claim 1, wherein the paying a corresponding reward to each mobile terminal based on the perceived accuracy corresponding to each perception data according to the first payment rule comprises:

wherein the second threshold is greater than the first threshold.

6. A method of suppressing false perception attacks, comprising:

acquiring a perception task and a payment rule, wherein the payment rule is acquired based on a Q-learning algorithm or a DQN algorithm according to perception accuracy, and the perception accuracy is acquired by performing an EM (effective noise) algorithm on perception data;

selecting whether to accept the perception task according to the estimated perception task reward, and if so, sending perception data of the perception task to a server after the perception task is completed;

7. A server for suppressing false perception attacks, comprising:

the sending module is used for sending the first perception task and the first payment rule to a plurality of mobile terminals so that the mobile terminals can select whether to participate in the first perception task or not according to the first payment rule;

the payment module is used for paying corresponding remuneration to each mobile terminal based on the perception accuracy corresponding to each perception data according to the first payment rule;

8. A mobile terminal for suppressing false perception attacks, comprising:

the system comprises a first acquisition module, a second acquisition module and a payment module, wherein the first acquisition module is used for acquiring a perception task and a payment rule, the payment rule is obtained based on a Q-learning algorithm or a DQN algorithm according to perception accuracy, and the perception accuracy is obtained by performing an EM algorithm on perception data;

the selection module is used for selecting whether to accept the perception task according to the estimated perception task reward, and if so, sending perception data of the perception task to a server after the perception task is completed;

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements a method of suppressing false perception attacks as claimed in any one of claims 1 to 6.

10. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of suppressing false perception attacks according to any one of claims 1 to 6.