CN109308418B

CN109308418B - Model training method and device based on shared data

Info

Publication number: CN109308418B
Application number: CN201710632357.5A
Authority: CN
Inventors: 王力; 周俊; 李小龙
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd
Priority date: 2017-07-28
Filing date: 2017-07-28
Publication date: 2021-09-24
Anticipated expiration: 2037-07-28
Also published as: CN109308418A

Abstract

A model training method and device based on shared data are disclosed. On one hand, joint training can be performed according to data provided by a plurality of data providers, so that a more accurate and comprehensive data model is obtained; on the other hand, operations (such as data decryption operation, local model parameter updating operation and the like) related to private data in the model training process are packaged in a trusted execution environment of a data provider to be executed. No clear text of data can be obtained outside the trusted execution environment.

Description

Model training method and device based on shared data

Technical Field

The embodiment of the specification relates to the technical field of data mining, in particular to a model training method and device based on shared data.

Background

In the big data era, by mining mass data, useful information in various forms can be obtained, and therefore the importance of the data is self evident. Different organizations own respective data, but the data mining effect of any one organization is limited by the amount and kind of data owned by the organization. Aiming at the problem, a direct solution idea is as follows: the multiple mechanisms cooperate with each other to share data, so that a better data mining effect is realized, and a win-win situation is realized.

However, data itself is a very valuable asset for the data owner, and the data owner often does not want to provide the data directly due to the requirements of protecting privacy, preventing disclosure, etc., which makes the "data sharing" difficult to be actually operated in reality. Therefore, how to implement data sharing on the premise of sufficiently ensuring data security has become a problem of great concern in the industry.

Disclosure of Invention

In view of the above technical problems, embodiments of the present specification provide a method and an apparatus for model training based on shared data, and the technical solution is as follows:

according to a first aspect of embodiments herein, there is provided a method for model training based on shared data, the method comprising:

performing iterative training by using the following steps until the model training requirement is met:

respectively obtaining ciphertext data provided by at least 1 data provider;

respectively inputting the ciphertext data of each data provider into a trusted execution environment of the data provider;

obtaining an output value of each trusted execution environment, wherein the output value is obtained by calculation according to the ciphertext data;

calculating a deviation value between a model predicted value and a true value according to a given training target model; the model predicted value is determined according to the output value of each trusted execution environment, and the real value is a global tag value determined according to the data of each data provider;

respectively returning the deviation values to the trusted execution environments, so that the trusted execution environments respectively update the local model parameters;

wherein the following steps are performed inside any trusted execution environment:

decrypting the input ciphertext data to obtain a plaintext data characteristic value;

calculating an output value corresponding to the characteristic value of the plaintext data according to the current local model parameter;

and updating the local model parameters according to the returned deviation value.

According to the 2 nd aspect of the embodiments of the present specification, there is provided a shared data-based model training method, including:

respectively inputting ciphertext data of at least 1 data provider to a trusted execution environment of the data provider; in each trusted execution environment, respectively decrypting the input ciphertext data to obtain each plaintext data characteristic value;

in each trusted execution environment, calculating an output value corresponding to a plaintext data characteristic value according to a current local model parameter;

respectively returning the deviation values to the trusted execution environments;

and in each trusted execution environment, updating local model parameters according to the returned deviation value.

According to a 3 rd aspect of embodiments herein, there is provided a shared data-based model training method, the method including:

respectively obtaining data provided by a plurality of data providers, wherein the data provided by at least 1 data provider is ciphertext data, and the data provided by other data providers is plaintext data;

if the data form provided by the data provider is ciphertext data, inputting the ciphertext data into a trusted execution environment of the data provider correspondingly;

calculating deviation values of the model predicted values and the real values by utilizing the output values of the trusted execution environments and plaintext data provided by other data providers; the model prediction value is determined according to the output value of each trusted execution environment and the characteristic value of plaintext data; the real value is a global label value determined according to the data of each data provider;

According to the 4 th aspect of the embodiments of the present specification, there is provided a data prediction method based on shared data modeling, the method including:

respectively obtaining ciphertext data provided by at least 1 data provider;

inputting the output value of each trusted execution environment into a pre-trained prediction model, and calculating to obtain a predicted value;

and calculating an output value corresponding to the characteristic value of the plaintext data according to the current local model parameter.

According to the 5 th aspect of the embodiments of the present specification, there is provided a shared data-based model training apparatus, including the following modules for implementing iterative training:

the data acquisition module is used for respectively acquiring ciphertext data provided by at least 1 data provider;

the data input module is used for correspondingly inputting the ciphertext data of each data provider into the trusted execution environment of the data provider;

an output value obtaining module, configured to obtain an output value of each trusted execution environment, where the output value is obtained through calculation according to the ciphertext data;

the deviation value calculation module is used for calculating the deviation value between the model predicted value and the actual value according to a given training target model; the model predicted value is determined according to the output value of each trusted execution environment, and the real value is a global tag value determined according to the data of each data provider;

the deviation value returning module is used for returning the deviation values to the trusted execution environments respectively so that the trusted execution environments update the local model parameters respectively;

wherein, arbitrary trusted execution environment includes internally:

the decryption submodule is used for decrypting the input ciphertext data to obtain a plaintext data characteristic value;

the output value operator module is used for calculating an output value corresponding to the characteristic value of the plaintext data according to the current local model parameter;

and the parameter updating submodule is used for updating the local model parameters according to the returned deviation value.

According to the 6 th aspect of the embodiments of the present specification, there is provided a shared data-based model training apparatus, including the following modules for implementing iterative training:

the data acquisition module is used for respectively acquiring data provided by a plurality of data providers, wherein the data provided by at least 1 data provider is in a ciphertext data form, and the data provided by other data providers is in a plaintext data form;

the data input module is used for correspondingly inputting the ciphertext data into a trusted execution environment of a data provider if the data provided by the data provider is ciphertext data;

the deviation value calculation module is used for calculating the deviation value between the model predicted value and the real value by utilizing the output value of each trusted execution environment and plaintext data provided by other data providers; the model prediction value is determined according to the output value of each trusted execution environment and the characteristic value of plaintext data; the real value is a global label value determined according to the data of each data provider;

wherein, arbitrary trusted execution environment includes internally:

According to the 7 th aspect of the embodiments of the present specification, there is provided a data prediction apparatus based on shared data modeling, the apparatus including:

the predicted value calculation module is used for inputting the output value of each trusted execution environment into a pre-trained prediction model and calculating to obtain a predicted value;

wherein any trusted execution environment E^uThe inside includes:

and the output value operator module is used for calculating an output value corresponding to the characteristic value of the plaintext data according to the current local model parameter.

The technical scheme provided by the embodiment of the specification is as follows: on one hand, joint training can be performed according to data provided by a plurality of data providers, so that a more accurate and comprehensive data model is obtained; on the other hand, operations (such as data decryption operation, local model parameter updating operation and the like) related to private data in the model training process are packaged in a trusted execution environment of a data provider to be executed. That is to say: and the data plaintext can not be acquired outside the trusted execution environment, so that the data security of the shared data provider is effectively ensured.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of embodiments of the invention as claimed.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present disclosure, and other drawings can be obtained by those skilled in the art according to the drawings.

FIG. 1 is a schematic diagram of a data sharing collaboration mode;

FIG. 2 is a schematic diagram of the architecture of the model training system disclosed in this specification;

FIG. 3 is a schematic diagram of the architecture of the data prediction system disclosed herein;

FIG. 4a is a schematic diagram of the architecture of a model training system of one embodiment of the present description;

FIG. 4b is a schematic diagram of the architecture of a model training system of another embodiment of the present description;

FIG. 5 is a schematic diagram of a shared data based model training apparatus disclosed in the present specification;

FIG. 6 is a schematic diagram of a data prediction device based on shared data modeling as disclosed in the present specification;

fig. 7 is a schematic structural diagram of a computer device disclosed in the present specification.

Detailed Description

Technical solutions in embodiments of the present specification will be described in detail below in order to enable those skilled in the art to better understand the technical solutions, with reference to the drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only a part of the embodiments of the present specification, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification should fall within the scope of protection of the embodiments in the present specification.

As shown in fig. 1, in the cooperative mode of "data sharing", several roles are involved: the system comprises a data provider, a data mining party and a data attacking party. A plurality of data providers give data to a data miner in common for data sharing mining, but it is not desirable to provide data to the data miner as it is in order to protect data privacy. On the other hand, data providers also need to prevent data attackers from stealing data. In a broad sense, for any data provider, the data miner and other data providers actually constitute potential attackers.

Thus, one of the basic requirements for implementing secure data sharing security is: the data provider provides the data to the data miner after some encryption processing. The encrypted data still keeps a certain amount of information, so that a data mining party can still perform data mining by using the encrypted data, but cannot obtain specific data content.

In view of the above-mentioned needs, embodiments of the present specification provide a data sharing scheme based on a trusted execution environment. The scheme is used for training the data model according to the mass data samples, wherein the data samples are sourced from a plurality of data providers, and different data providers can provide data sample characteristics from different dimensions respectively, so that the data samples with richer characteristic dimensions can be formed after the shared data of each data provider are integrated, and the data model with better effect is trained.

First, a technique related to a Trusted Execution Environment (TEE) will be described: a trusted execution environment is a secure area on the device processor that can guarantee the security, confidentiality, and integrity of code and data loaded inside the environment. The trusted execution environment provides an isolated execution environment, the security features provided comprising: isolated execution, integrity of trusted applications, confidentiality of trusted data, secure storage, and the like. In general, a trusted execution environment may provide a higher level of security than an operating system. Trusted execution environments were originally proposed for mobile devices (e.g., smart phones, tablet computers, set-top boxes, smart televisions, etc.), and the current application scenarios have not been limited to the mobile domain. Common trusted execution environment implementations include AMD's PSP (platform Security processor), ARM's TrustZone, Intel's SGX (software Guard extensions), and so on.

In an application scenario of data sharing, for any data provider, except itself, any other party is considered to be untrustworthy, so trusted execution environments can be created for the data providers respectively, and operations related to data security risks are encapsulated in the trusted execution environments for execution, so that the requirements of the data providers on data security are met.

The architecture of a data sharing system provided by the embodiments of the present specification is shown in fig. 2. Assume that there are a total of U data providers: 1.2 … U, together providing data to the data miner for the data miner to train out a global model. The overall data sharing working principle is as follows:

each data provider provides the encrypted data to the data miner;

and the data mining party respectively creates a trusted execution environment for each data provider, and then correspondingly inputs the ciphertext data into each trusted execution environment.

In the trusted execution environment, the ciphertext data is decrypted, then an output value is calculated by using the model parameter stored in the trusted execution environment, the output value of the trusted execution environment can be used for model training of a data mining party, but the data mining party cannot obtain specific data content from the output value of the trusted execution environment.

And the data mining party performs joint training according to the output values of the trusted execution environments, so as to obtain a global data model.

In the whole process, operations related to data cleartext are packaged in each trusted execution environment, and the trusted execution environments are completely isolated from the data provider, so that the data security of the data provider can be effectively guaranteed.

The following describes a model training method provided in the embodiments of the present specification with reference to a specific flow of steps:

and (3) training the data model, searching for the optimal model parameter value through repeated iteration, updating the model parameter every iteration, and ending the training until the updated model parameter meets the training requirement. Referring to fig. 2, the following describes an embodiment of the present disclosure in a complete iterative process:

s101, respectively obtaining ciphertext data provided by U data providers;

the data provider provides sample data for model training. The data provided by different data providers may have completely different characteristics, respectively, or may have the same or partially the same characteristics. In practical application, any data provider and data miner can agree in advance which features need to be uploaded for model training, which is not limited in the embodiment of the present specification.

Assuming that there are a total of U data providers (where U ≧ 2), the data forms each provided are represented as follows:

the data provider 1: (x)₁ ¹,x₂ ¹,x₃ ¹…) and is noted as (x)₁,x₂,x₃,…)¹→X¹

The data provider 2: (x)₁ ²,x₂ ²,x₃ ²…) and is noted as (x)₁,x₂,x₃,…)²→X²

……

A data provider U: (x)₁ ^U,x₂ ^U,x₃ ^U…) and is noted as (x)₁,x₂,x₃,…)^U→X^U

Where x is₁,x₂,x₃…, respectively, represent different characteristic values of a piece of data, and 1,2, … U at superscript represents the identity of the respective data provider. For convenience of description, the data format may be uniformly expressed in a form of a whole superscript. It will be appreciated that the features provided by the various data providers may have the same or different meanings, and that the number of features uploaded by the various data providers may vary. However, obtained from multiple data providersIn the data, a plurality of groups of data for describing the same object can be extracted to form a global training sample.

For example, there are 3 data providers, each providing a data sample with different characteristics:

the features uploaded by the data provider 1 include: age, occupation;

the features uploaded by the data provider 2 include: sex, height, weight;

the features uploaded by the data provider 3 include: gender, blood pressure, heart rate;

if the feature data can be obtained from 3 data providers for any person i, a large number of training samples can be formed by combining the data provided by the 3 data providers, so that a model with 7 features (age, occupation, sex, height, weight, blood pressure, heart rate) can be jointly trained.

In order to ensure the safety of the data, each data provider encrypts the data and provides the data to the data miner in a ciphertext mode. It is considered that the encryption logic used by each data provider is known only to itself, and thus the encrypted data can be securely stored or transmitted in an untrusted environment.

In specific implementation, a data mining party can request data from a data provider in a network transmission mode, and the data provider encrypts the data and then sends ciphertext data to the data mining party through a network; in another embodiment, the ciphertext data may also be stored in a storage device of the data mining party, so that the data mining party directly reads the ciphertext data from the local.

S102, the ciphertext data of each data provider are respectively and correspondingly input into the trusted execution environment of each data provider;

the data mining party creates a trusted execution environment E for each data provider 1,2, … U, respectively¹、E²…E^UTo ensure that data security-related operations can only be performed on the data provided by any data provider U (U-1, 2, … U) at its corresponding E^uIn E is carried out^uExternal failing to feelKnowing these operations, it is also impossible to influence these operations.

Under different implementation schemes, specific ways of creating the trusted execution environment are different, and the embodiments of the present specification do not limit the specific ways of creating the trusted execution environment. In addition, the operation of creating the trusted execution environment by the data mining party can be executed after ciphertext data of the data providing party is obtained for the first time, or can be executed in advance.

Each trusted execution environment provides an input interface and an output interface to the outside, wherein one of the output interfaces functions as: ciphertext data input from outside is received. After the data mining party obtains the ciphertext data of the data provider, the E corresponding to each data provider u is determined^uThen, the ciphertext data are respectively input into E^uThe ciphertext data input interface.

S103, obtaining output values O of the trusted execution environments¹、O²…O^U；

At each E^uInternally, the input ciphertext data is decrypted first to obtain plaintext data X^uThen according to a preset algorithm and the internal local model parameters W^uFor plaintext data X^uCalculating to obtain corresponding output value O^uAnd through E^uOne output interface of (2) will be O^uOutput to E^uAnd (3) an external part. That is, the data miner may obtain output values O of a plurality of trusted execution environments, respectively¹、O²…O^U。

To fully describe the implementation of the system, the present embodiment first describes the overall processing flow of the data model training from the perspective of the data mining party, since E^uThe internal operation is not visible to the outside, so in this embodiment each E can be used^uSeen as a black box. With respect to each E^uThe specific implementation of the internal part will be described in detail in the following embodiments.

S104, calculating a deviation value delta-Y-h [ z (O) according to a given training target model¹,O²,…O^U)]；

Deviation values are model training iterationsAn important value to be calculated in the process is assumed to be in the form of (X) for a data sample i_i，y_i) Wherein:

X_i＝(x_i1,x_i2,…)，x_i1,x_i2… are a plurality of characteristic values of the data sample i, respectively;

y_iis the label value of data sample i;

assuming that the training target model is of the form y ═ h (X), then for the data sample (X)_i，y_i) With predicted deviation value equal to tag value y_iAnd the model predicted value h (X)_i) The difference of (a) to (b), namely:

Δ_i＝h(X_i)-y_ior Δ_i＝y_i-h(X_i)

Deviation value delta_iThere are two main roles in model training:

on one hand, the effect is to use the fitting effect of the evaluation model on the training sample set: for any one data sample i, Δ_iThe smaller the value, the better the fitting effect of the model; if there are n groups of data samples, n Δ s_iThe smaller the value is as a whole, the better the fitting effect of the model is. In practical applications, generally speaking, Σ Δ is calculated_iThe method (3) is used for evaluating the fitting effect of the function on the training sample set as a whole.

On the other hand, the function is to participate in the iterative update operation of model parameters: assume that there is a set of model parameters W ═ W (W)₁,w₂…), the basic form of iterative update of the parameters (various variations are possible in practical applications) is as follows:

W←W-αΔX

in the whole model training process, model parameters are continuously updated in an iterative manner, so that the fitting effect of the model on the training sample set meets the training requirement (for example, the deviation value is small enough). The following is a brief description of the parameter update formula, and for the specific derivation process of the parameter update formula, reference may be made to the description of the prior art.

In the above update formula, "W" on the right side of the arrow indicates the parameter value before each update, and "W" on the left side of the arrow indicates the parameter value after each update, and it can be seen that the change amount of each update is the product of α, Δ, and X.

Alpha represents a learning rate, also called a step size, which determines that each iteration is an updating amplitude of a parameter, if the learning rate is too small, the speed of the process of meeting the training requirement may be slow, and if the learning rate is too large, the phenomenon of overhoot the minimum may be caused, that is, the model cannot be made to approach to fitting along with the updating process. As to how to select the appropriate learning rate, reference may be made to the description of the prior art, and in the embodiments in the present specification, α is regarded as a preset value.

X represents the characteristic value of the data sample, and may represent different forms of the characteristic value according to the selected update formula, which will be further exemplified in the embodiments later in this specification.

In an embodiment of the present specification, the data mining party may obtain output values O of a plurality of trusted execution environments respectively¹、O²…O^ULet y be h (z) be the global training objective model function, where z is with respect to O¹、O²…O^UFunction of (a), denoted as z (O)¹,O²,…O^U) I.e. a union function for the output values of U data providers, and O¹、O²…O^UAnd each is about X¹、X²…X^UAs a result, it can be seen that: y ═ h (z) also relates to X¹、X²…X^UAs a function of (c).

Definition of Δ ═ Y-h [ z (O)¹,O²,…O^U)]Or Δ ═ h [ z (O)¹,O²,…O^U)]-Y；

Wherein h [ z (O)¹,O²,…O^U)]Is z (O)¹,O²,…O^U) The model predicted value of (2); y is z (O)¹,O²,…O^U) The corresponding real value is the global label value determined according to the data of each data provider; the difference Δ is the deviation.

The actual form can be selected according to the actual training requirement, such as a linear regression model (linear regression model), a logistic regression model (logistic regression model), and so on. The examples in this specification are not necessarily limiting.

In addition, for each group O¹,O²,…O^UThe corresponding global tag value Y may be determined in a variety of ways, as will be described in detail in the following embodiments.

S105, respectively returning delta to E¹、E²…E^USo that E¹、E²…E^UUpdating local model parameters W separately¹、W²…W^U. The updated parameters are used for calculating the output value O in the next iteration process^u。

In the above, from the perspective of the data mining party, the overall processing flow of the data model training is introduced, and the processing logic inside the trusted execution environment is specifically described as follows:

as shown in FIG. 2, for any trusted execution environment E^uAnd 3 basic functions are realized inside the device:

1) data decryption:

corresponding to the encryption logic used by the data provider u, at E^uIn which corresponding decryption logic is stored, such as decryption algorithm information, key information, etc. Based on this information, in E^uThe inside can decrypt the input ciphertext data to obtain the plaintext data characteristic value X^u＝(x₁,x₂,…)^u。

The data decryption operation is performed after S102.

2) Calculating an output value:

at E^uIn which local model parameters W are stored^u＝(w₁,w₂,…)^uAt E^uInternally, it can be based on the current local model parameters W^uCalculating X^uCorresponding output value O^u(ii) a During the whole training process, W^uThe method is continuously updated iteratively, and initialized parameter values are used in the first iterative calculation.

O^uAccording to the global model y ═ a specific calculation modeh(z)＝h[z(O¹,O²,…O^U)]For example, for linear and logistic regression models, the global model can be expressed as y ═ h (z) h (w ═ h) for both the linear and logistic regression models₁x₁+w₂x₂+ …), then the corresponding O^uWith union function z (O)¹,O²,…O^U) May be in the form of:

O^u＝w₁ ^ux₁ ^u+w₂ ^ux₂ ^u+ …, note (w)₁x₁+w₂x₂+…)^u

In practice, the above-mentioned O^uMay also include a constant term parameter b^uNamely:

O^u＝b^u+w₁ ^ux₁ ^u+w₂ ^ux₂ ^u+…

in fact, if let b^u＝w₀ ^uAnd w is₀ ^uIs understood as feature x₀ ^uCorresponding parameters, and feature x₀ ^uIs constant equal to 1, O^uThe expression of (c) can be expressed as:

O^u＝w₀ ^ux₀ ^u+w₁ ^ux₁ ^u+w₂ ^ux₂ ^u+…

it can be seen that the form of the overall expression is uniform regardless of the presence or absence of the constant term parameter, and therefore O^u＝w₁ ^ux₁ ^u+w₂ ^ux₂ ^uThe expression of + … should be understood to cover both "constant term parameters" and "constant term parameters". In practical applications, for any u, the model parameters may include constantsThe number term parameter may or may not include the constant term parameter.

Of course, O is as defined above^uWith union function z (O)¹,O²,…O^U) The form of the invention is merely illustrative and should not be construed as limiting the embodiments of the present invention.

The output value calculation operation is performed after the data decryption operation, and after the output value is calculated, S104 is continuously performed.

3) Updating parameters:

at each E^uInternally, current local model parameters W are stored^uReceive E^uAfter the deviation value Delta returned from the outside, according to the form W^u←W^u-αΔX^uParameter update formula pair W^uAn update is performed (using initialized parameter values before the first update). Of course, the actually used parameter update formula is not limited to the above form. For example:

if each time a read is made from the data source and to E^uWhen 1 piece of data i is input as a training sample, the parameter updating formula is as follows: w^u←W^u-αΔ_iX_i ^u；

If each time a read is made from the data source and to E^uWhen a plurality of pieces of data are input as training samples and a gradient descent method (gradient) is used for parameter updating, the parameter updating formula is as follows:

namely, all training samples participate in updating operation;

if each time a read is made from the data source and to E^uWhen a plurality of pieces of data are input as training samples and a random gradient descent method (stochastic gradient device) is used for parameter updating, the parameter updating formula is as follows: w^u←W^u-αΔ_iX_i ^uWherein i is an arbitrary value, namely a training sample is randomly selected to participate in updating operation;

the above updating algorithm is only used for illustrative purposes and should not be construed as limiting the scheme. For example, to reduce the over-fitting phenomenon, a regularization correction term may be added to the update formula. There are other update algorithms available and the description is not intended to be exhaustive.

A parameter updating operation, executed after the above step S105, where after the parameter updating, one iteration updating is completed, and the parameter obtained after the updating is used for calculating the output value O in the next iteration process^u。

A complete iteration process is introduced above, and iteration is performed through the above steps until the model training requirement is met, where the model training requirement may be, for example: the deviation value delta of the global model is small enough, the delta difference value of two adjacent iterative computations is small enough, E^uInternal twice-iteratively calculated O^uThe difference is small enough, or a preset number of iterations is reached, etc., although an additional verification set may be used for verification, the specification does not need to limit the specific model training requirements.

It can be seen that, by applying the above scheme, operations (such as data decryption operation, local model parameter update operation, and the like) related to private data in the model training process are all encapsulated in the trusted execution environment of the data provider for execution. That is to say: data plaintext cannot be acquired outside the trusted execution environment, and in some specific embodiments, specific local model parameters cannot be acquired even outside the trusted execution environment, so that data security of a shared data provider is effectively guaranteed.

The model training scheme based on shared data provided by the embodiments of the present specification is introduced above in a whole, and in terms of overall detailed design, in combination with practical application requirements, there are some alternative implementations, for example, as follows:

in S101 to S102, only one piece of data may be read to the trusted execution environment at a time, or multiple pieces of data may be read to the trusted execution environment at a time. In each iteration process, N pieces of ciphertext data are respectively obtained from each data provider, wherein N can be a preset numerical value not less than 1, and the training samples are replaced by obtaining different data of contents each time.

In addition, it can be understood that the acquisition of the training sample data may be performed successively along with the iterative process, or may be performed once. For example, if the number of data required for each iteration is N, N pieces of data may be obtained in each iteration and decrypted in the trusted execution environment; or the data with the quantity larger than N (for example, the full quantity of data, or a multiple of N, etc.) is acquired at one time and then input into the trusted execution environment, and N pieces of data are decrypted at the trusted execution environment as required each time; the method can also be implemented by inputting the data with the quantity larger than N (such as full quantity data, or multiple of N, etc.) into the trusted execution environment at one time, and decrypting the input data at the trusted execution environment at one time; and so on.

It can be seen that the steps that must be executed in each iteration in practical applications include steps S103 to S105 and steps of "output value calculation" and "parameter update" inside the trusted execution environment, and steps S101 and S102 and step of "data decryption" inside the trusted execution environment do not have to be executed in each iteration. In a word, the mode of acquiring the sample data can be flexibly set according to the actual situation, and the mode does not influence the realization of the whole scheme.

The association of data among multiple data providers can be realized through some common and identification features, such as identification numbers, which can ensure that data obtained from multiple data providers are used for describing the same person. The identification feature does not need to participate in model training, and the security of the feature data can be improved in a hash mode and the like.

Each E^uAre created based on information provided by the data provider itself, E^uThe basic design criteria should be met for the overall design, but not necessarily exactly the same for the specific implementation. For example, different data decryption algorithms, different parameter update algorithms, etc. may be employed.

For each group O¹,O²,…O^UIts global tag value Y may be determined in a number of ways, for example:

1) a tag value Y provided by a certain data provider u^uDetermined as Y;

2) root of herbaceous plantA tag value Y provided by a plurality of data providers u1, u2 …^u1、Y^u2…, and may be determined in a manner such as calculating a weighted average, a logical and, a logical or, and so forth;

3) determining Y through other channels except the data provider;

for example, a predictive model was established for the prevalence of a disease known to be associated with 5 characteristics (age, occupation, sex, height, weight) and:

the institution 1 may provide characteristic data: age, occupation;

the agency 2 may provide characteristic data: sex, height, weight;

the prediction model is assumed to be a binary model, i.e. the model output values include both "diseased" and "not diseased" (the corresponding prediction results may be presented as "high risk" and "low risk"). Each institution may or may not provide a tag value, i.e. the "diseased or not" result, based on the provided characteristic data. However, there may be various strategies for determining the global tag value, such as:

the global tag value is subject to the tag value provided by a certain institution, and the actual situation may be that the institution is more authoritative or that another institution cannot provide the tag value.

The global tag value is determined jointly from tag values provided by two institutions, for example: if the tag value provided by at least one institution is "diseased," the global tag value is determined to be "diseased.

In addition, in some cases, the data mining party may also directly know the "sick or not" result of a group of users from other channels, the requirement is to further mine the relationship between the result and other features, "the other features" may be obtained from the data provider, and the above-mentioned previously known result may be directly used as the global tag value.

After the training is finished, each E^uThe parameters obtained by the last update can be output to the data mining party so that the data mining party can maintain a complete data model. Parameters can also be usedDistributed storage in respective E^uAnd the safety is further improved.

If adopted, each E^uSaving the parameters still in E^uIn the internal scheme, in the model using stage, each data provider uploads ciphertext data to E^uIn (E)^uDecrypting the ciphertext data and calculating an output value, and finally mining the ciphertext data according to the E values^uAnd calculating an output result of the global model. FIG. 3 illustrates a method of data prediction based on shared data modeling, which may include the steps of:

s201, respectively obtaining ciphertext data provided by U data providers, wherein U is more than or equal to 2;

s202, respectively corresponding the ciphertext data of each data provider to the trusted execution environment E of the input data provider¹、E²…E^U；

S203 obtains the output value O of the trusted execution environment¹、O²…O^U；

S204, according to a pre-trained prediction model, calculating to obtain a predicted value Y-h [ z (O)¹,O²,…O^U)]；

Comparing fig. 2 and fig. 3, it can be seen that in the model using stage, a system architecture similar to the model training stage is still adopted, except that the parameter iteration update is not required, that is, the predicted result value y is output once according to the input data.

Accordingly, for any trusted execution environment E^uLocal model parameter W^uIt is preserved in advance, and in the model using stage, it realizes 2 basic functions:

1) decrypting the input ciphertext data to obtain a plaintext data characteristic value X^u；

2) According to the current local model parameter W^uCalculating X^uCorresponding output value O^u；

In the model using phase, specific implementation of each step may refer to a corresponding step in the model training phase, and a description of this embodiment is not repeated.

In the above embodiment, an implementation scheme that more than 2 data providers jointly provide a data joint training model is introduced, and it can be understood that other improvements can be made on the basis of the above scheme to meet the application requirements of some specific scenarios, for example, as follows:

when only 1 data provider provides data to a data mining party and has a secrecy requirement on the data, the following scheme can be utilized to realize model training:

the data mining party performs iterative training by using the following steps until model training requirements are met:

s101', ciphertext data provided by 1 data provider is obtained;

s102', inputting ciphertext data of a data provider into a trusted execution environment E of the data provider;

s103' obtaining an output value O of the trusted execution environment E;

s104', calculating a deviation value delta between a model predicted value and a true value according to a given training target model;

s105' returning the deviation value delta to the trusted execution environment E, so that the trusted execution environment updates the model parameters;

the embodiment can be applied to application scenarios in which a certain data provider entrusts a data miner to perform data mining and does not want to reveal data details to the data miner.

Compared with S101 to S105, S101 'to S105' are cases where U data providers are simplified to 1 data provider, and other implementations are basically the same, and a description thereof will not be repeated in this embodiment. Wherein, inside the trusted execution environment E, three functions of data decryption, output value calculation, and parameter update are still implemented.

When there are multiple data providers providing data to the data miner and some of them have no privacy requirement for the data, model training can be implemented using the following scheme:

s101, respectively obtaining data provided by U data providers (wherein U is more than or equal to 2), wherein the data provided by at least 1 data provider is ciphertext data, and the data provided by other data providers is plaintext data;

s102, if the data form provided by the data provider u is ciphertext data, inputting the ciphertext data into the trusted execution environment E of the provider corresponding to the data^uU here designates a data provider with data security requirements.

S103' obtaining output value O of each trusted execution environment^u；

S104' utilizing trusted execution environments E^uOutput value of O^uAnd calculating the deviation value delta between the predicted value and the true value of the model according to plaintext data provided by other data providers;

the difference between this step and S104 is: for a provider providing plaintext data, a data miner can directly obtain corresponding plaintext data to participate in global computing without passing through a trusted execution environment.

S105, respectively returning the deviation values to the trusted execution environments, so that the trusted execution environments respectively update the local model parameters;

the difference between this step and S105 is: for providers that provide plaintext data, the data miner may be directly responsible for maintaining and updating the local model parameters.

In contrast to S101 to S105, S101 "to S105" described above divide U data providers into two categories: for a data provider without data confidentiality requirement, a data mining party can directly acquire plaintext data provided by the data mining party to perform model training; and for a data provider with a data security requirement, the ciphertext data provided by the data provider still needs to be processed by the trusted execution environment. The trusted execution environment is internally provided with three functions of data decryption, output value calculation and parameter updating.

The scheme of the embodiment is suitable for the scene that some data features required by the global model do not have the security requirement. Of course, from the data privacy perspective, the "no privacy requirements" herein is not generally in an absolute sense, but rather there are no privacy requirements within the data miners. For example, if a data provider has a deep cooperation relationship with a data miner, or if the data miner itself has a piece of data available to participate in the global model training (the data miner itself can be considered to be one of the data providers), then the data without the privacy requirement can be used directly by the data miner without going through the trusted execution environment.

The scheme of the embodiment of the present specification is described below with reference to specific examples;

assuming that the overall training requirements are: a model for predicting whether the user has the ability to repay the high loan on due date is established according to the user property data provided by two banking institutions.

The bank 1 may provide a data feature of x₁,x₂,x₃；

The bank 2 may provide a data feature of x₄,x₅；

The global modeling uses a logistic regression model, the functional form is:

wherein:

z＝(w₁x₁+w₂x₂+w₃x₃+w₄x₄+w₅x₅) (2)

w₁,w₂,w₃as a local parameter of the bank 1, w₄,w₅Local parameters for the bank 2.

Defining:

sum1＝w₁x₁+w₂x₂+w₃x₃ (3)

sum2＝w₄x₄+w₅x₅ (4)

then, according to equations (1) to (4), the bias value calculation function of the global model can be obtained:

the trusted execution environment is realized by adopting the SGX technology of Intel, the created trusted execution environment is called enclave, specifically, the method is to encapsulate the security operation of legal software in one enclave, and once the software and data are located in the enclave, even an operating system or a VMM (Hypervisor) cannot influence the code and data in the enclave. The security boundary of enclave contains only the CPU and itself.

The overall architecture of the system for implementing model training is shown in fig. 4a, and the following describes the implementation of the system from the perspective of the data provider and the data miner:

1) a data provider:

each bank encrypts data which needs to be provided for a data mining party respectively, and the encrypted data can be stored in a hard disk of a data providing party. Of course, some parts of the data may also be provided in clear text, depending on the actual application requirements.

Each bank respectively provides an enclave definition file (. edl) and a corresponding dynamic link library (.dll or. so), and the generated enclave comprises the following functions or interfaces:

1.1) decrypting the encrypted text data which is input from the outside of the envelope and encrypted in advance by the bank to obtain the plaintext data. Inputting N ciphertext data each time in an iteration mode, wherein each piece of data corresponds to one user, and for any user i, the plaintext data of the bank 1 is x_i1,x_i2,x_i3The plain text data of the bank 2 is x_i4,x_i5；

1.2) respectively calculating the output value of each piece of data according to the current local parameter value and outputting the output value to the outside of enclave. For any user i, the output value of bank 1 is sum1_iThe output value of bank 2 is sum2_i；

1.3) Delta returned externally according to enclave_iAnd updating local parameters by using a gradient descent method, wherein all N data participate in operation in each iteration, and an updating formula is as follows:

W←W-α∑_iΔ_iX_i (6)

namely:

w₁←w₁-α∑_iΔ_ix_i1

w₂←w₂-α∑_iΔ_ix_i2

w₃←w₃-α∑_iΔ_ix_i3

w₄←w₄-α∑_iΔ_ix_i4

w₅←w₅-α∑_iΔ_ix_i5

wherein

Alpha is a preset learning rate, and the adoption alpha of the bank 1 and the bank 2 can be the same or different.

2) And (3) a data mining party:

the data miner first unifies a global tag value Y, which is used to represent: whether a user who already has too much loan behavior can repay the loan on an installment basis. This information may be obtained from two banks or from other lending institutions.

Loading enclave information provided by two banks respectively, creating enclave1 and enclave2, and establishing a model training application based on enclave1 and enclave2, wherein the running mechanism of the application is as follows:

2.1) reading a batch of ciphertext data from the hard disk at each iteration, and assuming that the number of ciphertext data read at each time is N. And the two bank data can be read in an associated manner through the identification card number. The encrypted data of the bank 1 is input into enclave1, and the encrypted data of the bank 2 is input into enclave 2.

2.2) respectively decrypting the ciphertext data in enclave1 and enclave2, and respectively calculating and obtaining sum1 by using an equation (3) and an equation (4) according to the current local parameter (using the initial parameter value in the first iteration)_iAnd sum2_iAnd output to the outside. 2.3) sum1 output according to enclave1 and enclave2_iAnd sum2_iCalculating Delta from equation (7)_iAnd will be_iRespectively returning to enclave1 and enclave 2;

2.4) within enclave1 and enclave2, the parameters are updated using equation (6), respectively.

Repeating the iteration until the model training condition is met to obtain the final parameter value w₁,w₂,w₃,w₄,w₅And substituting the values into the formula (1) and the formula (2) to obtain the model needing to be trained.

FIG. 4b shows another overall architecture of a system implementing model training, where the corresponding overall training requirements are: the data mining party has some user property data and needs to establish a joint model for predicting whether the user has the ability to repay the high loan according to the property data provided by the bank 1 and the owned data, wherein:

the bank 1 may provide a data feature of x₁,x₂,x₃(ii) a Corresponding local parameter is w₁,w₂,w₃；

The data provider has the data characteristic x₄,x₅(ii) a Corresponding local parameter is w₄,w₅；

Compared with the previous embodiment, the overall model training thought is basically consistent, and the difference points out that: creating enclave only for Bank 1, for feature x₄,x₅In other words, the data provider can directly read the plaintext data to participate in the model training calculation.

Corresponding to the above method embodiment, an embodiment of the present specification further provides a shared data-based model training apparatus, and as shown in fig. 5, the apparatus may include the following modules for implementing iterative training:

a data obtaining module 110, configured to obtain ciphertext data provided by at least 1 data provider;

the data input module 120 is configured to respectively input the ciphertext data of each data provider to a trusted execution environment of the data provider;

an output value obtaining module 130, configured to obtain an output value of each trusted execution environment, where the output value is obtained through calculation according to the ciphertext data;

the deviation value calculation module 140 is used for calculating a deviation value between the model predicted value and the actual value according to a given training target model; the model predicted value is determined according to the output value of each trusted execution environment, and the real value is a global tag value determined according to the data of each data provider;

an offset value returning module 150, configured to return the offset value to each trusted execution environment respectively, so that each trusted execution environment updates the local model parameter respectively;

wherein, arbitrary trusted execution environment includes internally:

In one embodiment provided in this specification, when there are a plurality of data providers providing data to the data miner and there is no secrecy requirement for the data by some of the data providers, then the module functions of the above apparatus may be configured as follows:

the data obtaining module 110 is configured to obtain data provided by multiple data providers, where at least 1 data provider provides data in a ciphertext data form, and other data providers provide data in a plaintext data form;

the data input module 120 is configured to, when the data format provided by the data provider is ciphertext data, input the ciphertext data corresponding to the trusted execution environment of the data provider;

the deviation value calculation module 140 is configured to calculate a deviation value between the model predicted value and the real value by using the output value of each trusted execution environment and plaintext data provided by another data provider; the model prediction value is determined according to the output value of each trusted execution environment and the characteristic value of plaintext data; the real value is a global label value determined according to the data of each data provider;

wherein, arbitrary trusted execution environment includes internally:

Referring to fig. 6, an embodiment of the present specification further provides a data prediction apparatus based on shared data modeling, where the apparatus may include:

a data obtaining module 210, configured to obtain ciphertext data provided by at least 1 data provider;

the data input module 220 is configured to respectively input the ciphertext data of each data provider to a trusted execution environment of the data provider;

an output value obtaining module 230, configured to obtain an output value of each trusted execution environment, where the output value is obtained through calculation according to the ciphertext data;

the predicted value calculation module 240 is configured to input the output value of each trusted execution environment into a pre-trained prediction model, and calculate to obtain a predicted value;

wherein any trusted execution environment E^uThe inside includes:

Embodiments of the present specification further provide a computer device, which at least includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor can implement the aforementioned model training method or data prediction method when executing the program.

Fig. 7 is a more specific hardware structure diagram of a computing device provided in an embodiment of the present specification, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.

The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.

The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.

The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.

The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).

Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.

It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.

Embodiments of the present specification also provide a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the aforementioned model training method or data prediction method:

computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the modules described as separate components may or may not be physically separate, and the functions of the modules may be implemented in one or more software and/or hardware when implementing the embodiments of the present disclosure. And part or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The foregoing is only a specific embodiment of the embodiments of the present disclosure, and it should be noted that, for those skilled in the art, a plurality of modifications and decorations can be made without departing from the principle of the embodiments of the present disclosure, and these modifications and decorations should also be regarded as the protection scope of the embodiments of the present disclosure.

Claims

1. A method of model training based on shared data, the method comprising:

obtaining ciphertext data provided by a plurality of data providers;

2. The method according to claim 1, wherein the calculating an output value corresponding to the plaintext data feature value according to the current local model parameter comprises:

according to the current local model parameter W^u＝(w₁,w₂,…)^uCalculating a weighted sum O of the characteristic values of the plaintext data^u＝(w₁x₁+w₂x₂+…)^uWherein u represents the identification of the data provider and x represents the characteristic value of the data.

3. The method of claim 1, the training target model being a logistic regression model.

4. The method of claim 1, the trusted execution environment being: enclave created using software protection extension SGX technology.

5. The method of claim 1, said updating local model parameters based on returned bias values, comprising:

updating local model parameters by using a gradient descent method according to the returned deviation value; or

And updating local model parameters by using a random gradient descent method according to the returned deviation value.

6. The method of claim 1, wherein the global tag value is determined from a tag value provided by one data provider or jointly determined from tag values provided by a plurality of data providers.

7. A method of model training based on shared data, the method comprising:

respectively inputting ciphertext data of a data provider to a trusted execution environment of the data provider, wherein the number of the data providers is multiple; in each trusted execution environment, respectively decrypting the input ciphertext data to obtain each plaintext data characteristic value;

8. The method according to claim 7, wherein calculating the output value corresponding to the plaintext data feature value according to the current local model parameter comprises:

9. The method of claim 7, the training target model being a logistic regression model.

10. The method of claim 7, the trusted execution environment being: enclave created using software protection extension SGX technology.

11. The method of claim 7, wherein updating local model parameters based on the returned bias values comprises:

12. The method of claim 7, wherein the global tag value is determined from a tag value provided by one data provider or jointly determined from tag values provided by a plurality of data providers.

13. A method of model training based on shared data, the method comprising:

14. The method according to claim 13, wherein calculating the output value corresponding to the plaintext data feature value according to the current local model parameter comprises:

15. The method of claim 13, training the target model is a logistic regression model.

16. The method of claim 13, the trusted execution environment being: enclave created using software protection extension SGX technology.

17. The method of claim 13, wherein updating local model parameters based on the returned bias values comprises:

18. The method of claim 13, wherein the global tag value is determined from a tag value provided by one data provider or jointly determined from tag values provided by a plurality of data providers.

19. A method of data prediction based on shared data modeling, the method comprising:

obtaining ciphertext data provided by a plurality of data providers;

20. The method of claim 19, wherein calculating the output value corresponding to the plaintext data feature value according to the current local model parameter comprises:

21. The method of claim 19, training the target model is a logistic regression model.

22. The method of claim 19, the trusted execution environment being: enclave created using software protection extension SGX technology.

23. A shared data based model training apparatus, the apparatus comprising the following means for performing iterative training:

the data acquisition module is used for acquiring ciphertext data provided by data providers, and the number of the data providers is multiple;

wherein, arbitrary trusted execution environment includes internally:

24. The apparatus of claim 23, wherein the output value calculation module is specifically configured to:

25. The apparatus of claim 23, the training target model is a logistic regression model.

26. The apparatus of claim 23, the trusted execution environment being: enclave created using software protection extension SGX technology.

27. The apparatus according to claim 23, wherein the parameter update sub-module is specifically configured to:

28. The apparatus of claim 23, wherein the global tag value is determined from a tag value provided by one data provider or jointly determined from tag values provided by a plurality of data providers.

29. A shared data based model training apparatus, the apparatus comprising the following means for performing iterative training:

wherein, arbitrary trusted execution environment includes internally:

30. The apparatus of claim 29, wherein the output value calculation module is specifically configured to:

31. The apparatus of claim 29, the training target model is a logistic regression model.

32. The apparatus of claim 29, the trusted execution environment being: enclave created using software protection extension SGX technology.

33. The apparatus according to claim 29, wherein the parameter update sub-module is specifically configured to:

34. The apparatus of claim 29, wherein the global tag value is determined from a tag value provided by one data provider or jointly determined from tag values provided by a plurality of data providers.

35. A data prediction apparatus based on shared data modeling, the apparatus comprising:

wherein any trusted execution environment E^uThe inside includes:

36. The apparatus of claim 35, wherein the output value obtaining module is specifically configured to:

according to the current local model parameter W^u＝(w₁,w₂,…)^uCalculating a weighted sum O of the characteristic values of the plaintext data^u＝(w₁x₁+w₂x₂+…)^u。

37. The apparatus of claim 35, the training target model is a logistic regression model.

38. The apparatus of claim 35, the trusted execution environment being: enclave created using software protection extension SGX technology.

39. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 6 when executing the program.

40. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 7 to 12 when executing the program.

41. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 13 to 18 when executing the program.

42. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 19 to 22 when executing the program.