WO2021147487A1 - 一种数据处理方法及装置 - Google Patents

一种数据处理方法及装置 Download PDF

Info

Publication number
WO2021147487A1
WO2021147487A1 PCT/CN2020/129124 CN2020129124W WO2021147487A1 WO 2021147487 A1 WO2021147487 A1 WO 2021147487A1 CN 2020129124 W CN2020129124 W CN 2020129124W WO 2021147487 A1 WO2021147487 A1 WO 2021147487A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
common
common cutting
participant
feature
Prior art date
Application number
PCT/CN2020/129124
Other languages
English (en)
French (fr)
Inventor
衣志昊
程勇
刘洋
陈天健
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2021147487A1 publication Critical patent/WO2021147487A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Definitions

  • the present invention relates to the technical field of financial technology (Fintech), in particular to a data processing method and device.
  • Abnormal data detection is a commonly used data processing method in the financial field.
  • the detection model is used to detect abnormal transaction data in the massive transaction data, which can facilitate the operation and maintenance personnel to detect abnormal transaction scenarios in time and improve Stability in the financial sector.
  • devices of each participant can usually only use locally stored data to be trained to train a detection model.
  • the detection model trained by a single participant’s equipment cannot reflect the data characteristics of other participants’ equipment, the detection model trained on a single participant’s equipment can only accurately detect the abnormal data of the participant’s equipment, but cannot detect other participants.
  • the abnormal data of the equipment or the abnormal data of the equipment of other participants may be misjudged, resulting in a lower detection accuracy of the abnormal data.
  • the present invention provides a data processing method and device, which are used for training to obtain a general detection model, so as to detect abnormal data of each participant's equipment, thereby improving the accuracy of abnormal data detection.
  • the present invention provides a data processing method, which is applied to a federated server, and the method includes: the federated server combines the data to be trained of each participant's device to determine the common cutting feature of each participant's device at a common cutting point , Construct a detection model based on the common cutting features at the common cutting point, and send the detection model to each participant's device, so that each participant's device uses the detection model to detect the data to be detected, so as to determine whether the data to be detected is abnormal data.
  • the common cut feature is a feature that distinguishes abnormal data from normal data.
  • the common cutting feature may include a common cutting feature dimension and a common cutting feature value.
  • the federation server combines the training data of each participant's equipment to determine the common cutting characteristics of each participant's equipment at the common cutting point, including: the federation server according to the waiting training data of each participant's equipment at the common cutting point The feature dimension of the data, determine the common cutting feature dimension at the common cutting point, and send the common cutting feature dimension at the common cutting point to each participant device, so that the participant device is based on the common cutting feature dimension at the common cutting point Determine the cutting characteristic value of the participant’s equipment in the common cutting characteristic dimension.
  • the federation server receives the cutting characteristic value of each participant’s equipment in the common cutting characteristic dimension reported by each participant’s equipment, and based on the common characteristic of each participant’s equipment The cutting characteristic value under the dimension determines the common cutting characteristic value.
  • the federation server determines the public feature dimension at the common cut point according to the feature dimensions of the data to be trained at the common cut point of each participant’s device, including: the federated server determines the public feature dimension at the common cut point according to the The feature dimension of the data to be trained at the cutting point is determined, the common feature dimension of the data to be trained at the common cutting point of each participant's device is determined, and the common cutting feature dimension is selected from the common feature dimensions.
  • the federation server constructs and obtains a detection model based on the common cut feature at the common cut point, including: the federated server associates any common cut point with the common cut feature at the common cut point, and then according to each participant
  • the inclusion relationship of the data to be trained at each common cutting point of the square device is connected to each common cutting point to obtain a binary tree model, and the binary tree model is used as a detection model.
  • the federated server combines the training data of each participant's device to determine the common cut feature of each participant's device at the common cut point, including: the federated server combines the data of each participant's device in any training session. For the training data, determine the common cutting feature of each participant's device at a common cutting point corresponding to any training, and then construct a detection model corresponding to any training based on the common cutting feature at the common cutting point.
  • the detection model is issued to each participant device, including: the federation server sends the detection model corresponding to each training to each participant device, so that each participant device uses the detection model corresponding to each training to treat The detection data is tested to determine whether the data to be detected is abnormal data.
  • the present invention provides a data processing method applied to a participant device.
  • the method includes: the participant device receives a detection model sent by a federated server, and uses the detection model to detect the data to be detected to determine the data to be detected Whether the data is abnormal data.
  • the detection model is the federation server united with the data to be trained on the equipment of each participant to determine the common cutting feature of each participant's equipment at the common cutting point, which is constructed based on the common cutting feature at the common cutting point, and the common cutting feature is used for Distinguish abnormal data from normal data.
  • the participant device before the participant device receives the detection model sent by the federation server, it also receives the common cut feature dimension at the common cut point issued by the federation server, and determines the participation based on the common cut feature dimension at the common cut point The cutting feature value of the party device in the common cutting feature dimension, and reporting the cutting feature value of the participant device in the common cutting feature dimension to the federation server, so that the federation server can according to the cutting feature value of each participant device in the common feature dimension , To determine the common cut characteristic value.
  • the common cutting feature dimension at the common cutting point is determined by the federation server according to the feature dimensions of the data to be trained at the common cutting point of each participant's device.
  • the participant device uses the detection model to detect the data to be detected to determine whether the data to be detected is abnormal data, including: the participant device uses the common cutting feature at each common cutting point to perform the detection on the data to be detected. Cut, determine the common cutting point to which the data to be detected is finally cut. If the weight corresponding to the common cutting point that is finally cut is greater than the first preset threshold, the data to be detected is determined to be abnormal data, otherwise the data to be detected is determined to be normal data.
  • the participant device receiving the detection model sent by the federation server includes: the participant device receiving the detection model corresponding to each training sent by the federation server.
  • the participant device uses the detection model to detect the data to be detected to determine whether the data to be detected is abnormal data, including: the participant device uses the common cutting feature at each common cutting point in the detection model corresponding to any training session. Cut the detection data, determine the common cutting point that the data to be detected is finally cut to in the detection model corresponding to any training, and calculate the average corresponding to the common cutting point that the data to be detected is finally cut in the detection model corresponding to each training. Weight, if the average weight is greater than the second preset threshold, it is determined that the data to be detected is abnormal data, otherwise it is determined that the data to be detected is normal data.
  • the present invention provides a data processing device, the device includes: a determination module, used to combine the training data of each participant's equipment to determine the common cutting feature of each participant's equipment at a common cutting point; a building module, It is used to construct a detection model based on the common cutting feature at the common cutting point; the transceiver module is used to deliver the detection model to each participant's device, so that each participant's device is also used to use the detection model to detect the data to be detected To determine whether the data to be detected is abnormal data.
  • the common cut feature is a feature that distinguishes abnormal data from normal data.
  • the common cutting feature may include a common cutting feature dimension and a common cutting feature value.
  • the determining module is specifically used to: determine the common cutting feature dimension at the common cutting point according to the feature dimensions of the data to be trained at the common cutting point of each participant’s device, and to convert the common cutting feature at the common cutting point The dimension is issued to each participant device, so that the participant device determines the cutting feature value of the participant device in the common cutting feature dimension based on the common cutting feature dimension at the common cutting point.
  • the determining module receives the cutting characteristic value of each participant device in the common cutting characteristic dimension reported by each participant device, and determines the common cutting characteristic value according to the cutting characteristic value of each participant device in the common characteristic dimension.
  • the determining module is specifically used to: determine the common characteristics of the data to be trained at the common cutting point of each participant device according to the characteristic dimension of the data to be trained at the common cutting point of each participant device Dimension, and select the common cutting feature dimension from the common feature dimension.
  • the building module is specifically used to: associate any common cutting point with a common cutting feature at the common cutting point, and according to the inclusion relationship of the data to be trained at each common cutting point of each participant’s device, Connect each common cutting point to obtain a binary tree model, and use the binary tree model as a detection model.
  • the determining module is specifically used to: combine the data to be trained in any training of each participant's device to determine the common cutting feature of each participant's device at a common cutting point corresponding to any training, according to the public The common cutting feature at the cutting point is constructed to obtain a detection model corresponding to any training.
  • the transceiver module is specifically used to: send the detection model corresponding to each training to each participant device, so that each participant device uses the detection model corresponding to each training to detect the data to be detected to determine the data to be detected Whether it is abnormal data.
  • the present invention provides a data processing device, which includes: a transceiver module for receiving a detection model sent by a federated server; a detection module for using the detection model to detect data to be detected to determine whether the data to be detected Is abnormal data.
  • the detection model is that the federation server combines the data to be trained on the equipment of each participant to determine the common cutting feature of each participant's equipment at the common cutting point, and is constructed based on the common cutting feature at the common cutting point; the common cutting feature is used for Distinguish abnormal data from normal data.
  • the transceiver module before the transceiver module receives the detection model sent by the federation server, it also receives the common cutting feature dimension at the common cutting point issued by the federation server, and determines the participant based on the common cutting feature dimension at the common cutting point The cutting feature value of the device in the common cutting feature dimension, and reporting the cutting feature value of the participant's device in the common cutting feature dimension to the federation server, so that the federation server is also used for cutting according to each participant's device in the common feature dimension Characteristic value, determine the common cutting characteristic value.
  • the common cutting feature dimension at the common cutting point is determined by the federation server according to the feature dimensions of the data to be trained at the common cutting point of each participant's device.
  • the detection module is specifically configured to: use the common cutting features at each common cutting point to cut the data to be detected, and determine the common cutting point to which the data to be detected is finally cut. If the weight corresponding to the common cutting point is greater than the first preset threshold, it is determined that the data to be detected is abnormal data; otherwise, the data to be detected is determined to be normal data.
  • the transceiver module is specifically configured to: receive the detection model corresponding to each training sent by the federation server.
  • the detection module is specifically used to: use the common cutting feature at each common cutting point in the detection model corresponding to any training to cut the data to be detected, and to determine that the data to be detected is finally cut to the detection model corresponding to any training. Calculate the average weight corresponding to the common cutting point that the data to be detected is finally cut to in the detection model corresponding to each training. If the average weight is greater than the second preset threshold, it is determined that the data to be detected is abnormal data, Otherwise, it is determined that the data to be detected is normal data.
  • the present invention provides a computing device including at least one processing unit and at least one storage unit, wherein the storage unit stores a computer program, and when the program is executed by the processing unit, the processing The unit executes any method described in the first aspect or the second aspect described above.
  • the present invention provides a computer-readable storage medium that stores a computer program executable by a computing device.
  • the program runs on the computing device, the computing device executes the first aspect described above. Or any of the methods described in the second aspect.
  • the common cutting feature at the common cutting point is determined by combining the to-be-trained data of each participant's device, so that the common cutting feature can simultaneously reflect the data characteristics of each participant's device.
  • the detection is constructed based on the common cutting feature.
  • the model can accurately detect the abnormal data of each participant's equipment.
  • the detection model has good versatility and high detection accuracy.
  • FIG. 1 is a schematic diagram of a suitable system architecture provided by an embodiment of the present invention
  • FIG. 2 is a schematic flowchart corresponding to a data processing method provided by an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a process method for determining a common cutting feature according to an embodiment of the present invention
  • Figure 4 is a schematic diagram of the distribution of the data to be trained at each common cutting point of the participant's equipment
  • FIG. 5 is a schematic flowchart of a method for determining the next common cutting point according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a detection model provided by an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a data processing device provided by an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of another data processing device provided by an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a back-end device provided by an embodiment of the present invention.
  • FIG. 1 is a schematic diagram of an applicable system architecture provided by an embodiment of the present invention.
  • the system architecture may include a federation server 110 and at least two participant devices, such as participant device 121, participant device 122, and Participant equipment 123.
  • the federation server 110 may be connected to each participant's device, for example, it may be connected in a wired manner, or may be connected in a wireless manner, which is not specifically limited.
  • FIG. 2 is a schematic diagram of the interaction flow corresponding to a data processing method provided by an embodiment of the present invention.
  • the method is applicable to a federation server and participant devices, such as the federation server 110 shown in FIG. And any participant device, such as participant device 121, participant device 122, or participant device 123.
  • the method includes:
  • Step 201 The federation server combines the to-be-trained data of each participant's device to determine the common cutting feature of each participant's device at a common cutting point.
  • the common cutting point is a unified cutting node that is determined in conjunction with each participant's equipment when cutting the training data of each participant's equipment.
  • Each participant's equipment has its own waiting point at the common cutting point.
  • the federation server can combine the data to be trained at the common cutting point of each participant's device to determine the common cutting feature at the common cutting point, and then deliver it to each participant's device.
  • any participant device uses the common cutting feature at the common cutting point to cut the participant’s equipment to be trained at the common cutting point to obtain the subsequent common cutting point, that is, the participant’s device at the common cutting point
  • the data to be trained at is cut into the subsequent common cutting points, and the above process is performed cyclically until the cutting is impossible, and the common cutting points and the common cutting features at each common cutting point are also obtained accordingly.
  • the common cut feature is a feature that distinguishes abnormal data from normal data. Normal data and abnormal data are relative to each data to be trained. Normal data refers to data with similar features to most of the data to be trained, and abnormal data refers to data with large differences in features from most of the data to be trained.
  • step 202 the federation server constructs and obtains a detection model according to the common cutting feature at the common cutting point.
  • Step 203 The federation server delivers the detection model to each participant's device.
  • each participant device uses the detection model to detect the data to be detected to determine whether the data to be detected is abnormal data.
  • the common cutting feature at the common cutting point is determined by combining the to-be-trained data of each participant's device, so that the common cutting feature can simultaneously reflect the data characteristics of each participant's device.
  • the detection model constructed based on the common cutting feature can accurately detect the abnormal data of each participant's equipment, the detection model has better versatility, and the detection accuracy rate is higher.
  • the federation server can determine the common cutting feature of each participant's device at any common cutting point in the manner shown in FIG. 3 below.
  • the common cutting feature may include a common cutting feature dimension and a common cutting feature value.
  • FIG. 3 is a schematic diagram of a process method for determining a common cutting feature at any common cutting point according to an embodiment of the present invention.
  • the method is applicable to a federation server and participant devices, such as the federation server 110 and any server shown in FIG.
  • a participant device such as a participant device 121, a participant device 122, or a participant device 123.
  • the method includes:
  • the federation server determines the common cut feature dimension at the common cut point according to the feature dimension of the data to be trained at the common cut point of each participant's device.
  • the federation server 110 may first determine the data to be trained for each participant device at the common cutting point according to the feature dimensions of the data to be trained at the common cutting point. Then, select the common cutting feature dimension at the common cutting point from the common feature dimensions. There are many ways to determine the feature dimension of the data to be trained at the common cutting point of each participant device. For example, the federation server 110 may send a dimension acquisition request to each participant device, and the dimension acquisition request carries the common cutting point. Point identification, so that each participant device can determine and report the characteristic dimension of the data to be trained at the common cutting point of each participant device according to the dimension acquisition request. Alternatively, each participant device may also report the feature dimension of the data to be trained at the common cutting point of each participant device in a fixed period or in real time, which is not limited.
  • the common cutting feature dimensions at any two common cutting points can be different.
  • the federation server 110 may first obtain the feature dimensions of the data to be trained in each participant's device, and then use the common feature dimensions of the data to be trained in each participant's device to construct a common feature dimension set. In this way, for any common cutting point, the federation server 110 may first determine from the set of common characteristic dimensions each common characteristic dimension that is different from the common cutting characteristic dimension at other common cutting points, and then select one from each common characteristic dimension As the common cutting feature dimension of each participant's equipment at the common cutting point.
  • the common feature dimension set can have the following possible situations:
  • Case 1 Only the common feature dimensions that are different from the common cut feature dimensions at other common cut points are stored in the common feature dimension set.
  • the federation server 110 can directly select a common feature dimension randomly from the set of common feature dimensions, and use the common feature dimension as the common cutting feature of each participant at the common cutting point. Dimension.
  • the federated server 110 may delete the common cutting feature dimension of each participant at the common cutting point from the common feature dimension set, so that only the common feature dimension set is stored in the common feature dimension set. Participants have different common feature dimensions at other common cutting points.
  • Case 2 The public feature dimension set stores all public feature dimensions and the status of all public feature dimensions; the status of any public feature dimension is selected or unselected, and the selected status is used to indicate that the public feature dimension is other public The common cutting feature dimension at the cutting point. The unselected state is used to indicate that the common feature dimension is not the common cutting feature dimension at other common cutting points.
  • the federation server 110 may first determine the state of each common feature dimension from the set of common feature dimensions, and then randomly select a common feature from the public feature dimensions whose status is not selected. Dimension, the common feature dimension is taken as the common cutting feature dimension at the common cutting point. Correspondingly, after the common cutting point cutting ends, the federation server 110 can update the state of the common cutting feature dimension at the common cutting point in the common feature dimension set to the selected state, so as to update each common feature in the common feature dimension set in real time. The status of the dimension guarantees the accuracy of determining the dimension of the common cutting feature.
  • each common cutting point can be cut using different feature dimensions, thereby The data to be trained can be cut more evenly using the data features of each feature dimension, and the accuracy of the common cut feature at the common cut point can be improved.
  • step 302 the federation server issues the common cutting feature dimension at the common cutting point to each participant device.
  • step 303 the participant device determines the cutting feature value of the participant device in the common cutting feature dimension based on the common cutting feature dimension at the common cutting point.
  • the common cutting point can be any common cutting point in the 1st to Nth level common cutting layer, the common cutting point in the 1st level common cutting layer is the root common cutting point, and the participant device is at the root
  • the data to be trained at the common cutting point includes all data to be trained in the model training of the participant's device.
  • the common cutting point in the i-th (0 ⁇ i ⁇ N-2, and is an integer) level common cutting layer is the intermediate common cutting point or the leaf common cutting point, and any intermediate common cutting point in the i-th common cutting layer
  • the cutting point is connected to at least one common cutting point in the i+1-th level common cutting layer, and the data to be trained at any intermediate common cutting point of the participant device in the i-th level common cutting layer includes that the participant device is in the middle
  • the data to be trained at all common cutting points in the i+1 level common cutting layer connected by the common cutting point, and the leaf common cutting points in the i level common cutting layer are not connected to any i+1 level common cutting The common cutting point in the layer.
  • Fig. 4 is a schematic diagram of the distribution of the data to be trained at each common cutting point of the participant’s equipment.
  • the root common cutting point A 1 is set in the first level common cutting layer, and the root common cutting point A 1 includes all the data to be trained in the participant’s equipment, namely the data to be trained a 1 , the data to be trained a 2 , the data to be trained a 3 , the data to be trained a 4 , the data to be trained a 5 , and the data to be trained a 6 And the training data a 7 .
  • the root common cutting point A 1 connects the intermediate common cutting point A 21 and the leaf common cutting point A 22 in the second-level common cutting layer.
  • the data to be trained at the intermediate common cutting point A 21 of the participant’s equipment includes the data to be trained.
  • the intermediate common cutting point A 21 connects the leaf common cutting point A 31 and the leaf common cutting point A 32 in the third level common cutting layer
  • the data to be trained at the leaf common cutting point A 31 of the participant’s device includes the data to be trained a 1 , the data to be trained a 4 and the data to be trained a 7
  • the data to be trained at the leaf common cutting point A 32 of the participant’s device includes the data to be trained a 3 , the data to be trained a 5 and the data to be trained a 6 .
  • the participant device can first obtain the training data of the participant device in the common cutting point, and then determine the common cutting point according to the common cutting feature dimension at the common cutting point Each feature value of the data to be trained in the common cut feature dimension. If the common cut feature dimension corresponds to a feature dimension with discrete feature values, the participant device can randomly select a feature value from each feature value of the data to be trained under the common cut feature dimension, as the participant device’s common cut point Cut characteristic value.
  • the participant device can randomly select an intermediate feature value from the maximum feature value and the minimum feature value under the common cut feature dimension of the data to be trained, as The cutting characteristic value of the participant's equipment at the common cutting point.
  • the method of selecting the intermediate feature value can be set by those skilled in the art based on experience.
  • the intermediate feature value can be selected randomly, or the average feature value of the maximum feature value or the minimum feature value can be used as the intermediate feature value, or the The maximum feature value or the weighted average feature value of the minimum feature value is used as the intermediate feature value, which is not specifically limited.
  • Table 1 is a schematic table of the data to be trained at the common cutting point of the participant's equipment.
  • the data to be trained at the common cutting point A 21 of the participant’s equipment includes the data to be trained a 1 , the data to be trained a 3 , the data to be trained a 4 , the data to be trained a 5 , and the data to be trained a 6
  • the characteristic dimensions of the participant’s equipment at the common cutting point A 21 include consumption amount, purchase time, age, and shopping category.
  • the participant device can first query Table 1 to determine each feature value of each data to be trained at the common cut point A 21 under the consumption amount, that is, 210, 600, 53, 1000, 860, 100. Further, since the consumption amount belongs to the feature dimension with continuous feature values, the participant device can first determine that the maximum consumption amount of each data to be trained at the common cutting point A 21 under the consumption amount is 1000, and the minimum consumption amount is 53, and then Randomly select a consumption amount from [53, 1000] as the cutting characteristic value of the participant's device at the common cutting point A 21, such as 520.
  • the participant device can first query Table 1 to determine the feature values of the data to be trained at the common cut point A 21 under the shopping category, namely heaters, furniture, snacks, and game consoles. , Washing machine, clothes. Further, since the shopping category belongs to a feature dimension with discrete feature values, the participant device can randomly select a feature value from each feature value as the cutting feature value of the participant device at the common cutting point A 21 , such as a game machine.
  • step 304 the participant device reports the cutting feature value of the participant device in the common cutting feature dimension to the federation server.
  • the federation server determines the common cutting characteristic value according to the cutting characteristic value of each participant's device in the common cutting characteristic dimension.
  • the federation server 110 may determine the common cutting characteristic value in a variety of ways. For example, a cutting characteristic value can be randomly selected as the common cutting characteristic value, or the average cutting characteristic value of each cutting characteristic value can be used as the common cutting characteristic value, or the weighted average cutting characteristic value of each cutting characteristic value can be used as the common cutting characteristic value.
  • the characteristic values, etc., are not limited.
  • step 306 the federation server uses the common cutting feature dimension and the common cutting feature value as the common cutting feature at the common cutting point.
  • the participant device can only report the cutting feature value to the federation server without reporting the data to be trained, thereby protecting the security of the data to be trained in the participant device, reducing the amount of transmitted data, and improving training efficiency.
  • the cutting feature value is generated based on the feature value of the data to be trained in the participant device under the common cut feature dimension, the cutting feature value can accurately reflect the data characteristics of the data to be trained in the participant device.
  • the detection model trained based on the common cutting feature value determined by the cutting feature value reported by each participant's device can reflect the data characteristics of the data to be trained in each participant's device.
  • the detection model has good versatility and high detection accuracy. .
  • FIG. 5 is a schematic flowchart of a method for determining the next common cutting point according to an embodiment of the present invention.
  • the method is applicable to a federation server and participant devices, such as the federation server 110 and any participant device shown in FIG. 1.
  • the participant device 121, the participant device 122, or the participant device 123 As shown in Figure 5, the method includes:
  • step 501 the federation server delivers the common cut feature value at the common cut point to each participant's device.
  • the federation server 110 may directly issue the public cut feature value at the public cut point to each participant's device, or it may first encrypt the public cut feature value at the public cut point, and then encrypt the public cut feature value.
  • the characteristic value is issued to each participant's device to ensure the security of data transmission, which is not specifically limited.
  • step 502 the participant device uses the common cutting feature value at the common cutting point to cut the to-be-trained data of the participant device at the common cutting point to obtain a cutting result.
  • the common cutting feature value can be used directly to cut the participant’s equipment to be trained at the common cutting point, and after the cutting is completed , Confirm the cutting result is successful cutting.
  • the participant’s device can use the consumption amount of 500 pairs
  • the participant equipment cuts the data to be trained a 1 , the data to be trained a 3 , the data to be trained a 4 , the data to be trained a 5 , the data to be trained a 6, and the data to be trained a 7 at the common cutting point A 21 .
  • the training data a 1 (consumption amount is 210), the training data a 4 (consumption amount is 53), and the training data a 7 (consumption amount is 100) are all less than 500, the training data a 1.
  • the to-be-trained data a 4 and the to-be-trained data a 7 are divided into the common cutting point A 31 of the third-level common cutting layer.
  • the consumption amount of the training data a 3 (consumption amount is 600), the training data a 5 (consumption amount is 1000), and the training data a 6 (consumption amount is 860) are all greater than or equal to 500, it can be
  • the to-be-trained data a 3 , the to-be-trained data a 5 and the to-be-trained data a 6 are divided into the common cutting point A 32 of the third-level common cutting layer. And, when the cutting is completed, the participant equipment determines that the cutting result is a successful cutting. Or, when the common cutting point is the common cutting point A 22 , as shown in FIG.
  • the division method can be set by those skilled in the art based on experience. For example, it may also be set to divide the data to be trained with a consumption amount greater than or equal to 500 into the common cutting point A 31 , and to divide the data to be trained with a consumption amount less than 500 into the common cutting point A 32 .
  • step 503 the participant device reports the cutting result to the federation server.
  • step 504 the federation server determines whether the end condition of the model training is satisfied according to the cutting result of each participant's device, if not, execute step 505, and if yes, execute step 506.
  • the end condition of the model training can be any one or more of the following: the depth of the common cutting point (the distance between the common cutting point and the root common cutting point) is greater than or equal to the preset cutting depth, and each participant There is no common cutting point that is not cut and can be cut in the device, the number of cuts that have been executed is greater than or equal to the preset number of cuts, the duration of the cut is greater than or equal to the preset cut time, and the common cuts included in the highest level of common cutting layer The number of points is greater than or equal to the preset number.
  • the data processing method in the embodiment of the present invention can have a wider application range and meet the needs of users more.
  • step 505 the federation server combines the to-be-trained data of each participant's device in each common cutting point to determine the next common cutting point.
  • step 506 takes the end conditions of the model training including the above items as an example to describe the specific implementation process of the above step 505 and step 506:
  • Step a After receiving the cutting results sent by each participant's device, the federated server first determines whether the number of times of cutting has been performed is greater than or equal to the preset number of cuttings, and/or whether the duration of cutting has been performed is greater than or equal to the preset number of cuttings Duration, and/or, determine whether the number of common cutting points included in the highest-level common cutting layer is greater than or equal to the preset number. If there is at least one item, yes, it can be determined that the cutting result meets the end condition of the model training, and then execute the step b, if all are no, go to step c.
  • Step b The federation server determines that there is no next common cutting point.
  • Step c The federation server determines whether each participant's device is successfully cut according to the cutting result of each participant's device. If it is determined that the equipment of each participant cannot successfully cut, it means that the common cutting point is the common cutting point of the leaf, and the common cutting point can no longer be cut downward, and step e is performed. If it is determined that there is a successful cutting by a participant's equipment, it is determined whether the depth of the common cutting point obtained by cutting is greater than or equal to the preset depth, if not, step d is executed, and if yes, step e is executed.
  • step d the federation server uses the left common cutting point in the next-level common cutting layer connected to the common cutting point as the next common cutting point.
  • the federation server determines that one or more participant devices have successfully cut, and the current branch has not reached the set cutting depth, it can continue the cutting work of the current branch, that is, the common cutting point on the current branch
  • the left common cutting point in the connected next-level common cutting layer serves as the next common cutting point.
  • Step e The federation server issues a query instruction to each participant's device.
  • Step f The participant device determines whether there is an uncut and cuttable common cutting point in the participant device according to the query instruction. If it is, the uncut and cuttable common cutting point with the deepest cutting depth is taken as the next common cutting point that can be cut in the participant's equipment. If not, it is determined that there is no next common cutting point in the participant's device.
  • the uncut and cuttable common cutting point refers to the common cutting point where the participant's equipment in the common cutting point has to-be-trained data greater than 1, and the depth of the common cutting point is less than the preset cutting depth.
  • each participant device may first query to determine whether there is an uncut and slicable common cutting point in the participant device. If it exists, the deepest common cutting point can be obtained from all uncut and cuttable common cutting points, and the query result can be generated according to the hierarchical relationship of the common cutting point.
  • the hierarchical relationship of the common cutting point may include the common cutting layer where the common cutting point is located and the position of the common cutting point in the common cutting layer.
  • the query result can be generated according to the indication message that the next common cutting point does not exist in the participant device.
  • Step g each participant device reports the query result to the federation server.
  • the query result is the hierarchical relationship of the next common cutting point that can be cut in the participant device, or an indication message that the next common cutting point does not exist in the participant device.
  • Step h According to the query results reported by each participant's device, the federated server determines that there is no next common cutting point that can be cut in all the participant's devices, then determines that the end condition of the model training has been met, and executes step b. If it is determined that there is a next common cutting point that can be cut in one or more participant devices, step i is executed.
  • Step i The federation server selects the deepest and closest common cutting point as the next common cutting point of each participant device according to the hierarchical relationship of each next common cutting point reported by one or more participant devices.
  • the common cutting point with the deepest level and the closest position refers to the common cutting point that is the deepest in the common cutting layer and is closest to the cut common cutting point in the common cutting layer.
  • the federation server when the common cutting point on the initial branch is cut to a preset cutting depth or cannot be cut, if the federation server finds that there are uncut and cuttable common cutting points in each participant’s device, it can Select the deepest common cutting point from the uncut and cuttable common cutting points of each participant's equipment as the next common cutting point, and perform the cutting cycle until there is no uncut and cuttable common cutting point in each participant's equipment Up to the cutting point. It can be seen that by cutting each common cutting point from deep to shallow based on the cutting depth, the orderly cutting can be ensured, the common cutting point can be avoided, and the accuracy of data processing and the detection effect of the detection model can be improved.
  • step 506 the federation server determines that there is no next common cutting point, and constructs a detection model according to the common cutting features at each common cutting point.
  • the federated server can construct the detection model in the following way: Associate any common cutting point with the common cutting feature of each participant’s device at the common cutting point, and according to each participant’s device in each public cutting feature The inclusion relationship of the data to be trained at the cutting point is connected to each common cutting point to obtain a binary tree model, and the binary tree model is used as the detection model.
  • Fig. 6 is a schematic structural diagram of a detection model provided by an embodiment of the present invention.
  • the first cutting uses the common cutting feature dimension 1 and the common cutting feature value 1.
  • a participant device cuts the data to be trained at the common cutting point 1, and cuts the data to be trained at the common cutting point 1 of any participant's device into the common cutting point 2 and the common cutting point 3. Since the cutting depth at this time is 2, which has not reached the preset cutting depth, the left common cutting point in the next common cutting layer (ie, common cutting point 2) can be used as the next common cutting point.
  • the second cutting uses the common cutting feature dimension 2 and the common cutting feature value 2 to cut the data to be trained at the common cutting point 2 of any participant's device, and place any participant's device at the common cutting point 2.
  • the data to be trained is cut into common cut 4 and common cut point 5.
  • the cutting depth at this time is 3, which does not reach the preset cutting depth. Therefore, the left common cutting point in the next-level common cutting layer of the common cutting point 2 (that is, the common cutting point 4) can be used as the next common cutting point.
  • the third cutting uses common cutting feature dimension 3 and common cutting feature value 3 to cut any participant’s equipment at the common cutting point 4 to be trained data, and any participant’s equipment at the common cutting point 4 The data to be trained is cut into the left sample space and the right sample space of the common cutting point 4. Since the cutting depth at this time is 4, which has reached the preset cutting depth, the federation server determines that the current branch cannot be cut.
  • the federation server sends a query instruction to each participant's device, and determines the deepest and cuttable common cutting point as common cutting point 5 according to the query results returned by each participant's device. Therefore, the common cutting point 5 is taken as the next Common cutting point.
  • the fourth cutting uses the common cutting feature dimension 4 and the common cutting feature value 4 to cut the data to be trained at the common cutting point 5 of any participant's device, and place any participant's device at the common cutting point 5.
  • the data to be trained is cut into the left sample space and the right sample space of the common cutting point 5. Since the cutting depth at this time is 4, which has reached the preset cutting depth, the federation server determines that the current branch cannot be cut.
  • the federation server resends the query instruction to each participant's device, and determines the deepest and cuttable common cutting point as common cutting point 3 according to the query results returned by each participant's device. Therefore, the common cutting point 3 is taken as the next A common cutting point.
  • the fifth cut uses the common cut feature dimension 5 and the common cut feature value 5 to cut the data to be trained on any participant’s device at the common cut point 3, and cut any participant’s device at the common cut point 3.
  • the data to be trained is cut into the left sample space and the right sample space of the common cutting point 3. Since the cutting depth at this time is 4, which has reached the preset cutting depth, the federation server determines that the current branch cannot be cut.
  • the federated server continues to send query instructions to each participant's device, and according to the query result returned by each participant's device, determines that each participant's device is not stored in a slicable common cutting point, so the federated server determines that the end of the model training has been satisfied condition.
  • the federation server can first associate each common cutting point with the common cutting characteristics of each participant's device at the common cutting point, and then according to the inclusion of the data to be trained at each common cutting point of each participant's device Relations, connect all common cutting points, and get the binary tree model shown in Figure 6, that is, the detection model.
  • each participant's device when the federation server determines the common cutting feature at the common cutting point in conjunction with each participant's device, each participant's device also uses the common cutting feature at the common cutting point to identify each participant's device at the common cutting point. In this way, the federation server and each participant's device actually achieve the synchronous operation effect of training while testing.
  • the detection model is obtained by training, the data to be trained in each participant's device is also divided into different In the common cutting point, the abnormality of the data to be trained in each participant’s device has also been determined.
  • the embodiment of the present invention can realize multiple model applications at the same time through one model training. In the process of training to obtain a detection model, the detection of the data to be trained in the equipment of each participant is synchronously realized, so that the efficiency of model detection is improved. high.
  • the federation server 110 may only perform one model training in conjunction with each participant to obtain one detection model, or may perform multiple model training in conjunction with devices of each participant to obtain multiple detection models. If only one detection model is obtained through training, each participant can use all their to-be-trained data as the to-be-trained data used for this model training. If multiple detection models are obtained by training, each participant device can select part of the data to be trained from all the data to be trained as the data to be trained for each model training before each model training.
  • the amount of data to be trained selected by each participant's device for each model training can be the same or different, and the data to be trained used by the same participant's device in each model training may not be exactly the same to ensure the detection model
  • the data features of different data to be trained can be collected to improve the detection effect of the detection model.
  • the federation server 110 may first issue a sample confirmation instruction to each participant's device before performing model training. After any participant device receives the sample confirmation instruction, if it is determined that the amount of all data to be trained in the participant device is less than or equal to the preset number, then all the data to be trained in the participant device can be used as the current model training And use all the data to be trained as the data to be trained in the root common cutting point.
  • part of the data to be trained can be selected from all the data to be trained as the data to be trained for this model training, and the selected Part of the data to be trained is used as the data to be trained in the root common cutting point.
  • the amount of data to be trained in the participant’s device is greater than the preset number, then part of the data to be trained can be selected from all the data to be trained as the data to be trained for this model training, and the selected Part of the data to be trained is used as the data to be trained in the root common cutting point.
  • the participant device can detect the abnormality of the data to be detected in the following manner:
  • the participant device can first use the common cutting feature at each common cutting point to cut the data to be detected, and determine the common cutting point to which the data to be detected is finally cut. If the weight corresponding to the finally cut common cutting point is greater than the first preset threshold, the data to be detected is determined to be abnormal data, otherwise, the data to be detected is determined to be normal data.
  • the first preset threshold may be set by those skilled in the art based on experience, or may be set according to actual needs, and is not specifically limited. In an example, the first preset threshold may be set to 0.5.
  • the common cutting feature dimension 1 and the common cutting feature value 1 at the common cutting point 1 can be used to cut the data to be inspected, if the data to be inspected is under the common cutting feature dimension 1. If the characteristic value of is less than or equal to the common cutting characteristic value 1, the data to be detected can be cut into the common cutting point 2. If the feature value of the data to be detected in the common cutting feature dimension 1 is greater than the common cutting feature value 1, the data to be detected can be cut into the common cutting point 3. Taking the data to be detected being cut to the common cutting point 2 as an example, the participant device can use the common cutting feature dimension 2 and the common cutting feature value 2 at the common cutting point 2 to cut the data to be detected.
  • the data to be detected is in the common cutting If the feature value under feature dimension 2 is less than or equal to the common cut feature value 2, the data to be detected can be cut into the common cut point 4. If the feature value of the data to be detected in the common cutting feature dimension 2 is greater than the common cutting feature value 2, the data to be detected can be cut into the common cutting point 5. Continue to perform the above process until the data to be detected is cut to the point that it cannot be cut.
  • the federation server 110 may first determine the common cutting point to which the data to be detected is last cut, and then obtain the weight of the last common cutting point to be cut. Among them, the weight of any common cutting point and the distance between the common cutting point and the root common cutting point have an anti-correspondence relationship.
  • the federated server 110 can determine whether the weight is greater than the first preset threshold, and if it is, it determines that the data to be detected is abnormal data, if not, it determines that the data to be detected is abnormal. The data is normal.
  • each common cutting point of the common cutting layer of the same level has the same distance from the root common cutting point
  • the same weight can be set for each common cutting point of the common cutting layer of the same level.
  • the common cutting point 2 and the common cutting point 3 of the second-level common cutting layer in FIG. 6 are set with a weight of 0.8
  • the common cutting point 4 and the common cutting point 5 of the third-level common cutting layer are set with a weight of 0.3.
  • the first preset threshold is set to 0.5, if the data to be detected is finally cut into the right sample space of the common cutting point 5, it is determined that the last common cutting point is the common cutting point 5.
  • the data to be detected is normal data.
  • the last common cutting point is determined to be the common cutting point 3, because the weight of the common cutting point 3 is 0.8 (greater than 0.5) , So the data to be detected is abnormal data.
  • the participant device can detect the abnormality of the data to be detected in the following manner:
  • the participant device can first use the common cutting feature at each common cutting point in the detection model corresponding to any training to cut the data to be detected, and determine that the data to be detected is in the detection model corresponding to any training. The common cut point that is finally cut to, and then calculate the average weight corresponding to the common cut point that the data to be detected is cut to in the detection model corresponding to each training. If the average weight is greater than the second preset threshold, it is determined that the data to be detected is abnormal data; otherwise, the data to be detected is determined to be normal data.
  • the second preset threshold may be set by those skilled in the art based on experience, or may be set according to actual needs, and is not specifically limited. In an example, the second preset threshold may be set to 0.5.
  • the participant device may also calculate the weighted average weight corresponding to the common cut point that is finally cut, and determine the abnormality of the data to be detected by comparing the weighted average weight with the second preset threshold.
  • the weighted weight can be determined based on the loss function of multiple detection models: if the loss function of the detection model is smaller, the detection effect is better, and a larger weight can be set for the detection model; The larger the loss function, the worse the detection effect, and a smaller weight can be set for the detection model.
  • the federation server combines the training data of each participant's equipment to determine the common cut feature of each participant's equipment at the common cut point; the common cut feature is the processing of abnormal data and normal data. Distinguishing features; in this way, the federation server constructs a detection model based on the common cutting feature at the common cutting point, and sends the detection model to each participant device so that each participant device can use the The detection model detects the data to be detected and determines whether the data to be detected is abnormal data.
  • the common cutting feature at the common cutting point is determined by combining the to-be-trained data of each participant's device, so that the common cutting feature can simultaneously reflect the data characteristics of each participant's device. In this way, the common cutting feature is constructed based on the common cutting feature.
  • the detection model can accurately detect the abnormal data of each participant's equipment.
  • the detection model has good versatility and the accuracy of abnormal detection is high.
  • an embodiment of the present invention also provides a data processing device, and the specific content of the device can be implemented with reference to the foregoing method.
  • FIG. 7 is a schematic structural diagram of a data processing device provided by an embodiment of the present invention. As shown in FIG. 7, the device includes:
  • the determining module 701 is configured to combine the training data of each participant's device to determine the common cut feature of each participant's device at a common cut point; the common cut feature is a feature that distinguishes abnormal data from normal data;
  • the construction module 702 is configured to construct and obtain a detection model according to the common cutting feature at the common cutting point;
  • the transceiver module 703 is configured to deliver the detection model to each participant device; each participant device is also configured to use the detection model to detect the data to be detected to determine whether the data to be detected is abnormal data.
  • the common cutting characteristic may include a common cutting characteristic dimension and a common cutting characteristic value.
  • the determining module 701 is specifically configured to: first determine the common cutting feature dimension at the common cutting point according to the feature dimensions of the data to be trained at the common cutting point of each participant device , And then send the common cutting feature dimension at the common cutting point to each participant device, so that each participant device determines that the participant device is located at the location based on the common cutting feature dimension at the common cutting point.
  • the cutting feature value in the common cutting feature dimension and then receiving the cutting feature value of each participant device in the common cutting feature dimension reported by each participant device, and according to the cutting feature value of each participant device in the common cutting feature dimension
  • the cutting characteristic value in the common characteristic dimension determines the common cutting characteristic value.
  • the determining module 701 is specifically configured to: determine, according to the feature dimensions of the data to be trained at the common cutting point of each participant device, the value of each participant device at the common cutting point For the common feature dimensions of the data to be trained, the common cutting feature dimensions are selected from the common feature dimensions.
  • the construction module 702 is specifically configured to: associate any common cutting point with a common cutting feature at the common cutting point, and according to the equipment to be trained at each common cutting point.
  • the inclusion relationship of the data is connected to the common cutting points to obtain a binary tree model, and the binary tree model is used as the detection model.
  • the determining module 701 is specifically configured to: combine the data to be trained by each participant device in any training session to determine the common cutting feature of each participant device at the common cutting point corresponding to the training session. , Constructing and obtaining a detection model corresponding to any training session according to the common cutting feature at the common cutting point;
  • the transceiving module 703 is specifically configured to: deliver the detection model corresponding to each training to each participant device, so that each participant device uses the detection model corresponding to each training to analyze the to-be-detected The data is detected to determine whether the data to be detected is abnormal data.
  • FIG. 8 is a schematic structural diagram of another data processing device provided by an embodiment of the present invention. As shown in FIG. 8, the device includes:
  • the transceiver module 801 is configured to receive a detection model sent by the federation server; the detection model is the federation server combining the data to be trained of each participant's device to determine the common cutting feature of each participant's device at a common cutting point, Constructed according to the common cutting feature at the common cutting point; the common cutting feature is used to distinguish abnormal data from normal data;
  • the detection module 802 is configured to detect the data to be detected using the detection model to determine whether the data to be detected is abnormal data.
  • the transceiver module 801 before the transceiver module 801 receives the detection model sent by the federation server, it is further configured to: receive the common cutting feature dimension at the common cutting point issued by the federation server, based on the common cutting feature dimension at the common cutting point Determine the cutting characteristic value of the participant device in the common cutting characteristic dimension, and report the cutting characteristic value of the participant device in the common cutting characteristic dimension to the federation server; the federation server also uses And determining the common cutting characteristic value according to the cutting characteristic value of each participant's device in the common characteristic dimension.
  • the common cut feature dimension at the common cut point is determined by the federation server according to the feature dimension of the data to be trained at the common cut point of each participant's device.
  • the detection module 802 is specifically configured to: use the common cutting feature at each common cutting point to cut the data to be detected, and determine the common cutting point to which the data to be detected is finally cut, if the final cut is The weight corresponding to the cut common cutting point is greater than the first preset threshold, then it is determined that the data to be detected is abnormal data, otherwise it is determined that the data to be detected is normal data.
  • the transceiver module 801 is specifically configured to: receive the detection model corresponding to each training sent by the federation server.
  • the detection module 802 is specifically configured to: use the common cutting feature at each common cutting point in the detection model corresponding to any training session to cut the data to be detected, and determine the detection data corresponding to any training session.
  • the common cut point that is finally cut to in the model, and the average weight corresponding to the common cut point that the data to be detected is finally cut to in the detection model corresponding to each training is calculated, if the average weight is greater than the second preset threshold , It is determined that the data to be detected is abnormal data; otherwise, it is determined that the data to be detected is normal data.
  • the federation server combines the data to be trained on each participant's device to determine the common cutting feature of each participant's device at the common cutting point;
  • the common cutting feature is A feature that distinguishes abnormal data from normal data; in this way, the federated server constructs a detection model based on the common cut feature at the common cut point, and sends the detection model to each participant's device to facilitate Each participant device uses the detection model to detect the data to be detected, and determines whether the data to be detected is abnormal data.
  • the common cutting feature at the common cutting point is determined by combining the to-be-trained data of each participant's device, so that the common cutting feature can simultaneously reflect the data characteristics of each participant's device. In this way, the common cutting feature is constructed based on the common cutting feature.
  • the detection model can accurately detect the abnormal data of each participant's equipment.
  • the detection model has good versatility and the accuracy of abnormal detection is high.
  • a computing device provided by an embodiment of the present invention includes at least one processing unit and at least one storage unit, wherein the storage unit stores a computer program, and when the program is executed by the processing unit, The processing unit is caused to execute any of the methods described in FIGS. 2 to 5 above.
  • an embodiment of the present invention provides a computer-readable storage medium that stores a computer program executable by a computing device.
  • the computing device executes the graph. 2 to any of the methods described in Figure 5.
  • an embodiment of the present invention provides a terminal device. As shown in FIG. 9, it includes at least one processor 901 and a memory 902 connected to the at least one processor.
  • the embodiment of the present invention does not limit the processor.
  • the specific connection medium between the 901 and the memory 902 is the connection between the processor 901 and the memory 902 through a bus in FIG. 9 as an example.
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the memory 902 stores instructions that can be executed by at least one processor 901, and the at least one processor 901 can execute the steps included in the aforementioned data processing method by executing the instructions stored in the memory 902.
  • the processor 901 is the control center of the terminal device, which can use various interfaces and lines to connect various parts of the terminal device, and realize data by running or executing instructions stored in the memory 902 and calling data stored in the memory 902. handle.
  • the processor 901 may include one or more processing units, and the processor 901 may integrate an application processor and a modem processor.
  • the application processor mainly processes the operating system, user interface, and application programs.
  • the adjustment processor mainly handles issuing instructions. It can be understood that the foregoing modem processor may not be integrated into the processor 901.
  • the processor 901 and the memory 902 may be implemented on the same chip, and in some embodiments, they may also be implemented on separate chips.
  • the processor 901 may be a general-purpose processor, such as a central processing unit (CPU), a digital signal processor, an application specific integrated circuit (ASIC), a field programmable gate array or other programmable logic devices, discrete gates or transistors Logic devices and discrete hardware components can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of the present invention.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the data processing embodiment can be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • the memory 902 as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules.
  • the memory 902 may include at least one type of storage medium, for example, may include flash memory, hard disk, multimedia card, card-type memory, random access memory (Random Access Memory, RAM), static random access memory (Static Random Access Memory, SRAM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), magnetic memory, disk , CD, etc.
  • the memory 902 is any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto.
  • the memory 902 in the embodiment of the present invention may also be a circuit or any other device capable of realizing a storage function for storing program instructions and/or data.
  • an embodiment of the present invention provides a back-end device. As shown in FIG. 10, it includes at least one processor 1001 and a memory 1002 connected to the at least one processor.
  • the embodiment of the present invention does not limit the processing.
  • the connection between the processor 1001 and the memory 1002 in FIG. 10 is taken as an example.
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the memory 1002 stores instructions that can be executed by at least one processor 1001, and the at least one processor 1001 can execute the steps included in the aforementioned data processing method by executing the instructions stored in the memory 1002.
  • the processor 1001 is the control center of the back-end equipment, which can use various interfaces and lines to connect to various parts of the back-end equipment, and by running or executing instructions stored in the memory 1002 and calling data stored in the memory 1002, Realize data processing.
  • the processor 1001 may include one or more processing units, and the processor 1001 may integrate an application processor and a modem processor, where the application processor mainly processes operating systems, application programs, etc., and the modem processor Mainly analyze the received instructions and analyze the received results. It can be understood that the foregoing modem processor may not be integrated into the processor 1001.
  • the processor 1001 and the memory 1002 may be implemented on the same chip, and in some embodiments, they may also be implemented on separate chips.
  • the processor 1001 may be a general-purpose processor, such as a central processing unit (CPU), a digital signal processor, an application specific integrated circuit (ASIC), a field programmable gate array or other programmable logic devices, discrete gates or transistors Logic devices and discrete hardware components can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of the present invention.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in combination with the data processing embodiment can be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • the memory 1002 as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules.
  • the memory 1002 may include at least one type of storage medium, for example, may include flash memory, hard disk, multimedia card, card-type memory, random access memory (Random Access Memory, RAM), static random access memory (Static Random Access Memory, SRAM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), magnetic memory, disk , CD, etc.
  • the memory 1002 is any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto.
  • the memory 1002 in the embodiment of the present invention may also be a circuit or any other device capable of realizing a storage function for storing program instructions and/or data.
  • the embodiments of the present invention can be provided as a method or a computer program product. Therefore, the present invention may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种数据处理方法及装置,涉及金融科技(Fintech)领域,用以解决现有技术无法训练得到通用的检测模型的问题。其中方法包括:联邦服务器联合各参与方设备的待训练数据,确定公共切割点处的公共切割特征,根据公共切割点处的公共切割特征构建得到检测模型并下发给各参与方设备,以便于各参与方设备使用检测模型检测待检测数据的异常性。通过联合各参与方设备的待训练数据确定公共切割点处的公共切割特征,使得公共切割特征能同时反映各参与方设备的数据特性,如此,基于该种公共切割特征构建得到检测模型,能使检测模型准确地检测各参与方设备中的异常数据,检测模型的通用性较好,异常检测的准确率较高。

Description

一种数据处理方法及装置
相关申请的交叉引用
本申请要求在2020年01月21日提交中国专利局、申请号为202010072413.6、申请名称为“一种数据处理方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及金融科技(Fintech)技术领域,尤其涉及一种数据处理方法及装置。
背景技术
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变.然而,由于金融行业的安全性和实时性要求较高,金融科技领域也对技术提出了更高的要求。异常数据检测是金融领域常用的一种数据处理方法,在对海量交易数据进行处理时,通过使用检测模型检测出海量交易数据中的异常交易数据,能够便于运维人员及时检测异常交易场景,提高金融领域的稳定性。
现阶段,各个参与方设备通常仅能使用本地存储的待训练数据训练检测模型。然而,由于单一参与方设备训练的检测模型无法反映其他参与方设备的数据特性,因此,单一参与方设备训练的检测模型仅能准确检测该参与方设备的异常数据,而无法检测出其他参与方设备的异常数据,或者会误判其他参与方设备的异常数据,从而导致异常数据的检测准确率较低。
发明内容
本发明提供一种数据处理方法及装置,用以训练得到通用的检测模型,以实现对各个参与方设备的异常数据进行检测,进而提升异常数据检测的准确率。
第一方面,本发明提供一种数据处理方法,该方法应用于联邦服务器,该方法包括:联邦服务器联合各个参与方设备的待训练数据,确定各个参与方设备在公共切割点处的公共切割特征,根据公共切割点处的公共切割特征构建得到检测模型,并将检测模型下发给各个参与方设备,以使各个参与方设备使用检测模型对待检测数据进行检测,从而确定待检测数据是否为异常数据。其中,公共切割特征为对异常数据与正常数据进行区分的特征。
在一种可能的实现方式中,公共切割特征可以包括公共切割特征维度和公共切割特征值。在这种情况下,联邦服务器联合各个参与方设备的待训练数据,确定各个参与方设备在公共切割点处的公共切割特征,包括:联邦服务器根据各个参与方设备在公共切割点处的待训练数据的特征维度,确定公共切割点处的公共切割特征维度,并将公共切割点处的公共切割特征维度下发给各个参与方设备,以使参与方设备基于公共切割点处的公共切割特征维度确定参与方设备在公共切割特征维度下的切割特征值,之后,联邦服务器接收各个参与方设备上报的各个参与方设备在公共切割特征维度下的切割特征值,并根据各个参与方设备在公共特征维度下的切割特征值,确定公共切割特征值。
在一种可能的实现方式中,联邦服务器根据各个参与方设备在公共切割点处的待训练数据的特征维度,确定公共切割点处的公共特征维度,包括:联邦服务器根据各个参与方 设备在公共切割点处的待训练数据的特征维度,确定各个参与方设备在公共切割点处的待训练数据的公共特征维度,进而从公共特征维度中选取公共切割特征维度。
在一种可能的实现方式中,联邦服务器根据公共切割点处的公共切割特征,构建得到检测模型,包括:联邦服务器关联任一公共切割点与公共切割点处的公共切割特征,之后根据各个参与方设备在各个公共切割点处的待训练数据的包含关系,连接各个公共切割点,得到二叉树模型,并将二叉树模型作为检测模型。
在一种可能的实现方式中,联邦服务器联合各个参与方设备的待训练数据,确定各个参与方设备在公共切割点处的公共切割特征,包括:联邦服务器联合各个参与方设备在任一次训练中的待训练数据,确定各个参与方设备在任一次训练对应的公共切割点处的公共切割特征,之后根据公共切割点处的公共切割特征构建得到任一次训练对应的检测模型。相应地,并将检测模型下发给各个参与方设备,包括:联邦服务器将各次训练对应的检测模型下发给各个参与方设备,以使各个参与方设备使用各次训练对应的检测模型对待检测数据进行检测,以确定待检测数据是否为异常数据。
第二方面,本发明提供一种数据处理方法,该方法应用于参与方设备,该方法包括:参与方设备接收联邦服务器发送的检测模型,并使用检测模型对待检测数据进行检测,以确定待检测数据是否为异常数据。其中,检测模型为联邦服务器联合各个参与方设备的待训练数据,确定各个参与方设备在公共切割点处的公共切割特征,根据公共切割点处的公共切割特征构建得到的,公共切割特征用于区分异常数据与正常数据。
在一种可能的实现方式中,参与方设备接收联邦服务器发送的检测模型之前,还接收联邦服务器下发的公共切割点处的公共切割特征维度,基于公共切割点处的公共切割特征维度确定参与方设备在公共切割特征维度下的切割特征值,将参与方设备在公共切割特征维度下的切割特征值上报给联邦服务器,以使联邦服务器根据各个参与方设备在公共特征维度下的切割特征值,确定公共切割特征值。其中,公共切割点处的公共切割特征维度为联邦服务器根据各个参与方设备在公共切割点处的待训练数据的特征维度确定的。
在一种可能的实现方式中,参与方设备使用检测模型对待检测数据进行检测,以确定待检测数据是否为异常数据,包括:参与方设备使用各个公共切割点处的公共切割特征对待检测数据进行切割,确定待检测数据最终被切到的公共切割点,若最终被切到的公共切割点对应的权重大于第一预设阈值,则确定待检测数据为异常数据,否则确定待检测数据为正常数据。
在一种可能的实现方式中,参与方设备接收联邦服务器发送的检测模型,包括:参与方设备接收联邦服务器发送的各次训练对应的检测模型。相应地,参与方设备使用检测模型对待检测数据进行检测,以确定待检测数据是否为异常数据,包括:参与方设备使用任一次训练对应的检测模型中的各个公共切割点处的公共切割特征对待检测数据进行切割,确定待检测数据在任一次训练对应的检测模型中最终被切到的公共切割点,计算待检测数据在各次训练对应的检测模型中最终被切到的公共切割点对应的平均权重,若平均权重大于第二预设阈值,则确定待检测数据为异常数据,否则确定待检测数据为正常数据。
第三方面,本发明提供一种数据处理装置,该装置包括:确定模块,用于联合各个参与方设备的待训练数据,确定各个参与方设备在公共切割点处的公共切割特征;构建模块,用于根据公共切割点处的公共切割特征,构建得到检测模型;收发模块,用于将检测模型下发给各个参与方设备,以使各个参与方设备还用于使用检测模型对待检测数据进行 检测,以确定待检测数据是否为异常数据。其中,公共切割特征为对异常数据与正常数据进行区分的特征。
在一种可能的实现方式中,公共切割特征可以包括公共切割特征维度和公共切割特征值。在这种情况下,确定模块具体用于:根据各个参与方设备在公共切割点处的待训练数据的特征维度,确定公共切割点处的公共切割特征维度,将公共切割点处的公共切割特征维度下发给各个参与方设备,以使参与方设备基于公共切割点处的公共切割特征维确定参与方设备在公共切割特征维度下的切割特征值。之后,确定模块接收各个参与方设备上报的各个参与方设备在公共切割特征维度下的切割特征值,并根据各个参与方设备在公共特征维度下的切割特征值,确定公共切割特征值。
在一种可能的实现方式中,确定模块具体用于:根据各个参与方设备在公共切割点处的待训练数据的特征维度,确定各个参与方设备在公共切割点处的待训练数据的公共特征维度,并从公共特征维度中选取公共切割特征维度。
在一种可能的实现方式中,构建模块具体用于:关联任一公共切割点与公共切割点处的公共切割特征,根据各个参与方设备在各个公共切割点处的待训练数据的包含关系,连接各个公共切割点,得到二叉树模型,将该二叉树模型作为检测模型。
在一种可能的实现方式中,确定模块具体用于:联合各个参与方设备在任一次训练中的待训练数据,确定各个参与方设备在任一次训练对应的公共切割点处的公共切割特征,根据公共切割点处的公共切割特征,构建得到任一次训练对应的检测模型。相应地,收发模块具体用于:将各次训练对应的检测模型下发给各个参与方设备,以使各个参与方设备使用各次训练对应的检测模型对待检测数据进行检测,以确定待检测数据是否为异常数据。
第四方面,本发明提供一种数据处理装置,该装置包括:收发模块,用于接收联邦服务器发送的检测模型;检测模块,用于使用检测模型对待检测数据进行检测,以确定待检测数据是否为异常数据。其中,检测模型为联邦服务器联合各个参与方设备的待训练数据,确定各个参与方设备在公共切割点处的公共切割特征,根据公共切割点处的公共切割特征构建得到的;公共切割特征用于区分异常数据与正常数据。
在一种可能的实现方式中,收发模块接收联邦服务器发送的检测模型之前,还接收联邦服务器下发的公共切割点处的公共切割特征维度,基于公共切割点处的公共切割特征维度确定参与方设备在公共切割特征维度下的切割特征值,将参与方设备在公共切割特征维度下的切割特征值上报给联邦服务器,以使联邦服务器还用于根据各个参与方设备在公共特征维度下的切割特征值,确定公共切割特征值。其中,公共切割点处的公共切割特征维度为联邦服务器根据各个参与方设备在公共切割点处的待训练数据的特征维度确定的。
在一种可能的实现方式中,检测模块具体用于:使用各个公共切割点处的公共切割特征对待检测数据进行切割,确定待检测数据最终被切到的公共切割点,若最终被切到的公共切割点对应的权重大于第一预设阈值,则确定待检测数据为异常数据,否则确定待检测数据为正常数据。
在一种可能的实现方式中,收发模块具体用于:接收联邦服务器发送的各次训练对应的检测模型。相应地,检测模块具体用于:使用任一次训练对应的检测模型中的各个公共切割点处的公共切割特征对待检测数据进行切割,确定待检测数据在任一次训练对应的检测模型中最终被切到的公共切割点,计算待检测数据在各次训练对应的检测模型中最终被切到的公共切割点对应的平均权重,若平均权重大于第二预设阈值,则确定待检测数据为 异常数据,否则确定待检测数据为正常数据。
第五方面,本发明提供一种计算设备,包括至少一个处理单元以及至少一个存储单元,其中,所述存储单元存储有计算机程序,当所述程序被所述处理单元执行时,使得所述处理单元执行上述第一方面或第二方面任意所述的方法。
第六方面,本发明提供的一种计算机可读存储介质,其存储有可由计算设备执行的计算机程序,当所述程序在所述计算设备上运行时,使得所述计算设备执行上述第一方面或第二方面任意所述的方法。
在本发明中,通过联合各个参与方设备的待训练数据确定公共切割点处的公共切割特征,使得公共切割特征能够同时反映各个参与方设备的数据特性,如此,基于公共切割特征构建得到的检测模型能够准确地对各个参与方设备的异常数据进行检测,检测模型的通用性较好,检测的准确率较高。
本发明的这些方面或其他方面在以下实施例的描述中会更加简明易懂。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种适用的系统架构示意图;
图2为本发明实施例提供的一种数据处理方法对应的流程示意图;
图3为本发明实施例提供的一种确定公共切割特征的流程方法示意图;
图4为参与方设备在各个公共切割点处的待训练数据的分布示意图;
图5为本发明实施例提供的一种确定下一公共切割点的方法流程示意图;
图6为本发明实施例提供的一种检测模型的结构示意图;
图7为本发明实施例提供的一种数据处理装置的结构示意图;
图8为本发明实施例提供的另一种数据处理装置的结构示意图;
图9为本发明实施例提供的一种终端设备的结构示意图;
图10为本发明实施例提供的一种后端设备的结构示意图。
具体实施方式
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
图1为本发明实施例提供的一种适用的系统架构示意图,如图1所示,该系统架构可以包括联邦服务器110和至少两个参与方设备,比如参与方设备121、参与方设备122和参与方设备123。其中,联邦服务器110可以与每个参与方设备连接,比如可以通过有线方式连接,也可以通过无线方式连接,具体不作限定。
基于图1所示意的系统架构,图2为本发明实施例提供的一种数据处理方法对应的交互流程示意图,该方法适用于联邦服务器和参与方设备,例如图1所示意出的联邦服务器 110和任一参与方设备,如参与方设备121、参与方设备122或参与方设备123。如图2所示,该方法包括:
步骤201,联邦服务器联合各个参与方设备的待训练数据,确定各个参与方设备在公共切割点处的公共切割特征。
本发明实施例中,公共切割点是在对各个参与方设备的待训练数据进行切割时联合各个参与方设备确定出的统一的切割节点,各个参与方设备在公共切割点处对应有各自的待训练数据,联邦服务器能够联合各个参与方设备在公共切割点处的待训练数据确定公共切割点处的公共切割特征,然后下发给各个参与方设备。相应地,任一参与方设备使用公共切割点处的公共切割特征对该参与方设备在公共切割点处的待训练数据进行切割,得到后续的公共切割点,即将该参与方设备在公共切割点处的待训练数据切割至后续的公共切割点中,循环执行上述过程,直至切割至无法切割时,相应地也得到了各个公共切割点和各个公共切割点处的公共切割特征。其中,公共切割特征为对异常数据与正常数据进行区分的特征。正常数据与异常数据是相对于各个待训练数据来说的,正常数据是指与大部分待训练数据特征相似的数据,而异常数据是指与大部分待训练数据特征差异较大的数据。
步骤202,联邦服务器根据公共切割点处的公共切割特征,构建得到检测模型。
步骤203,联邦服务器将检测模型下发给各个参与方设备。
步骤204,各个参与方设备使用检测模型对待检测数据进行检测,以确定待检测数据是否为异常数据。
本发明实施例中,通过联合各个参与方设备的待训练数据确定公共切割点处的公共切割特征,使得公共切割特征能够同时反映各个参与方设备的数据特性。如此,基于公共切割特征构建得到的检测模型能够准确地对各个参与方设备的异常数据进行检测,检测模型的通用性较好,检测的准确率较高。
在上述步骤201中,联邦服务器可以通过如下图3所示意的方式确定各个参与方设备在任一公共切割点处的公共切割特征。其中,公共切割特征可以包括公共切割特征维度和公共切割特征值。
图3为本发明实施例提供的一种确定任一公共切割点处的公共切割特征的流程方法示意图,该方法适用于联邦服务器和参与方设备,例如图1所示意出的联邦服务器110和任一参与方设备,如参与方设备121、参与方设备122或参与方设备123。如图3所示,该方法包括:
步骤301,联邦服务器根据各个参与方设备在公共切割点处的待训练数据的特征维度,确定公共切割点处的公共切割特征维度。
在一个示例中,针对于任一公共切割点,联邦服务器110可以先根据各个参与方设备在公共切割点处的待训练数据的特征维度,确定各个参与方设备在公共切割点处的待训练数据的公共特征维度,然后再从公共特征维度中选取该公共切割点处的公共切割特征维度。其中,确定各个参与方设备在公共切割点处的待训练数据的特征维度的方式可以有多种,比如,联邦服务器110可以向各个参与方设备发送维度获取请求,维度获取请求中携带有公共切割点的标识,以使各个参与方设备可以根据维度获取请求确定各个参与方设备在公共切割点处的待训练数据的特征维度并上报。或者,各个参与方设备也可以按照固定周期或实时上报各个参与方设备在公共切割点处的待训练数据的特征维度,不作限定。
在一种可能的方式中,任意两次公共切割点处的公共切割特征维度可以不同。具体实 施中,在执行模型训练之前,联邦服务器110可以先获取每个参与方设备中待训练数据的特征维度,然后使用各个参与方设备中待训练数据的公共特征维度构建公共特征维度集合。如此,针对于任一公共切割点,联邦服务器110可以先从公共特征维度集合中确定出与其它公共切割点处的公共切割特征维度不同的各个公共特征维度,然后从各个公共特征维度中选取一个作为各个参与方设备在公共切割点处的公共切割特征维度。
在该方式中,公共特征维度集合可以具有如下几种可能的情形:
情形一,公共特征维度集合中仅存储有与其它公共切割点处的公共切割特征维度不同的公共特征维度。
具体实施中,针对于任一公共切割点,联邦服务器110可以直接从公共特征维度集合中随机选择一个公共特征维度,并将该公共特征维度作为各个参与方在该公共切割点处的公共切割特征维度。相应地,在该公共切割点切割结束后,联邦服务器110可以将各个参与方在该公共切割点处的公共切割特征维度从公共特征维度集合中删除,以使公共特征维度集合中仅存储与各个参与方在其它公共切割点处的公共切割特征维度不同的公共特征维度。
情形二,公共特征维度集合中存储有全部公共特征维度以及全部公共特征维度的状态;任一公共特征维度的状态为已选状态或未选状态,已选状态用于指示公共特征维度为其它公共切割点处的公共切割特征维度,未选状态用于指示公共特征维度不为其它公共切割点处的公共切割特征维度。
具体实施中,针对于任一公共切割点,联邦服务器110可以先从公共特征维度集合中确定出每个公共特征维度的状态,再从状态为未选状态的公共特征维度中随机选择一个公共特征维度,将该公共特征维度作为公共切割点处的公共切割特征维度。相应地,在公共切割点切割结束后,联邦服务器110可以将公共特征维度集合中公共切割点处的公共切割特征维度的状态更新为已选状态,以实时更新公共特征维度集合中每个公共特征维度的状态,保证公共切割特征维度确定的准确性。
在上述方式中,通过选取与其它公共切割点处的公共切割特征维度不同的公共特征维度作为公共切割点处的公共切割特征维度,使得各个公共切割点均能使用不同的特征维度进行切割,从而能够更加均衡的利用各个特征维度的数据特征切割待训练数据,提高公共切割点处的公共切割特征的准确性。
步骤302,联邦服务器将公共切割点处的公共切割特征维度下发给各个参与方设备。
步骤303,参与方设备基于公共切割点处的公共切割特征维度确定参与方设备在公共切割特征维度下的切割特征值。
在上述步骤303中,公共切割点可以为第1至第N级公共切割层中的任一公共切割点,第1级公共切割层中的公共切割点为根公共切割点,参与方设备在根公共切割点处的待训练数据包括该参与方设备在模型训练中的全部待训练数据。相应地,第i(0<i<N-2,且为整数)级公共切割层中的公共切割点为中间公共切割点或叶子公共切割点,第i级公共切割层中的任一中间公共切割点连接第i+1级公共切割层中的至少一个公共切割点,参与方设备在第i级公共切割层中的任一中间公共切割点处的待训练数据包括该参与方设备在该中间公共切割点所连接的第i+1级公共切割层中的全部公共切割点处的待训练数据,而第i级公共切割层中的叶子公共切割点不连接任意的第i+1级公共切割层中的公共切割点。
举例来说,图4为参与方设备在各个公共切割点处的待训练数据的分布示意图,如图 4所示,第1级公共切割层中设置有根公共切割点A 1,根公共切割点A 1中包括参与方设备中的全部待训练数据,即待训练数据a 1、待训练数据a 2、待训练数据a 3、待训练数据a 4、待训练数据a 5、待训练数据a 6和待训练数据a 7。相应地,根公共切割点A 1连接第2级公共切割层中的中间公共切割点A 21和叶子公共切割点A 22,参与方设备在中间公共切割点A 21处的待训练数据包括待训练数据a 1、待训练数据a 3、待训练数据a 4、待训练数据a 5、待训练数据a 6和待训练数据a 7,参与方设备在叶子公共切割点A 22处的待训练数据包括待训练数据a 2。且,中间公共切割点A 21连接第3级公共切割层中的叶子公共切割点A 31和叶子公共切割点A 32,参与方设备在叶子公共切割点A 31处的待训练数据包括待训练数据a 1、待训练数据a 4和待训练数据a 7,参与方设备在叶子公共切割点A 32处的待训练数据包括待训练数据a 3、待训练数据a 5和待训练数据a 6
具体实施中,针对于任一公共切割点,参与方设备可以先获取参与方设备在该公共切割点中的待训练数据,再根据该公共切割点处的公共切割特征维度,确定公共切割点中的待训练数据在该公共切割特征维度下的各个特征值。若公共切割特征维度对应为特征值离散的特征维度,则参与方设备可以从待训练数据在公共切割特征维度下的各个特征值中随机选择一个特征值,作为参与方设备在公共切割点处的切割特征值。相应地,若公共切割特征维度对应为特征值连续的特征维度,则参与方设备可以从待训练数据在公共切割特征维度下的最大特征值和最小特征值之间随机选择一个中间特征值,作为参与方设备在公共切割点处的切割特征值。其中,选择中间特征值的方式可以由本领域技术人员根据经验进行设置,比如可以随机选择中间特征值,或者也可以将最大特征值或最小特征值的平均特征值作为中间特征值,或者还可以将最大特征值或最小特征值的加权平均特征值作为中间特征值,具体不作限定。
举例来说,表1为一种参与方设备在公共切割点处的待训练数据的示意表。
Figure PCTCN2020129124-appb-000001
表1
如表1所示,参与方设备在公共切割点A 21处的待训练数据包括待训练数据a 1、待训练数据a 3、待训练数据a 4、待训练数据a 5、待训练数据a 6和待训练数据a 7,参与方设备在公共切割点A 21处的特征维度包括消费金额、购买时间、年龄和购物类别。
具体实施中,若公共切割特征维度为消费金额,则参与方设备可以先查询表格1确定公共切割点A 21处的各个待训练数据在消费金额下的各个特征值,即210、600、53、1000、860、100。进一步地,由于消费金额属于特征值连续的特征维度,因此参与方设备可以先确定公共切割点A 21处的各个待训练数据在消费金额下的最大消费金额为1000,最小消费金额为53,再从[53,1000]中随机选择一个消费金额,作为参与方设备在公共切割点A 21处的切割特征值,比如520。或者,若公共切割特征维度为购物类别,则参与方设备可以先查询表格1确定公共切割点A 21处的待训练数据在购物类别下的各个特征值,即暖风机、 家具、零食、游戏机、洗衣机、衣服。进一步地,由于购物类别属于特征值离散的特征维度,因此参与方设备可从各个特征值中随机选择一个特征值,作为参与方设备在公共切割点A 21处的切割特征值,比如游戏机。
步骤304,参与方设备将参与方设备在公共切割特征维度下的切割特征值上报给联邦服务器。
步骤305,联邦服务器根据各个参与方设备在公共切割特征维度下的切割特征值,确定公共切割特征值。
具体实施中,联邦服务器110接收到各个参与方设备在公共切割特征维度下的切割特征值后,可以采用多种方式确定公共切割特征值。比如可以随机选取一个切割特征值作为公共切割特征值,或者也可以将各个切割特征值的平均切割特征值作为公共切割特征值,或者还可以将各个切割特征值的加权平均切割特征值作为公共切割特征值等,不作限定。
步骤306,联邦服务器将公共切割特征维度和公共切割特征值作为公共切割点处的公共切割特征。
本发明实施例中,参与方设备可以仅向联邦服务器上报切割特征值,而无需上报待训练数据,从而可以保护参与方设备中待训练数据的安全性,同时降低传输数据量,提高训练效率。且,由于切割特征值基于参与方设备中的待训练数据在公共切割特征维度下的特征值而生成,因此切割特征值能够准确体现参与方设备中待训练数据的数据特性。如此,基于各个参与方设备上报的切割特征值确定的公共切割特征值训练得到的检测模型能够体现各个参与方设备中待训练数据的数据特性,检测模型的通用性较好,检测准确性较高。
上述内容介绍了确定任一公共切割点处的公共切割特征维度和公共切割特征值的实现过程,下面介绍如何确定下一公共切割点。
图5为本发明实施例提供的一种确定下一公共切割点的方法流程示意图,该方法适用于联邦服务器和参与方设备,例如图1所示意出的联邦服务器110和任一参与方设备,如参与方设备121、参与方设备122或参与方设备123。如图5所示,该方法包括:
步骤501,联邦服务器将公共切割点处的公共切割特征值下发给各个参与方设备。
具体实施中,联邦服务器110可以直接将公共切割点处的公共切割特征值下发给各个参与方设备,或者也可以先对公共切割点处的公共切割特征值进行加密,再将加密的公共切割特征值下发给各个参与方设备,以保证数据传输的安全性,具体不作限定。
步骤502,参与方设备使用公共切割点处的公共切割特征值对参与方设备在公共切割点处的待训练数据进行切割,得到切割结果。
具体实施中,若参与方设备在公共切割点处的待训练数据的数量小于或等于1,则参与方设备在公共切割点处的待训练数据无法切割,因此可以确定切割结果为切割失败。相应地,若参与方设备在公共切割点处的待训练数据的数量大于1,则可以直接使用公共切割特征值对参与方设备在公共切割点处的待训练数据进行切割,并在切割完成后,确定切割结果为切割成功。
举例来说,基于表1所示意的待训练数据,当公共切割点为公共切割点A 21,公共切割特征维度为消费金额500时,如图4所示,参与方设备可以使用消费金额500对参与方设备在公共切割点A 21处的待训练数据a 1、待训练数据a 3、待训练数据a 4、待训练数据a 5、待训练数据a 6和待训练数据a 7进行切割。由于待训练数据a 1(消费金额为210)、待训练数据a 4(消费金额为53)和待训练数据a 7(消费金额为100)的消费金额均小于500,因 此可以将待训练数据a 1、待训练数据a 4和待训练数据a 7划分至第3级公共切割层的公共切割点A 31中。相应地,由于待训练数据a 3(消费金额为600)、待训练数据a 5(消费金额为1000)和待训练数据a 6(消费金额为860)的消费金额均大于或等于500,因此可以将待训练数据a 3、待训练数据a 5和待训练数据a 6划分至第3级公共切割层的公共切割点A 32中。且,当切割完成后,参与方设备确定切割结果为切割成功。或者,当公共切割点为公共切割点A 22时,如图4所示,由于参与方设备在公共切割点A 22处的待训练数据仅包括待训练数据a 2,因此,可以确定参与方设备在公共切割点A 22中的待训练数据无法切割,参与方设备确定切割结果为切割失败。
需要说明的是,上述仅是一种示例性的说明,并不构成对本方案的限定,具体实施中,划分方式可以由本领域技术人员根据经验进行设置。比如也可以设置为将消费金额大于或等于500的待训练数据划分至公共切割点A 31中,将消费金额小于500的待训练数据划分至公共切割点A 32中。
步骤503,参与方设备将切割结果上报给联邦服务器。
步骤504,联邦服务器根据各个参与方设备的切割结果确定是否满足模型训练的结束条件,若否,则执行步骤505,若是,则执行步骤506。
本发明实施例中,模型训练的结束条件可以为以下任意一项或任意多项:公共切割点的深度(公共切割点与根公共切割点的距离)大于或等于预设切割深度、各个参与方设备中不存在未切割且可切割的公共切割点、已执行切割的次数大于或等于预设切割次数、已执行切割的时长大于或等于预设切割时长、最高一级公共切割层包括的公共切割点的数量大于或等于预设数量。通过设置上述几种结束条件,使得本发明实施例中的数据处理方法可以具有更广的应用范围,且更加满足用户的需要。
步骤505,联邦服务器联合各个参与方设备在各个公共切割点中的待训练数据,确定下一公共切割点。
为了便于理解,下面以模型训练的结束条件包括上述各项为例,描述上述步骤505和步骤506的具体实现过程:
步骤a,联邦服务器接收到各个参与方设备发送的切割结果后,先判断已执行切割的次数是否大于或等于预设切割次数,和/或,判断已执行切割的时长是否大于或等于预设切割时长,和/或,判断最高一级公共切割层包括的公共切割点的数量是否大于或等于预设数量,若存在至少一项为是,则可以确定切割结果满足模型训练的结束条件,执行步骤b,若全部为否,则执行步骤c。
步骤b,联邦服务器确定不存在下一公共切割点。
步骤c,联邦服务器根据每个参与方设备的切割结果确定每个参与方设备是否成功切割。若确定各个参与方设备均无法成功切割,则说明公共切割点为叶子公共切割点,公共切割点无法再向下切割,执行步骤e。若确定存在某一参与方设备切割成功,则判断切割得到的公共切割点的深度是否大于或等于预设深度,若否,则执行步骤d,若是,则执行步骤e。
步骤d,联邦服务器将与公共切割点连接的下一级公共切割层中的左公共切割点作为下一公共切割点。
本发明实施例中,联邦服务器若确定存在一个或多个参与方设备切割成功,且当前支路还未到达设置的切割深度,则可以继续当前支路的切割工作,即将当前支路上公共切割点连接的下一级公共切割层中的左公共切割点作为下一公共切割点。通过在当前支路的切 割工作还未完成时自动执行下次切割,可以将当前支路的左公共切割点切割至无法切割或达到预设深度,从而保证切割的连续性,提高数据处理的效率。
步骤e,联邦服务器下发查询指令给各个参与方设备。
步骤f,参与方设备根据查询指令,确定参与方设备中是否存在未切割且可切割的公共切割点。若是,则将切割深度最深的未切割且可切割的公共切割点作为参与方设备中可切割的下一公共切割点。若否,则确定参与方设备中不存在下一公共切割点。
在上述步骤f中,未切割且可切割的公共切割点是指参与方设备在公共切割点中的待训练数据大于1,且公共切割点的深度小于预设切割深度的公共切割点。
具体实施中,每个参与方设备在接收到查询指令后,可以先查询确定该参与方设备中是否存在未切割且可切割的公共切割点。若存在,则可以在所有未切割且可切割的公共切割点中获取深度最深的公共切割点,并根据该公共切割点的层级关系生成查询结果。其中,公共切割点的层级关系可以包括公共切割点所在的公共切割层以及公共切割点在公共切割层中的位置。相应地,若不存在,则可以根据参与方设备中不存在下一公共切割点的指示消息生成查询结果。
步骤g,每个参与方设备将查询结果上报给联邦服务器。其中,查询结果为参与方设备中可切割的下一公共切割点的层级关系,或者为参与方设备中不存在下一公共切割点的指示消息。
步骤h,联邦服务器根据各个参与方设备上报的查询结果,若确定全部参与方设备中均不存在可切割的下一公共切割点,则确定已满足模型训练的结束条件,执行步骤b。若确定一个或多个参与方设备中存在可切割的下一公共切割点,则执行步骤i。
步骤i,联邦服务器根据一个或多个参与方设备上报的各个下一公共切割点的层级关系,选择层级最深且位置最近的公共切割点作为各个参与方设备的下一公共切割点。其中,层级最深且位置最近的公共切割点是指所在的公共切割层最深,且在公共切割层中靠近已切割的公共切割点的位置最近的公共切割点。
本发明实施例中,在将最初支路上的公共切割点切割至预设切割深度或无法切割时,联邦服务器若查询到各个参与方设备中还存在未切割且可切割的公共切割点,则可以从各个参与方设备的未切割且可切割的公共切割点中选择深度最深的公共切割点作为下一公共切割点,循环执行切割,直至各个参与方设备中均不存在未切割且可切割的公共切割点为止。由此可知,通过以切割深度为基准由深向浅对各个公共切割点进行切割,可以保证切割的有序进行,避免遗漏公共切割点,提高数据处理的准确性和检测模型的检测效果。
步骤506,联邦服务器确定不存在下一公共切割点,根据各个公共切割点处的公共切割特征构建检测模型。
在一种可能的实现方式中,联邦服务器可以采用如下方式构建得到检测模型:关联任一公共切割点与各个参与方设备在该公共切割点处的公共切割特征,根据各个参与方设备在各个公共切割点处的待训练数据的包含关系,连接各个公共切割点,得到二叉树模型,将二叉树模型作为检测模型。
图6为本发明实施例提供的一种检测模型的结构示意图,如图6所示,当预设切割深度为4时,第一次切割使用公共切割特征维度1和公共切割特征值1对任一参与方设备在公共切割点1处的待训练数据进行切割,将任一参与方设备在公共切割点1处的待训练数据切割至公共切割点2和公共切割点3中。由于此时的切割深度为2,未达到预设切割深度,因此 可以将下一级公共切割层中的左公共切割点(即公共切割点2)作为下一公共切割点。基于此,第二次切割使用公共切割特征维度2和公共切割特征值2对任一参与方设备在公共切割点2处的待训练数据进行切割,将任一参与方设备在公共切割点2处的待训练数据切割至公共切割4和公共切割点5中。
相应地,此时的切割深度为3,未达到预设切割深度,因此可以将公共切割点2的下一级公共切割层中的左公共切割点(即公共切割点4)作为下一公共切割点。基于此,第三次切割使用公共切割特征维度3和公共切割特征值3对任一参与方设备在公共切割点4处的待训练数据进行切割,将任一参与方设备在公共切割点4处的待训练数据切割至公共切割点4的左样本空间和右样本空间中。由于此时的切割深度为4,已达到预设切割深度,因此联邦服务器确定当前支路不可切割。
进一步地,联邦服务器向各个参与方设备发送查询指令,并根据各个参与方设备返回的查询结果确定深度最深且可切割的公共切割点为公共切割点5,因此,将公共切割点5作为下一公共切割点。基于此,第四次切割使用公共切割特征维度4和公共切割特征值4对任一参与方设备在公共切割点5处的待训练数据进行切割,将任一参与方设备在公共切割点5处的待训练数据切割至公共切割点5的左样本空间和右样本空间中。由于此时的切割深度为4,已达到预设切割深度,因此联邦服务器确定当前支路不可切割。
相应地,联邦服务器向各个参与方设备重新发送查询指令,并根据各个参与方设备返回的查询结果确定深度最深且可切割的公共切割点为公共切割点3,因此,将公共切割点3作为下一公共切割点。基于此,第五次切割使用公共切割特征维度5和公共切割特征值5对任一参与方设备在公共切割点3处的待训练数据进行切割,将任一参与方设备在公共切割点3处的待训练数据切割至公共切割点3的左样本空间和右样本空间中。由于此时的切割深度为4,已达到预设切割深度,因此联邦服务器确定当前支路不可切割。
如此,联邦服务器继续向各个参与方设备发送查询指令,并根据各个参与方设备返回的查询结果确定各个参与方设备中不存储在可切割的公共切割点,因此联邦服务器确定已满足模型训练的结束条件。
在确定切割完成后,联邦服务器可以先关联每个公共切割点与各个参与方设备在该公共切割点处的公共切割特征,再根据各个参与方设备在各个公共切割点处的待训练数据的包含关系,连接各个公共切割点,得到图6所示意的二叉树模型,即检测模型。
本发明实施例中,联邦服务器在联合各个参与方设备确定公共切割点处的公共切割特征的同时,各个参与方设备还使用公共切割点处的公共切割特征对各个参与方设备在公共切割点处的待训练数据进行切割,如此,联邦服务器和各个参与方设备实际上实现了一边训练一边检测的同步操作效果,当训练得到检测模型时,各个参与方设备中的待训练数据也被划分到了不同的公共切割点中,从而各个参与方设备中的待训练数据的异常性也已确定。显然地,本发明实施例通过一次模型训练同时可以实现多次模型应用,在训练得到一个检测模型的过程中同步实现了对各个参与方设备中的待训练数据的检测,从而模型检测的效率更高。
本发明实施例中,联邦服务器110可以仅联合各个参与方设置执行1次模型训练,得到1个检测模型,也可以联合各个参与方设备执行多次模型训练,得到多个检测模型。若仅训练得到1个检测模型,则各个参与方可以将各自的全部待训练数据作为该次模型训练所使用的待训练数据。而若训练得到多个检测模型,则各个参与方设备在每次模型训练之前, 均可以从各自的全部待训练数据中选取部分待训练数据作为每次模型训练所使用的待训练数据。其中,各个参与方设备为每次模型训练所选取的待训练数据的数量可以相同,也可以不同,同一参与方设备在各次模型训练中使用的待训练数据可以不完全相同,以保证检测模型能够集合不同待训练数据的数据特征,提高检测模型的检测效果。
在一种可能的实现方式中,联邦服务器110在执行模型训练之前,可以先向各个参与方设备下发样本确认指令。任一参与方设备接收到样本确认指令后,若确定该参与方设备的全部待训练数据的数量小于或等于预设数量,则可以将该参与方设备中的全部待训练数据作为本次模型训练的待训练数据,并将全部待训练数据作为根公共切割点中的待训练数据。相应地,若该参与方设备中的全部待训练数据的数量大于预设数量,则可以先从全部待训练数据中选取部分待训练数据作为本次模型训练的待训练数据,并将所选取的部分待训练数据作为根公共切割点中的待训练数据。在该种实现方式中,通过在待训练数据较少时将全部待训练数据作为模型训练的待训练数据,且在待训练数据较多时选取部分待训练数据作为模型训练的待训练数据,可以在充分保证待训练数据的样本多样性的同时,降低每次模型训练的数据量,在提高数据处理的效率的同时,提高检测模型的准确性。
基于此,若仅训练得到一个检测模型,则在上述步骤204中,参与方设备可以通过如下方式检测待检测数据的异常性:
针对于任一待检测数据,参与方设备可以先使用各个公共切割点处的公共切割特征对待检测数据进行切割,确定出待检测数据最终被切到的公共切割点。若最终被切到的公共切割点对应的权重大于第一预设阈值,则确定待检测数据为异常数据,否则确定待检测数据为正常数据。其中,第一预设阈值可以由本领域技术人员根据经验进行设置,或者可以根据实际需要进行设置,具体不作限定。在一个示例中,第一预设阈值可以设置为0.5。举例来说,参照图6所示意的检测模型,可以先使用公共切割点1处的公共切割特征维度1和公共切割特征值1对待检测数据进行切割,若待检测数据在公共切割特征维度1下的特征值小于或等于公共切割特征值1,则可以将待检测数据切割至公共切割点2中。若待检测数据在公共切割特征维度1下的特征值大于公共切割特征值1,则可以将待检测数据切割至公共切割点3中。以待检测数据被切割至公共切割点2中为例,参与方设备可以使用公共切割点2处的公共切割特征维度2和公共切割特征值2对待检测数据进行切割,若待检测数据在公共切割特征维度2下的特征值小于或等于公共切割特征值2,则可以将待检测数据切割至公共切割点4中。若待检测数据在公共切割特征维度2下的特征值大于公共切割特征值2,则可以将待检测数据切割至公共切割点5中。不断执行上述过程,直至待检测数据被切割至不可切割为止。
进一步地,在切割完成后,联邦服务器110可以先确定待检测数据最后被切到的公共切割点,然后获取最后被切到的公共切割点的权重。其中,任一公共切割点的权重和该公共切割点与根公共切割点的距离成反对应关系。待检测数据最终被切割到的公共切割点与根公共切割点的距离越远,说明待检测数据与大部分待训练数据越相似,待检测数据的异常程度越轻,待检测数据最终被切割到的公共切割点与根公共切割点的距离越近,说明待检测数据与大部分待训练数据的差异越大,待检测数据的异常程度越严重。基于此,在获取最后被切到的公共切割点的权重后,联邦服务器110可以判断该权重是否大于第一预设阈值,若是,则确定待检测数据为异常数据,若否,则确定待检测数据为正常数据。
其中,由于同一级公共切割层的各个公共切割点与根公共切割点的距离相同,因此可 以为同一级公共切割层的各个公共切割点设置相同的权重。比如为图6中第二级公共切割层的公共切割点2和公共切割点3设置权重0.8,为第三级公共切割层的公共切割点4和公共切割点5设置权重0.3。如此,当第一预设阈值设置为0.5时,若待检测数据最后被切割到公共切割点5的右样本空间中,则确定最后被切到的公共切割点为公共切割点5。由于公共切割点5的权重为0.3(小于0.5),因此待检测数据为正常数据。相应地,若待检测数据最后被切割到公共切割点3的左样本空间中,则确定最后被切到的公共切割点为公共切割点3,由于公共切割点3的权重为0.8(大于0.5),因此待检测数据为异常数据。
相应地,若训练得到多个检测模型,则在上述步骤204中,参与方设备可以通过如下方式检测待检测数据的异常性:
针对于任一待检测数据,参与方设备可以先使用任一次训练对应的检测模型中的各个公共切割点处的公共切割特征对待检测数据进行切割,确定待检测数据在任一次训练对应的检测模型中最终被切到的公共切割点,然后计算待检测数据在各次训练对应的检测模型中最终被切到的公共切割点对应的平均权重。若平均权重大于第二预设阈值,则确定待检测数据为异常数据,否则确定待检测数据为正常数据。其中,第二预设阈值可以由本领域技术人员根据经验进行设置,或者可以根据实际需要进行设置,具体不作限定。在一个示例中,第二预设阈值可以设置为0.5。
需要说明的是,上述仅是一种示例性的说明,并不构成对本方案的限定。在具体实施中,参与方设备也可以计算得到最终被切到的公共切割点对应的加权平均权重,并通过对比加权平均权重和第二预设阈值来确定待检测数据的异常性。其中,加权的权值可以基于多个检测模型的损失函数来确定:若检测模型的损失函数越小,则说明检测效果越好,可以为该检测模型设置较大的权值;若检测模型的损失函数越大,则说明检测效果越差,可以为该检测模型设置较小的权值。
本发明的上述实施例中,联邦服务器联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征;所述公共切割特征为对异常数据与正常数据进行区分的特征;如此,联邦服务器根据所述公共切割点处的公共切割特征,构建得到检测模型,并将所述检测模型下发给所述各个参与方设备,以便于各个参与方设备使用所述检测模型对待检测数据进行检测,确定所述待检测数据是否为异常数据。本发明实施例中,通过联合各个参与方设备的待训练数据确定公共切割点处的公共切割特征,使得公共切割特征能够同时反映各个参与方设备的数据特性,如此,基于公共切割特征构建得到的检测模型能够准确地对各个参与方设备的异常数据进行检测,检测模型的通用性较好,异常检测的准确率较高。
针对上述方法流程,本发明实施例还提供一种数据处理装置,该装置的具体内容可以参照上述方法实施。
图7为本发明实施例提供的一种数据处理装置的结构示意图,如图7所示,该装置包括:
确定模块701,用于联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征;所述公共切割特征为对异常数据与正常数据进行区分的特征;
构建模块702,用于根据所述公共切割点处的公共切割特征,构建得到检测模型;
收发模块703,用于将所述检测模型下发给所述各个参与方设备;所述各个参与方设备还用于使用所述检测模型对待检测数据进行检测,以确定所述待检测数据是否为异常数 据。
可选地,所述公共切割特征可以包括公共切割特征维度和公共切割特征值。在这种情况下,所述确定模块701具体用于:先根据所述各个参与方设备在所述公共切割点处的待训练数据的特征维度,确定所述公共切割点处的公共切割特征维度,再将所述公共切割点处的公共切割特征维度下发给所述各个参与方设备,以使各个参与方设备基于所述公共切割点处的公共切割特征维度确定所述参与方设备在所述公共切割特征维度下的切割特征值,之后,接收所述各个参与方设备上报的所述各个参与方设备在所述公共切割特征维度下的切割特征值,并根据所述各个参与方设备在所述公共特征维度下的切割特征值,确定所述公共切割特征值。
可选地,所述确定模块701具体用于:根据所述各个参与方设备在所述公共切割点处的待训练数据的特征维度,确定所述各个参与方设备在所述公共切割点处的待训练数据的公共特征维度,从所述公共特征维度中选取所述公共切割特征维度。
可选地,所述构建模块702具体用于:关联任一公共切割点与所述公共切割点处的公共切割特征,并根据所述各个参与方设备在所述各个公共切割点处的待训练数据的包含关系,连接所述各个公共切割点,得到二叉树模型,将所述二叉树模型作为所述检测模型。
可选地,所述确定模块701具体用于:联合各个参与方设备在任一次训练中的待训练数据,确定所述各个参与方设备在所述任一次训练对应的公共切割点处的公共切割特征,根据所述公共切割点处的公共切割特征,构建得到所述任一次训练对应的检测模型;
所述收发模块703具体用于:将各次训练对应的检测模型下发给所述各个参与方设备,以使所述各个参与方设备使用所述各次训练对应的检测模型对所述待检测数据进行检测,以确定所述待检测数据是否为异常数据。
图8为本发明实施例提供的另一种数据处理装置的结构示意图,如图8所示,该装置包括:
收发模块801,用于接收联邦服务器发送的检测模型;所述检测模型为所述联邦服务器联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征,根据所述公共切割点处的公共切割特征构建得到的;所述公共切割特征用于区分异常数据与正常数据;
检测模块802,用于使用所述检测模型对待检测数据进行检测,以确定所述待检测数据是否为异常数据。
可选地,所述收发模块801接收联邦服务器发送的检测模型之前,还用于:接收联邦服务器下发的公共切割点处的公共切割特征维度,基于所述公共切割点处的公共切割特征维度确定所述参与方设备在所述公共切割特征维度下的切割特征值,将所述参与方设备在所述公共切割特征维度下的切割特征值上报给所述联邦服务器;所述联邦服务器还用于根据所述各个参与方设备在所述公共特征维度下的切割特征值,确定所述公共切割特征值。其中,公共切割点处的公共切割特征维度为所述联邦服务器根据各个参与方设备在所述公共切割点处的待训练数据的特征维度确定的。
可选地,所述检测模块802具体用于:使用各个公共切割点处的公共切割特征对所述待检测数据进行切割,确定待检测数据最终被切到的公共切割点,若所述最终被切到的公共切割点对应的权重大于第一预设阈值,则确定所述待检测数据为异常数据,否则确定所述待检测数据为正常数据。
可选地,所述收发模块801具体用于:接收所述联邦服务器发送的各次训练对应的检测模型。对应的,所述检测模块802具体用于:使用任一次训练对应的检测模型中的各个公共切割点处的公共切割特征对待检测数据进行切割,确定待检测数据在所述任一次训练对应的检测模型中最终被切到的公共切割点,计算所述待检测数据在各次训练对应的检测模型中最终被切到的公共切割点对应的平均权重,若所述平均权重大于第二预设阈值,则确定所述待检测数据为异常数据,否则确定所述待检测数据为正常数据。
从上述内容可以看出:本发明的上述实施例中,联邦服务器联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征;所述公共切割特征为对异常数据与正常数据进行区分的特征;如此,联邦服务器根据所述公共切割点处的公共切割特征,构建得到检测模型,并将所述检测模型下发给所述各个参与方设备,以便于各个参与方设备使用所述检测模型对待检测数据进行检测,确定所述待检测数据是否为异常数据。本发明实施例中,通过联合各个参与方设备的待训练数据确定公共切割点处的公共切割特征,使得公共切割特征能够同时反映各个参与方设备的数据特性,如此,基于公共切割特征构建得到的检测模型能够准确地对各个参与方设备的异常数据进行检测,检测模型的通用性较好,异常检测的准确率较高。
基于同一发明构思,本发明实施例提供的一种计算设备,包括至少一个处理单元以及至少一个存储单元,其中,所述存储单元存储有计算机程序,当所述程序被所述处理单元执行时,使得所述处理单元执行上述图2至图5任意所述的方法。
基于同一发明构思,本发明实施例提供的一种计算机可读存储介质,其存储有可由计算设备执行的计算机程序,当所述程序在所述计算设备上运行时,使得所述计算设备执行图2至图5任意所述的方法。
基于相同的技术构思,本发明实施例提供了一种终端设备,如图9所示,包括至少一个处理器901,以及与至少一个处理器连接的存储器902,本发明实施例中不限定处理器901与存储器902之间的具体连接介质,图9中处理器901和存储器902之间通过总线连接为例。总线可以分为地址总线、数据总线、控制总线等。
在本发明实施例中,存储器902存储有可被至少一个处理器901执行的指令,至少一个处理器901通过执行存储器902存储的指令,可以执行前述的数据处理方法中所包括的步骤。
其中,处理器901是终端设备的控制中心,可以利用各种接口和线路连接终端设备的各个部分,通过运行或执行存储在存储器902内的指令以及调用存储在存储器902内的数据,从而实现数据处理。可选的,处理器901可包括一个或多个处理单元,处理器901可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理下发指令。可以理解的是,上述调制解调处理器也可以不集成到处理器901中。在一些实施例中,处理器901和存储器902可以在同一芯片上实现,在一些实施例中,它们也可以在独立的芯片上分别实现。
处理器901可以是通用处理器,例如中央处理器(CPU)、数字信号处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本发明实施例中公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合数据处理实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处 理器中的硬件及软件模块组合执行完成。
存储器902作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块。存储器902可以包括至少一种类型的存储介质,例如可以包括闪存、硬盘、多媒体卡、卡型存储器、随机访问存储器(Random Access Memory,RAM)、静态随机访问存储器(Static Random Access Memory,SRAM)、可编程只读存储器(Programmable Read Only Memory,PROM)、只读存储器(Read Only Memory,ROM)、带电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、磁性存储器、磁盘、光盘等等。存储器902是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本发明实施例中的存储器902还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。
基于相同的技术构思,本发明实施例提供了一种后端设备,如图10所示,包括至少一个处理器1001,以及与至少一个处理器连接的存储器1002,本发明实施例中不限定处理器1001与存储器1002之间的具体连接介质,图10中处理器1001和存储器1002之间通过总线连接为例。总线可以分为地址总线、数据总线、控制总线等。
在本发明实施例中,存储器1002存储有可被至少一个处理器1001执行的指令,至少一个处理器1001通过执行存储器1002存储的指令,可以执行前述的数据处理方法中所包括的步骤。
其中,处理器1001是后端设备的控制中心,可以利用各种接口和线路连接后端设备的各个部分,通过运行或执行存储在存储器1002内的指令以及调用存储在存储器1002内的数据,从而实现数据处理。可选的,处理器1001可包括一个或多个处理单元,处理器1001可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、应用程序等,调制解调处理器主要对接收到的指令进行解析以及对接收到的结果进行解析。可以理解的是,上述调制解调处理器也可以不集成到处理器1001中。在一些实施例中,处理器1001和存储器1002可以在同一芯片上实现,在一些实施例中,它们也可以在独立的芯片上分别实现。
处理器1001可以是通用处理器,例如中央处理器(CPU)、数字信号处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本发明实施例中公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合数据处理实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
存储器1002作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块。存储器1002可以包括至少一种类型的存储介质,例如可以包括闪存、硬盘、多媒体卡、卡型存储器、随机访问存储器(Random Access Memory,RAM)、静态随机访问存储器(Static Random Access Memory,SRAM)、可编程只读存储器(Programmable Read Only Memory,PROM)、只读存储器(Read Only Memory,ROM)、带电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、磁性存储器、磁盘、光盘等等。存储器1002是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本 发明实施例中的存储器1002还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。
本领域内的技术人员应明白,本发明的实施例可提供为方法、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (20)

  1. 一种数据处理方法,其特征在于,应用于联邦服务器,所述方法包括:
    联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征;所述公共切割特征为对异常数据与正常数据进行区分的特征;
    根据所述公共切割点处的公共切割特征,构建得到检测模型;
    将所述检测模型下发给所述各个参与方设备;所述各个参与方设备还用于使用所述检测模型对待检测数据进行检测,以确定所述待检测数据是否为异常数据。
  2. 根据权利要求1所述的方法,其特征在于,所述公共切割特征包括公共切割特征维度和公共切割特征值;
    所述联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征,包括:
    根据所述各个参与方设备在所述公共切割点处的待训练数据的特征维度,确定所述公共切割点处的公共切割特征维度;
    将所述公共切割点处的公共切割特征维度下发给所述各个参与方设备,所述参与方设备还用于基于所述公共切割点处的公共切割特征维度确定所述参与方设备在所述公共切割特征维度下的切割特征值;
    接收所述各个参与方设备上报的所述各个参与方设备在所述公共切割特征维度下的切割特征值,并根据所述各个参与方设备在所述公共特征维度下的切割特征值,确定所述公共切割特征值。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述各个参与方设备在所述公共切割点处的待训练数据的特征维度,确定所述公共切割点处的公共特征维度,包括:
    根据所述各个参与方设备在所述公共切割点处的待训练数据的特征维度,确定所述各个参与方设备在所述公共切割点处的待训练数据的公共特征维度;从所述公共特征维度中选取所述公共切割特征维度。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述根据所述公共切割点处的公共切割特征,构建得到检测模型,包括:
    关联任一公共切割点与所述公共切割点处的公共切割特征;
    根据所述各个参与方设备在所述各个公共切割点处的待训练数据的包含关系,连接所述各个公共切割点,得到二叉树模型,将所述二叉树模型作为所述检测模型。
  5. 根据权利要求1至3中任一项所述的方法,其特征在于,所述联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征,包括:
    联合各个参与方设备在任一次训练中的待训练数据,确定所述各个参与方设备在所述任一次训练对应的公共切割点处的公共切割特征,根据所述公共切割点处的公共切割特征,构建得到所述任一次训练对应的检测模型;
    所述将所述检测模型下发给所述各个参与方设备,包括:
    将各次训练对应的检测模型下发给所述各个参与方设备,以使所述各个参与方设备使用所述各次训练对应的检测模型对所述待检测数据进行检测,以确定所述待检测数据是否为异常数据。
  6. 一种数据处理方法,其特征在于,应用于参与方设备,所述方法包括:
    接收联邦服务器发送的检测模型;所述检测模型为所述联邦服务器联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征,根据所述公共切割点处的公共切割特征构建得到的;所述公共切割特征用于区分异常数据与正常数据;
    使用所述检测模型对待检测数据进行检测,以确定所述待检测数据是否为异常数据。
  7. 根据权利要求6所述的方法,其特征在于,所述接收联邦服务器发送的检测模型之前,还包括:
    接收联邦服务器下发的公共切割点处的公共切割特征维度;所述公共切割点处的公共切割特征维度为所述联邦服务器根据各个参与方设备在所述公共切割点处的待训练数据的特征维度确定的;
    基于所述公共切割点处的公共切割特征维度确定所述参与方设备在所述公共切割特征维度下的切割特征值;
    将所述参与方设备在所述公共切割特征维度下的切割特征值上报给所述联邦服务器;所述联邦服务器还用于根据所述各个参与方设备在所述公共特征维度下的切割特征值,确定所述公共切割特征值。
  8. 根据权利要求6或7任一项所述的方法,其特征在于,所述使用所述检测模型对待检测数据进行检测,以确定所述待检测数据是否为异常数据,包括:
    使用各个公共切割点处的公共切割特征对所述待检测数据进行切割,确定待检测数据最终被切到的公共切割点;
    若所述最终被切到的公共切割点对应的权重大于第一预设阈值,则确定所述待检测数据为异常数据,否则确定所述待检测数据为正常数据。
  9. 根据权利要求6或7任一项所述的方法,其特征在于,所述接收联邦服务器发送的检测模型,包括:
    接收所述联邦服务器发送的各次训练对应的检测模型;
    所述使用所述检测模型对待检测数据进行检测,以确定所述待检测数据是否为异常数据,包括:
    使用任一次训练对应的检测模型中的各个公共切割点处的公共切割特征对待检测数据进行切割,确定待检测数据在所述任一次训练对应的检测模型中最终被切到的公共切割点;
    计算所述待检测数据在各次训练对应的检测模型中最终被切到的公共切割点对应的平均权重,若所述平均权重大于第二预设阈值,则确定所述待检测数据为异常数据,否则确定所述待检测数据为正常数据。
  10. 一种数据处理装置,其特征在于,所述装置包括:
    确定模块,用于联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征;所述公共切割特征为对异常数据与正常数据进行区分的特征;
    构建模块,用于根据所述公共切割点处的公共切割特征,构建得到检测模型;
    收发模块,用于将所述检测模型下发给所述各个参与方设备;所述各个参与方设备还用于使用所述检测模型对待检测数据进行检测,以确定所述待检测数据是否为异常数据。
  11. 根据权利要求10所述的装置,其特征在于,所述公共切割特征包括公共切割特征维度和公共切割特征值;
    所述确定模块具体用于:
    根据所述各个参与方设备在所述公共切割点处的待训练数据的特征维度,确定所述公共切割点处的公共切割特征维度;
    将所述公共切割点处的公共切割特征维度下发给所述各个参与方设备,所述参与方设备还用于基于所述公共切割点处的公共切割特征维度确定所述参与方设备在所述公共切割特征维度下的切割特征值;
    接收所述各个参与方设备上报的所述各个参与方设备在所述公共切割特征维度下的切割特征值,并根据所述各个参与方设备在所述公共特征维度下的切割特征值,确定所述公共切割特征值。
  12. 根据权利要求11所述的装置,其特征在于,所述确定模块具体用于:
    根据所述各个参与方设备在所述公共切割点处的待训练数据的特征维度,确定所述各个参与方设备在所述公共切割点处的待训练数据的公共特征维度;从所述公共特征维度中选取所述公共切割特征维度。
  13. 根据权利要求10至12中任一项所述的装置,其特征在于,所述构建模块具体用于:
    关联任一公共切割点与所述公共切割点处的公共切割特征;
    根据所述各个参与方设备在所述各个公共切割点处的待训练数据的包含关系,连接所述各个公共切割点,得到二叉树模型,将所述二叉树模型作为所述检测模型。
  14. 根据权利要求10至12中任一项所述的装置,其特征在于,所述确定模块具体用于:
    联合各个参与方设备在任一次训练中的待训练数据,确定所述各个参与方设备在所述任一次训练对应的公共切割点处的公共切割特征,根据所述公共切割点处的公共切割特征,构建得到所述任一次训练对应的检测模型;
    所述收发模块具体用于:
    将各次训练对应的检测模型下发给所述各个参与方设备,以使所述各个参与方设备使用所述各次训练对应的检测模型对所述待检测数据进行检测,以确定所述待检测数据是否为异常数据。
  15. 一种数据处理装置,其特征在于,所述装置包括:
    收发模块,用于接收联邦服务器发送的检测模型;所述检测模型为所述联邦服务器联合各个参与方设备的待训练数据,确定所述各个参与方设备在公共切割点处的公共切割特征,根据所述公共切割点处的公共切割特征构建得到的;所述公共切割特征用于区分异常数据与正常数据;
    检测模块,用于使用所述检测模型对待检测数据进行检测,以确定所述待检测数据是否为异常数据。
  16. 根据权利要求15所述的装置,其特征在于,在所述收发模块接收联邦服务器发送的检测模型之前,所述收发模块还用于:
    接收联邦服务器下发的公共切割点处的公共切割特征维度;所述公共切割点处的公共切割特征维度为所述联邦服务器根据各个参与方设备在所述公共切割点处的待训练数据的特征维度确定的;
    基于所述公共切割点处的公共切割特征维度确定所述参与方设备在所述公共切割特征维度下的切割特征值;
    将所述参与方设备在所述公共切割特征维度下的切割特征值上报给所述联邦服务器;所述联邦服务器还用于根据所述各个参与方设备在所述公共特征维度下的切割特征值,确 定所述公共切割特征值。
  17. 根据权利要求15或16任一项所述的装置,其特征在于,所述检测模块具体用于:
    使用各个公共切割点处的公共切割特征对所述待检测数据进行切割,确定待检测数据最终被切到的公共切割点;
    若所述最终被切到的公共切割点对应的权重大于第一预设阈值,则确定所述待检测数据为异常数据,否则确定所述待检测数据为正常数据。
  18. 根据权利要求15或16任一项所述的装置,其特征在于,所述收发模块具体用于:
    接收所述联邦服务器发送的各次训练对应的检测模型;
    所述检测模块具体用于:
    使用任一次训练对应的检测模型中的各个公共切割点处的公共切割特征对待检测数据进行切割,确定待检测数据在所述任一次训练对应的检测模型中最终被切到的公共切割点;
    计算所述待检测数据在各次训练对应的检测模型中最终被切到的公共切割点对应的平均权重,若所述平均权重大于第二预设阈值,则确定所述待检测数据为异常数据,否则确定所述待检测数据为正常数据。
  19. 一种计算设备,其特征在于,包括至少一个处理单元以及至少一个存储单元,其中,所述存储单元存储有计算机程序,当所述程序被所述处理单元执行时,使得所述处理单元执行权利要求1~9任一权利要求所述的方法。
  20. 一种计算机可读存储介质,其特征在于,其存储有可由计算设备执行的计算机程序,当所述程序在所述计算设备上运行时,使得所述计算设备执行权利要求1~9任一权利要求所述的方法。
PCT/CN2020/129124 2020-01-21 2020-11-16 一种数据处理方法及装置 WO2021147487A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010072413.6A CN111291801B (zh) 2020-01-21 2020-01-21 一种数据处理方法及装置
CN202010072413.6 2020-01-21

Publications (1)

Publication Number Publication Date
WO2021147487A1 true WO2021147487A1 (zh) 2021-07-29

Family

ID=71024369

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/129124 WO2021147487A1 (zh) 2020-01-21 2020-11-16 一种数据处理方法及装置

Country Status (2)

Country Link
CN (1) CN111291801B (zh)
WO (1) WO2021147487A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291801B (zh) * 2020-01-21 2021-08-27 深圳前海微众银行股份有限公司 一种数据处理方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210626A (zh) * 2019-05-31 2019-09-06 京东城市(北京)数字科技有限公司 数据处理方法、装置和计算机可读存储介质
CN110378487A (zh) * 2019-07-18 2019-10-25 深圳前海微众银行股份有限公司 横向联邦学习中模型参数验证方法、装置、设备及介质
CN110633806A (zh) * 2019-10-21 2019-12-31 深圳前海微众银行股份有限公司 纵向联邦学习系统优化方法、装置、设备及可读存储介质
US20200005071A1 (en) * 2019-08-15 2020-01-02 Lg Electronics Inc. Method and apparatus for recognizing a business card using federated learning
CN111291801A (zh) * 2020-01-21 2020-06-16 深圳前海微众银行股份有限公司 一种数据处理方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10270599B2 (en) * 2017-04-27 2019-04-23 Factom, Inc. Data reproducibility using blockchains
CN109492420B (zh) * 2018-12-28 2021-07-20 深圳前海微众银行股份有限公司 基于联邦学习的模型参数训练方法、终端、系统及介质
CN110245510B (zh) * 2019-06-19 2021-12-07 北京百度网讯科技有限公司 用于预测信息的方法和装置
CN110401649A (zh) * 2019-07-17 2019-11-01 湖北央中巨石信息技术有限公司 基于态势感知学习的信息安全风险评估方法和系统
CN110414688A (zh) * 2019-07-29 2019-11-05 卓尔智联(武汉)研究院有限公司 信息分析方法、装置、服务器及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210626A (zh) * 2019-05-31 2019-09-06 京东城市(北京)数字科技有限公司 数据处理方法、装置和计算机可读存储介质
CN110378487A (zh) * 2019-07-18 2019-10-25 深圳前海微众银行股份有限公司 横向联邦学习中模型参数验证方法、装置、设备及介质
US20200005071A1 (en) * 2019-08-15 2020-01-02 Lg Electronics Inc. Method and apparatus for recognizing a business card using federated learning
CN110633806A (zh) * 2019-10-21 2019-12-31 深圳前海微众银行股份有限公司 纵向联邦学习系统优化方法、装置、设备及可读存储介质
CN111291801A (zh) * 2020-01-21 2020-06-16 深圳前海微众银行股份有限公司 一种数据处理方法及装置

Also Published As

Publication number Publication date
CN111291801B (zh) 2021-08-27
CN111291801A (zh) 2020-06-16

Similar Documents

Publication Publication Date Title
CN109815332B (zh) 损失函数优化方法、装置、计算机设备及存储介质
US11003564B2 (en) Methods and systems for determining system capacity
CN108833458B (zh) 一种应用推荐方法、装置、介质及设备
CN110532168B (zh) 全链路接口压测方法、装置、计算机设备及存储介质
CN107070940B (zh) 一种从流式登录日志中判断恶意登录ip地址的方法及装置
US20230156036A1 (en) Detection of malicious activity within a network
US8170894B2 (en) Method of identifying innovations possessing business disrupting properties
WO2018177275A1 (zh) 一种多数据源用户信息整合方法和装置
Yu et al. A statistical framework of data-driven bottleneck identification in manufacturing systems
CN109586952A (zh) 服务器扩容方法、装置
US10884805B2 (en) Dynamically configurable operation information collection
CN104967616A (zh) 一种Web服务器中的WebShell文件的检测方法
WO2021147487A1 (zh) 一种数据处理方法及装置
CN114357495A (zh) 基于区块链的预言机链下聚合方法、装置、设备和介质
CN102331987A (zh) 专利数据挖掘系统及方法
WO2021233475A1 (zh) 多联机的内外机通信方法及多联机
CN113672389A (zh) 一种服务器兼容方法、系统、设备及计算机可读存储介质
US7797136B2 (en) Metrics to evaluate process objects
CN112988892A (zh) 一种分布式系统热点数据的管理方法
US9560027B1 (en) User authentication
CN114679335B (zh) 电力监控系统网络安全风险评估训练方法、评估方法及设备
CN104239785B (zh) 基于云模型的入侵检测数据划分方法
US20230156043A1 (en) System and method of supporting decision-making for security management
CN115509931A (zh) 基于系统的性能测试方法、装置、电子设备及存储介质
CN115757390A (zh) 一种智慧工地对不完整数据的修补方法、装置及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20915804

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.11.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20915804

Country of ref document: EP

Kind code of ref document: A1