CN114692098A - Intelligent software behavior control method based on block chain and federal learning - Google Patents
Intelligent software behavior control method based on block chain and federal learning Download PDFInfo
- Publication number
- CN114692098A CN114692098A CN202210610745.4A CN202210610745A CN114692098A CN 114692098 A CN114692098 A CN 114692098A CN 202210610745 A CN202210610745 A CN 202210610745A CN 114692098 A CN114692098 A CN 114692098A
- Authority
- CN
- China
- Prior art keywords
- model
- software
- node
- software behavior
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000006399 behavior Effects 0.000 title claims abstract description 211
- 238000000034 method Methods 0.000 title claims abstract description 72
- 238000012549 training Methods 0.000 claims abstract description 109
- 238000013145 classification model Methods 0.000 claims abstract description 77
- 238000005516 engineering process Methods 0.000 claims abstract description 7
- 230000002776 aggregation Effects 0.000 claims description 52
- 238000004220 aggregation Methods 0.000 claims description 52
- 230000008569 process Effects 0.000 claims description 44
- 230000006870 function Effects 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 9
- 238000013461 design Methods 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 6
- 238000010801 machine learning Methods 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 4
- 239000004744 fabric Substances 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 3
- 238000012854 evaluation process Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000004806 packaging method and process Methods 0.000 claims 1
- 238000007726 management method Methods 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 230000009467 reduction Effects 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 238000011217 control strategy Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012856 packing Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 102100026278 Cysteine sulfinic acid decarboxylase Human genes 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108010064775 protein C activator peptide Proteins 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/12—Protecting executable software
- G06F21/121—Restricting unauthorised execution of programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/64—Protecting data integrity, e.g. using checksums, certificates or signatures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Multimedia (AREA)
- Technology Law (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the technical field of internet, and discloses a software behavior intelligent control method based on a block chain and federal learning, which comprises the following steps: (1) constructing a software behavior classification model of a single node (2), constructing an alliance chain of a multi-node joint training task facing the software behavior classification model, taking the software behavior classification model constructed in the last step as a local training model of each node, and combining a block chain and a federal learning technology to realize multi-party cooperative training of the software behavior classification model of each node in the alliance chain under the condition of no central node drive; (3) designing a software behavior intelligent control client tool, sensing software behavior data generated when a user uses a computer in real time, inputting the data into a software behavior classification model, and intelligently controlling the software behavior according to a classification result. According to the invention, the classification accuracy of the software popup behaviors is improved on the premise of protecting the data privacy of the user, and the intelligent control of the popup behaviors can be realized according to the software preference of the user.
Description
Technical Field
The invention belongs to the technical field of internet, and particularly relates to a software behavior intelligent control method based on a block chain and federal learning.
Background
At present, a part of software providers bind a large amount of advertisements and information pop-up windows to the application software developed by the software providers, and the system operation experience of software users is damaged by the behavior. The appearance of the software popup can cause the problems of interference on the sight of a user, influence on the operation of the user, reduction in the use efficiency of a computer, reduction in the network speed and the like, and the user experience is greatly influenced. Although some security software capable of performing popup interception is available in the market at present, for example, the popup interception method disclosed in patent No. CN202010503928.7 monitors a popup event through a target hook function, then obtains at least one attribute feature of a popup corresponding to the popup event according to a process corresponding to the popup event, and determines whether the popup is an advertisement popup according to the attribute feature, and if the popup is an advertisement popup, refuses to respond to the popup event. The method can realize interception before the advertisement popup is displayed, improves user experience, but the method judges the advertisement popup behavior of the software by depending on certain rules, and various characteristics of the software popup are continuously changed, so that the method has the risk of missing detection or misjudgment.
Obviously, in the existing scheme, the control of the popup mainly depends on the rule configuration of the user, the user needs to manually configure the control options of the popup, and a software popup behavior information sharing mechanism among multiple users is not established, so that the intelligent control of the popup behavior cannot be realized by means of the software preference of a user group.
In the aspect of detection and classification of software behaviors, the traditional machine learning algorithm and the deep learning algorithm can learn corresponding characteristics through a large amount of data, and finally, the software behaviors are accurately classified. However, due to the lack of large amount of high-quality software behavior training data, software behavior detection and classification research faces the current situation of data islanding, so that software behavior data of each participant cannot be effectively utilized when machine learning model training is performed, and the improvement of algorithm model effect is hindered, so that a software behavior perception method needs to be explored and designed to collect data such as user interface content and process of various software popup windows, and data support is provided for classification and control of software behaviors.
In addition, collected software behavior data such as software window screenshots can contain personal preference information of users, window processes can expose behaviors of the users for using software, the data relate to user privacy, the data are dispersed in all nodes, the effect of training a model by using limited data by a single node is limited, and the centralized training of the data of all the nodes can cause privacy disclosure, so that a data island is broken on the premise of protecting the data privacy, and the accuracy of the model is improved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an intelligent software behavior control method based on a block chain and federal learning, which combines machine learning, federal learning and block chain technology, synthesizes multi-party data to cooperatively train a software behavior classification model on the premise of protecting the privacy of software behavior data of a user, and timely controls the software popup behavior according to the classification result. Compared with the original method for training the model by centralizing the data, the method provided by the invention can improve the classification accuracy of the software popup behaviors on the premise of protecting the data privacy of the user, and simultaneously provides support for the management and control strategy of the popup software shared by the users.
In order to solve the technical problems, the invention adopts the technical scheme that:
the intelligent software behavior control method based on the block chain and the federal learning comprises the following steps:
(1) constructing a software behavior classification model of a single node, comprising the steps of sensing and storing software behaviors, constructing a software behavior data set, designing software popup user interface content identification and a software behavior classification model network structure; (2) constructing an alliance chain of a multi-node joint training task oriented to a software behavior classification model, taking the software behavior classification model constructed in the last step as a model for local training of each node, and combining a block chain and a federal learning technology to realize that the software behavior classification model is trained in a multi-party cooperation mode under the condition that each node on the alliance chain is not driven by a central node, wherein the alliance chain architecture comprises a design software behavior classification model joint training task, a definition on-chain block structure and a joint training flow, a design model aggregation algorithm and a part for compiling an intelligent contract in the software behavior classification model joint training process;
(3) designing a software behavior intelligent control client tool, sensing software behavior data generated when a user uses a computer in real time, inputting a constructed software behavior classification model, classifying software behaviors, obtaining a classification result, and intelligently controlling the software behaviors according to the classification result.
Further, the specific steps in the step (1) are as follows:
(1.1) software behavior sensing and storing: analyzing the automatic pop-up window principle of different pop-up window software, sensing the behavior data of various multidimensional software in real time, and storing the behavior data;
(1.2) constructing a software behavior data set: performing data cleaning on the stored multi-dimensional software behavior data, labeling the behavior data of the pop-up window software, judging whether the user likes the content and labeling, and taking the labeled software behavior data as a final training and testing data set;
(1.3) identifying the content of the software popup user interface: recognizing text contents in the software popup user interface, and taking the text contents as a part of input of a subsequent software behavior classification model;
(1.4) constructing a software behavior classification model: respectively mapping the recognition result, the process behavior data and the network behavior data of the software popup user interface content into a feature vector form, fusing the feature vectors by different weights, then performing feature learning through a deep neural network, finally inputting the learned feature vectors into a probability output layer, outputting the probability of each category, and realizing the classification of the software popup behaviors.
Further, step (2) designs a alliance chain architecture facing the software behavior classification model joint training task based on Hyperhedger Fabric, and establishes a decentralized software behavior classification model joint training framework. In order to ensure the privacy of software behavior data and the security of a software behavior classification model, the block chain type used by the framework provided by the method is a federation chain. Compared with a public chain architecture, the alliance chain sets a stricter identity authentication condition for each node, and safety of the system can be improved. Meanwhile, in order to avoid storage space burden caused by data storage on the link to each node and guarantee privacy of local data of a user, the alliance link only records model training task information, abstract information of a node local model and shared model parameter information, and each node does not need to upload local software behavior data. The roles of the nodes in the alliance chain are divided into a task coordination node and a data providing node, the task coordination node is responsible for issuing a software behavior classification model joint training task and initializing a global model, and finally the task is completed by extracting parameters of the global model; the data providing node is responsible for jointly training the task aiming at the software behavior classification model issued by the task coordinating node, training the model by using the local software behavior data set, and uploading abstract information and model parameters of the local model to complete the task.
On a decentralized network with initialized and constructed alliance chain, the detailed steps of the software behavior classification model joint training process are as follows:
and (2.1) the task coordination node issues a learning training task description to the alliance chain system, initializes global model parameters, and then adds a software behavior classification joint training task, wherein the initial global model selects the software behavior classification model constructed in the step (1).
And (2.2) forming a set by the data providing nodes participating in the task, then acquiring joint training task information and global model parameters from the alliance chain, decrypting, executing federal learning based on a local software behavior data set, and cooperatively training the same machine learning model.
And (2.3) after each round of iterative training is finished, each data providing node takes the abstract data (such as model accuracy, loss value and the like) of the local model and the encrypted model parameters as transaction content to initiate transaction, then the intelligent contract evaluates the contribution of each node to the model, and only the model transaction information meeting the requirements is reserved on the alliance chain.
And (2.4) after each round of iterative training is finished, calculating the weight of the contribution degree of each node to the global model by an intelligent contract, then realizing the aggregation of model parameters based on the weight, updating the global model, and judging whether the global model is fitted or reaches the maximum iteration times. If the condition is met, judging that the joint training task is finished, and then informing a task coordination node; if the condition is not met, packing the aggregated global model into blocks and issuing a next round of iterative training task, starting the next round of iterative training by each node, and continuing to execute the step (2.2).
And (2.5) the task coordination node extracts the global model parameters and verifies the validity of the model to finish the joint training task.
Further, the model aggregation algorithm in step (2) adjusts the aggregation weight of each local model according to the summary information of the local model of each data providing node, adjusts the proportion of the local model parameters in the updated global model parameters, and increases the proportion of the high-quality model parameters in the aggregation model; the contribution degree of each data providing node to the model is determined by the model loss of the node in the k-th local training iteration, and the cross entropy is usedEvaluating the local model loss of each node, and defining the model contribution degree weight of the model aggregation algorithm as follows:
andare respectively asNode andthe cross-entropy loss function of the node,in order to input the parameters, the user can select the parameters,for the desired output, the process of updating the global model parameters in the model aggregation algorithm based on the model contribution weight can be represented as:
wherein,is shown ask-global model parameters after 1 round of aggregation,nfor the number of nodes participating in the model aggregation in the current round,is a nodeIn the first placekModel contribution weight in round model aggregation,is a nodeIn the first placekThe updated model parameters are updated locally in turn,is a nodeIn thatThe average gradient of the local data of (a),is as followskAnd (4) the global model parameters after the round aggregation.
Furthermore, in the process of joint training, an intelligent contract is written, and each iteration comprises three processes: 1) the method comprises the steps that a task coordination node or a data providing node calls an intelligent contract issuing software behavior classification model to jointly train a task, after task information is recorded on a block chain, each data providing node requests and receives relevant data of the model training task, and in the process, the data providing node calls the intelligent contract to obtain the relevant data through obtaining a classified account; 2) after acquiring global model parameters, each data providing node carries out local training based on a private software behavior data set, then sends the obtained model abstract information and the encrypted model parameters back to a block chain network, at the moment, a function of an intelligent contract is called, the function takes the model abstract information and the model parameters transmitted by each node in the round as input, verifies the model abstract information in the transaction, and only retains the model data which can improve the model accuracy or ensure that the sample number reaches the specified minimum sample number; 3) and the last process is the aggregation of software behavior classification models, the intelligent contract takes the model abstract information and the model parameters which are kept on the chain through the model contribution evaluation process in the step (2.3) as input, calculates the model contribution weight of each data providing node, updates the global model based on the model aggregation algorithm of the model contribution weight, then judges whether the global model reaches the maximum fitting degree or the iteration times, and further triggers the completion of the software behavior classification combined training task or the continuous release of the next round of iterative training task.
Furthermore, each node in the alliance chain continuously iterates the optimization model through a software behavior classification model joint training process, finally outputs an optimized software behavior classification model with the best effect, and extracts the model for actual software behavior classification.
Compared with the prior art, the invention has the advantages that:
(1) the deep learning method is applied to software behavior classification and management and control, multidimensional software behavior data are automatically sensed, software processes, network flow and user interface content data which can more accurately express software behavior characteristics are fused, then the characteristics of the software behavior data are learned to realize classification of software behaviors, classification results are reliable, intelligent management and control can be carried out according to the classification results, and system operation experience of users is improved.
(2) The software behavior classification model combined training thought designed by the invention can make the data of each node not go out of the local, and fully utilizes the data of each participant to cooperatively train a more accurate software behavior classification model, thereby not only ensuring the privacy of user data, but also fully utilizing the software behavior data of each participant and improving the performance of the model.
(3) The traditional federal learning process relies on a central server to drive training, so that the learning disclosure can not be guaranteed, and the tracking and tracing of the whole federal learning process can not be realized. In order to solve the problems, the invention designs a alliance chain framework of a software behavior classification model joint training task by combining a block chain and a federal learning technology, the framework can store abstract data such as model accuracy, loss value and the like of a software behavior classification model training process on the block chain under the condition of no central node drive, and automation and transparency of the federal learning process are realized through a written intelligent contract.
(4) The invention provides a model parameter aggregation algorithm based on model contribution degree weight, which adjusts the aggregation weight of each model according to the summary information of each data providing node local model, increases the weight proportion of high-quality model parameters in the aggregation model, and can enhance the mutual credibility between federal learning nodes and the accuracy of the aggregation model.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a flow chart of the present invention for constructing a single-node software behavior classification model;
FIG. 3 is an alliance chain framework of a multi-node joint training task oriented to a software behavior classification model according to the present invention;
FIG. 4 is a flowchart of the model issuing and acquiring training tasks of the present invention;
FIG. 5 is a flow chart of model contribution evaluation in accordance with the present invention;
FIG. 6 is a flow diagram of model aggregation in accordance with the present invention;
fig. 7 is a functional structure diagram of the intelligent management and control client tool for software behavior according to the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific embodiments.
Referring to fig. 1, the intelligent software behavior management and control method based on block chain and federal learning includes the following steps:
(1) and constructing a software behavior classification model of the single node, wherein the software behavior classification model comprises the steps of sensing and storing software behaviors, constructing a software behavior data set, designing software popup user interface content identification and a software behavior classification model network structure. (2) The software behavior classification model multi-node joint training task oriented alliance chain is constructed, the software behavior classification model constructed in the last step is used as a model for local training of each node, and the block chain and federal learning technology are combined, so that the software behavior classification model is trained in a multi-party cooperation mode under the condition that each node on the alliance chain is not driven by a central node.
(3) Real-time software behavior data generated when a user uses a computer is input into the constructed software behavior classification model to classify the software behaviors, classification results are obtained, and the software behaviors are intelligently controlled according to the classification results.
The following steps are respectively described with reference to the accompanying drawings, as shown in fig. 2, a process of constructing a software behavior classification model of a single node is provided, and the specific steps in step (1) are as follows:
(1.1) software behavior sensing and storing: aiming at the ubiquitous popup internet application software in a Windows operating system, aiming at the technical difference between software developed based on different development frames, the automatic popup principle in the running process of the popup internet application software is researched, a software behavior information acquisition client tool is designed and developed based on a Windows API and a WinPcap library, multi-dimensional software behavior data such as window creation, window images, occupation of process running resources (mainly comprising a CPU, a memory and a network) and the like of various types of software are sensed in real time, an SQLite database and a PCAP type file are used for storing the data, and data support is provided for software behavior modeling analysis.
(1.2) constructing a software behavior data set: and (3) carrying out data cleaning on the stored multi-dimensional software behavior data, then labeling the behavior data of the pop-up window software, judging whether the user likes the content and labeling, and taking the labeled software behavior data as a final training and testing data set.
(1.3) identifying the content of the software popup user interface: the text part in the software popup user interface picture occupies the main body, and a user generally knows the content in the popup by reading the text, so the method can identify the text content in the software popup user interface and take the text content as a part of the subsequent software behavior classification model input.
In the advertisement popup user interface picture collected in the text, because the character area is not full of the whole picture, if the text recognition model is directly used for recognition, the interference of other background information can be received, and the effect of the model can be greatly reduced. Therefore, before the advertisement pop-up window user interface picture uses the text recognition model, a text region detection algorithm is generally used for positioning the text in the picture, and the text region detected in the picture is used as the input of the text recognition model, so that the recognition effect is improved.
The text area detection network performs normalization processing on the text area in the software popup user interface content, then integrates user interface character feature extraction and sequence prediction through a deep convolution neural network and a sequence marking module, realizes end-to-end user interface content identification, and finally outputs the identification result of the software popup user interface content.
(1.4) constructing a software behavior classification model: respectively mapping the recognition result, the process behavior data and the network behavior data of the software popup user interface content into a feature vector form, fusing the feature vectors with different weights, then performing feature learning through a deep neural network, finally inputting the learned feature vectors into a probability output layer, outputting the probability of each category, and realizing the classification of the software popup behaviors.
Combining the alliance chain framework of the software behavior classification model-oriented multi-node joint training task shown in fig. 3, step (2) designs the alliance chain framework of the software behavior classification model-oriented joint training task based on Hyperhedger Fabric, and establishes the decentralized software behavior classification model joint training framework. In order to ensure the privacy of software behavior data and the security of a software behavior classification model, the block chain type used by the framework provided by the method is a federation chain. Compared with a public chain architecture, the alliance chain sets stricter identity authentication conditions for each node, and safety of the system can be improved. Meanwhile, in order to avoid storage space burden caused by data storage on the link to each node and guarantee privacy of local data of a user, the alliance link only records model training task information, abstract information of a node local model and shared model parameter information, and each node does not need to upload local software behavior data. The roles of the nodes in the alliance chain are divided into a task coordination node and a data providing node, the task coordination node is responsible for issuing a software behavior classification model joint training task and initializing a global model, and finally the task is completed by extracting parameters of the global model; the data providing node is responsible for jointly training the task aiming at the software behavior classification model issued by the task coordinating node, training the model by using the local software behavior data set, and uploading abstract information and model parameters of the local model to complete the task.
On a decentralized network with initialized and constructed alliance chain, the detailed steps of the software behavior classification model joint training process are as follows:
and (2.1) the task coordination node issues a learning training task description to the alliance chain system, initializes global model parameters, and then adds a software behavior classification joint training task, wherein the initial global model selects the software behavior classification model constructed in the step (1).
And (2.2) forming a set by the data providing nodes participating in the task, then acquiring joint training task information and global model parameters from the alliance chain, decrypting, executing federal learning based on a local software behavior data set, and cooperatively training the same machine learning model.
And (2.3) after each round of iterative training is finished, the data providing nodes take abstract data (such as model accuracy, loss value and the like) of a local model and encrypted model parameters as transaction contents to initiate transactions, then endorsement nodes simulate and execute the transactions and carry out endorsements, the selected sequencing service nodes pack the transactions into blocks, the blocks are broadcasted to the whole network to verify block information, and the blocks are issued to a alliance chain after the verification is passed. When the data providing nodes upload abstract data of a local model and encrypted model parameters, the intelligent contract evaluates the contribution of each node to the model, and only retains model transaction information meeting requirements to a alliance chain.
And (2.4) after each round of iterative training is finished, calculating the weight of the contribution degree of each node to the global model by an intelligent contract, then realizing the aggregation of model parameters based on the weight, updating the global model, and judging whether the global model is fitted or reaches the maximum iteration times. If the condition is met, the Pandeon joint training task is completed, and then a task coordination node is informed; if the condition is not met, packing the aggregated global model into blocks and issuing a next round of iterative training task, starting the next round of iterative training by each node, and continuing to execute the step (2.2).
And (2.5) the task coordination node extracts the global model parameters and verifies the validity of the model to finish the joint training task.
The existing model aggregation algorithm is mainly Federal average FedAvg, but FedAvg does not consider the problem of global model quality reduction caused by the participation of low-quality models in aggregation. The invention provides a model aggregation algorithm based on model contribution degree weight, wherein in the step (2), the model aggregation algorithm adjusts the aggregation weight of each local model according to the summary information of each data providing node local model, adjusts the proportion of local model parameters in updated global model parameters, and increases the proportion of high-quality model parameters in the aggregation model; the contribution degree of each data providing node to the model is determined by the model loss of the node in the k-th local training iteration, and the cross entropy is usedEvaluating the local model loss of each node, and defining the model contribution degree weight of the model aggregation algorithm as follows:
andare respectively asNode andthe cross-entropy loss function of the node,in order to input the parameters, the user can select the parameters,for the desired output, the process of updating the global model parameters in the model aggregation algorithm based on the model contribution weight can be represented as:
wherein,is shown ask-global model parameters after 1 round of aggregation,nfor the number of nodes participating in the model aggregation in the current round,is a nodeIn the first placekModel contribution weight in round model aggregation,is a nodeIn the first placekThe updated model parameters are updated locally in turn,is a nodeIn thatThe average gradient of the local data of (a),is as followskAnd (4) the global model parameters after the round aggregation.
Global modelG k The condition for achieving convergence is asG k Is less than a predetermined valueHOr the number of iterations reaches the maximum number of iterationsMaxIterationNum,Namely:
the model aggregation algorithm based on the model contribution weight is shown as algorithm 1.
Algorithm 1 model aggregation based on model contribution weight
4for each local iteration j do from 1 to the number of iterations S
6 for from 1 to the batch number B do of the batch data set B
10 end for
11 end for
obtaining local model parameter updatesIn aUp-performing addition homomorphic encryptionLoss of local modelAnd uploaded to a block chain
13 the smart contract performs model parameter aggregation, and the received model parameters are weighted and averaged, i.e.
14 end while
According to the method, the local model loss value of each node is evaluated through the model parameter aggregation algorithm based on the model contribution weight, the proportion of the local model parameters in the updated global model parameters is adjusted, and the fairness of the federal learning process and the accuracy of the aggregation model are enhanced.
As shown in fig. 4-6, the present invention realizes the automation and transparency of the software behavior classification model joint training process by writing an intelligent contract, and each iteration includes three main processes in the joint training process: 1) the method comprises the following steps that a task coordination node or an intelligent contract issuing software behavior classification model is called to jointly train a task, after task information is recorded on a block chain, each data providing node requests and receives relevant data of the model training task, and in the process, the data providing node calls an intelligent contract to obtain the relevant data through obtaining a classified account; 2) after acquiring global model parameters, each data providing node carries out local training based on a private software behavior data set, then sends the obtained model abstract information and the encrypted model parameters back to a block chain network, at the moment, a function of an intelligent contract is called, the function takes the model abstract information and the model parameters transmitted by each node in the round as input, verifies the model abstract information in the transaction, and only retains the model data which can improve the model accuracy or ensure that the sample number reaches the specified minimum sample number; 3) and the final process is the aggregation of software behavior classification models, the intelligent contract takes the model abstract information and the model parameters which are kept on the chain through the model contribution evaluation process in the previous step as input, calculates the model contribution weight of each data providing node, updates the global model based on the model aggregation algorithm of the model contribution weight, then judges whether the global model reaches the maximum fitting degree or the iteration times, and further triggers the completion of the software behavior classification combined training task or the continuous release of the next round of iterative training task.
According to the invention, on the premise of protecting user data privacy, the leaguer link framework based on HyperLedger Fabric and Federal learning can utilize multi-user software behavior data to cooperate with the training model, so that the accuracy of the software behavior classification model is improved, and the popup behaviors are classified and controlled by means of the software preference of user groups; the combined training process driven by the nodes without the center can be realized, and the credibility of mutual trust and an aggregation model among the nodes is ensured; the automation and the transparentization of the whole process of the joint training are realized by compiling and deploying intelligent contracts; the abstract data in the training process is stored in the chain, so that the follow-up tracking and auditing of the combined training task are facilitated.
When the software behavior classification model is used, each node in the alliance chain continuously iterates the optimization model through a software behavior classification model joint training process, and finally a software behavior classification model with better effect is output, and the model is extracted to be used for actual software behavior classification.
Referring to fig. 7, the invention designs a software behavior intelligent control client tool, which can sense the behavior of various types of software in a system in real time, collect relevant software behavior data, and call a software behavior classification model to classify the software behavior in real time when a user uses a computer. The user can define a software behavior control strategy in the client, and intelligent control is carried out according to the classification result: for the software behavior with the classification result disliked by the user, stopping the software behavior in a way of ending the process and the like; and taking a reservation measure for the software behavior with the classification result being the favorite of the user. Meanwhile, the client tool can record the software behaviors with the classification results disliked by the user, wherein the software behaviors comprise data such as software names, software service providers and the occurrence frequency of the same software behaviors, and the user can inquire the statistical information of the software behaviors regularly. For the software behavior which is frequently not liked by the user, the client prompts the user whether to uninstall the software or not, and the client automatically uninstalls the corresponding software after the user confirms the uninstallation.
In summary, the invention learns the behavior characteristics of the software from multiple dimensions, such as the process data, the network flow and the user interface content of the software, so as to realize the classification of the software behavior and implement the management and control of the software behavior in time according to the classification result. In addition, the invention can build an information sharing mechanism by means of multi-user software behavior data, and realize intelligent control of software behaviors according to the software preferences of user groups by combining block chains and a federal learning technology.
It is understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art should understand that they can make various changes, modifications, additions and substitutions within the spirit and scope of the present invention.
Claims (6)
1. The intelligent software behavior control method based on the block chain and the federal learning is characterized by comprising the following steps of:
(1) constructing a software behavior classification model of a single node, comprising the steps of sensing and storing software behaviors, constructing a software behavior data set, designing software popup user interface content identification and a software behavior classification model network structure;
(2) constructing an alliance chain of a multi-node joint training task oriented to a software behavior classification model, taking the software behavior classification model constructed in the last step as a model for local training of each node, and combining a block chain and a federal learning technology to realize that the software behavior classification model is trained in a multi-party cooperation mode under the condition that each node on the alliance chain is not driven by a central node, wherein the alliance chain architecture comprises a design software behavior classification model joint training task, a definition on-chain block structure and a joint training flow, a design model aggregation algorithm and a part for compiling an intelligent contract in the software behavior classification model joint training process;
(3) designing a software behavior intelligent control client tool, sensing software behavior data generated when a user uses a computer in real time, inputting the constructed software behavior classification model, classifying the software behavior, obtaining a classification result, and intelligently controlling the software behavior according to the classification result.
2. The intelligent software behavior control method based on the block chain and the federal learning according to claim 1, wherein the specific steps in the step (1) are as follows:
(1.1) software behavior sensing and storing: analyzing the automatic pop-up window principle of different pop-up window software, sensing the behavior data of various multidimensional software in real time, and storing the behavior data;
(1.2) constructing a software behavior data set: performing data cleaning on the stored multi-dimensional software behavior data, labeling the behavior data of the pop-up window software, judging whether the user likes the content and labeling, and taking the labeled software behavior data as a final training and testing data set;
(1.3) identifying the content of the software popup user interface: recognizing text contents in the software popup user interface, and taking the text contents as a part of input of a subsequent software behavior classification model;
(1.4) constructing a software behavior classification model: respectively mapping the recognition result, the process behavior data and the network behavior data of the software popup user interface content into a feature vector form, fusing the feature vectors by different weights, then performing feature learning through a deep neural network, finally inputting the learned feature vectors into a probability output layer, outputting the probability of each category, and realizing the classification of the software popup behaviors.
3. The intelligent software behavior control method based on the blockchain and the federal learning of claim 1, wherein step (2) designs a alliance chain architecture facing a software behavior classification model joint training task based on Hyperhedger Fabric, and establishes an alliance chain architecture for decentralized software behavior classification model joint training framework, the type of the blockchain used by the framework is an alliance chain, the alliance chain only records model training task information, abstract information of a node local model and shared model parameter information, and each node does not need to upload local software behavior data; the roles of the nodes in the alliance chain are divided into task coordination nodes and data providing nodes, the task coordination nodes are responsible for issuing software behavior classification model joint training tasks, initializing global models and finally completing the tasks by extracting global model parameters; the data providing node is responsible for jointly training a task aiming at a software behavior classification model issued by the task coordination node, training the model by using a local software behavior data set, and uploading abstract information and model parameters of the local model to complete the task;
on a decentralized network with initialized and constructed alliance chain, the detailed steps of the software behavior classification model joint training process are as follows:
(2.1) the task coordination node issues a learning training task description to the alliance chain system, initializes global model parameters, and then adds a software behavior classification joint training task, wherein the initial global model selects the software behavior classification model constructed in the step (1);
(2.2) forming a set by the data providing nodes participating in the task, then acquiring joint training task information and global model parameters from a alliance chain and decrypting the joint training task information and the global model parameters, executing federal learning based on a local software behavior data set, and cooperatively training the same machine learning model;
(2.3) after each round of iterative training is finished, each data providing node takes the abstract data of the local model and the encrypted model parameters as transaction contents to initiate transaction, then the intelligent contract evaluates the contribution of each node to the model, and only the model transaction information meeting the requirements is reserved on the alliance chain;
(2.4) after each round of iterative training is finished, calculating the weight of the contribution degree of each node to the global model by an intelligent contract, then realizing the aggregation of model parameters based on the weight, updating the global model, and judging whether the global model is fitted or reaches the maximum iteration times; if the condition is met, judging that the joint training task is finished, and then informing a task coordination node; if the condition is not met, packaging the aggregated global model into blocks and issuing a next round of iterative training task, starting the next round of iterative training by each node, and continuing to execute the step (2.2);
and (2.5) the task coordination node extracts the global model parameters and verifies the validity of the model to finish the joint training task.
4. The intelligent software behavior control method based on the blockchain and federal learning of claim 3, wherein in the step (2), the model aggregation algorithm adjusts the aggregation weight of each local model according to the summary information of each data providing node local model, adjusts the proportion of local model parameters in the updated global model parameters, and increases the proportion of high-quality model parameters in the aggregation model; the contribution degree of each data providing node to the model is determined by the model loss of the node in the k-th local training iteration, and the cross entropy is usedEvaluating the local model loss of each node, and defining the model contribution degree weight of the model aggregation algorithm as follows:
andare respectively asNode andthe cross-entropy loss function of the node,in order to input the parameters, the user can select the parameters,for the desired output, the process of updating the global model parameters in the model aggregation algorithm based on the model contribution weight can be represented as:
wherein,is shown ask-global model parameters after 1 round of aggregation,nfor the number of nodes participating in the model aggregation in the current round,is a nodeIn the first placekModel contribution weight in round model aggregation,is a nodeIn the first placekThe updated model parameters are updated locally in turn,is a nodeIn thatThe average gradient of the local data of (a),is as followskAnd (4) the global model parameters after the round aggregation.
5. The intelligent software behavior management and control method based on the blockchain and the federal learning of claim 3, wherein in the process of joint training, an intelligent contract is written, and each iteration comprises three processes: 1) the method comprises the steps that a task coordination node or a data providing node calls an intelligent contract issuing software behavior classification model to jointly train a task, after task information is recorded on a block chain, each data providing node requests and receives relevant data of the model training task, and in the process, the data providing node calls the intelligent contract to obtain the relevant data through obtaining a classified account; 2) after acquiring global model parameters, each data providing node carries out local training based on a private software behavior data set, then sends the obtained model abstract information and the encrypted model parameters back to a block chain network, at the moment, a function of an intelligent contract is called, the function takes the model abstract information and the model parameters transmitted by each node in the round as input, verifies the model abstract information in the transaction, and only retains the model data which can improve the model accuracy or ensure that the sample number reaches the specified minimum sample number; 3) and the last process is the aggregation of software behavior classification models, the intelligent contract takes the model abstract information and the model parameters which are kept on a chain through the model contribution evaluation process in the step (2.3) as input, calculates the model contribution weight of each data providing node, updates the global model based on the model aggregation algorithm of the model contribution weight, then judges whether the global model reaches the maximum fitting degree or the iteration times, and further triggers the completion of the software behavior classification combined training task or the continuous release of the next round of iterative training task.
6. The intelligent software behavior control method based on the blockchain and the federal learning of claim 5, wherein each node in the alliance chain continuously iterates the optimization model through a software behavior classification model joint training process, and finally outputs an optimized software behavior classification model with the best effect, and the model is extracted for actual software behavior classification.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210610745.4A CN114692098B (en) | 2022-06-01 | 2022-06-01 | Intelligent software behavior control method based on block chain and federal learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210610745.4A CN114692098B (en) | 2022-06-01 | 2022-06-01 | Intelligent software behavior control method based on block chain and federal learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114692098A true CN114692098A (en) | 2022-07-01 |
CN114692098B CN114692098B (en) | 2022-08-26 |
Family
ID=82131116
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210610745.4A Active CN114692098B (en) | 2022-06-01 | 2022-06-01 | Intelligent software behavior control method based on block chain and federal learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114692098B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190222586A1 (en) * | 2018-01-17 | 2019-07-18 | Trust Ltd. | Method and system of decentralized malware identification |
CN111291394A (en) * | 2020-01-31 | 2020-06-16 | 腾讯科技(深圳)有限公司 | False information management method, false information management device and storage medium |
US20200193292A1 (en) * | 2018-12-04 | 2020-06-18 | Jinan University | Auditable privacy protection deep learning platform construction method based on block chain incentive mechanism |
CN112950222A (en) * | 2021-04-08 | 2021-06-11 | 腾讯科技(深圳)有限公司 | Resource processing abnormity detection method and device, electronic equipment and storage medium |
CN113657608A (en) * | 2021-08-05 | 2021-11-16 | 浙江大学 | Excitation-driven block chain federal learning method |
CN113992360A (en) * | 2021-10-01 | 2022-01-28 | 浙商银行股份有限公司 | Block chain cross-chain-based federated learning method and equipment |
CN114491616A (en) * | 2021-12-08 | 2022-05-13 | 杭州趣链科技有限公司 | Block chain and homomorphic encryption-based federated learning method and application |
-
2022
- 2022-06-01 CN CN202210610745.4A patent/CN114692098B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190222586A1 (en) * | 2018-01-17 | 2019-07-18 | Trust Ltd. | Method and system of decentralized malware identification |
US20200193292A1 (en) * | 2018-12-04 | 2020-06-18 | Jinan University | Auditable privacy protection deep learning platform construction method based on block chain incentive mechanism |
CN111291394A (en) * | 2020-01-31 | 2020-06-16 | 腾讯科技(深圳)有限公司 | False information management method, false information management device and storage medium |
CN112950222A (en) * | 2021-04-08 | 2021-06-11 | 腾讯科技(深圳)有限公司 | Resource processing abnormity detection method and device, electronic equipment and storage medium |
CN113657608A (en) * | 2021-08-05 | 2021-11-16 | 浙江大学 | Excitation-driven block chain federal learning method |
CN113992360A (en) * | 2021-10-01 | 2022-01-28 | 浙商银行股份有限公司 | Block chain cross-chain-based federated learning method and equipment |
CN114491616A (en) * | 2021-12-08 | 2022-05-13 | 杭州趣链科技有限公司 | Block chain and homomorphic encryption-based federated learning method and application |
Non-Patent Citations (2)
Title |
---|
SHILI HU等: "《2021 IEEE International Conference on Blockchain (Blockchain)》", 24 January 2022 * |
张君如等: "面向用户隐私保护的联邦安全树算法", 《计算机应用》 * |
Also Published As
Publication number | Publication date |
---|---|
CN114692098B (en) | 2022-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230316076A1 (en) | Unsupervised Machine Learning System to Automate Functions On a Graph Structure | |
US11521221B2 (en) | Predictive modeling with entity representations computed from neural network models simultaneously trained on multiple tasks | |
US11017271B2 (en) | Edge-based adaptive machine learning for object recognition | |
US20190378049A1 (en) | Ensemble of machine learning engines coupled to a graph structure that spreads heat | |
US20190377819A1 (en) | Machine learning system to detect, label, and spread heat in a graph structure | |
US11501161B2 (en) | Method to explain factors influencing AI predictions with deep neural networks | |
CN108288051B (en) | Pedestrian re-recognition model training method and device, electronic equipment and storage medium | |
Chorianopoulos | Effective CRM using predictive analytics | |
CN113468227B (en) | Information recommendation method, system, equipment and storage medium based on graph neural network | |
CN110991789B (en) | Method and device for determining confidence interval, storage medium and electronic device | |
CN115631008B (en) | Commodity recommendation method, device, equipment and medium | |
US20230252362A1 (en) | Techniques for deriving and/or leveraging application-centric model metric | |
US11991183B2 (en) | Optimizing resource utilization | |
Rehman et al. | Federated self-supervised learning for video understanding | |
CN112817563A (en) | Target attribute configuration information determination method, computer device, and storage medium | |
CN111159241A (en) | Click conversion estimation method and device | |
CN106294788A (en) | The recommendation method of Android application | |
CN114692098B (en) | Intelligent software behavior control method based on block chain and federal learning | |
CN117010492A (en) | Method and device for model training based on knowledge migration | |
CN114003821B (en) | Personalized behavior recommendation method based on federal learning | |
CN113360772A (en) | Interpretable recommendation model training method and device | |
CN112308706A (en) | Machine learning model training method and device | |
CN118470046A (en) | Image processing method, device, equipment and storage medium | |
CN116127363A (en) | Customer classification method, apparatus, device, medium, and program product | |
CN117171562A (en) | Training method and device of intent prediction model, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231102 Address after: 27a, Haiya building, Binhai garden, 1 Shandong Road, Shinan District, Qingdao City, Shandong Province 266071 Patentee after: Shandong Zhidou Digital Technology Co.,Ltd. Address before: 266100 Shandong Province, Qingdao city Laoshan District Songling Road No. 238 Patentee before: OCEAN University OF CHINA |
|
TR01 | Transfer of patent right |