CN113723220A

CN113723220A - Deep counterfeiting traceability system based on big data federated learning architecture

Info

Publication number: CN113723220A
Application number: CN202110919472.7A
Authority: CN
Inventors: 倪志彬; 唐龙翔; 王昊龙; 梁淇奥; 何震宇; 蒋新科; 向芝莹; 周啸宇; 石爻; 李顺; 左健甫; 杨若辰; 吴世涵; 张恩华; 吉雪莲; 常世晴; 罗佳源; 陈攀宇; 王瑞锦
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2021-08-11
Filing date: 2021-08-11
Publication date: 2021-11-30
Anticipated expiration: 2041-08-11
Also published as: CN113723220B

Abstract

The invention discloses a deep counterfeiting traceability system based on a big data federal learning framework, which comprises an application layer, an interface layer, a logic layer, a network layer and a storage layer which are sequentially connected; the application layer is used for providing a deep forgery traceability service for a user and acquiring user login and upload data; the interface layer is used for providing interface service and realizing communication between the server side and the web side; the logic layer is used for dividing system functions and designing an algorithm construction model to realize system function logic; the network layer is used for exchanging parameters and encrypting gradient information in the modeling process; the storage layer is used for receiving the transmitted parameter information and the encrypted information and storing the parameter information and the encrypted information in a local database and a block chain network. The invention provides an overall framework of a federal anti-counterfeiting traceability chain, establishes a three-fold mechanism of federal anti-counterfeiting, abnormal traceability and risk prediction, and can effectively solve the problems of data virus throwing and single-point failure aiming at federal learning while preventing Web security threat.

Description

Deep counterfeiting traceability system based on big data federated learning architecture

Technical Field

The invention relates to the technical field of deep forgery detection, in particular to a deep forgery traceability system based on a big data federal learning framework.

Background

The smart city concept has been proposed for more than ten years, and as far as 2020, the number of smart city test points published by the Ministry of construction has reached 290. In the process of deep integration of new-generation information technology and urban modernization, the face recognition system is widely applied to a plurality of important fields related to the national civilization and becomes a favorable weapon for promoting the construction of smart cities. According to southern metropolitan newspaper statistics, the main application scenario of face recognition in the year 2020 is shown in fig. 1-1-1. The face recognition technology becomes a bright bead in various biological recognition technologies due to the characteristics of non-replicability, non-contact property, expandability, rapidness and the like. The deployment of biometric-based identity authentication systems in first-line cities exceeds one hundred thousand, and PB-level video information is generated every day.

The face information belongs to individual unique biological identification information, and once the face information is revealed, the property safety, the privacy safety and the like of the user are seriously threatened. However, in the current society, many merchants steal face information by using cameras, and offer and avoid risks accurately. Meanwhile, the method can be used for generating a vivid, false or artificially controlled audio and video machine learning model deep fake which gradually meets all reporting ends and becomes an object for media dispute reports and an emerging network security threat. However, with the rapid development of artificial intelligence technology, network criminals have more and more behaviors of using deep counterfeiting technology based on deep learning to perform terminal camera shooting authentication equipment spoofing attack. With the continuous progress of the deep counterfeiting technology, the existing biological activity detection of the face recognition system can still be attacked by the deep counterfeiting.

At present, a high-definition 3D mask can be easily generated and manufactured by using a software technology related to the deep Face, such as the Face2Face, the FaceSwap, the deep Face method and the 3D printing technology, which is a potential great challenge for the safety of a Face recognition system. Currently, there are related risk events that occur. For example, in 1 month of 2020, the criminal official book of the middle-grade people's court in Quzhou city, Zhejiang province, disclosed a case of ' wool ' in which a citizen head portrait photo is made into a 3D head portrait by software for criminal gangster, and a new user is registered after face recognition of a paid treasure is deceived. As another example, a human intelligence company Kneron, san diego, usa, performed a test for face recognition, which used a high-definition 3D mask made by a japanese specialty mask manufacturer, to successfully defraud face recognition systems, including chinese pao and wechat face recognition payment systems, all over the world, to complete a shopping payment procedure. At the Sckups airport, the largest airport in the Netherlands, the Kneron team deceives the sensor of the self-service boarding terminal with a picture on the cell phone screen. The team also alleges that it entered the train station in china in the same way. Similar events are numerous and do not account for the impact of Deepfake on face recognition.

In connection with this, deep forgery detection still faces many technical bottlenecks. First, the accuracy of the deep forgery detection algorithm still remains to be improved. In the face book leading, deep fake Detection Challenge games of human face video (DFDC) held by famous enterprises such as Microsoft, Amazon and Massachusetts and colleges, all Cross entry Loss of the champion team is still above 0.42, and the Detection effect is not ideal. Meanwhile, in a new depth forgery data set Celeb-DF in 2020, the average ROC-AUC score of the existing mainstream deep forgery Detection algorithm is only 56.9, a large number of Detection means fail, and a new effective depth forgery Detection algorithm is urgently needed to cope with the increasingly developed depth forgery generation model. The existing big pain point problem is that most of the existing deep forgery detection models are based on the analysis of video frames, and the analysis mode is generally efficient. However, the time information is ignored, and deep analysis cannot be performed from the change of the frames, so that the recognition rate is more insufficient, the interpretability of the detection process in the neural network method is poor, and the evidence obtaining is difficult due to the black box characteristic. Even if the method is applied to a federal learning scene, the problem that the pain point is urgently required to be improved by the overall architecture can be solved by the method that the data structure which can be processed by the existing federal learning framework is simple, most of the data structure can only be used for experimental machine learning scenes and cannot be applied to more complex scenes.

Secondly, the data island problem brought by the current privacy protection is not beneficial to the deep forgery detection model training. The data regulatory law system at home and abroad is also becoming stricter. Under such conditions, traditional machine learning methods of centralized data collection no longer comply with strict data protection regulations. The success of AI face recognition systems is based in large part on the availability of large amounts of image data, with large amounts of high quality data from large users in many organizations, and is not willing to share data directly due to data privacy, regulatory risks, lack of incentives, and the like. Data privacy, data islands and other problems related to face detection become a bottleneck of deep forgery detection in a big data environment.

Finally, in practical applications, a service user usually wants to trace the source of counterfeiting and prevent and control the corresponding risks. Video counterfeiting infringes legitimate interests such as the portrait right of a video owner, but mere detection of video counterfeiting causes the person who counterfeits to be left without any penalty. Therefore, the tracing function can find all users through which the video is uploaded and forwarded, and find people who forge video contents through comparison of video information. Therefore, technical means are needed to provide a practical traceability tool and a risk prediction algorithm with strong practicability. In conclusion, the security problem of the biometric authentication system has become a bottleneck problem in the development of smart city construction, and it is urgent to adopt effective measures to prevent the abuse risk of deep counterfeiting.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a deep counterfeiting traceability system based on a big data federal learning framework, integrates a time-advanced counterfeiting detection method and technology on a core algorithm, realizes the analysis and identification of deep counterfeiting videos in a data-driven mode by utilizing multi-dimensional information of an original data set and a convolutional neural network (3DRes) based on space-time sequence characteristics and a residual error network under the federal learning framework, solves the problem of deep counterfeiting of face identification through performance evaluation of a training result, and can provide safer, more accurate and more growth-oriented services for intelligent city landing.

The purpose of the invention is realized by the following technical scheme:

the deep counterfeiting traceability system based on the big data federal learning framework comprises an application layer, an interface layer, a logic layer, a network layer and a storage layer which are sequentially connected;

the application layer is used for providing a deep forgery traceability service for a user through Web browser application and acquiring user login and upload data;

the interface layer is used for providing interface service and realizing communication between the server side and the web side;

the logic layer is used for dividing system functions and designing an algorithm construction model to realize system function logic;

the network layer is used for exchanging parameters between the logic layer and the storage layer and encrypting gradient information in the modeling process of the logic layer;

and the storage layer is used for receiving the parameter information and the encryption information transmitted by the network layer and respectively storing the parameter information and the encryption information in the local database and the block chain network.

Specifically, the logic layer comprises a login registration module, an attack prevention module, a federal learning module, a block chain traceability module, a deep forgery detection module and a risk prediction module; wherein the content of the first and second substances,

the login registration module is used for performing identity authentication and character string form pre-verification judgment on information input by a user, and returning a judgment result and a login session record to the server;

the attack method module is used for carrying out sample virus attack investigation on the model training data, and linking a data source accessed by each participant to a distributed account book provided by a block chain for data audit;

the depth counterfeiting detection module is used for designing a 3DRes depth counterfeiting detection algorithm, preprocessing an original data set and performing depth counterfeiting detection on data uploaded by a user by constructing a 3DRes depth counterfeiting detection model;

the federated learning module is used for carrying out horizontal federated learning training according to the preprocessed original data set and the user information transmitted by the application layer and outputting a model obtained by training;

the block chain traceability module is used for constructing a federal anti-counterfeiting traceability chain, calling an intelligent contract to take a user ID, a timestamp and a video ID automatically generated by a server transmitted by an application layer as transaction information, packaging the transaction information into a block and uploading the block to the federal anti-counterfeiting traceability chain;

and the risk prediction module is used for establishing a risk prediction model by using the desensitized portrait data of the user node and adopting a machine learning algorithm to predict the future risk behavior of the node.

Specifically, the interface layer comprises a block chain tracing interface, a federal learning online reasoning API interface, a foreground data interface and a backend data interface.

Specifically, the block chain traceability module is further configured to, when the deep forgery detection module detects that local data is subjected to deep forgery tampering, find a first user node that generates an error, delete and mark all tampered data on the traceability path, upload correct data again, and call the login registration module again to perform identity authentication.

Specifically, the preprocessing process performed on the original data set specifically includes: and constructing a Retina face model, performing face detection and cutting on the original data set by using the Retina face model, only reserving the data set of the video containing the face, and performing feedback regulation optimization on the face detection process by adopting a deformable convolution and dense regression loss function to obtain the preprocessed original data set.

Specifically, the process of performing the horizontal federal learning training according to the preprocessed original data set and the user information transmitted by the application layer specifically includes: distributing the preprocessed original data set and the user information transmitted by the application layer to different machines, downloading a transverse federal learning model from a server by each machine, training the transverse federal learning model by using local data, and returning parameters to be updated by the server; and the server aggregates the returned parameters of all the machines, updates the transverse federated learning model in the server, and feeds back the latest transverse federated learning model to each machine for model prediction.

The invention has the beneficial effects that:

1. the invention provides an overall framework of a federal anti-counterfeiting traceability chain, establishes a triple mechanism of federal anti-counterfeiting, abnormal traceability and risk prediction to realize an expected function, and can effectively solve the problems of data virus input and single-point failure aiming at federal learning while preventing common Web security threats.

2. The method uses the three-dimensional convolution CNN optimized by the residual error network, not only considers the single-frame image information, but also analyzes the time sequence information of the video by using the 3D Conv, and effectively improves the detection accuracy of the depth counterfeit video. Meanwhile, in order to solve the problem of neural network degradation caused by the fact that a convolutional network is too deep, the progressive relation of interlayer information is kept by using a jump connection by utilizing the thought of a residual error network, and the reliability of the network is kept at a high level. Compared with a single-frame detection model, the model has more excellent performance on each large-depth counterfeit data set, and the reliability of the system on depth counterfeit detection is improved.

3. The invention enriches the thickness of data by using the federal learning technology and improves the accuracy of the algorithm. In consideration of the fact that data share the same sample space but feature spaces are different under the condition of face identity authentication, a horizontal federal learning technology is selected, and privacy safety among nodes can be guaranteed through built-in multi-party safety calculation realized by an addition homomorphic encryption algorithm.

4. The invention combines the federal learning and the block chain technology, and provides the concept of the federal anti-counterfeiting traceability chain. The system takes the block chain as a bottom data platform, stores the stored video information of all users and the circulation path information of the video, and ensures the reality and reliability of the tracing path and the distributed characteristics of the stored information of the users. The system solves the problems of data virus input and single point fault in the artificial intelligence system for federal learning by using the characteristics of the block chain.

Drawings

FIG. 1 is a system architecture diagram of the present invention.

FIG. 2 is a schematic diagram of the Deepfakes model architecture of the present invention.

FIG. 3 is a diagram of the CycleGAN model architecture of the present invention.

FIG. 4 is a graph of AUC scores for each data set of the present invention.

FIG. 5 is a schematic diagram of the two-dimensional convolution of the present invention.

FIG. 6 is a schematic diagram of the 3D convolution of the present invention.

FIG. 7 is a diagram of a residual block-introduced neuron according to the present invention.

Fig. 8 is a graph of the Loss variation of the present invention.

FIG. 9 is a graph of the accuracy change of the present invention.

FIG. 10 is a federal learning architecture diagram of the present invention.

Fig. 11 is a block chain architecture diagram of the present invention.

FIG. 12 is a schematic view of the life cycle process of the Federal anti-counterfeiting consensus mechanism of the present invention.

FIG. 13 is a flow chart of a risk prediction algorithm of the present invention.

FIG. 14 is a bar graph of user profiles for the present invention.

FIG. 15 is a Roc plot for the Log model of the present invention.

Fig. 16 is a Roc plot of the naive bayes model of the invention.

Fig. 17 is a Roc plot for the XGBoost model of the present invention.

Fig. 18 is a Roc graph of an integrated model of the present invention.

Detailed Description

In order to more clearly understand the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings.

The first embodiment is as follows:

in this embodiment, as shown in fig. 1, a deep forgery traceability system based on big data federal learning architecture includes an application layer, an interface layer, a logic layer, a network layer and a storage layer, which are sequentially connected; the application layer is used for providing a deep forgery traceability service for a user through Web browser application and acquiring user login and upload data; the interface layer is used for providing interface service and realizing communication between the server side and the web side; the logic layer is used for dividing system functions and designing an algorithm construction model to realize system function logic; the network layer is used for exchanging parameters between the logic layer and the storage layer and encrypting gradient information in the modeling process of the logic layer; and the storage layer is used for receiving the parameter information and the encryption information transmitted by the network layer and respectively storing the parameter information and the encryption information in the local database and the block chain network.

In this embodiment, on the application layer, an operator uses a Web browser application to obtain a service, such as deep forgery detection or deep forgery tracing. The front end of the system is an Vue framework, which realizes several core function modules for deep forgery tracing and simultaneously assists a general Web attack security module.

On the interface layer, 4 types of interfaces such as a block chain tracing interface, a federal learning online reasoning API interface, a foreground data interface, a back-end data interface and the like are designed according to the design of functions and the consideration of communication between the server end and the Web end, and each interface can be continuously subdivided in the actual implementation process.

On a logic layer, the following six logic modules are divided according to different functions to be realized, namely a login registration module, an attack prevention module, a federal learning module, a block chain traceability module, a deep forgery detection module and a risk prediction module, wherein the logic flow or function classification of the function is included under each module.

The login registration module is used for performing identity authentication and character string form pre-verification judgment on information input by a user, and returning a judgment result and a login session record to the server. And the attack method module is used for carrying out sample virus attack investigation on the model training data and linking the data source accessed by each participant to a distributed account book provided by the block chain for data audit. The depth forgery detection module is used for designing a 3DRes depth forgery detection algorithm, preprocessing an original data set and performing depth forgery detection on data uploaded by a user by constructing a 3DRes depth forgery detection model. And the federal learning module is used for performing horizontal federal learning training according to the preprocessed original data set and the user information transmitted by the application layer and outputting a model obtained by training. The block chain traceability module is used for constructing a federal anti-counterfeiting traceability chain, calling an intelligent contract to take the user ID, the timestamp and the video ID automatically generated by the server transmitted by the application layer as transaction information, packaging the transaction information into a block, and uploading the block to the federal anti-counterfeiting traceability chain. And the risk prediction module is used for establishing a risk prediction model by using the desensitized portrait data of the user node and adopting a machine learning algorithm to predict the future risk behavior of the node.

The network layer follows the requirement of federal learning, and gradient information in the modeling process needs to be encrypted to prevent privacy disclosure.

In the storage layer, the bottom layer architecture is considered, so that the mysql database and the block chain are used as the bottom layer storage platform.

The embodiment can achieve the following technical effects:

the integral framework of the federal anti-counterfeiting traceability chain designed in the embodiment is distributed, the nodes are mutually independent and realize autonomy, and simultaneously, all the nodes can realize combination and logically aggregate all the node model parameters. The cloud end completes the aggregation and the updating of the model parameters, returns the updated parameters to the terminals of the participants, and each terminal starts the next iteration. The above procedure is repeated until the convergence of the whole training process.

Example two:

in this embodiment, the functions of the system are elaborated and further optimized on the basis of the first embodiment. In order to solve the problems of deep forgery detection, privacy protection and the like, the following core algorithm is designed in the embodiment. The embodiment mainly introduces the key algorithms used in the system, shows the relevant principles therein, and presents the corresponding test results.

One, 3DRes depth forgery detection algorithm design

Firstly, analyzing a deep forgery principle, and laying a foundation for providing a 3DRes deep forgery detection algorithm; secondly, preprocessing an original data set; and finally, a 3DRes deep counterfeiting detection model is constructed, and experimental results show that compared with a traditional algorithm, 3DRes can be greatly improved in algorithm accuracy.

1.1 principle of deep forgery

Although each user of the system can continuously update and optimize the DeepFake detection algorithm used by the system by using federal learning, a detection model used at the beginning of system construction is still needed, and the model is trained in advance through an open-source DeepFake data set, so that the system has better performance when being online. In order to build the initial DeepFake detection algorithm model used by the system, a DeepFake video data set with good implementation and sufficient data volume is needed.

The traditional method of the image face changing technology is mainly based on the graphics principle, and the human face is modified by using the graphics transformation to obtain the face changing effect. For example, FaceSwap developed by Marek Kowalski is a face changing method based on graphics, and the technology firstly obtains key points of a human face, then renders the positions of the obtained key points of the human face through a 3D model, continuously reduces the difference between the target shape and the key point positioning, finally mixes the images of the rendered models, and obtains a final image by using a color correction technology.

Although the face tampering method based on graphics has been studied for many years, the technology is difficult to popularize and expand due to the large calculation overhead, high use threshold, large development cost and difficulty in achieving the effect of deceiving human eyes. With the development of Deep learning technology in recent years, researchers have been concerned about the application of Deep Neural Networks (DNNs) to face changing technology, among which the best-known Deep prices are face changing systems based on Deep learning with milestone significance, and Deep forgery (Deep fake) is also known. Deepfakes constructs a model to train two automatic encoders (Encoders), which share weight parameters, so that two subsequent decoders (Decoders) learn the ability to reconstruct faces. After the training is finished, two decoders are exchanged in the face changing stage, so that the face changing effect is achieved. The model architecture diagram is shown in fig. 2.

The model can be trained, deployed and used only by the face picture with the original character and the target character, has a good replacement effect, greatly reduces the use threshold, even can deceive the resolution of human eyes to a certain degree, and promotes the development of the whole deep fake technology.

Although the deep neural network-based model provides good effect and convenience for training, certain training skills are required, otherwise, the generation quality of the generator cannot be guaranteed. Based on this, researchers have begun to focus on the fusion of the anti-neural network (GAN) technology with DeepFake. An open source system Facecwap-GAN developed by shaoanlu et al is Deepfakes of a traditional Encoder-Decoder architecture combined with GAN technology, a CycleGAN architecture model is used in the model, an antagonistic loss function of a Discriminator (Discriminator) is introduced, the similarity between a generated image and an original image is discriminated during generation, so that the quality of the generated image is greatly improved, and a perceptual loss function is introduced to increase the rotation effect of an eyeball. The addition of the GAN technology enables face changing to be more vivid and natural, and the popularity of the deep counterfeiting technology is increased to a certain extent. The model architecture is shown in fig. 3.

1.2 dataset selection

As The deepFake video generation technology matures, there are currently multiple open-source large high-quality deepFake video datasets, including faceForenses + +, The DeepFake Detection Change (DFDC), and Celeb-DF, among others.

Wherein faceforces + + is one of the largest large-scale, most diverse sets of deep forgery data at present. The method mainly comprises the steps of selecting 1000 videos with labels of human faces, news broadcasters and news simulcasts, wherein the videos are selected from YouTube, and the total number of the videos is 1000. When generating a depth-forged video, four different methods are mainly adopted: deepfaces (Encoder-Decoder model), Face2Face model, faceSwap model, and Neural Textures method.

But the quality of the data set is not very high, the human eyes can obviously observe the tampering trace, and the modified outline is very obvious; meanwhile, a face flicker phenomenon also exists in the synthesized false video.

DFDC dataset is a study by Facebook to advance the field of deep forgery, held in 2019 in a deep fake detection race: the Detection change provides a public data set consisting of 5214 videos, The true-false ratio is 1:0.28, original videos are all shot by actors, two falsification modes exist for falsification videos, and a large number of replacements are performed among similar faces, such as skin color, hair, eyes and The like, and each video is a small segment of about 15 s. And the full version data set published after the game has 119,169 videos, and the data volume is huge.

We chose the latest DeepFake video dataset Celeb-DF (v2) published in 5 months 2020, which contains 5639 optimized DeepFake videos, solving many problems of the existing DeepFake datasets, such as Low resolution of synthesized faces (Low resolution of synthesized faces), Color mismatch of synthesized parts (Color mismatch), Inaccurate face edge problems (Inacuate face masks) and Temporal flicker problems (Temporal flicker).

The Celeb-DF data set reduces the problem of inconsistency between the modification area and the peripheral area by increasing the size of the generated image and increasing the tone brightness, the contrast ratio and the like in the training stage, and in addition, the Celeb-DF data set reduces the human face flicker phenomenon by using more accurate human face key point positioning information. By optimizing the DeepFake video generation algorithm through the method, the highest Mask-SSIM score in the existing DeepFake data set is successfully obtained.

By testing the existing DeepFake Detection model, it is found that Celeb-DF has the lowest accuracy in the existing large DeepFake video data sets, which means that the DeepFake data set is the most challenging at present. The average AUC score for each particular data set over each DeepFake detection algorithm is shown in fig. 4. As can be seen from the figure, the average score of the Celeb-DF dataset is much lower than that of many existing DeepFake datasets, and an AUC score of nearly 50 means that many existing DeepFake Detection models are almost ineffective on it. In summary, we choose to challenge this data set to maximize the measure of model performance.

1.3 preprocessing (Pre-Processing)

In the depfake video, the spatiotemporal artifacts we detect will only appear in the modified face region, so we need and need to focus only on the face region of the video. In the preprocessing process, the face area is selected to be cut frame by frame to form a video only containing the face, so that useless environment information in the video is deleted. Because the data set is very large and manual cropping labeling is not practically feasible, we need to find a model that can implement facial cropping for data set preprocessing.

As one of the problems of the earliest research and the forefront result in the deep learning field, a plurality of open source models are already available for face detection, and new models proposed in recent years are more excellent in performance. A Face Detection model based on a multi-task Cascaded Convolutional neural network (MTCNN) as proposed by Kaipeng Zhang et al in Joint Face Detection and Alignment Using a multi-task Cascaded Convolutional neural network, which is divided into three parts: P-Net checks the input image with a smaller convolution kernel to perform face candidate, then performs regression calculation in the stage of R-Net to eliminate partial redundant candidate frames, and finally obtains a face detection result in O-Net.

MTCNN achieved state-of-the-art performance on the well-known face data set WIDER FACE. In addition to MTCNN, Face detection models such as Dual Shot Face Detector (DSFD) and retina Face have good effects on large open-source Face data sets.

In consideration of the comprehensive consideration of accuracy and efficiency, a RetinaFace model proposed by the insight Face in 2019 is selected, Extra-supervised (Extra-supervised) and Self-supervised Multi-task (Self-supervised Multi-task) learning is used for the model, and the robustness of the overall architecture realization of the model is improved; meanwhile, Deformable Convolution (Deformable Convolution) and Dense Regression Loss (Dense Regression Loss) are adopted, and the accuracy of the face detection process is optimized. The loss function of the model is defined as:

the Face recognition Loss model is formed by summing four sub-Loss functions, wherein the four parts are Face Classification Loss (Face Classification Loss), Face Box Regression Loss (Face Box Regression Loss), Face punctuation Regression Loss (Face Landmark Regression Loss) and the dense Regression Loss. Training is carried out in a mode of multi-task loss common calculation, and the overall accuracy of the model is optimized to a great extent.

Compared with the most widely applied face detection algorithm MTCNN, the RetinaFace has better performance on a plurality of data sets and higher accuracy. The ArcFace validation accuracy (ArcFace's verification accuracy) on LFW et al face detection datasets is shown in Table 2-1.

TABLE 2-1 Performance comparison Table

Model (model)	LFW	CFP-FP	AgeDB-30
				MTCNN	99.83	98.37	98.15
RetinaFace	99.86	99.49	98.60

Different from MTCNN and other models, RetinaFace has higher processing efficiency as a Single-Stage (Single-Stage) face detection algorithm. After sampling and re-weighting, compared with the condition that positive and negative samples are unbalanced in multiple stages, the single-stage algorithm is solved, and too much accuracy cannot be influenced. The RetinaFace also has a lightweight model deployed under the MobileNet architecture, so that the reasoning time is further shortened, the processing speed of millisecond level can be realized on a CPU, and the preprocessing process is quicker. The model inference times under different equipment and clarity conditions are shown in table 2-2.

TABLE 2-2 COMPARATIVE TABLE FOR DIFFERENT EQUIPMENTS

Device	VGA(640*480)	HD(1920*1080)	4K
				GPU(NVIDIA Tesla P40)	1.4ms	6.1ms	25.6ms
CPU (Intel i7-6700K single thread)	5.5ms	50.3ms	-
				CPU (Intel i7-6700K multithread)	17.2ms	130.4ms	-
ARM(RK3399)	61.2ms	434.3ms	-

In actual use, the RetinaFace has high accuracy and a frame with stable frames, so that the follow-up DeepFake distinguishing process of the cut video is greatly facilitated, and the identification accuracy is improved to a certain extent.

1.4 deep Fake Detection algorithm

On the model, we use a Convolutional Neural network (3D fundamental Neural Networks,3DRes) based on spatio-temporal features, optimized by a Residual network structure. The model applies the image convolution neural network CNN to the time sequence to optimize the characteristic extraction effect and improve the identification accuracy; meanwhile, the degradation phenomenon of the neural network is solved by utilizing a residual error network model.

In the existing large number of DeepFake detection models. For example, Two-Stream neural Network proposed by Peng Zhou et al, XceptionNet based on Andreas Rossler et al, and Capsule Network used by HH Nguyen et al, are mostly based on the analysis of video frames, which is generally more efficient and has some effect on the early DeepFake data set. However, since time information is ignored and deep analysis cannot be performed from the change between frames, the recognition rate is still insufficient when migrating to a data set with higher quality, such as Celeb-DF. The CNN we have adopted based on Spatio-Temporal Features (Spatio-Temporal Features) can solve this problem. In the model, a preprocessed continuous multi-frame video sequence is input into the model for training, the model extracts features in a 3D convolution mode, and visual artifacts which may appear in a DeepFake video are used for classification.

However, considering that the 3D CNN requires a deeper layer number, many problems may occur, especially for the case of insufficient training set, the phenomenon of over-fitting, the case of Gradient disappearance (Gradient impact) or Gradient explosion (Gradient explosion), and the phenomenon of Degradation (Degradation) easily occur. Based on this, we also consider the use of a residual network to optimize the model. A Residual Block (Residual Block) inside the Residual network uses jump connection, so that the problem of gradient disappearance caused by depth increase in the deep neural network is relieved, and the problem of degradation is solved.

(1)3D CNN

The conventional convolutional neural network CNN generally performs a two-dimensional convolution operation on image recognition, that is, performs a two-dimensional convolution on each or a plurality of two-dimensional Feature maps, as shown in fig. 5.

However, the features extracted by the two-dimensional convolution lose information on the time dimension, and the time sequence information is of great help to the visual artifact detection process in the depth forgery detection, which is one of the reasons that the depth detection algorithm based on a single frame (picture) cannot achieve higher accuracy on a more real data set. In order to obtain the time sequence information, we consider putting adjacent continuous frames into a CNN model together, changing the traditional two-dimensional Convolution calculation into three-dimensional Convolution, and simultaneously performing three-dimensional pooling, and using a three-dimensional Convolution Kernel (3D Convolution Kernel) to act on a plurality of adjacent frames simultaneously, so that the network can obtain the time sequence information of the video at the same time when extracting features, and the model can obtain the change information between adjacent frames, which is the basis for detecting the spatio-temporal artifacts.

The formalization for the 3D convolution process is described as follows.

Convolution form as shown in fig. 6, it can be seen from the schematic diagram that the network performs a convolution operation on the continuous 3-frame image, and the above 3D convolution is performed by stacking a plurality of continuous frames to form a cube, and then applying a 3D convolution kernel in the cube. In this configuration, each Feature Map in the convolutional layer is connected to a plurality of adjacent consecutive frames in the previous layer, thereby capturing motion information. As shown by the dots on the right side of the figure, the value of a certain position of each convolved Feature Map is obtained by convolving the same position of three consecutive frames of the previous layer.

In implementation, since the purpose of the 3D convolution kernel is to extract the same type of features from consecutive frames, while considering the simplicity of the network, the 3D convolution kernel uses the same weights, i.e., shared weights, in different frames. In addition, various convolution kernels can be adopted to extract various features to expand the number of Feature maps.

At present, in The field Of Action recognition (Action recognition), 3D CNN is well applied, and The achievement Of State-Of-The-Art is obtained on a plurality Of data sets. Similar to motion recognition, also working as video classification, it is reasonable enough to migrate this technique to deep forgery detection, and actually a better result is achieved.

(2)Residual Network

The 3D CNN is a huge model, has a quite deep network structure and a large number of parameters, and is easily subjected to phenomena such as Overfitting (Overfitting), Gradient extinction and Gradient explosion (Gradient impact/explosion) and network Degradation (Degradation) without optimization.

For the former two, certain avoidance can be performed through some model optimization methods, for example, overfitting can be improved in a Dropout regularization mode, so that the overfitting phenomenon is reduced; and the conditions of gradient disappearance and gradient explosion can try to replace the activation function and change the distribution of neuron values among layers in the network by using a Batch Normalization method, so that the gradient is not excessively changed during the transfer among the layers.

The above method has no obvious effect on the network degradation phenomenon. While 3D CNN requires a sufficient number of network layers to show good accuracy, if the network is saturated before the 3D CNN is trained completely, the network cannot get a good result.

Here we apply a residual network model to the 3D CNN. The Residual network model replaces the traditional neuron by introducing a Residual Block (Residual Block), realizes the jump connection between different layers in the network and solves the problem of network degradation.

It can be seen from fig. 7 that in each residual block, the input feature x propagates backward through two paths. Firstly, as with the conventional neural network, weight and bias are calculated, and then characteristic output is obtained by activating a function. In addition, x is combined with the calculated features through a direct Mapping (Identity Mapping), so that a deeper layer has more information than a shallow layer, and the purpose of avoiding network degradation is achieved. The characteristic forward propagation and gradient computation formalization between residual blocks is described as:

after the 3D CNN is optimized through the framework of the residual error network, the number of layers of the network can be deepened without degradation, and the accuracy of model identification is further improved.

1.5 training Effect and analysis

We implemented 3DRes for backend keras with tenserflow. The network structure of the model is based on an 18-layer network structure used by Kensho Hara et al in the field of action recognition, and a 3DRes model used by the model is obtained through improvement and testing. The model is formed by five convolutions, input data is firstly Down-sampled (Down-Sampling) through a 7X 7D convolution kernel, after 3D pooling, Feature Map is expanded through the four times of 3D convolutions with gradually increased convolution kernel number, Feature extraction is added, and finally two classification tasks are carried out through a full connection layer, so that deep forgery detection is realized.

In implementation, because the Data set is too large to be loaded into the memory at one time, the Data Generator is used for generating and training the Data set, and the Batch Size is reduced to adapt to large Data training.

The increasing variation of Loss according to Epoch during training is shown in fig. 8. It can be seen from the figure that the model is basically fitted at about 30Epoch, the overall Loss is at a smaller level, and the situation that overfitting and network degradation do not occur can be seen by combining the effect of the verification set.

In addition, the accuracy of the model is shown in FIG. 9 along with the Epoch variation curve. The overall accuracy is high, the model classification effect reaches more than 90 degrees of accuracy, and the method has excellent performance on a verification set.

The embodiment can achieve the following technical effects:

in the embodiment, the three-dimensional convolution CNN optimized by the residual error network is used, single-frame image information is considered, time sequence information of the video is analyzed by using the 3D Conv, and the detection accuracy of the depth-forged video is effectively improved. Meanwhile, in order to solve the problem of neural network degradation caused by the fact that a convolutional network is too deep, the progressive relation of interlayer information is kept by using a jump connection by utilizing the thought of a residual error network, and the reliability of the network is kept at a high level. Compared with a single-frame detection model, the model has more excellent performance on each large-depth counterfeit data set, and the reliability of the system on depth counterfeit detection is improved.

Example two:

in this embodiment, the system function is further designed and optimized in detail based on the second embodiment. The embodiment considers that the deep forgery detection identification mainly aims at human faces, and the data has sensitivity, so that the selection of the federal learning framework is a reasonable behavior. In the application scenario, due to the privacy of source data of the federal learning ecosystem aiming at deep counterfeit content authentication, the federal learning ecosystem hopes to ensure that the owned data of each entity can not be sent out locally, and then the federal system can establish a virtual common model through a parameter exchange mode under an encryption mechanism, namely under the condition of not violating data privacy regulations. The virtual model is equivalent to the optimal model established by aggregating the user data together. But when the virtual model is established, the data does not move, and privacy is not disclosed, and data compliance is not influenced. Thus, the built models serve only local targets in their respective regions. The above statements may illustrate that using federal learning may be achieved.

Currently, federal learning is divided into horizontal federal learning, vertical federal learning and federal transfer learning according to differences of data feature space and sample space. The horizontal federated learning is mainly applicable to the situation that data share the same sample space but feature spaces are different, the vertical direction is opposite to the horizontal direction, and the federated transfer learning is aimed at the situation that the feature spaces and the sample spaces are different. The method provided by the system is also applied to the field of face identity authentication, and a horizontal federal learning mode is adopted because the data characteristics are the same and sample data come from different organizations.

The process of federal learning is divided into two parts, autonomous and union.

The autonomous part: first, two or more participants install the initialized models at their respective terminals, each participant having the same model, after which the participants can train the models using local data. Since the participants have different data, the model trained by the terminal finally has different model parameters.

The combined part: different model parameters are uploaded to the cloud end at the same time, the cloud end completes aggregation and updating of the model parameters, the updated parameters are returned to the terminals of the participants, and each terminal starts the next iteration. The above procedure is repeated until the convergence of the whole training process.

As shown in fig. 10, the system mainly realizes horizontal federal learning local offline modeling and online reasoning service API under the condition of fully analyzing data characteristics, aiming at the problems of low precision reading, small data amount and insufficient privacy protection when deep forgery recognition is performed in a local environment. The transverse federal learning training process performed in this embodiment includes:

2.1 horizontal Federal learning local modeling

For the application scenario of deep fake content detection, the pain point of the requirement is island distribution of data and data privacy disclosure. Considering that the multimedia video data of the participants need fusion detection in the depth forgery detection, it can be considered that the data feature dimensions overlap more, the user features overlap more among the data sets and the user overlaps less. The federal learning purpose of the feature alignment is mainly to increase the amount of training samples and meet the definition and characteristics of horizontal federal learning. The transverse federated learning mechanism specifically comprises the steps of distributing all data to different machines, downloading a model from a server by each machine, then training the model by using local data, and then returning parameters to be updated to the server; the server aggregates the returned parameters on each machine, updates the model, and feeds back the latest model to each machine.

There are many open source frameworks for federal learning alternatives, such as TensorFlow Federated, developed by Google, Inc., and Clara FL, also developed by England on the NVIDIA NGC-Ready server for distributed collaborative federal learning training. The open source federated learning framework based on the system is FATE, compared with other open source frameworks, FATE is more biased to technicians, and a custom module is supported to realize the federated learning algorithm of the FATE. The method is derived from the actual scene of carrying out big data wind control modeling by a micro-bank, so that FATE is a production system, and the expandability of the FATE is very excellent. The FATE is provided with a distributed computing framework, the stability is superior to that of a general research system, and developers with certain development capacity and algorithm success can use the FATE to realize and complete a closed-loop flow of safe modeling.

And the Docker mode is used for deploying the FATE environment during local modeling. We segment the video data set according to the horizontal direction (i.e. user dimension), and extract the data of the part where the features of both users are the same but the users are not exactly the same for training.

Based on the deep learning model mentioned above, we develop it as an independent functional module by using a decoupling modularization method. Meanwhile, considering the defect that FATE federal learning cannot process multi-source heterogeneous data, the FATE is rewritten on the original federal learning framework. Firstly, multiplexing a neural network construction module according to technical characteristics, importing a machine learning model autonomously proposed by the system, and modifying test _ home _ nn _ keras _ temporal. In order to support the federal learning in a complex data structure such as video data to be processed by the system, a preprocessing layer is provided in communication with the preprocessing logic mentioned in the second embodiment.

After modifying the names and namespaces of the guest and host data sets, the host party data and the guest party data need to be uploaded respectively, and at this time, a configuration file for uploading data needs to be defined.

Considering the requirements in the actual scenario, it is generally desirable to have a pre-trained model service before the federation, and the model in embodiment two is serialized here, and introduced into the FATE federal learning framework to facilitate the post-federal learning training according to the model parameters.

Through analysis of the full flow, we know that training for FATE federated learning should be initiated by the guest party, so we log into the guest Python container to run the neural network task using the gate _ flow. Model prediction is used after the model is obtained.

2.2 Online inference service API

When model prediction is used, a pipeline component needs to be defined in order to be directly operated quickly by using a python script. Automated components are defined according to the DSL language specification.

2.3 design of principle and defense strategy for sample virus attack of federal learning

Federal learning is not perfect in itself and presents a security threat. Federated learning provides participants with the ability to build powerful machine learning models through collaboration and uses privacy protection mechanisms to protect the privacy of data. However, federal learning has been somewhat questioned as it is vulnerable to backdoor attacks.

For example, a malicious participant may poison a machine learning model using malicious training samples and use model substitution techniques to disrupt the performance impact of the final model. In such data attack, a malicious attacker cannot directly change the model of the central node, but can achieve the purpose of non-target attack or specific target attack by tampering the data, features or tags of the client. In federal learning, each participant is an independent individual, and a general server does not have the capability of checking whether the data of the participants is normal or not, so that if an attacker poisons data or a model from the interior of federal learning, such as an attack mode proposed by Chen and the like, more than nine attack success rates can be achieved by only using a small amount of toxic samples, hidden dangers can be buried in the generated model, the model parameter training value is guided to be a result to be obtained, the accuracy of the model prediction sample is reduced, and the performance is reduced. Like model update attacks, the existence of malicious data attacks is difficult to detect only by the aid of indexes such as global accuracy or single client training accuracy. Many security protocols have been developed to defend and protect against malicious attacks, such as distillation and training regularization. However, to actively prevent federal learning from malicious attacks, rather than passive defense, there remains a need for a mechanism that can effectively detect intentional attacks and determine malicious parties.

The root cause of data poisoning is that data that is not considered for the input model may be erroneous or even subject to human damage. Therefore, the data protection measures proposed here are to investigate the source of the data before model training and to ensure the integrity of the data without modification before the security of the data cannot be guaranteed. When data is uploaded, the identity authentication mechanism and the deep forgery detection mechanism can be used to ensure the reliability of the uploaded data. Meanwhile, the technical means selected by the inventor is to adopt a block chain to prevent sample poisoning attacks aiming at federal learning. Block chains are non-volatile and traceable and are an effective tool in federated learning to prevent malicious attacks. More specifically, the data sources accessed by each participant may be linked to a distributed ledger provided by the blockchain to audit the model input data, which helps to detect tampering attempts and malicious model replacements.

2.4 Federal anti-counterfeiting traceability chain implementation principle

A blockchain is a distributed storage structure, with data no longer maintained solely by a trusted center, but instead maintained and stored together by users. Due to the distributed storage characteristic of the block chain, a user can obtain a copy of complete data, and the non-tamper property and the error recovery capability of the data are ensured. Meanwhile, the consensus algorithm of the block chain enables different users to achieve consensus, so that the integrity of data is maintained together. The system integrates federal learning and block chains, and provides a federal anti-counterfeiting traceability chain.

The system adopts a mechanism of a federal anti-counterfeiting traceability chain to traceability the forwarding process of the video, the bottom layer of the system is built on an Etherhouse platform, the marking and traceability of the forwarding information of the user are realized through an intelligent contract which is compiled independently, the operation information of the user during each forwarding is recorded into a block chain through a built-in POW consensus mechanism of the Etherhouse, meanwhile, the forwarding information of the user at each time has the guarantee of being not falsified and traceable through the characteristics of the block chain, and the authenticity and reliability of the video traceability information are ensured.

2.5 video Forwarding information real-time tracing

In the present system, each user, video has a unique ID value. The ID value of the user is sent to a back-end database and a federal anti-counterfeiting traceability chain network by a front end when the user registers, the video ID value is automatically generated by a back-end server and sent to the federal anti-counterfeiting traceability chain network when the video is uploaded, and the two ID values are the key for tracing the video by using the federal anti-counterfeiting traceability chain system. When a user uploads a video, the intelligent dating is called to automatically package a user ID, a timestamp and a video ID automatically generated by a server, which are transmitted from a front end, into a block as transaction information, and the block is uploaded to a federal anti-counterfeiting traceability chain under the action of a consensus mechanism, so that the authenticity of traceability information is ensured. When the user forwards an uploaded video, an intelligent contract is called to package the user ID, the timestamp and the video ID as transaction information into a block and upload the block to the federal anti-counterfeiting traceability chain.

In the intelligent contract, a mapping set nested by struct is defined, each mapping element is a key value pair, the content of the "key" is a video ID automatically generated by a server, such as "NWRuxXJ 75 fBWUjBZi", and the "value" is a struct structure, and the structure comprises two members: ViedoInfo and TraceInfo. Wherein, the ViedoInfo is related information of a video, generally a video name or a video ID; TraceInfo is the traceability information of the video, and displays users who forward the video and the time sequence of forwarding in a character string splicing mode.

When a user needs to inquire the source tracing information of the video, the ID of the video can be input in a front-end interface, and then a geViedoInformationWithID function in an intelligent contract is operated according to the ID of the video to obtain a value structure body corresponding to an element of which the key in a mapping set is the video ID, so that the information of the video and the source tracing information of the video are obtained. The blockchain architecture of the present embodiment is shown in fig. 11.

2.6 traceability information is not tampable

Compared with the traditional tracing means, the tracing is carried out by utilizing the federal anti-counterfeiting tracing chain network, so that the block information of the packaging uplink has the characteristic of being incapable of being tampered and traced, wherein the characteristic of being incapable of being tampered is realized by a federal anti-counterfeiting consensus mechanism.

The system tracing part is built on an Ethernet workshop platform, a federal anti-counterfeiting tracing chain network is controlled through a Web3js interface, an intelligent contract is deployed in the Ethernet workshop, relevant information of a forwarded and uploaded video is used as a transaction and written into a block and uploaded into the block chain, and the process is realized by depending on a federal anti-counterfeiting consensus mechanism. The life cycle of the federal anti-counterfeit consensus mechanism of this embodiment is shown in fig. 12.

2.7 Single-Point failure problem principle for Server for federated learning over Block chain network

In the system, a federal learning training strategy is adopted, and all medical institutions are used as clients, so that on one hand, the user privacy safety of all institutions in a network among communities is guaranteed, on the other hand, the deployment indirectly utilizes the user data of all institutions to help the training of deep counterfeiting detection models, and the problem that the data of all institutions are distributed in an island mode is solved. However, the training process of federal learning has certain requirements on the stability of each participating server, once a certain node server fails, part of data cannot participate in the training to influence the updating of all local models, and further the effect of the models is reduced.

The decentralized characteristic of the block chain technology can well solve the problems of poor reliability, high cost, low efficiency and the like in the current Federal learning training mode. Because the blockchain has the characteristics of non-tampering, traceability and distributed storage, each node in the blockchain has the backup of the content of the block in the blockchain, and when a certain node server has a single point failure and cannot acquire the data information of the node, the data content stored in the node can be recovered by using the characteristic of the blockchain. There is no central server in the blockchain network, and each node in the network verifies the validity of the block and participates in the store-and-forward of the block, and meanwhile, blocks and transactions which do not pass the validity verification can be discarded. Thus all nodes in the blockchain network are equal and each node has a complete data record, and even if some node or nodes in the system suffer a failure, the integrity of the system is not affected. The system utilizes the blockchain network to replace a central server to optimize the federal learning, and the blockchain technology can also ensure the safety of the user privacy data to a great extent while enhancing the robustness of the system. It can be said that the blockchain technique ensures the security of data and the stability of distributed training.

In the federal anti-counterfeiting traceability system, the effect is achieved by combining the video ID and the block chain characteristic. Because the number of videos in the system is far larger than that of users, in order to facilitate query and save memory space, the following ideas are adopted to design for preventing single-point failure:

each time a user uploads a video, a unique video ID is generated, the video is stored on a back-end server, and the video IDs are respectively stored in the back-end server and a block chain;

the video ID transmitted into the blockchain is divided into two parts, wherein one part is used for recording all video IDs stored by the node, and the other part is used for tracing the video.

Due to the characteristic that the block chain can not be tampered, the validity of the video ID transmitted into the block chain can be ensured, and the path is real and reliable; meanwhile, due to the characteristic of distributed storage, when one node fails, the video ID number set stored by the node can be acquired from other nodes, and then videos corresponding to the ID numbers are downloaded from a back-end server to recover data stored by the node.

2.8 Risk prediction Algorithm design

Under the conditions of detecting deep fake content and finishing the abnormal source tracing, the system considers the subsequent countermeasures for the abnormal generation nodes. The system fully utilizes the portrait data generated in the system operation, mines the potential information in the data, predicts the abnormity of the node, and rejects any data operation of the node for the high-risk node.

The system improves the classification capability of the model by integrating the XGboost, naive Bayes, logistic regression and other classical algorithms from the perspective of machine learning based on real desensitization data and by applying the idea of integrated learning, is applied to risk behavior prediction, deeply excavates the risk characteristics of the user, establishes a risk prediction model, analyzes the multi-index behavior of the user, more accurately predicts the future risk behavior of the user, and provides an effective method for formulating a better risk management and control strategy.

The system establishes an integrated learning model based on an XGboost algorithm to complete a risk prediction task. The fusion model inherits classifier algorithms of classical models such as XGboost, naive Bayes, logistic regression and the like. The XGboost algorithm is a Boosting algorithm based on a gradient decision tree, and compared with a common decision tree algorithm, the XGboost algorithm is widely applied to behavior prediction in recent years due to higher accuracy and universality on data. By integrating the XGboost with naive Bayes and logistic regression, some disadvantages of the XGboost can be overcome. The specific flow chart of the algorithm is shown in fig. 13.

2.8.1 selection of datasets

In the model building stage, due to the lack of similar application scenes, a prediction data set under the similar scenes is downloaded from a Kaggle platform, and data cleaning is completed on the basis, so that the result can be reasonably migrated to the current situation. The data sets available to the system are two sets, a training set and a validation set.

2.8.2 core Algorithm selection

As an application system of machine learning, we can roughly divide performance into model prediction performance and running performance. On the one hand, in order to improve the prediction performance, attention needs to be paid to the processing of data characteristics and the selection and optimization of models. Multiple data should be considered comprehensively, and the trend and the association in the data set are captured by the design characteristics by utilizing abundant numerical type, subtype and time sequence type characteristics in the given data set as much as possible. In the model selection stage, besides the scores of the indexes, the actual scene of the problem needs to be considered, the recall rate is emphasized, and the accuracy of model identification is improved through comprehensive consideration; on the other hand, to improve the operation efficiency, especially to achieve the goal of near real-time operation, it is necessary to ensure the lightweight of the model architecture, the real-time data processing, and select some features that contribute the highest degree to the decision, so as to save the operation time and the occupied consumption of the operation space, and enhance the high efficiency of model identification.

At present, the three basic models for solving the problems, such as an XGboost model, a naive Bayes model and a logistic regression model, have the following problems:

the original data distribution is unbalanced, so that the three primary models have different preferences on positive and negative samples, namely, the classification capability on the positive samples is poor.

Secondly, after statistical analysis and comparison, the best-performing naive Bayes model is selected from the three models, and in order to try higher accuracy, the classification capability is further improved by adopting a Stacking Classifier.

In view of the first problem, a SMOTE oversampling method is introduced, which is defined as follows:

SMOTE is called Synthetic Minority over sampling Technique, namely a Technique for synthesizing Minority class Oversampling, and is an improved scheme based on a random Oversampling algorithm, because random Oversampling adopts a strategy of simply copying samples to increase Minority class samples, the problem of model overfitting is easily generated, namely information learned by a model is too special (Specific) and not generalized (General), and the basic idea of the SMOTE algorithm is to analyze the Minority class samples and artificially synthesize new samples according to the Minority class samples to add into a data set. By sampling SMOTE, the classification capability of the model can be improved.

For the second category of problems, we try to improve the performance of prediction using an integration model, which uses the three models described above as the base model for integration, and the tool used is the StackingClassifier in sklern. Stacking is a set learning technique that combines multiple classification models by a meta-classifier. Training each classification model based on a complete training set; the meta classifier is then fitted based on the output-meta features of the individual classification models in the ensemble. The meta classifier may be trained on the predicted class labels or probabilities from the set.

In practical tests, we find that the accuracy of the prediction performance is improved better by using the integrated model, and therefore the results of the classification model experiment and the integrated model training are mainly described below.

2.8.3 Algorithm construction, data visualization and Performance comparison

Preliminary analysis of the data was performed prior to modeling. Through descriptive statistical analysis, the difference between the characteristic units is small, and the units are close. In the case of large data difference between different bits, non-dimensionalization processing is required. Meanwhile, we observe that the number distribution of type 0 users and type 1 users in the training set train is unbalanced, i.e. a data skew situation occurs, which also requires that we deal with the data equalization problem when training the model.

Meanwhile, by the distribution condition of the user characteristics, we can observe that in some characteristics, the distribution of the type 0 user and the type 1 user is obviously different, and such characteristics may be characteristic factors having a large influence on user classification. For example: the two types of users have significant differences in var feature distributions numbered 0, 3, 6, 9, 12, 13. The var feature distributions on

numbers

3, 4, 7, 8, 10 are more uniform. In subsequent data set feature selection, we may consider screening or reconstructing features of the data set.

Because of the large amount of data, the missing value query is performed first, and the missing value query function is defined as follows:

the missing value query function is used to output the total number of missing values in each row of elements and the corresponding percentage of missing values, and generally, when the ratio of missing values is very small compared to the data set, mean filling or deleting can be used. If the missing value occupies a large proportion of the total elements, other padding methods should be selected for padding (typically, lagrange interpolation method).

Through the missing value query function, we know that the data set with the capacity of 20 ten thousand users does not contain the missing value, which provides great convenience for subsequent operations of us.

By looking at the user distribution of 0 and 1 in the training set train through the bar chart shown in fig. 14, it can be observed that the number distribution of 0 type users and 1 type users in the training set train is unbalanced, i.e. a data skew situation occurs, which also requires us to deal with the data equalization problem when training the model. The self-contained standardization processing toolkit in sklern is adopted to standardize the data set, and by operating as above, the following results are obtained: (1) with labels (user classification known) to be used for training the model, and for preliminary validation of the data set df _1 of the model. (2) A data set df _ t2 that contains no tags and is used to project the model and predict its user behavior.

Through careful observation and preliminary understanding, the data is classified and predicted by dense data, so that the classification algorithm should be considered as the main algorithm when the algorithm is selected. The system selects simple and practical naive Bayes and logistic regression algorithm (i.e. less hyper-parameters, high training speed and low requirement on machine and parameter adjustment) as a preliminary attempt.

Although the logistic regression belongs to a regression analysis model in a broad sense, the practical application field has more problems of two categories.

Here, a logistic regression model is called from skleran, and the parameter is set to C0.01 and the solution is set to 'sag' because the data set is large. The Roc curve of the obtained Log model is shown in fig. 15. The modeling was then performed using a naive bayes model whose Roc curve is shown in fig. 16. Finally, XGboost is used for modeling, and the Roc curve of the XGboost model is drawn as shown in FIG. 17.

The performance of the above several models was evaluated: the AUC value (Area under the dark of ROC) is the Area covered by the ROC Curve, and the larger the AUC, the better the classification effect of the classifier.

AUC 1 is a perfect classifier, and when this prediction model is used, a perfect prediction can be obtained regardless of what threshold is set. In most prediction scenarios, no perfect classifier exists.

0.5< AUC <1, superior to random guess. This classifier (model) can be predictive if it sets the threshold value properly.

AUC is 0.5, as the follower guesses, the model has no predictive value.

AUC <0.5, worse than random guess; but is better than random guessing as long as it always works against prediction.

From the preliminary analysis of the three models, it can be seen that the three algorithms perform better on unbalanced data sets X _ test and y _ test, but due to the imbalance of the data sets themselves, there should be more reference AUC scores and other three indexes.

In the three models established preliminarily, it can be seen that the three evaluation indexes have larger differences in scores of positive and negative examples, and the main performance is higher in score of negative examples and lower in score of positive examples. Namely, the coverage rate and the classification accuracy of the model for the negative samples are high, and the classification capability for the positive samples is weak.

In short, the preliminary model we have built, while performing well for sample classification in the test set, tends to cover more negative samples from the coverage of the classification, but fails to cover positive samples.

Here, the integration model technology is used for optimization, and the Roc curve of the integration model is obtained as shown in fig. 18, and by comparing the fusion model with the three basic models, it can be known that the accuracy and performance of the fusion model algorithm are both improved.

To sum up, the generalization and classification ability of the model are improved to a certain extent through model fusion.

The integrated model is built, and the model is stored in a persistent mode in the last step of the code, so that the model can be prepared for predicting user behaviors by the model in subsequent deployment.

2.8.4 model deployment

And when the system is deployed and on-line, the trained model service needs to be deployed. An HTTP Server can be started quickly through the flash and different processing functions can be set in different access paths.

And carrying out system deployment based on flash. And deploying the machine-learned model service by utilizing the serialized model. Firstly, loading a model into a memory, then calling the model to predict when an access path is/api is set, and for the sake of simplicity, checking and exception handling of input data are not performed; run finally starts a server and listens on the 5000 port by default.

2.5 System building implementation

2.5.1 JavaScript-based Lab of Magic development and modularization method

2.5.1.1 JavaScript

JavaScript ("JS") is a lightweight, interpreted or just-in-time programming language with function precedence. Although it is named as a scripting language for developing Web pages, it is also used in many non-browser environments, JavaScript is based on prototypical programming, multi-modal dynamic scripting languages, and supports object-oriented, imperative, and declarative (e.g., functional programming) styles.

JavaScript was first designed in 1995 by Netscape's Bredan Eich on the Netscape navigator browser. Because Netscape works with Sun, Netscape management expects it to look like Java and is therefore named JavaScript. But in practice its grammatical style is closer to Self and Scheme.

The standard for JavaScript is ECMAScript. By 2012, all browsers fully support ECMAScript 5.1, and older versions of browsers support at least the ECMAScript 3 standard. On 17.6.2015, the ECMA international organization released a sixth version of ECMAScript, formally named ECMAScript 2015, but commonly referred to as ECMAScript 6 or ES 6.

2.5.1.2 Vue framework

The Vue has unique design concept, revolutionary innovation, excellent performance and simple code logic. Therefore, more and more people are beginning to pay attention to and use, and it is considered that it may be a mainstream tool for future Web development.

The system itself is rolling the bigger, and the earliest UI engine becomes a whole set of front-end and back-end popular Web App solution. The derived Vue Native system aims at great ambition and hopes to write the Native App in a Web App writing mode. If it can be realized, the whole internet industry can be subverted, because the same group of people only need to write the UI once, and can run on the server, the browser and the mobile phone simultaneously.

Vue are used primarily to construct UIs. You can pass many types of parameters in Vue, such as declarative code, helping you render a UI, HTML DOM elements that can also be static, dynamic variables, and even interactive application components.

The method is characterized in that:

firstly, a statement design: vue use a declarative paradigm to easily describe an application.

Secondly, high efficiency: vue minimize interaction with the DOM through simulation of the DOM.

Flexible: vue may work well with known libraries or frameworks.

2.5.1.3 processing multi-threaded Web services using uWSGI containers

uWSGI is an option for one deployment on servers like nginx, lighttpd and kiosk. See FastCGI and independent WSGI containers for more options. You would first need a uWSGI server to use your WSGI application with the uWSGI protocol. uWSGI is a protocol, and is also an application server, and can provide uWSGI, FastCGI and HTTP protocols.

In uWSGI, a multithreading function is started by default and is divided into a main thread and a worker thread. The main thread is automatically enabled during initialization. The main thread designates a specific worker thread to accept requests, and thereafter spawns worker threads in succession according to the user's connection.

During the operation of the uWSGI container, the main process executing part is an infinite loop, and can execute a specific terminal and receive signals, namely, managing a worker process and processing timing or event triggering tasks. The worker process is used for executing a part of terminals and circularly receiving requests, the number of threads to be generated can be manually set when the worker process is started, and one thread is started by default.

The function that receives the request in the worker process should have a lock: 1 under _ lock. In the actual working process, the main process binds and monitors the socket and calls fork, and the accept is carried out in each subprocess, so that when a request attempts to establish connection, all the subprocesses can be awakened. The system uses a method of serializing the accept and placing a lock before the accept to solve the problem of huge CPU waste.

2.5.1.4 interaction with front-end data

The interaction with the front end adopts Ajax interaction based on JavaScript. Because the system adopts a front-end and back-end separated deployment technology, most browsers adopt a homologous strategy when requesting back-end data based on safety consideration in Ajax cross-domain access of the front end and the back end, namely the request end and the response end have the same port, the same domain name and the same protocol. In a system with separation of the front and back ends, heterogeneous access must be employed against this policy. The solution at this time may be converting the interactive data into JSONP format and CORS technology. The CORS technique is used here to solve the problem of cross-domain access to data based on the problem of back-end data security considerations. Because the CORS can process the request with the certificate, the security of the server side can be considered under the condition of protecting the security of the client side. And the CORS tool has a plurality of processing methods for the self-defined request header, so that the development pressure of developers is reduced.

2.5.1.5 facilitating front-end page beautification using Bootstrap-flash

The page style of the front end mainly depends on html, css and JavaScript of the front end. However, in the flash tool chain at the back end, there is also a help tool that can make the front end perform the use of the boottrap frame quickly. Wherein the Bootstrap-flash is a front-end optimized library of the improved flash-Bootstrap and is directly quoted in the back end.

2.5.2 background development and data management

2.5.2.1 Node.js

Js is published in 5 months in 2009, developed by Ryan Dahl, is a JavaScript running environment based on a Chrome V8 engine, uses an event-driven and non-blocking I/O model, enables the JavaScript to run on a development platform of a server, and enables the JavaScript to become a script language which is flush with the languages of the server such as PHP, Python, Perl, Ruby and the like.

Js optimizes some special cases and provides a substitute API, so that the V8 can better run in a non-browser environment, the V8 engine has very high Javascript execution speed and very good performance, and a platform established based on Chrome JavaScript running is used for conveniently establishing network application with high response speed and easy expansion.

The V8 engine itself uses some of the latest compilation techniques. Therefore, the running speed of the code written by the script language such as Javascript is greatly improved, and the development cost is saved. Performance is a key factor for a Node. Javascript is an event-driven language, Node utilizes the advantage and writes a server with high expandability. Node employs an architecture called "event loop" to make it easy and secure to write highly scalable servers. There are various techniques to improve the performance of the server. The Node selects a framework which can improve the performance and reduce the development complexity. This is a very important property. Concurrent programming is often complex and pervasive on mines. Node bypasses these but still provides good performance.

The Node employs a series of "non-blocking" libraries to support the manner in which events are cycled. Essentially providing an interface for resources such as file systems, databases, and the like. When a request is sent to the file system, the non-blocking interface can inform the Node when the hard disk is ready without waiting for the hard disk (addressing and retrieving the file). The model simplifies the access to slow resources in an extensible mode, and is visual and easy to understand. Especially, the user familiar with the DOM events such as onmouseover, onclick and the like has a feeling of similar great acquaintance.

Although it is not a Node's unique point to let Javascript run on the server side, it is a powerful function. It has to be acknowledged that the browser environment limits our freedom to choose a programming language. The desire to share code between any server and increasingly complex browser client applications can only be fulfilled by Javascript. Although other platforms supporting Javascript to run on the server side exist, Node is developed rapidly and becomes a real platform due to the above characteristics.

(1) MySQL database

MySQL is a relational database management system, developed by the MySQLAB company, Sweden, and belongs to the product under Oracle flag. MySQL is one of the most popular Relational Database Management systems, and in terms of WEB applications, MySQL is one of the best RDBMS (Relational Database Management System) application software.

MySQL is a relational database management system that keeps data in different tables instead of putting all the data in one large repository, which increases speed and flexibility.

(2) Cloud server deployment

Load balancing

Load Balance (BLB) can Balance the flow of an application program, forward concurrent access of a front end to a plurality of cloud servers in a background, implement service level expansion, eliminate single-point failures of services in time through automatic failure switching, and improve the availability of services.

② relational database service

A Relational Database Service (RDS for short) is a professional, high-performance, and highly reliable cloud Database Service. The RDS provides a WEB interface for configuration and database operation, and also provides functional support for reliable data backup and recovery, complete safety management, perfect monitoring, easy expansion and the like for users. Compared with a user self-built database, the RDS has the characteristics of being more economical, more professional, more efficient, more reliable, simple and easy to use and the like, so that you can concentrate on core services more.

The embodiment can achieve the following technical effects:

1. the embodiment provides an overall framework of a federal anti-counterfeiting traceability chain, establishes a triple mechanism of federal anti-counterfeiting, abnormal traceability and risk prediction to realize an expected function, and can effectively solve the problems of data virus throwing and single-point failure aiming at federal learning while preventing common Web security threats.

2. The embodiment enriches the thickness of data by using the federal learning technology, and improves the accuracy of the algorithm. In consideration of the fact that data share the same sample space but feature spaces are different under the condition of face identity authentication, a horizontal federal learning technology is selected, and privacy safety among nodes can be guaranteed through built-in multi-party safety calculation realized by an addition homomorphic encryption algorithm.

3. The embodiment combines the federal learning and the block chain technology, and provides the concept of the federal anti-counterfeiting traceability chain. The system takes the block chain as a bottom data platform, stores the stored video information of all users and the circulation path information of the video, and ensures the reality and reliability of the tracing path and the distributed characteristics of the stored information of the users. The system solves the problems of data virus input and single point fault in the artificial intelligence system for federal learning by using the characteristics of the block chain.

The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. The deep counterfeiting traceability system based on the big data federal learning architecture is characterized by comprising an application layer, an interface layer, a logic layer, a network layer and a storage layer which are sequentially connected;

2. The deep counterfeiting traceability system based on the big data federated learning architecture of claim 1, wherein the logic layer comprises a login registration module, an attack prevention module, a federated learning module, a block chain traceability module, a deep counterfeiting detection module and a risk prediction module; wherein the content of the first and second substances,

3. The deep forgery traceability system under big data federated learning architecture of claim 1, wherein the interface layer comprises a blockchain traceability interface, a federated learning online reasoning API interface, a foreground data interface, and a backend data interface.

4. The deep forgery traceability system based on big data federation learning framework of claim 2, wherein the blockchain traceability module is further configured to, when the deep forgery detection module detects that local data is subjected to deep forgery tampering, find a first user node that generates an error, delete and mark all tampered data on the traceability path, upload correct data again, and call the login registration module again for identity authentication.

5. The deep counterfeiting traceability system based on big data federation learning framework as claimed in claim 2, wherein the preprocessing process performed on the original data set specifically comprises: and constructing a Retina face model, performing face detection and cutting on the original data set by using the Retina face model, only reserving the data set of the video containing the face, and performing feedback regulation optimization on the face detection process by adopting a deformable convolution and dense regression loss function to obtain the preprocessed original data set.

6. The deep counterfeiting traceability system based on big data federated learning architecture of claim 2, wherein the transverse federated learning training process performed according to the preprocessed original data set and the user information transferred by the application layer specifically comprises: distributing the preprocessed original data set and the user information transmitted by the application layer to different machines, downloading a transverse federal learning model from a server by each machine, training the transverse federal learning model by using local data, and returning parameters to be updated by the server; and the server aggregates the returned parameters of all the machines, updates the transverse federated learning model in the server, and feeds back the latest transverse federated learning model to each machine for model prediction.