CN116401685A - Data full-flow risk internal examination method, device and platform - Google Patents

Data full-flow risk internal examination method, device and platform Download PDF

Info

Publication number
CN116401685A
CN116401685A CN202310277855.8A CN202310277855A CN116401685A CN 116401685 A CN116401685 A CN 116401685A CN 202310277855 A CN202310277855 A CN 202310277855A CN 116401685 A CN116401685 A CN 116401685A
Authority
CN
China
Prior art keywords
data
risk
flow
answer
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310277855.8A
Other languages
Chinese (zh)
Inventor
刘金飞
文龙
刘宁
朱鹏云
朱一丁
贺泉贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202310277855.8A priority Critical patent/CN116401685A/en
Publication of CN116401685A publication Critical patent/CN116401685A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Molecular Biology (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data total-flow risk internal examination method, a device and a platform, wherein the method comprises the following steps: establishing a data whole-flow risk assessment questionnaire library; selecting a module and a question from the data whole-flow risk assessment questionnaire according to user requirements and preset rules according to attribute characteristics of the data whole-flow risk assessment questionnaire so as to enable a user to select and answer; invoking a pre-trained model to score the answer result according to the answer result of the user; judging the illegal item with the highest risk in the answer selection result by a machine learning method and providing a risk solution for the illegal item; and generating a compliance certificate according to the data risk analysis result and automatically storing the compliance certificate by using a blockchain technology.

Description

Data full-flow risk internal examination method, device and platform
Technical Field
The invention belongs to the technical field of data risk internal examination, and particularly relates to a data total-flow risk internal examination method, device and platform.
Background
Today, frequent social data security events are common, and enterprises are always faced with various risk challenges when daily collecting and processing data. Through periodic internal scrutiny, effectively managing possible risk items to ensure data security is an effective means that enterprises can take. In the existing risk assessment technology, data risk examination is mostly completed manually in a form of filling in a form and is stored in a computer terminal, but the data risk examination has higher professionality and the risk assessment standard is harder to unify. Therefore, a data full-flow risk internal examination method is needed to solve the above dilemma. In addition, the risk assessment of each data process is difficult to manage, the internal examination report and the certificate storage mode are not safe, so that the reliability of the certificate is not high when an enterprise is examined or a security event occurs, and therefore a data whole-process risk internal examination platform is needed to solve the dilemma.
Disclosure of Invention
In order to solve the problems of high data risk examination difficulty and low examination report credibility in the prior art, the embodiment of the application provides a data whole-flow risk internal examination method, device and platform.
According to a first aspect of an embodiment of the present application, there is provided a data total-flow risk internal review method, including:
establishing a data whole-flow risk assessment questionnaire library;
selecting a module and a question from the data whole-flow risk assessment questionnaire according to user requirements and preset rules according to attribute characteristics of the data whole-flow risk assessment questionnaire so as to enable a user to select and answer;
invoking a pre-trained model to score the answer result according to the answer result of the user;
judging the illegal item with the highest risk in the answer selection result by a machine learning method and providing a risk solution for the illegal item;
and generating a compliance certificate according to the data risk analysis result and automatically storing the compliance certificate by using a blockchain technology.
Further, the data full-flow risk assessment questionnaire library comprises:
(1) The data whole-flow questionnaire library comprises questionnaires aiming at each flow in the data whole-flow processing, and the data whole-flow processing evaluation questionnaire is divided into data acquisition, data transmission, data storage, data use, data disclosure, data destruction and entrusting processing according to each flow;
(2) The legal questionnaire library comprises relevant laws and normative standards of domestic and foreign data compliance, and specific questionnaires are formulated for each legal and regulatory rule;
(3) The custom questionnaire library comprises custom questionnaires input by users.
Further, in the model, the attribute is adopted to convert the form vector obtained by the answer result into a feature matrix, and the answer result scoring is carried out through a residual network based on the feature matrix.
Further, the attribute is adopted to convert the form vector obtained by the answer result into a feature matrix, and the answer result scoring is carried out through a residual network based on the feature matrix, and the method comprises the following steps:
according to the form vector obtained by the answer result, obtaining a plurality of relation information among all the questions of the form through a multi-head attention mechanism:
connecting the plurality of relation information and adjusting the dimension to obtain an incidence matrix of the form;
based on the incidence matrix, obtaining deep features of the form through a residual error network;
and obtaining the form score through the deep features by using a full connection layer adopting Dropout.
Further, judging the offence item with the highest risk in the answer result by a machine learning method, including:
performing factor analysis on the answer result of the user to obtain a factor load matrix;
calculating factors of the answer selection result of the user according to a Bartlett estimation method by utilizing the factor load matrix;
performing cluster analysis on the factor KNN to obtain k classified forms closest to the answer result distance of the user;
and classifying the answer results of the users according to the most classification category in the k nearest classified forms, thereby obtaining the violation item with the highest risk.
Further, a risk solution is provided for the highest risk violation item, specifically:
matching the violation item with the highest risk with the violation item to which each rule belongs to obtain a corresponding rule of risk, and providing a risk solution by combining a judicial case and expert opinion, wherein the risk solution comprises: the risk corresponds to legal and compliance advice.
Further, the natural language processing method is used for analyzing the violation items of each rule related to the data compliance, so that the violation item with the highest risk is matched with the violation item to which each rule belongs, and a corresponding rule of risk is obtained.
Further, the process of storing the proper authentication using the blockchain technique includes:
delivering the compliance certificate to an end node of a blockchain, wherein one blockchain link point in the blockchain corresponds to one data flow node;
evaluating the compliance certificate according to the evaluation standard set by the end node, and checking the integrity of the compliance certificate;
judging whether the compliance certificate meets the data transmission compliance requirement or not and judging whether data are transmitted or not.
According to a second aspect of the embodiments of the present application, there is provided a data full-flow risk internal review device, including:
the building module is used for building a data whole-flow risk assessment questionnaire library;
the selection module is used for selecting a module and a question from the data whole-flow risk assessment questionnaire according to the attribute characteristics of the data whole-flow risk assessment questionnaire and the user requirements and preset rules so as to enable the user to select and answer;
the scoring module is used for calling a pre-trained model to score the answer result according to the answer result of the user;
the judging module is used for judging the illegal item with the highest risk in the answer result through a machine learning method and providing a risk solution for the illegal item;
and the storage module is used for generating a compliance certificate according to the data risk analysis result and automatically storing the compliance certificate by using a blockchain technology.
According to a third aspect of embodiments of the present application, there is provided a data full-flow risk internal review platform, including:
the receiving module is used for receiving the answer selection result of the user;
the management module is used for controlling, managing and withdrawing the authority of the data processor;
the examination module is used for establishing a data whole-flow risk assessment questionnaire library; selecting a module and a question from the data whole-flow risk assessment questionnaire according to user requirements and preset rules according to attribute characteristics of the data whole-flow risk assessment questionnaire so as to enable a user to select and answer; invoking a pre-trained model to score the form according to the answer result of the user; judging the illegal item with the highest risk in the form by a machine learning method according to the answer selection result of the user and providing a risk solution for the illegal item;
and the certification module is used for generating and storing a compliance certification after the examination module completes examination.
The technical scheme provided by the embodiment of the application can comprise the following beneficial effects:
according to the embodiment, the user establishes the data whole-flow risk assessment form, answers the problems in the form, analyzes the data risk by using the model, obtains the solution, encrypts and stores the compliance certificate, so that the user can perform professional, efficient and safe risk examination on the enterprise data risk by himself, and the problem of different manual risk assessment standards is avoided. Meanwhile, the user can upload data on a data full-flow internal examination platform, examine the data, manage the data and safely store the compliance certificate, so that the user can safely and professionally conduct full-flow risk internal examination on the data.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a flow chart illustrating a data full-flow risk internal review method according to an example embodiment.
FIG. 2 is a flowchart illustrating blockchain storage in accordance with an exemplary embodiment.
Fig. 3 is a block diagram of a data full-flow risk internal audit device, according to an example embodiment.
FIG. 4 is a schematic diagram of a data full-flow risk internal audit platform, according to an example embodiment
Fig. 5 is a schematic diagram of an electronic device, according to an example embodiment.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application.
The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the present application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first message may also be referred to as a second message, and similarly, a second message may also be referred to as a first message, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
Fig. 1 is a flowchart of a data total-flow risk internal review method, as shown in fig. 1, according to an exemplary embodiment, where the method is applied to a terminal, and may include the following steps:
step S101: establishing a data whole-flow risk assessment questionnaire library;
step S102: selecting a module and a question from the data whole-flow risk assessment questionnaire according to user requirements and preset rules according to attribute characteristics of the data whole-flow risk assessment questionnaire so as to enable a user to select and answer;
step S103: invoking a pre-trained model to score the answer result according to the answer result of the user;
step S104: judging the illegal item with the highest risk in the answer selection result by a machine learning method and providing a risk solution for the illegal item;
step S105: and generating a compliance certificate according to the data risk analysis result and automatically storing the compliance certificate by using a blockchain technology.
According to the embodiment, the user establishes the data whole-flow risk assessment form, answers the problems in the form, analyzes the data risk by using the model, obtains the solution, encrypts and stores the compliance certificate, so that the user can perform professional, efficient and safe risk examination on the enterprise data risk by himself, and the problem of different manual risk assessment standards is avoided. Meanwhile, the user can upload data on a data full-flow internal examination platform, examine the data, manage the data and safely store the compliance certificate, so that the user can safely and professionally conduct full-flow risk internal examination on the data.
Step S101: establishing a data whole-flow risk assessment questionnaire library;
specifically, the data full-flow risk assessment questionnaire library includes:
(1) The data whole-flow questionnaire library is used for dividing the data whole-flow processing evaluation questionnaire into modules such as data acquisition, data transmission, data storage, data use, data disclosure, data destruction, entrusting processing and the like according to each flow;
(2) The legal questionnaire library comprises relevant laws and normative standards of domestic and foreign data compliance, and specific questionnaires are formulated for each legal and regulatory rule;
(3) The custom questionnaire library comprises custom questionnaires input by users.
The questionnaire module is characterized by comprising a data whole-flow questionnaire library, a rule questionnaire library and a rule questionnaire library, wherein a plurality of questionnaire templates are arranged in the data whole-flow questionnaire library, the questionnaire templates are composed of different questionnaire modules, and the questionnaire modules are distinguished according to each rule violation item of data and specifically comprise modules of data acquisition, data transmission, data storage, data use, data disclosure, data destruction, entrusting processing and the like.
In specific implementation, a corresponding questionnaire, a module or an additional custom module can be selected from the questionnaire library, a question is selected according to a preset rule, a questionnaire form specific to the project is generated, each form is associated with one or more violation terms, and each violation term is associated with one or more legal terms.
Step S102: generating a questionnaire form from the data whole-flow risk assessment questionnaire library according to the attribute characteristics of the data whole-flow risk assessment questionnaire library and a user demand, a preset rule selection module and a preset question so as to enable a user to select and answer;
specifically, the data whole-flow risk assessment questionnaire library supports a custom template, and the user can add a custom questionnaire in the custom questionnaire library and combine the custom questionnaire form with the custom template to form a more targeted questionnaire form.
Specifically, the question types in the questionnaire form are mainly non-questions and single-choice questions, multiple-choice questions and question answering can be supported, the user can select yes or no for each judgment question, and the user can select the most suitable option or not for each single-choice question.
Step S103: invoking a pre-trained model to score the form according to the answer result of the user;
specifically, the model firstly adopts the attribute to convert the form vector into the feature matrix, and secondly, the form scoring is carried out through a residual error network.
In one embodiment, scaled Dot-Product Attention (other attitudes may be used in practice, without limitation) is used, and the vector P= { x is formed for any form answer 1 ,...,x i ,...,x n N is the number of questions, x i Representing the answer to the ith question in form P. Let q=k=v=p, then q= qW Q ,K=kW k ,V=vW v Wherein W is Q ,W K ,W v Is three trainable parameter vectors used for encoding the original form into three different feature matrices, thereby enhancing the fitting capability of the model.
In the attention mechanism, Q (Query), K (Key), and V (Value) are three vectors used to calculate the attention weight. These vectors are typically from the same set of input data, typically word embeddings or feature representations of the input. Q, K, V can be regarded as a representation of the input data in different spaces. The dot product of Q and K represents their degree of similarity in the same space, which can be regarded as a degree of matching, namely: the weight of each input vector in the attention mechanism is denoted QK. The weighted feature matrix is further obtained by calculating a weighted average (i.e., QK multiplied by feature matrix V). Namely, according to the acquired Q, K and V, calculating Scaled Dot-Product Attention, and acquiring relation information among all problems of the form:
Figure BDA0004137031600000051
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004137031600000052
is a scaling factor that prevents the weights from being too large.
The association matrix H of the form is obtained by calculating a plurality of attributes and connecting the attributes and sending the attributes to a linear layer A
H A =concat(h 1 ,...,h t )W o ,h i =Attention(Q i ,K i ,V i )=Attention(qW i Q ,kW i k ,vW i v )
W o For scaling the matrix for dimensions, for adjusting the Attention (Q i ,K i ,V i ) Is a dimension of (c).
The residual network can effectively relieve the degradation phenomenon of the deep network by realizing signal cross-layer propagation through jump connection, and can acquire the deep features of the form. Formalized representation is as follows:
T i =Pool(Relu(Conv(Relu(Conv(T i-1 ))+T i-1 ))
wherein Relu denotes an activation function Relu, conv is a convolution layer, and Pool is a Max-Pooling layer.
Finally, the deep features acquired by the residual network obtain a scoring index o=fc (T) by adopting a full connection layer of Dropout i )。
The model target value is an average value of scores given by a plurality of experts according to related regulations, a training model is trained by adopting cross entropy and L1 regularization, so that a pre-trained model is obtained, and the scores of questionnaire forms can be obtained through the pre-trained model.
Step S104: judging the illegal item with the highest risk in the form by a machine learning method according to the answer selection result of the user and providing a risk solution for the illegal item;
the machine learning method includes: factor analysis and cluster analysis.
Specifically, all forms in the questionnaire library are subjected to factor analysis, and each problem variable in the form is regulated to each violation item, such as data acquisition, data transmission, data storage, data disclosure and the like. And secondly, taking the factor loads obtained by factor analysis as input to perform cluster analysis, and obtaining corresponding form rule violation classification.
First, an arbitrary questionnaire p= { x is defined 1 ,...,x i ,...,x n X, where x i Representing the observed value corresponding to the ith problem; 0 < |i| < c r ,c r The number of options corresponding to the ith question; n is the total number of questions in the form, i.e., the dimension of P.
Next, the factor analysis of P can be considered to be for any x i I is equal to or more than 1 and is equal to or less than n, and can be expressed as the sum of a linear function of a common factor and a special factor:
Figure BDA0004137031600000061
wherein F is j Represents common factors, j is more than or equal to 1 and less than or equal to m, m is the number of the set common factors, epsilon i Called q i Is a special factor of (a). The matrix is represented as follows:
P=AF+ε
Figure BDA0004137031600000062
a is called factor load matrix, a ij For form problem x i At factor F j Load on the load cell.
The principal component analysis method can be used for solving the factor load matrix, the correlation coefficient matrix of the form is set as R, and the characteristic value is calculated as lambda 1 ,λ 2 ,...,λ n1 ≥λ 2 ≥...≥λ n ) The corresponding feature vector is eta 1 ,η 2 ,...,η n Note h= [ η ] 1 ,η 2 ,...,η n ]Has the following components
Figure BDA0004137031600000071
The available factor load matrix is:
Figure BDA0004137031600000072
then, factor f= (a) of the form was calculated using Bartlett estimation T D -1 A) -1 A T D -1 P, where d=diag (σ 1 2 ,σ 2 2 ,...,σ n 2 )。
And performing clustering analysis through KNN by using the calculated factor score F. Recording the form to be analyzed as P, and the corresponding factor is scored as F P The Euclidean distance is used as a measure of the distance between the forms.
And finally, classifying the form P according to the most classified categories in the k classified forms closest to the form distance to obtain the illegal item with highest risk in the form P. Meanwhile, matching the risk corresponding laws with the violation items to which each rule belongs to obtain risk corresponding laws, and providing a risk solution by combining judicial cases and expert opinions, wherein the risk solution comprises the following steps: the risk corresponds to legal and compliance advice. Specifically, a natural language processing method is used for analyzing the violation items of each rule related to data compliance, so that the violation item with the highest risk in the form is matched with the violation item to which each rule belongs, a corresponding risk rule is obtained, and compliance advice is provided by combining a judicial case and expert opinion.
Step S105: and generating a compliance certificate according to the data risk analysis result and automatically storing the compliance certificate by using a blockchain technology.
And after the evaluation is finished, generating a compliance certificate according to the name of the user, the data evaluated by the user, the scores of the answer results, the violation item with the highest risk, the rule corresponding to the violation item and the risk solution.
The blockchain technology relates to a cryptographic hash function and is asymmetric to encryption, and when a compliance certificate is transmitted in each flow of data, the data and the compliance certificate are transmitted to an end node of a blockchain, wherein the end node is provided with an evaluation standard, and the terminal node judges whether the compliance certificate meets the compliance requirement of data transmission and judges whether the data is transmitted.
As shown in fig. 2, the process of storing a composite certificate using blockchain techniques may include the sub-steps of:
s201: the compliance certificate is communicated to an end node of a blockchain, wherein one blockchain link point of the blockchain corresponds to one data flow node.
For example, if the user selects three data processing flows of data acquisition, data transmission and data storage, there are three blocks, and each data processing flow corresponds to one block.
S202: and evaluating the compliance certificate according to the evaluation standard set by the end node, and checking the integrity of the compliance certificate.
Specifically, the audit compliance certificate includes whether the content includes the user name, the data assessed by the user, the score of the answer result, the highest risk violation item, the violation item corresponding legal and the risk solution.
S203: judging whether the compliance certificate meets the data transmission compliance requirement or not and judging whether data are transmitted or not.
Specifically, determining whether the compliance certificate meets the data transmission compliance requirement includes performing data sensitivity assessment and performing data transmission inspection.
Specifically, the data sensitivity evaluation is implemented by using an evaluation model (mainly related to a deep learning model and a natural language processing model), the evaluation standard can be established according to actual requirements, in this embodiment, personal information is set as sensitive information, and a user can also customize sensitive data.
In particular, the data transmission check includes whether the data is desensitized or whether the data transmission is encrypted. If the requirements are not met, an alarm is sent, and the transfer of the compliance memory card to the next node is forbidden. And if the requirement is met, transmitting the compliance certificate to the next node. The data transfer check may prevent sensitive data from leaking.
Specifically, the data, the compliance certificate, the transmission process and the connection between the blocks are all made with security leakage prevention and alteration prevention measures.
Corresponding to the embodiment of the data whole-flow risk internal examination method, the application also provides an embodiment of the data whole-flow risk internal examination device.
Fig. 3 is a block diagram of a data full-flow risk internal audit device, according to an example embodiment. Referring to fig. 3, the apparatus may include:
the establishing module 21 is used for establishing a data whole-flow risk assessment questionnaire library;
the selection module 22 is configured to select a module and a question from the data overall-process risk assessment questionnaire according to a user requirement and a preset rule according to attribute features of the data overall-process risk assessment questionnaire, so that a user can answer a selection;
the scoring module 23 is configured to invoke a pre-trained model to score the answer result according to the answer result of the user;
a judging module 24, configured to judge the offence item with the highest risk in the answer result by using a machine learning method and provide a risk solution for the offence item;
the storage module 25 is configured to generate a compliance certificate according to the data risk analysis result and automatically store the compliance certificate by using a blockchain technology.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The application also provides a data whole-flow risk internal examination platform, as shown in fig. 4, which comprises a receiving module, a management module, an examination module and a certification module.
A receiving module 301, configured to receive a answer result of a user;
specifically, the module supports processing multi-modal data, including text, audio, images and video, the platform intelligently extracts contents in the multi-modal data, intelligently distinguishes compliance data from non-compliance data, and classifies and marks data containing different data risk levels. And uploading data by a user on a main page of a platform in the whole data flow risk, selecting links where the data are located and target evaluation items, intelligently analyzing the risk of each dimension of the data, the links where the risk is located and a risk solution by the platform, and generating an evaluation result and a compliance certificate.
The management module 302 mainly serves a data security manager, and the data security manager can control, manage and withdraw the authority of the data processor at any time.
Specifically, the platform is provided with four drop-down frames of data flow, related responsible persons, authority management and security score, wherein the authority management comprises: browse right, processing right, management right, examination right. For example, the data security manager clicks the management module on the platform, selects one relevant responsible person related to data storage and data storage, selects the management authority of the relevant responsible person, such as "browse only", and checks or changes the security score of the relevant responsible person.
The review module 303 is configured to establish a data full-flow risk assessment questionnaire library; selecting a module and a question from the data whole-flow risk assessment questionnaire according to user requirements and preset rules according to attribute characteristics of the data whole-flow risk assessment questionnaire so as to enable a user to select and answer; invoking a pre-trained model to score the form according to the answer result of the user; judging the illegal item with the highest risk in the form by a machine learning method according to the answer selection result of the user and providing a risk solution for the illegal item;
and the certification module 304 is used for generating and storing a compliance certification after the inspection module completes inspection.
Specifically, the blockchain technique is used to store compliance censoring content, processes, and evidence.
And finally, after evaluating, checking and storing the data flows, generating a data total-flow risk internal check report, and transmitting the data total-flow risk internal check report to a data security manager for secondary checking, wherein the data security manager can change the internal check report according to specific conditions, and the change process is recorded.
Correspondingly, the application also provides electronic equipment, which comprises: one or more processors; a memory for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement a data full-flow risk-internal-audit method as described above. As shown in fig. 5, a hardware structure diagram of any device with data processing capability, where the data total-flow risk internal auditing method is provided in the embodiment of the present invention, except for the processor, the memory and the network interface shown in fig. 5, any device with data processing capability in the embodiment is generally according to the actual function of the any device with data processing capability, and may further include other hardware, which is not described herein.
Accordingly, the present application further provides a computer readable storage medium having stored thereon computer instructions that, when executed by a processor, implement a data full-flow risk-censoring method as described above. The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may also be an external storage device, such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), or the like, provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any device having data processing capabilities. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof.

Claims (10)

1. The full-flow risk internal examination method for the data is characterized by comprising the following steps of:
establishing a data whole-flow risk assessment questionnaire library;
selecting a module and a question from the data whole-flow risk assessment questionnaire according to user requirements and preset rules according to attribute characteristics of the data whole-flow risk assessment questionnaire so as to enable a user to select and answer;
invoking a pre-trained model to score the answer result according to the answer result of the user;
judging the illegal item with the highest risk in the answer selection result by a machine learning method and providing a risk solution for the illegal item;
and generating a compliance certificate according to the data risk analysis result and automatically storing the compliance certificate by using a blockchain technology.
2. The method of claim 1, wherein the data full-flow risk assessment questionnaire library comprises:
(1) The data whole-flow questionnaire library comprises questionnaires aiming at each flow in the data whole-flow processing, and the data whole-flow processing evaluation questionnaire is divided into data acquisition, data transmission, data storage, data use, data disclosure, data destruction and entrusting processing according to each flow;
(2) The legal questionnaire library comprises relevant laws and normative standards of domestic and foreign data compliance, and specific questionnaires are formulated for each legal and regulatory rule;
(3) The custom questionnaire library comprises custom questionnaires input by users.
3. The method of claim 1, wherein in the model, a form vector derived from the pick result is converted to a feature matrix using Attention, and the pick result scoring is performed over a residual network based on the feature matrix.
4. The method of claim 3, wherein converting the form vector resulting from the pick result to a feature matrix using Attention, and wherein scoring the pick result via a residual network based on the feature matrix comprises:
according to the form vector obtained by the answer result, obtaining a plurality of relation information among all the questions of the form through a multi-head attention mechanism:
connecting the plurality of relation information and adjusting the dimension to obtain an incidence matrix of the form;
based on the incidence matrix, obtaining deep features of the form through a residual error network;
and obtaining the form score through the deep features by using a full connection layer adopting Dropout.
5. The method of claim 1, wherein determining the highest risk offence item in the answer result by a machine learning method comprises:
performing factor analysis on the answer result of the user to obtain a factor load matrix;
calculating factors of the answer selection result of the user according to a Bartlett estimation method by utilizing the factor load matrix;
performing cluster analysis on the factor KNN to obtain k classified forms closest to the answer result distance of the user;
and classifying the answer results of the users according to the most classification category in the k nearest classified forms, thereby obtaining the violation item with the highest risk.
6. Method according to claim 1, characterized in that a risk solution is provided for the highest risk violation, in particular:
matching the violation item with the highest risk with the violation item to which each rule belongs to obtain a corresponding rule of risk, and providing a risk solution by combining a judicial case and expert opinion, wherein the risk solution comprises: the risk corresponds to legal and compliance advice.
7. The method of claim 6, wherein the rule related to the data compliance is analyzed for violations using natural language processing methods, so that the highest risk violation is matched with the violation to which each rule belongs, and a risk corresponding rule is obtained.
8. The method of claim 1, wherein storing the proper authentication using blockchain techniques comprises:
delivering the compliance certificate to an end node of a blockchain, wherein one blockchain link point in the blockchain corresponds to one data flow node;
evaluating the compliance certificate according to the evaluation standard set by the end node, and checking the integrity of the compliance certificate;
judging whether the compliance certificate meets the data transmission compliance requirement or not and judging whether data are transmitted or not.
9. The utility model provides a data full flow risk internal examination device which characterized in that includes:
the building module is used for building a data whole-flow risk assessment questionnaire library;
the selection module is used for selecting a module and a question from the data whole-flow risk assessment questionnaire according to the attribute characteristics of the data whole-flow risk assessment questionnaire and the user requirements and preset rules so as to enable the user to select and answer;
the scoring module is used for calling a pre-trained model to score the answer result according to the answer result of the user;
the judging module is used for judging the illegal item with the highest risk in the answer result through a machine learning method and providing a risk solution for the illegal item;
and the storage module is used for generating a compliance certificate according to the data risk analysis result and automatically storing the compliance certificate by using a blockchain technology.
10. The full-flow risk internal examination platform for the data is characterized by comprising the following steps:
the receiving module is used for receiving the answer selection result of the user;
the management module is used for controlling, managing and withdrawing the authority of the data processor;
the examination module is used for establishing a data whole-flow risk assessment questionnaire library; selecting a module and a question from the data whole-flow risk assessment questionnaire according to user requirements and preset rules according to attribute characteristics of the data whole-flow risk assessment questionnaire so as to enable a user to select and answer; invoking a pre-trained model to score the form according to the answer result of the user; judging the illegal item with the highest risk in the form by a machine learning method according to the answer selection result of the user and providing a risk solution for the illegal item;
and the certification module is used for generating and storing a compliance certification after the examination module completes examination.
CN202310277855.8A 2023-03-17 2023-03-17 Data full-flow risk internal examination method, device and platform Pending CN116401685A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310277855.8A CN116401685A (en) 2023-03-17 2023-03-17 Data full-flow risk internal examination method, device and platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310277855.8A CN116401685A (en) 2023-03-17 2023-03-17 Data full-flow risk internal examination method, device and platform

Publications (1)

Publication Number Publication Date
CN116401685A true CN116401685A (en) 2023-07-07

Family

ID=87008340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310277855.8A Pending CN116401685A (en) 2023-03-17 2023-03-17 Data full-flow risk internal examination method, device and platform

Country Status (1)

Country Link
CN (1) CN116401685A (en)

Similar Documents

Publication Publication Date Title
CN110598016B (en) Method, device, equipment and medium for recommending multimedia information
Zehlike et al. Fa* ir: A fair top-k ranking algorithm
Motoki et al. More human than human: Measuring ChatGPT political bias
Buolamwini Gender shades: intersectional phenotypic and demographic evaluation of face datasets and gender classifiers
Soprano et al. The many dimensions of truthfulness: Crowdsourcing misinformation assessments on a multidimensional scale
Vetrò et al. A data quality approach to the identification of discrimination risk in automated decision making systems
Wihbey et al. The social silos of journalism? Twitter, news media and partisan segregation
US20170046346A1 (en) Method and System for Characterizing a User&#39;s Reputation
Liu et al. Can listing information indicate borrower credit risk in online peer-to-peer lending?
Amariles et al. Legal indicators in transnational law practice: a methodological assessment
Baugh et al. A matter of appearances: How does auditing expertise benefit audit committees when selecting auditors?
US11663397B1 (en) Digital posting match recommendation apparatus and method
WO2016203652A1 (en) System related to data analysis, control method, control program, and recording medium therefor
Akinbowale et al. Application of forensic accounting techniques in the South African banking industry for the purpose of fraud risk mitigation
Baron Explainable AI and causal understanding: Counterfactual approaches considered
Habibi et al. Using crowdsourcing to compare document recommendation strategies for conversations
CN116401685A (en) Data full-flow risk internal examination method, device and platform
US20220164374A1 (en) Method of scoring and valuing data for exchange
Bromberg et al. Police body‐worn camera policies as democratic deficits? Comparing public support for policy alternatives
Kelley et al. Anti-Discrimination Laws, AI, and Gender Bias in Non-Mortgage Fintech Lending
CN117764710B (en) Monitoring method for housing financial risk behaviors
Alghamdi et al. Evaluating E-Commerce Engagement Factors In Saudi Arabia: Financial Loss, Identity Theft And Privacy Policies
CN112116441B (en) Training method, classification method, device and equipment for financial risk classification model
JP7180921B1 (en) Program, information processing device and information processing method
CN110297884B (en) Project member authentication method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination