CN115713249A - Government affair satisfaction evaluation system and method based on data security and privacy protection - Google Patents

Government affair satisfaction evaluation system and method based on data security and privacy protection Download PDF

Info

Publication number
CN115713249A
CN115713249A CN202211235305.1A CN202211235305A CN115713249A CN 115713249 A CN115713249 A CN 115713249A CN 202211235305 A CN202211235305 A CN 202211235305A CN 115713249 A CN115713249 A CN 115713249A
Authority
CN
China
Prior art keywords
data
module
satisfaction
scoring
government
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211235305.1A
Other languages
Chinese (zh)
Other versions
CN115713249B (en
Inventor
卢清华
李梦园
李方伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Yitong College
Original Assignee
Chongqing Yitong College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Yitong College filed Critical Chongqing Yitong College
Priority to CN202211235305.1A priority Critical patent/CN115713249B/en
Publication of CN115713249A publication Critical patent/CN115713249A/en
Application granted granted Critical
Publication of CN115713249B publication Critical patent/CN115713249B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention requests to protect a government affair satisfaction degree scoring system and method based on machine learning, and belongs to the technical field of data scoring. The method specifically comprises the following steps: the system comprises a database module, a data security module and a data grading module, wherein the further database module comprises a data acquisition module and a data classification module; the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module; the data scoring module comprises a preprocessing module, a model training module and a scoring module. The data security module provides effective encryption protection work for sensitive information of the inspection crowd by using identity authentication, a data desensitization technology and a feedback early warning mechanism, enhances data security and privacy protection, forms an accessible system, increases data transparency and improves fairness and openness of government affair evaluation work. And the data scoring module completes model training by using a Catboost algorithm in machine learning, establishes a scientific government affair satisfaction degree scoring model and completes scoring work of related data.

Description

Government affair satisfaction evaluation system and method based on data security and privacy protection
Technical Field
The invention belongs to the technical field of data scoring, and particularly belongs to a government affair satisfaction evaluation system and method based on data security and privacy protection.
Background
With the continuous promotion of digital government construction, the data multi-path mode gradually becomes a novel and efficient government office mode. Meanwhile, governments at all levels increasingly pay more attention to the opinion of people and participate in government affair evaluation.
The existing government affair evaluation system only collects satisfaction degree data through a simple code scanning questionnaire or a webpage questionnaire, and only takes the average value after satisfaction degree scores are added as a final scoring result in data processing. The resulting problems are: (1) A complete government affair satisfaction degree scoring system is lacked, and a whole set of government affair evaluation system with strong adaptability and popularization is lacked, wherein the whole set of government affair satisfaction degree scoring system is from data collection to database establishment, data processing, scoring model establishment and the like; (2) Data security management is lacked, sensitive information such as attributes of the masses and the like is not processed, and the risk of privacy information leakage exists; (3) The scoring model is too backward, a scientific and efficient algorithm model is not formed, the problems existing behind government affair data are difficult to mine, and instructive suggestions are provided. (4) Government affair satisfaction data are not public, people cannot access queries, and the credibility is lacked.
Therefore, the method has important significance for applying scientific evaluation standards, methods and programs to perform government satisfaction evaluation work, performing duties on government personnel, constructing harmonious trunk group relations and establishing good government images.
CN111222753A, an electronic government performance evaluation system, comprising: the system comprises an evaluation model module, a data acquisition module, a data processing module, an index weight learning and generating module, an evaluation result generating module and an evaluation report natural language generating module. In the electronic government affair performance evaluation system, the index weight training model is established, the weight of each basic evaluation index is learned from data, and the certainty, the objectivity and the reasonability of the weight of each basic evaluation index in the evaluation process are improved. In the invention, the final evaluation result has more basis and directivity for the fusion of scores on each basic evaluation index through the setting of weight; and through the setting of a plurality of basic evaluation indexes, the government affair module is further conveniently evaluated from all directions, so that the demands of people are accurately known, and the government affair module is improved.
The e-government performance evaluation system proposed in patent No. 1.cn111222753a does not have a link of encryption protection for sensitive information of survey masses. This patent proposes data security module, utilizes authentication, data desensitization technique and feedback early warning mechanism, strengthens data security and privacy protection.
The cn111222753a patent data processing procedure lacks a data access system. This patent sets up data access port in data security module, forms accessible system for the people can visit original government affairs satisfaction survey data, increases data transparency, improves the fair openness of government affairs evaluation work.
The patent of CN111222753A focuses on determining the weight of an evaluation index through an evidence calculation method and a differential purification algorithm, and combining the weight to synthesize total evaluation information on the evaluation index to obtain a final evaluation result and generate a text evaluation report in natural language. The patent emphasizes on processing government affair evaluation satisfaction degree data to obtain scientific government affair satisfaction degree scoring results. And secondly, acquiring the contribution degree of the government affair satisfaction index by utilizing an import () function of the Catboost algorithm, and further determining the index weight. Not only is the data processing speed improved, but also the scientificity and fairness endowed by the weight are ensured.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A government affair satisfaction evaluation system and method based on data security and privacy protection are provided. The technical scheme of the invention is as follows:
a government satisfaction evaluation system based on data security and privacy protection, comprising: a database module, a data security module and a data scoring module, wherein,
the database module is used for forming a database according to the collection and classification results of various government affair satisfaction evaluation data and providing the database to the data security module;
the data security module is used for performing access control, privacy protection and feedback early warning on government affair satisfaction evaluation data, completing security management work of the data and providing a processed data set for the data scoring module;
and the data scoring module is used for model training of the government affair satisfaction evaluation data, constructing a government affair satisfaction scoring model and outputting a scoring result.
Further, the database module comprises a data acquisition module and a data classification module, wherein the data acquisition module is used for collecting various data of government affair satisfaction; the data classification module is used for dividing the collected government affair satisfaction data into monthly, quarterly and annual data according to time and further dividing the collected government affair satisfaction data into subdata sets including safe construction, legal construction and service evaluation according to the government affair subject content.
Further, the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module, wherein the identity authentication module is used for performing identity authentication on an accessor and determining the inquiry authority of the accessor; the data desensitization module is used for desensitizing a data set containing sensitive information; the feedback early warning module generates a log by recording user behaviors; and when the number of times of identity authentication refusal of the same user exceeds a set threshold value, locking the access port in time and feeding back an early warning to the terminal.
Further, the data security module completes the specific steps of access control, privacy protection and feedback early warning work as follows:
(1) When a user applies for accessing government affair satisfaction data resources, identity authentication is firstly carried out on the user, and if the authentication is not passed, access is refused; if the authentication is passed, the operation can be further carried out;
judging the data resource requested by the user, and when the data resource does not contain sensitive data, obtaining a required data set according to the authority; when the request data resource contains sensitive data, sensitive information including attributes of the masses is encrypted to obtain a desensitized data set;
(2) Recording the access process of the user to generate a corresponding log;
(3) And when the number of times of identity authentication refusal of the same user exceeds a set threshold value, locking the access port in time and feeding back an early warning to the terminal.
Further, the data desensitization module is used for desensitizing a data set containing sensitive information, and the specific steps are as follows:
(1) The sensitive data set enters a data desensitization module;
(2) Determining a desensitization scheme, and desensitizing sensitive data by using modes including truncation, encryption, hiding and replacement;
(3) Writing desensitization rules, writing desensitization rule tables, wherein different desensitization rules correspond to different data encryption methods;
(4) According to the sensitive data category, namely name, identification card number, mobile phone number, address and desensitization scheme, the desensitization of the sensitive data is completed through the main key correlation response and according to the appointed desensitization scheme;
(5) The desensitized data set is provided to a data scoring module.
Further, the data scoring module comprises a preprocessing module, a model training module and a scoring module, wherein the preprocessing module is used for preprocessing the accessed government affair satisfaction degree data set, including data cleaning, unbalanced data processing and data set segmentation, and providing the processed data set for the model training module; the model training module is used for carrying out model training on the data set, completing model training by utilizing a machine learning algorithm-Catboost algorithm, obtaining government satisfaction index contribution by utilizing an import () function in the Catboost algorithm, further determining index weight and providing the index weight to the scoring module; and the scoring module establishes a government affair satisfaction scoring model by using the index weight, finishes scoring work of related data and finally obtains a government affair satisfaction scoring result.
Further, the model training is completed by using a machine learning algorithm-a Catboost algorithm, and the government satisfaction index contribution degree is obtained by using an import () function in the Catboost algorithm, so that the index weight is determined and provided to the scoring module, and the method specifically comprises the following steps:
a government affair satisfaction evaluation method based on data security and privacy protection based on the system comprises the following steps:
collecting and classifying various government affair satisfaction degree evaluation data by using a database module to form a database and provide the database for a data security module;
performing access control, privacy protection and feedback early warning on government affair satisfaction evaluation data by using the data security module, completing security management work of the data and providing a processed data set for the data scoring module;
and performing model training on the government affair satisfaction evaluation data by using the data scoring module, constructing a government affair satisfaction scoring model, and outputting a scoring result.
The invention has the following advantages and beneficial effects:
aiming at the defects in the prior art, the invention establishes a complete government affair satisfaction evaluation system. It includes: the system comprises a database module, a data security module and a data grading module, wherein the database module comprises a data acquisition module and a data classification module and is used for evaluating the collection and classification results of data according to the degree of satisfaction of various government affairs, forming a database and providing the database for the data security module; the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module, and is used for performing access control, privacy protection and feedback early warning on government affair satisfaction evaluation data, completing security management work of the data and providing a processed data set to the data grading module; the data scoring module comprises a preprocessing module, a model training module and a scoring module and is used for model training of government affair satisfaction evaluation data, building a government affair satisfaction scoring model and outputting a scoring result.
The invention has the following advantages: 1. a complete data link of government affair satisfaction degree is formed, and adaptability and popularization are improved; 2. a data security module is added, and effective protection work is provided for the sensitive information of the public to be checked by using access control, encryption technology and a feedback early warning mechanism, so that the data security and privacy protection are enhanced; 3. an identity authentication module is added to form an accessible system, so that the data transparency is increased, and the fair openness of government affair evaluation work is improved; 4. a government affair satisfaction degree scoring model is established by utilizing a machine learning-Catboost algorithm, so that the data processing speed is improved, and the scientificity and fairness endowed by the satisfaction degree index weight are ensured.
Drawings
FIG. 1 is a system for evaluating government satisfaction based on data security and privacy protection according to a preferred embodiment of the present invention;
FIG. 2 is a schematic flow chart of a database module in a government affair satisfaction evaluation system based on data security and privacy protection provided by the present application;
fig. 3 is a schematic flow chart of a data security module in a government satisfaction evaluation system based on data security and privacy protection according to the present application;
fig. 4 is a schematic flowchart of a data evaluation module in a government affairs satisfaction evaluation system based on data security and privacy protection according to the present application;
FIG. 5 is a flow chart of data scoring in an embodiment of the present application;
FIG. 6 shows the results of desensitization processing of a data set according to an embodiment of the present application;
FIG. 7 is a comparison of the results of the algorithmic model evaluation of the data sets in the examples of the present application;
fig. 8 is a constructed government satisfaction scoring model.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
as shown in fig. 1 to 6, a government affair satisfaction evaluation system based on data security and privacy protection comprises a database module, a data security module and a data scoring module, and the implementation method comprises the following steps:
collecting and classifying various government affair satisfaction evaluation data by using a database module to form a database and provide the database to a data security module;
performing access control, privacy protection and feedback early warning on government affair satisfaction evaluation data by using the data security module, completing security management work of the data and providing a processed data set for the data scoring module;
and performing model training on the government affair satisfaction evaluation data by using the data scoring module, constructing a government affair satisfaction scoring model, and outputting a scoring result.
Preferably, the database module comprises a data acquisition module and a data classification module, wherein the data acquisition module is used for collecting various types of data of government affair satisfaction; the data classification module is used for dividing collected government affair satisfaction data into monthly, quarterly and annual data according to time, and further dividing the collected government affair satisfaction data into subdata sets such as safe construction, legal construction and service evaluation according to government affair subject contents.
Preferably, the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module, wherein the identity authentication module is used for performing identity authentication on an accessor and determining the inquiry authority of the accessor; the data desensitization module is used for desensitizing a data set containing sensitive information to increase data security; the feedback early warning module generates a log by recording user behaviors. When the number of times of identity authentication refusal of the same user exceeds a set threshold value, the access port is locked in time and early warning is fed back to the terminal, so that the access safety is improved.
Preferably, the data scoring module comprises a preprocessing module, a model training module and a scoring module, wherein the preprocessing module is used for cleaning the accessed government affair satisfaction degree data set, processing unbalanced data, segmenting the data set and providing the processed data set for the model training module; the model training module is used for carrying out model training on the data set, completing the model training by utilizing a machine learning algorithm-a Catboost algorithm, obtaining the government affair satisfaction index contribution degree by utilizing an import () function in the Catboost algorithm, further determining the index weight and providing the index weight to the scoring module; and the scoring module establishes a government affair satisfaction scoring model by using the index weight, finishes scoring work of related data and finally obtains a government affair satisfaction scoring result.
Preferably, the specific steps of the data security module for completing access control, privacy protection and feedback early warning work are as follows:
(1) When a user applies for accessing government affair satisfaction data resources, firstly, performing identity authentication on the user, and if the authentication is not passed, refusing access; if the authentication is passed, further operation can be carried out;
(2) Judging the data resource requested by the user, and when the data resource does not contain sensitive data, obtaining a required data set according to the authority; when the requested data resources contain sensitive data, sensitive information such as attributes of the masses and the like is encrypted to obtain a desensitized data set.
(3) And recording the access process of the user and generating a corresponding log.
(4) And when the number of times of identity authentication refusal of the same user exceeds a set threshold value, locking the access port in time and feeding back an early warning to the terminal.
Preferably, when the sensitive data set is accessed, the data desensitization module is automatically entered to complete the dynamic data desensitization, and the specific steps are as follows:
(1) The sensitive data set enters a data desensitization module;
(2) A desensitization protocol is determined. Desensitizing sensitive data by means of truncation, encryption, hiding, replacement and the like, such as replacing real values with special characters (, etc.);
(3) Writing desensitization rules, writing desensitization rule tables, wherein different desensitization rules correspond to different data encryption methods;
(4) According to the sensitive data category, namely sensitive information including name, identification card number, mobile phone number and address and the desensitization scheme, the desensitization of the sensitive data is completed according to the appointed desensitization scheme through the main key correlation response;
(5) The desensitized data set is provided to a data scoring module.
Preferably, the government affair satisfaction degree data set enters a data scoring module to complete relevant data scoring operation, and the specific steps are as follows:
(1) Finishing data preprocessing work: finishing data cleaning work, namely viewing a data set and describing data through a preprocessing module; processing the unbalanced data by utilizing an oversampling or undersampling method; finally, the data set is divided into a training set and a test set and provided to a model training module
(2) Completing model training work: through a model training module, training set data enter a data scoring module to perform model training, model training is completed through a machine learning algorithm-Catboost algorithm, the model is evaluated, the government satisfaction index contribution degree is obtained through an import () function in the Catboost algorithm, and index weight is determined and provided for the scoring module;
(3) And (3) finishing data scoring work: and establishing a government affair satisfaction degree scoring model by using the index weight through a scoring module, finishing the scoring work of related data and finally obtaining a government affair satisfaction degree scoring result.
Preferably, the data scoring module completes data preprocessing, model training and scoring model establishment to obtain a scoring result, and the detailed programming pseudo-code sentence based on Python software is as follows:
Figure BDA0003882526990000081
Figure BDA0003882526990000091
Figure BDA0003882526990000101
Figure BDA0003882526990000111
it is assumed that the satisfaction evaluation data of the residents in the J area for the government food safety work in 2021 year is collected through the data acquisition module A1 in the database module of FIG. 2, and sensitive information such as the name, identification number, mobile phone number, address and the like of the investigator exists in the data set. The data classification module A2 further divides the data into 3 sub-data sets according to time and government affairs subjects: 2021 year data D1, food safety work data D2, and 2021 year food safety work data D3.
Now, as shown in fig. 3, the user R accesses the 2021-year food safety work data D3 through the data security module B, and after obtaining the right through the identity authentication module B1, the data automatically enters the data desensitization module B2 to complete desensitization of sensitive data information, and the specific steps are as follows:
(1) According to the sensitive data category, namely the name, the identification number, the mobile phone number, the address and the desensitization scheme, the desensitization of the sensitive data is completed through the main key response and according to the appointed desensitization scheme;
the result of desensitization processing on the data set D3 is shown in fig. 6, and names in the data set are kept with surnames and are hidden; the identification card number keeps the first six digits and the last four digits, so that the identification card number can be matched with regional information and the safety of the information can be improved; the mobile phone number hides four digits from the fifth digit; the address information is intercepted and only reserved in the region, so that the compliance of government affair work satisfaction survey in the corresponding region can be conveniently checked, and meanwhile, information leakage is prevented.
(2) The resulting desensitized data set D3' is provided to a data scoring module C.
In order to further determine the satisfaction degree scoring result of the residents in the J area on the government food safety work, the data set D3' utilizes the data scoring module C to complete the final scoring work, and the specific steps are as follows with reference to FIG. 4:
(1) The government affair satisfaction degree data set D3' enters a preprocessing module C1 to finish basic data preprocessing work, including data cleaning work, data checking, null value characteristic filling, data set segmentation and the like;
(2) And completing model training work through a model training module C2. Model training is carried out on a data set D3' through Python software by utilizing a Catboost algorithm, and a government satisfaction degree data set is assumed to be
Figure BDA0003882526990000121
Wherein
Figure BDA0003882526990000122
Is an index vector of m government satisfaction characteristics,
Figure BDA0003882526990000123
is a label value corresponding to a government satisfaction index. The Catboost algorithm utilizes the mean of the same class feature data
Figure BDA0003882526990000124
Namely, it is
Figure BDA0003882526990000125
The frequency of the appearance of each class characteristic is deduced, and the coding forms a brand new numerical type variable
Figure BDA0003882526990000126
Namely, it is
Figure BDA0003882526990000127
Wherein [ ·]Representative indicator function: satisfy the requirement of
Figure BDA0003882526990000128
If the index of the category variable is represented, the function returns to 1, and if the index of the category variable is represented, the function returns to 0; p is a prior value of the hyper-parameter; the parameter alpha (alpha > 0) is the weight of the prior value;
Figure BDA0003882526990000129
and y j Respectively representing the jth category variable index and the corresponding label value thereof.
After the automatic coding work is finished, the Catboost algorithm replaces a gradient estimation method by using a self sequencing promotion method, and each government affair satisfaction degree sample D is subjected to k (D k ∈D 3 ') training to get a unique model M i Finally obtaining M n I.e. finding unbiased gradient estimation of the sample, and training and obtaining the final model.
(3) And completing model evaluation. The model trained by the Catboost model government satisfaction scoring module is evaluated through four indexes, the calculation result of each index is shown in FIG. 7, and the four measurement indexes are as follows: the model training speed, the accuracy, the F1 value and the AUC value, wherein the accuracy, the F1 value and the AUC value calculation method and the measurement content are as follows:
the model training speed refers to the time spent on training a model under the condition of equal data sets by different algorithms in the same computer equipment environment;
accuracy (precision) refers to the proportion of samples and accounts for which the government satisfaction score is truly correct in the samples, and is calculated by the formula
Figure BDA0003882526990000131
Recall (Recall) is the proportion of the classification identified by the model where the government satisfaction score is truly correct in the sample, and the calculation formula is
Figure BDA0003882526990000132
The F1 value is a weighted harmonic mean of accuracy and recall, and is assumed to be weighted the same, i.e.
Figure BDA0003882526990000133
(4) And constructing a government affair satisfaction degree scoring model. As shown in fig. 8, the relative importance of n feature variables in a data set D3' is obtained by using an import function in the castboost algorithm, an index contribution rate is determined, and further weighting is performed according to the index weight to obtain a scoring model, which specifically includes the following steps:
index contribution rate f of n different indexes obtained according to the Catboost model i Further according to F i =f i /∑f i Obtaining the weight of the corresponding index, reconstructing a resident food safety satisfaction degree scoring model, and assuming that the resident food safety satisfaction degree score is W, and each index score is W i Finally, a government satisfaction scoring model is obtained as follows: w = F 1 W 1 +F 2 W 2 +...+F i W i +...+F n W n
The systems, apparatuses, modules or units described in the above embodiments may be specifically implemented by a computer chip or an entity, or implemented by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims (8)

1. A system for evaluating government satisfaction based on data security and privacy protection, comprising: a database module, a data security module and a data scoring module, wherein,
the database module is used for forming a database according to the collection and classification results of various government affair satisfaction evaluation data and providing the database to the data security module;
the data security module is used for performing access control, privacy protection and feedback early warning on government affair satisfaction evaluation data, completing security management work of the data and providing a processed data set for the data scoring module;
and the data scoring module is used for model training of the government affair satisfaction evaluation data, constructing a government affair satisfaction scoring model and outputting a scoring result.
2. The system for evaluating the degree of satisfaction of government affairs based on data security and privacy protection as claimed in claim 1, wherein the database module comprises a data acquisition module and a data classification module, wherein the data acquisition module is used for collecting various types of data of the degree of satisfaction of government affairs; the data classification module is used for dividing the collected government affair satisfaction data into monthly, quarterly and annual data according to time and further dividing the collected government affair satisfaction data into subdata sets including safe construction, legal construction and service evaluation according to the government affair subject content.
3. The government affair satisfaction evaluation system based on data security and privacy protection according to claim 1, wherein the data security module comprises an identity authentication module, a data desensitization module and a feedback early warning module, wherein the identity authentication module is used for performing identity authentication on a visitor and determining inquiry authority of the visitor; the data desensitization module is used for desensitizing a data set containing sensitive information; the feedback early warning module generates a log by recording user behaviors; and when the number of times of identity authentication refusal of the same user exceeds a set threshold value, locking the access port in time and feeding back an early warning to the terminal.
4. The system for evaluating the satisfaction degree of government affairs based on data security and privacy protection as claimed in claim 3, wherein the data security module performs the specific steps of access control, privacy protection and feedback early warning work as follows:
(1) When a user applies for accessing government affair satisfaction data resources, identity authentication is firstly carried out on the user, and if the authentication is not passed, access is refused; if the authentication is passed, the operation can be further carried out;
(2) Judging the data resource requested by the user, and when the data resource does not contain sensitive data, obtaining a required data set according to the authority; when the requested data resources contain sensitive data, sensitive information including attributes of the masses is encrypted to obtain a desensitized data set;
(3) Recording the access process of the user to generate a corresponding log;
(4) And when the number of times of identity authentication refusal of the same user exceeds a set threshold value, locking the access port in time and feeding back an early warning to the terminal.
5. The government satisfaction evaluation system based on data security and privacy protection according to claim 3, wherein the data desensitization module is used for performing desensitization processing on a data set containing sensitive information, and comprises the following steps:
(1) The sensitive data set enters a data desensitization module;
(2) Determining a desensitization scheme, and desensitizing sensitive data by using modes including truncation, encryption, hiding and replacement;
(3) Writing desensitization rules, writing desensitization rule tables, wherein different desensitization rules correspond to different data encryption methods;
(4) According to the sensitive data category, namely name, identification card number, mobile phone number, address and desensitization scheme, the desensitization of the sensitive data is completed through the main key correlation response and according to the appointed desensitization scheme;
(5) The desensitized data set is provided to a data scoring module.
6. The system for evaluating government satisfaction based on data security and privacy protection according to claim 1, wherein the data scoring module comprises a preprocessing module, a model training module and a scoring module, wherein the preprocessing module is used for preprocessing the accessed government satisfaction data set including data cleaning, unbalanced data processing and data set segmentation, and providing the processed data set to the model training module; the model training module is used for carrying out model training on the data set, completing the model training by utilizing a machine learning algorithm-a Catboost algorithm, obtaining the government affair satisfaction index contribution degree by utilizing an import () function in the Catboost algorithm, further determining the index weight and providing the index weight to the scoring module; and the scoring module establishes a government affair satisfaction scoring model by using the index weight, finishes scoring work of related data and finally obtains a government affair satisfaction scoring result.
7. The system for evaluating government satisfaction based on data security and privacy protection according to claim 6, wherein the method comprises the steps of completing model training by using a machine learning algorithm-Catboost algorithm, obtaining government satisfaction index contribution by using an import () function in the Catboost algorithm, determining index weights and providing the index weights to a scoring module, and specifically comprises:
assume a government satisfaction data set of D = (X) k ,Y k ) k=1,2...,n Wherein
Figure FDA0003882526980000031
Is an index vector containing m government affair satisfaction degree characteristics, Y k =(y 1 ,y 2 ,...y k ),y k e.R is the label value for the corresponding government satisfaction index. The Catboost algorithm utilizes the mean of feature data of the same class
Figure FDA0003882526980000032
Namely that
Figure FDA0003882526980000033
) The frequency of the appearance of each class characteristic is deduced, and the coding forms a brand new numerical type variable
Figure FDA0003882526980000034
Namely, it is
Figure FDA0003882526980000035
Wherein [ ·]Representative indicator function: satisfy the requirements of
Figure FDA0003882526980000036
If the index of the category variable is represented, the function returns to 1, and if the index of the category variable is represented, the function returns to 0; p is a prior value of the hyperparameter; the parameter alpha (alpha > 0) is the weight of the prior value;
Figure FDA0003882526980000037
and y j Respectively representing the jth category variable index and the corresponding label value thereof;
after the automatic coding work is finished, the Catboost algorithm replaces a gradient estimation method by using a self sequencing promotion method, and each government affair satisfaction degree sample D is subjected to k (D k Epsilon D) training to obtain a unique model M i Finally obtaining M n I.e. finding unbiased gradient estimation of the sample, and training and obtaining the final model.
8. A government satisfaction evaluation method based on data security and privacy protection according to the system of any one of claims 1 to 7, comprising the steps of:
collecting and classifying various government affair satisfaction degree evaluation data by using a database module to form a database and provide the database for a data security module;
performing access control, privacy protection and feedback early warning on government affair satisfaction evaluation data by using the data security module, completing security management work of the data and providing a processed data set for the data scoring module;
and performing model training on the government affair satisfaction evaluation data by using the data scoring module, constructing a government affair satisfaction scoring model, and outputting a scoring result.
CN202211235305.1A 2022-10-10 2022-10-10 Government satisfaction evaluation system and method based on data security and privacy protection Active CN115713249B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211235305.1A CN115713249B (en) 2022-10-10 2022-10-10 Government satisfaction evaluation system and method based on data security and privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211235305.1A CN115713249B (en) 2022-10-10 2022-10-10 Government satisfaction evaluation system and method based on data security and privacy protection

Publications (2)

Publication Number Publication Date
CN115713249A true CN115713249A (en) 2023-02-24
CN115713249B CN115713249B (en) 2023-06-13

Family

ID=85230949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211235305.1A Active CN115713249B (en) 2022-10-10 2022-10-10 Government satisfaction evaluation system and method based on data security and privacy protection

Country Status (1)

Country Link
CN (1) CN115713249B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304725A (en) * 2018-02-09 2018-07-20 山东汇贸电子口岸有限公司 A kind of method and system to the desensitization of government data resource
CN109189826A (en) * 2018-08-14 2019-01-11 北京新广视通科技有限公司 A kind of government affairs service system based on big data
CN111222753A (en) * 2019-12-17 2020-06-02 合肥工业大学 E-government performance evaluation system and method
CN111582653A (en) * 2020-04-14 2020-08-25 五邑大学 Government affair service evaluation processing method, system, device and storage medium
CN111603161A (en) * 2020-05-28 2020-09-01 苏州小蓝医疗科技有限公司 Electroencephalogram classification method
CN113850483A (en) * 2021-09-10 2021-12-28 百维金科(上海)信息科技有限公司 Enterprise credit risk rating system
CN114219688A (en) * 2021-12-06 2022-03-22 安徽长泰科技有限公司 Government affair data supervision system for ensuring information safety
CN115130122A (en) * 2022-06-12 2022-09-30 四川云云旺软件技术有限公司 Big data security protection method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304725A (en) * 2018-02-09 2018-07-20 山东汇贸电子口岸有限公司 A kind of method and system to the desensitization of government data resource
CN109189826A (en) * 2018-08-14 2019-01-11 北京新广视通科技有限公司 A kind of government affairs service system based on big data
CN111222753A (en) * 2019-12-17 2020-06-02 合肥工业大学 E-government performance evaluation system and method
CN111582653A (en) * 2020-04-14 2020-08-25 五邑大学 Government affair service evaluation processing method, system, device and storage medium
CN111603161A (en) * 2020-05-28 2020-09-01 苏州小蓝医疗科技有限公司 Electroencephalogram classification method
CN113850483A (en) * 2021-09-10 2021-12-28 百维金科(上海)信息科技有限公司 Enterprise credit risk rating system
CN114219688A (en) * 2021-12-06 2022-03-22 安徽长泰科技有限公司 Government affair data supervision system for ensuring information safety
CN115130122A (en) * 2022-06-12 2022-09-30 四川云云旺软件技术有限公司 Big data security protection method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邵毅明;钟颖;吴文文;胡广雪;: "基于熵权TOPSIS法的短时交通流预测模型性能综合评价", 重庆理工大学学报(自然科学), no. 07, pages 213 - 219 *

Also Published As

Publication number Publication date
CN115713249B (en) 2023-06-13

Similar Documents

Publication Publication Date Title
Grimmer et al. Estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods
Hart et al. Reference data and geocoding quality: Examining completeness and positional accuracy of street geocoded crime incidents
CN114186275A (en) Privacy protection method and device, computer equipment and storage medium
CN113904872A (en) Feature extraction method and system for anonymous service website fingerprint attack
Krysovatyy et al. Economic Crime Detection Using Support Vector Machine Classification.
CN111192153A (en) Crowd relation network construction method and device, computer equipment and storage medium
CN111159763A (en) System and method for analyzing portrait of law-related personnel group
CN114399382A (en) Method and device for detecting fraud risk of user, computer equipment and storage medium
Safitri et al. Educational data mining using cluster analysis methods and decision trees based on log mining
CN109284978B (en) System and method for accurately identifying poverty-stricken user
CN111047146B (en) Risk identification method, device and equipment for enterprise users
Renigier-Biłozor et al. Residential market ratings using fuzzy logic decision-making procedures
CN115713249A (en) Government affair satisfaction evaluation system and method based on data security and privacy protection
CN106874739A (en) A kind of recognition methods of terminal iidentification and device
CN112506930B (en) Data insight system based on machine learning technology
CN114417099A (en) Archive management system based on RFID (radio frequency identification) label
CN116150663A (en) Data classification method, device, computer equipment and storage medium
CN110309312B (en) Associated event acquisition method and device
CN114661858A (en) Identification method and device for in-doubt legal provision in legal document and related equipment
CN110851864A (en) Sensitive data automatic identification and processing method and system
Alghamdi et al. Evaluating E-Commerce Engagement Factors In Saudi Arabia: Financial Loss, Identity Theft And Privacy Policies
CN116881687B (en) Power grid sensitive data identification method and device based on feature extraction
Lewis et al. Identification of residential property sub-markets using evolutionary and neural computing techniques
Peras et al. Using clustering methods to identify different profiles based on similarity in online security and privacy attitudes
CN111556050B (en) Domain name processing method, device, storage medium and processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant