CN115757384A

CN115757384A - Government affair data processing method based on big data

Info

Publication number: CN115757384A
Application number: CN202211520308.XA
Authority: CN
Inventors: 李娟�
Original assignee: Anhui Changzheng Think Tank Management Consulting Co ltd
Current assignee: Anhui Changzheng Think Tank Management Consulting Co ltd
Priority date: 2022-11-30
Filing date: 2022-11-30
Publication date: 2023-03-07

Abstract

The invention relates to data processing, in particular to a big data-based government affair data processing method, which comprises the steps of constructing a government affair data sharing model for multi-place joint office, detecting whether abnormal data exist in government affair data by using the government affair data sharing model, and removing the abnormal data; determining a data optimization model for optimizing the government affair data from which the abnormal data are removed; carrying out optimization training on the data optimization model until an optimal data optimization model is obtained; the optimal data optimal model is used for optimizing the government affair data to obtain high-quality government affair data; the technical scheme provided by the invention can effectively overcome the defects that abnormal data in government affair data cannot be effectively removed and high-quality government affair data cannot be preferably obtained from the government affair data in the prior art.

Description

Government affair data processing method based on big data

Technical Field

The invention relates to data processing, in particular to a government affair data processing method based on big data.

Background

The government affairs refer to affairs work related to the government, and when a certain government affair negotiation is completed, release work needs to be carried out so that people and government workers can know timely. With the development of internet technology, the distribution of government affairs has been expanded to the network from the original paper documents. Government departments have long developed and recorded a large amount of government affair data, which are important bases for daily management of the government departments. Government affair data are large in data amount, multiple in data types, wide in sources and complex in data formats, and along with the development of big data and the internet, the government has greater and greater demands on mining the value of government affair data of each department.

In recent years, with the progress of technology and the guidance in policy, the reform of digital governments proposes to break a data island, improve the work efficiency and require that all government departments carry out data docking, increasingly severe safety problems require that data of all government departments can be effectively shared, and the optimization and upgrade of convenient services require that all government departments cooperate with each other, so that all levels of governments actively promote the cross-department government affair data sharing.

However, the cross-department government affair data sharing is a very complicated project, and how to effectively remove abnormal data in the government affair data and fully mine the data value through preferably obtaining high-quality government affair data is a problem to be solved urgently in the current government affair data sharing.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects in the prior art, the invention provides a big data-based government affair data processing method, which can effectively overcome the defects that abnormal data in government affair data cannot be effectively removed and high-quality government affair data cannot be preferably obtained from the government affair data in the prior art.

(II) technical scheme

In order to achieve the purpose, the invention is realized by the following technical scheme:

a big data-based government affair data processing method comprises the following steps:

s1, building a multi-place joint office government affair data sharing model, detecting whether abnormal data exist in the government affair data by using the government affair data sharing model, and removing the abnormal data;

s2, determining a data optimization model for optimizing the government affair data from which the abnormal data are removed;

s3, carrying out optimization training on the data optimization model until an optimal data optimization model is obtained;

and S4, optimizing the government affair data by using the optimal data optimizing model to obtain high-quality government affair data.

Preferably, the government affair data sharing model of multi-place joint office is constructed in S1, and comprises the following steps:

acquiring government nodes, and randomly selecting one node from the government nodes as an aggregation node of the current turn by using a consensus algorithm in a block chain;

after data dimensionality reduction is carried out, an isolated forest is constructed based on aggregation nodes, and node aggregation is completed after isolated parameter vectors are removed;

and uploading the hash value of the vector after the isolated parameter vector is removed to a block chain, and sending the vector source data to the lower-round participating nodes until the construction of the government affair data sharing model is completed.

Preferably, the performing data dimension reduction includes:

and (4) the aggregated government affair node acquires corresponding values of all dimensions of the data vector, and performs data dimension reduction by using a dimension reduction graph method.

Preferably, the building of the isolated forest based on the aggregation nodes comprises:

and aggregating government nodes based on the aggregation node, and constructing an isolated forest with k isolated trees according to the data set after dimensionality reduction to obtain isolated parameter vectors.

Preferably, the rejecting isolated parameter vectors comprises:

calculating a boxed graph function for each dimension of the data vector of the government affair node;

and if all the function values in one dimension are in the set range or more than half of the function values are out of the set range, rejecting the data vector in the dimension.

Preferably, the step S2 of determining a data preference model for preference of the government affair data from which the abnormal data are removed includes:

and selecting a model file containing a default network structure and a hyperparameter according to the government affair data, and determining an algorithm file of an iterative algorithm according to a preset loss expected value.

Preferably, the optimization training of the data optimization model in S3 until the optimal data optimization model is obtained includes:

training the data optimization model by using a tuning optimizer to obtain a target data optimization model;

evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result;

initializing the target data optimization model by using the tuning unit based on the model evaluation result, and performing cyclic optimization on the target data optimization model by using the tuning unit and the evaluator until a preset convergence condition is reached to obtain the optimal data optimization model.

Preferably, the training of the data optimization model by using the tuning device to obtain the target data optimization model includes:

training the data optimization model by using a tuner according to a preset optimization mode to obtain a target data optimization model;

the preset optimization mode comprises a reinforcement learning mode, a non-number-of-candidates optimization mode and a heuristic search mode.

Preferably, the model parameters based on the target data preferred model are evaluated by the evaluator on the target data preferred model to obtain a model evaluation result, and the model evaluation result includes:

and according to a preset evaluation mode, evaluating the target data optimization model by using an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result.

Preferably, the initializing the target data preferred model by using the tuner based on the model evaluation result includes:

and determining the optimal model parameters corresponding to the model evaluation result by adopting an empirical learning algorithm, and initializing the target data optimization model by utilizing an optimizer based on the optimal model parameters.

(III) advantageous effects

Compared with the prior art, the government affair data processing method based on big data has the following beneficial effects:

1) The method comprises the steps of constructing a multi-place joint office government affair data sharing model, detecting whether abnormal data exist in the government affair data by using the government affair data sharing model, and rejecting the abnormal data, so that the abnormal data in the government affair data can be effectively rejected by using the government affair data sharing model, the government affair data can be optimized subsequently, and the accuracy of high-quality government affair data is obtained;

2) And determining a data optimization model for optimizing the government affair data without the abnormal data, optimally training the data optimization model until the optimal data optimization model is obtained, and optimizing the government affair data by using the optimal data optimization model to obtain high-quality government affair data, so that the high-quality government affair data can be optimally obtained from the government affair data by using the optimal data optimization model, and data guarantee is provided for full mining of the value of the government affair data.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

FIG. 1 is a schematic flow diagram of the present invention;

FIG. 2 is a schematic view of the process of eliminating abnormal data in government affairs data according to the present invention;

fig. 3 is a schematic flow chart of the preferred method for obtaining high-quality government data in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

A big data-based government affair data processing method is disclosed, as shown in fig. 1 and fig. 2, (1) a government affair data sharing model for multi-place combined office is constructed, whether abnormal data exist in the government affair data is detected by using the government affair data sharing model, and the abnormal data are removed.

The method for constructing the government affair data sharing model of multi-place combined office comprises the following steps:

acquiring government affair nodes, and randomly selecting one node from the government affair nodes as an aggregation node of the current turn by using a consensus algorithm in a block chain;

1) Performing data dimension reduction, including:

and for each dimensionality of the data vector, acquiring a corresponding value of each dimensionality by the aggregation government affair node, and performing data dimensionality reduction by using a dimensionality reduction graph method.

2) Constructing an isolated forest based on the aggregation nodes, comprising:

3) Rejecting isolated parameter vectors, comprising:

calculating a boxed graph function for each dimensionality of the government affair node to the data vector;

According to the technical scheme, the government affair data sharing model for multi-place combined office is constructed, whether abnormal data exist in the government affair data is detected by the government affair data sharing model, and the abnormal data are removed, so that the abnormal data in the government affair data can be effectively removed by the government affair data sharing model, follow-up optimization of the government affair data is guaranteed, and the accuracy of high-quality government affair data is obtained.

As shown in fig. 1 and fig. 3, (2) determining a data optimization model for optimizing the government affair data from which the abnormal data is removed specifically includes:

(3) Performing optimization training on the data optimization model until the optimal data optimization model is obtained, which specifically comprises the following steps:

1) Training the data optimization model by using the optimizer to obtain a target data optimization model, wherein the training comprises the following steps:

2) Evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result, wherein the model evaluation result comprises the following steps:

3) Initializing a target data optimization model by using a tuner based on the model evaluation result, wherein the target data optimization model comprises the following steps:

(4) And optimizing the government affair data by using the optimal data optimization model to obtain high-quality government affair data.

According to the technical scheme, the data optimization model used for optimizing the government affair data after the abnormal data are removed is determined, the data optimization model is optimized and trained until the optimal data optimization model is obtained, the government affair data are optimized by the optimal data optimization model to obtain high-quality government affair data, and therefore the high-quality government affair data can be obtained from the government affair data through optimization by the optimal data optimization model, and data guarantee is provided for full mining of the value of the government affair data.

The above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A government affair data processing method based on big data is characterized in that: the method comprises the following steps:

s2, determining a data optimization model for optimizing the government affair data without the abnormal data;

2. A big data-based government affairs data processing method according to claim 1, wherein: the method comprises the following steps of S1, constructing a government affair data sharing model of multi-place joint office, including:

after data dimension reduction is carried out, an isolated forest is constructed based on aggregation nodes, and node aggregation is completed after isolated parameter vectors are removed;

and uploading the hash value of the vector after the isolated parameter vector is removed to a block chain, and sending the vector source data to the next round of participating nodes until the construction of the government affair data sharing model is completed.

3. A big data-based government affairs data processing method according to claim 2, wherein: the data dimension reduction comprises the following steps:

4. A big data-based government affairs data processing method according to claim 3, wherein: the method for constructing the isolated forest based on the aggregation nodes comprises the following steps:

and aggregating the government affair nodes based on the aggregation node, and constructing an isolated forest with k isolated trees according to the data set after dimensionality reduction to obtain an isolated parameter vector.

5. The big-data-based government affair data processing method according to claim 4, wherein: the elimination of isolated parameter vectors includes:

6. A big data-based government affairs data processing method according to claim 1, wherein: the step 2 of determining a data optimization model for optimizing the government affair data from which the abnormal data are removed includes:

7. A big data-based government affairs data processing method according to claim 1, wherein: and S3, performing optimization training on the data optimization model until the optimal data optimization model is obtained, wherein the optimization training comprises the following steps:

training the data optimization model by using a tuner to obtain a target data optimization model;

and initializing the target data optimization model by using the tuning unit based on the model evaluation result, and performing circular optimization on the target data optimization model by using the tuning unit and the evaluator until a preset convergence condition is reached to obtain the optimal data optimization model.

8. The big-data-based government affair data processing method according to claim 7, wherein: the training of the data optimization model by using the tuning device to obtain the target data optimization model comprises the following steps:

9. The big-data-based government affair data processing method according to claim 7, wherein: the model parameters based on the target data optimization model are evaluated on the target data optimization model by using an evaluator to obtain a model evaluation result, and the model evaluation result comprises the following steps:

10. A big data-based government data processing method according to claim 7, wherein: initializing a target data optimization model by using a tuner based on the model evaluation result, wherein the initializing comprises the following steps: