CN115757384A - Government affair data processing method based on big data - Google Patents
Government affair data processing method based on big data Download PDFInfo
- Publication number
- CN115757384A CN115757384A CN202211520308.XA CN202211520308A CN115757384A CN 115757384 A CN115757384 A CN 115757384A CN 202211520308 A CN202211520308 A CN 202211520308A CN 115757384 A CN115757384 A CN 115757384A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- government affair
- optimization model
- government
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to data processing, in particular to a big data-based government affair data processing method, which comprises the steps of constructing a government affair data sharing model for multi-place joint office, detecting whether abnormal data exist in government affair data by using the government affair data sharing model, and removing the abnormal data; determining a data optimization model for optimizing the government affair data from which the abnormal data are removed; carrying out optimization training on the data optimization model until an optimal data optimization model is obtained; the optimal data optimal model is used for optimizing the government affair data to obtain high-quality government affair data; the technical scheme provided by the invention can effectively overcome the defects that abnormal data in government affair data cannot be effectively removed and high-quality government affair data cannot be preferably obtained from the government affair data in the prior art.
Description
Technical Field
The invention relates to data processing, in particular to a government affair data processing method based on big data.
Background
The government affairs refer to affairs work related to the government, and when a certain government affair negotiation is completed, release work needs to be carried out so that people and government workers can know timely. With the development of internet technology, the distribution of government affairs has been expanded to the network from the original paper documents. Government departments have long developed and recorded a large amount of government affair data, which are important bases for daily management of the government departments. Government affair data are large in data amount, multiple in data types, wide in sources and complex in data formats, and along with the development of big data and the internet, the government has greater and greater demands on mining the value of government affair data of each department.
In recent years, with the progress of technology and the guidance in policy, the reform of digital governments proposes to break a data island, improve the work efficiency and require that all government departments carry out data docking, increasingly severe safety problems require that data of all government departments can be effectively shared, and the optimization and upgrade of convenient services require that all government departments cooperate with each other, so that all levels of governments actively promote the cross-department government affair data sharing.
However, the cross-department government affair data sharing is a very complicated project, and how to effectively remove abnormal data in the government affair data and fully mine the data value through preferably obtaining high-quality government affair data is a problem to be solved urgently in the current government affair data sharing.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects in the prior art, the invention provides a big data-based government affair data processing method, which can effectively overcome the defects that abnormal data in government affair data cannot be effectively removed and high-quality government affair data cannot be preferably obtained from the government affair data in the prior art.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
a big data-based government affair data processing method comprises the following steps:
s1, building a multi-place joint office government affair data sharing model, detecting whether abnormal data exist in the government affair data by using the government affair data sharing model, and removing the abnormal data;
s2, determining a data optimization model for optimizing the government affair data from which the abnormal data are removed;
s3, carrying out optimization training on the data optimization model until an optimal data optimization model is obtained;
and S4, optimizing the government affair data by using the optimal data optimizing model to obtain high-quality government affair data.
Preferably, the government affair data sharing model of multi-place joint office is constructed in S1, and comprises the following steps:
acquiring government nodes, and randomly selecting one node from the government nodes as an aggregation node of the current turn by using a consensus algorithm in a block chain;
after data dimensionality reduction is carried out, an isolated forest is constructed based on aggregation nodes, and node aggregation is completed after isolated parameter vectors are removed;
and uploading the hash value of the vector after the isolated parameter vector is removed to a block chain, and sending the vector source data to the lower-round participating nodes until the construction of the government affair data sharing model is completed.
Preferably, the performing data dimension reduction includes:
and (4) the aggregated government affair node acquires corresponding values of all dimensions of the data vector, and performs data dimension reduction by using a dimension reduction graph method.
Preferably, the building of the isolated forest based on the aggregation nodes comprises:
and aggregating government nodes based on the aggregation node, and constructing an isolated forest with k isolated trees according to the data set after dimensionality reduction to obtain isolated parameter vectors.
Preferably, the rejecting isolated parameter vectors comprises:
calculating a boxed graph function for each dimension of the data vector of the government affair node;
and if all the function values in one dimension are in the set range or more than half of the function values are out of the set range, rejecting the data vector in the dimension.
Preferably, the step S2 of determining a data preference model for preference of the government affair data from which the abnormal data are removed includes:
and selecting a model file containing a default network structure and a hyperparameter according to the government affair data, and determining an algorithm file of an iterative algorithm according to a preset loss expected value.
Preferably, the optimization training of the data optimization model in S3 until the optimal data optimization model is obtained includes:
training the data optimization model by using a tuning optimizer to obtain a target data optimization model;
evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result;
initializing the target data optimization model by using the tuning unit based on the model evaluation result, and performing cyclic optimization on the target data optimization model by using the tuning unit and the evaluator until a preset convergence condition is reached to obtain the optimal data optimization model.
Preferably, the training of the data optimization model by using the tuning device to obtain the target data optimization model includes:
training the data optimization model by using a tuner according to a preset optimization mode to obtain a target data optimization model;
the preset optimization mode comprises a reinforcement learning mode, a non-number-of-candidates optimization mode and a heuristic search mode.
Preferably, the model parameters based on the target data preferred model are evaluated by the evaluator on the target data preferred model to obtain a model evaluation result, and the model evaluation result includes:
and according to a preset evaluation mode, evaluating the target data optimization model by using an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result.
Preferably, the initializing the target data preferred model by using the tuner based on the model evaluation result includes:
and determining the optimal model parameters corresponding to the model evaluation result by adopting an empirical learning algorithm, and initializing the target data optimization model by utilizing an optimizer based on the optimal model parameters.
(III) advantageous effects
Compared with the prior art, the government affair data processing method based on big data has the following beneficial effects:
1) The method comprises the steps of constructing a multi-place joint office government affair data sharing model, detecting whether abnormal data exist in the government affair data by using the government affair data sharing model, and rejecting the abnormal data, so that the abnormal data in the government affair data can be effectively rejected by using the government affair data sharing model, the government affair data can be optimized subsequently, and the accuracy of high-quality government affair data is obtained;
2) And determining a data optimization model for optimizing the government affair data without the abnormal data, optimally training the data optimization model until the optimal data optimization model is obtained, and optimizing the government affair data by using the optimal data optimization model to obtain high-quality government affair data, so that the high-quality government affair data can be optimally obtained from the government affair data by using the optimal data optimization model, and data guarantee is provided for full mining of the value of the government affair data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic view of the process of eliminating abnormal data in government affairs data according to the present invention;
fig. 3 is a schematic flow chart of the preferred method for obtaining high-quality government data in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A big data-based government affair data processing method is disclosed, as shown in fig. 1 and fig. 2, (1) a government affair data sharing model for multi-place combined office is constructed, whether abnormal data exist in the government affair data is detected by using the government affair data sharing model, and the abnormal data are removed.
The method for constructing the government affair data sharing model of multi-place combined office comprises the following steps:
acquiring government affair nodes, and randomly selecting one node from the government affair nodes as an aggregation node of the current turn by using a consensus algorithm in a block chain;
after data dimensionality reduction is carried out, an isolated forest is constructed based on aggregation nodes, and node aggregation is completed after isolated parameter vectors are removed;
and uploading the hash value of the vector after the isolated parameter vector is removed to a block chain, and sending the vector source data to the lower-round participating nodes until the construction of the government affair data sharing model is completed.
1) Performing data dimension reduction, including:
and for each dimensionality of the data vector, acquiring a corresponding value of each dimensionality by the aggregation government affair node, and performing data dimensionality reduction by using a dimensionality reduction graph method.
2) Constructing an isolated forest based on the aggregation nodes, comprising:
and aggregating government nodes based on the aggregation node, and constructing an isolated forest with k isolated trees according to the data set after dimensionality reduction to obtain isolated parameter vectors.
3) Rejecting isolated parameter vectors, comprising:
calculating a boxed graph function for each dimensionality of the government affair node to the data vector;
and if all the function values in one dimension are in the set range or more than half of the function values are out of the set range, rejecting the data vector in the dimension.
According to the technical scheme, the government affair data sharing model for multi-place combined office is constructed, whether abnormal data exist in the government affair data is detected by the government affair data sharing model, and the abnormal data are removed, so that the abnormal data in the government affair data can be effectively removed by the government affair data sharing model, follow-up optimization of the government affair data is guaranteed, and the accuracy of high-quality government affair data is obtained.
As shown in fig. 1 and fig. 3, (2) determining a data optimization model for optimizing the government affair data from which the abnormal data is removed specifically includes:
and selecting a model file containing a default network structure and a hyperparameter according to the government affair data, and determining an algorithm file of an iterative algorithm according to a preset loss expected value.
(3) Performing optimization training on the data optimization model until the optimal data optimization model is obtained, which specifically comprises the following steps:
training the data optimization model by using a tuning optimizer to obtain a target data optimization model;
evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result;
initializing the target data optimization model by using the tuning unit based on the model evaluation result, and performing cyclic optimization on the target data optimization model by using the tuning unit and the evaluator until a preset convergence condition is reached to obtain the optimal data optimization model.
1) Training the data optimization model by using the optimizer to obtain a target data optimization model, wherein the training comprises the following steps:
training the data optimization model by using a tuner according to a preset optimization mode to obtain a target data optimization model;
the preset optimization mode comprises a reinforcement learning mode, a non-number-of-candidates optimization mode and a heuristic search mode.
2) Evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result, wherein the model evaluation result comprises the following steps:
and according to a preset evaluation mode, evaluating the target data optimization model by using an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result.
3) Initializing a target data optimization model by using a tuner based on the model evaluation result, wherein the target data optimization model comprises the following steps:
and determining the optimal model parameters corresponding to the model evaluation result by adopting an empirical learning algorithm, and initializing the target data optimization model by utilizing an optimizer based on the optimal model parameters.
(4) And optimizing the government affair data by using the optimal data optimization model to obtain high-quality government affair data.
According to the technical scheme, the data optimization model used for optimizing the government affair data after the abnormal data are removed is determined, the data optimization model is optimized and trained until the optimal data optimization model is obtained, the government affair data are optimized by the optimal data optimization model to obtain high-quality government affair data, and therefore the high-quality government affair data can be obtained from the government affair data through optimization by the optimal data optimization model, and data guarantee is provided for full mining of the value of the government affair data.
The above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A government affair data processing method based on big data is characterized in that: the method comprises the following steps:
s1, building a multi-place joint office government affair data sharing model, detecting whether abnormal data exist in the government affair data by using the government affair data sharing model, and removing the abnormal data;
s2, determining a data optimization model for optimizing the government affair data without the abnormal data;
s3, carrying out optimization training on the data optimization model until an optimal data optimization model is obtained;
and S4, optimizing the government affair data by using the optimal data optimizing model to obtain high-quality government affair data.
2. A big data-based government affairs data processing method according to claim 1, wherein: the method comprises the following steps of S1, constructing a government affair data sharing model of multi-place joint office, including:
acquiring government nodes, and randomly selecting one node from the government nodes as an aggregation node of the current turn by using a consensus algorithm in a block chain;
after data dimension reduction is carried out, an isolated forest is constructed based on aggregation nodes, and node aggregation is completed after isolated parameter vectors are removed;
and uploading the hash value of the vector after the isolated parameter vector is removed to a block chain, and sending the vector source data to the next round of participating nodes until the construction of the government affair data sharing model is completed.
3. A big data-based government affairs data processing method according to claim 2, wherein: the data dimension reduction comprises the following steps:
and (4) the aggregated government affair node acquires corresponding values of all dimensions of the data vector, and performs data dimension reduction by using a dimension reduction graph method.
4. A big data-based government affairs data processing method according to claim 3, wherein: the method for constructing the isolated forest based on the aggregation nodes comprises the following steps:
and aggregating the government affair nodes based on the aggregation node, and constructing an isolated forest with k isolated trees according to the data set after dimensionality reduction to obtain an isolated parameter vector.
5. The big-data-based government affair data processing method according to claim 4, wherein: the elimination of isolated parameter vectors includes:
calculating a boxed graph function for each dimension of the data vector of the government affair node;
and if all the function values in one dimension are in the set range or more than half of the function values are out of the set range, rejecting the data vector in the dimension.
6. A big data-based government affairs data processing method according to claim 1, wherein: the step 2 of determining a data optimization model for optimizing the government affair data from which the abnormal data are removed includes:
and selecting a model file containing a default network structure and a hyperparameter according to the government affair data, and determining an algorithm file of an iterative algorithm according to a preset loss expected value.
7. A big data-based government affairs data processing method according to claim 1, wherein: and S3, performing optimization training on the data optimization model until the optimal data optimization model is obtained, wherein the optimization training comprises the following steps:
training the data optimization model by using a tuner to obtain a target data optimization model;
evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result;
and initializing the target data optimization model by using the tuning unit based on the model evaluation result, and performing circular optimization on the target data optimization model by using the tuning unit and the evaluator until a preset convergence condition is reached to obtain the optimal data optimization model.
8. The big-data-based government affair data processing method according to claim 7, wherein: the training of the data optimization model by using the tuning device to obtain the target data optimization model comprises the following steps:
training the data optimization model by using a tuner according to a preset optimization mode to obtain a target data optimization model;
the preset optimization mode comprises a reinforcement learning mode, a non-number-of-candidates optimization mode and a heuristic search mode.
9. The big-data-based government affair data processing method according to claim 7, wherein: the model parameters based on the target data optimization model are evaluated on the target data optimization model by using an evaluator to obtain a model evaluation result, and the model evaluation result comprises the following steps:
and according to a preset evaluation mode, evaluating the target data optimization model by using an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result.
10. A big data-based government data processing method according to claim 7, wherein: initializing a target data optimization model by using a tuner based on the model evaluation result, wherein the initializing comprises the following steps:
and determining the optimal model parameters corresponding to the model evaluation result by adopting an empirical learning algorithm, and initializing the target data optimization model by utilizing an optimizer based on the optimal model parameters.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211520308.XA CN115757384A (en) | 2022-11-30 | 2022-11-30 | Government affair data processing method based on big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211520308.XA CN115757384A (en) | 2022-11-30 | 2022-11-30 | Government affair data processing method based on big data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115757384A true CN115757384A (en) | 2023-03-07 |
Family
ID=85341177
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211520308.XA Pending CN115757384A (en) | 2022-11-30 | 2022-11-30 | Government affair data processing method based on big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115757384A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116308963A (en) * | 2023-05-19 | 2023-06-23 | 北京十环信息有限公司 | Government affair data analysis method and system |
-
2022
- 2022-11-30 CN CN202211520308.XA patent/CN115757384A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116308963A (en) * | 2023-05-19 | 2023-06-23 | 北京十环信息有限公司 | Government affair data analysis method and system |
CN116308963B (en) * | 2023-05-19 | 2023-07-18 | 北京十环信息有限公司 | Government affair data analysis method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11348570B2 (en) | Method for generating style statement, method and apparatus for training model, and computer device | |
US11132602B1 (en) | Efficient online training for machine learning | |
US10846052B2 (en) | Community discovery method, device, server and computer storage medium | |
US10489363B2 (en) | Distributed FP-growth with node table for large-scale association rule mining | |
US20190294975A1 (en) | Predicting using digital twins | |
US11107007B2 (en) | Classification model generation method and apparatus, and data identification method and apparatus | |
US20230385034A1 (en) | Automated decision making using staged machine learning | |
US11494638B2 (en) | Learning support device and learning support method | |
CN112566093B (en) | Terminal relation identification method and device, computer equipment and storage medium | |
US10642805B1 (en) | System for determining queries to locate data objects | |
CN110719106B (en) | Social network graph compression method and system based on node classification and sorting | |
CN115757384A (en) | Government affair data processing method based on big data | |
US10810458B2 (en) | Incremental automatic update of ranked neighbor lists based on k-th nearest neighbors | |
US11361195B2 (en) | Incremental update of a neighbor graph via an orthogonal transform based indexing | |
US20240095529A1 (en) | Neural Network Optimization Method and Apparatus | |
CN114281932A (en) | Method, device and equipment for training work order quality inspection model and storage medium | |
WO2022170853A1 (en) | Information generation method and apparatus, electronic device, and computer readable medium | |
CN114897666A (en) | Graph data storage, access, processing method, training method, device and medium | |
CN110570093B (en) | Method and device for automatically managing business expansion channel | |
CN114024912A (en) | Network traffic application identification analysis method and system based on improved CHAMELEON algorithm | |
CN111727108B (en) | Method, device and system for controlling robot and storage medium | |
WO2023019427A1 (en) | Method and apparatus for graph-based recommendation | |
CN116629388B (en) | Differential privacy federal learning training method, device and computer readable storage medium | |
KR20150107252A (en) | Method for data classification based on manifold learning | |
CN108833429B (en) | Method, device and storage medium for acquiring virus immunity strategy of power communication network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |