CN115757384A - Government affair data processing method based on big data - Google Patents

Government affair data processing method based on big data Download PDF

Info

Publication number
CN115757384A
CN115757384A CN202211520308.XA CN202211520308A CN115757384A CN 115757384 A CN115757384 A CN 115757384A CN 202211520308 A CN202211520308 A CN 202211520308A CN 115757384 A CN115757384 A CN 115757384A
Authority
CN
China
Prior art keywords
data
model
government affair
optimization model
government
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211520308.XA
Other languages
Chinese (zh)
Inventor
李娟�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Changzheng Think Tank Management Consulting Co ltd
Original Assignee
Anhui Changzheng Think Tank Management Consulting Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Changzheng Think Tank Management Consulting Co ltd filed Critical Anhui Changzheng Think Tank Management Consulting Co ltd
Priority to CN202211520308.XA priority Critical patent/CN115757384A/en
Publication of CN115757384A publication Critical patent/CN115757384A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to data processing, in particular to a big data-based government affair data processing method, which comprises the steps of constructing a government affair data sharing model for multi-place joint office, detecting whether abnormal data exist in government affair data by using the government affair data sharing model, and removing the abnormal data; determining a data optimization model for optimizing the government affair data from which the abnormal data are removed; carrying out optimization training on the data optimization model until an optimal data optimization model is obtained; the optimal data optimal model is used for optimizing the government affair data to obtain high-quality government affair data; the technical scheme provided by the invention can effectively overcome the defects that abnormal data in government affair data cannot be effectively removed and high-quality government affair data cannot be preferably obtained from the government affair data in the prior art.

Description

Government affair data processing method based on big data
Technical Field
The invention relates to data processing, in particular to a government affair data processing method based on big data.
Background
The government affairs refer to affairs work related to the government, and when a certain government affair negotiation is completed, release work needs to be carried out so that people and government workers can know timely. With the development of internet technology, the distribution of government affairs has been expanded to the network from the original paper documents. Government departments have long developed and recorded a large amount of government affair data, which are important bases for daily management of the government departments. Government affair data are large in data amount, multiple in data types, wide in sources and complex in data formats, and along with the development of big data and the internet, the government has greater and greater demands on mining the value of government affair data of each department.
In recent years, with the progress of technology and the guidance in policy, the reform of digital governments proposes to break a data island, improve the work efficiency and require that all government departments carry out data docking, increasingly severe safety problems require that data of all government departments can be effectively shared, and the optimization and upgrade of convenient services require that all government departments cooperate with each other, so that all levels of governments actively promote the cross-department government affair data sharing.
However, the cross-department government affair data sharing is a very complicated project, and how to effectively remove abnormal data in the government affair data and fully mine the data value through preferably obtaining high-quality government affair data is a problem to be solved urgently in the current government affair data sharing.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects in the prior art, the invention provides a big data-based government affair data processing method, which can effectively overcome the defects that abnormal data in government affair data cannot be effectively removed and high-quality government affair data cannot be preferably obtained from the government affair data in the prior art.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
a big data-based government affair data processing method comprises the following steps:
s1, building a multi-place joint office government affair data sharing model, detecting whether abnormal data exist in the government affair data by using the government affair data sharing model, and removing the abnormal data;
s2, determining a data optimization model for optimizing the government affair data from which the abnormal data are removed;
s3, carrying out optimization training on the data optimization model until an optimal data optimization model is obtained;
and S4, optimizing the government affair data by using the optimal data optimizing model to obtain high-quality government affair data.
Preferably, the government affair data sharing model of multi-place joint office is constructed in S1, and comprises the following steps:
acquiring government nodes, and randomly selecting one node from the government nodes as an aggregation node of the current turn by using a consensus algorithm in a block chain;
after data dimensionality reduction is carried out, an isolated forest is constructed based on aggregation nodes, and node aggregation is completed after isolated parameter vectors are removed;
and uploading the hash value of the vector after the isolated parameter vector is removed to a block chain, and sending the vector source data to the lower-round participating nodes until the construction of the government affair data sharing model is completed.
Preferably, the performing data dimension reduction includes:
and (4) the aggregated government affair node acquires corresponding values of all dimensions of the data vector, and performs data dimension reduction by using a dimension reduction graph method.
Preferably, the building of the isolated forest based on the aggregation nodes comprises:
and aggregating government nodes based on the aggregation node, and constructing an isolated forest with k isolated trees according to the data set after dimensionality reduction to obtain isolated parameter vectors.
Preferably, the rejecting isolated parameter vectors comprises:
calculating a boxed graph function for each dimension of the data vector of the government affair node;
and if all the function values in one dimension are in the set range or more than half of the function values are out of the set range, rejecting the data vector in the dimension.
Preferably, the step S2 of determining a data preference model for preference of the government affair data from which the abnormal data are removed includes:
and selecting a model file containing a default network structure and a hyperparameter according to the government affair data, and determining an algorithm file of an iterative algorithm according to a preset loss expected value.
Preferably, the optimization training of the data optimization model in S3 until the optimal data optimization model is obtained includes:
training the data optimization model by using a tuning optimizer to obtain a target data optimization model;
evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result;
initializing the target data optimization model by using the tuning unit based on the model evaluation result, and performing cyclic optimization on the target data optimization model by using the tuning unit and the evaluator until a preset convergence condition is reached to obtain the optimal data optimization model.
Preferably, the training of the data optimization model by using the tuning device to obtain the target data optimization model includes:
training the data optimization model by using a tuner according to a preset optimization mode to obtain a target data optimization model;
the preset optimization mode comprises a reinforcement learning mode, a non-number-of-candidates optimization mode and a heuristic search mode.
Preferably, the model parameters based on the target data preferred model are evaluated by the evaluator on the target data preferred model to obtain a model evaluation result, and the model evaluation result includes:
and according to a preset evaluation mode, evaluating the target data optimization model by using an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result.
Preferably, the initializing the target data preferred model by using the tuner based on the model evaluation result includes:
and determining the optimal model parameters corresponding to the model evaluation result by adopting an empirical learning algorithm, and initializing the target data optimization model by utilizing an optimizer based on the optimal model parameters.
(III) advantageous effects
Compared with the prior art, the government affair data processing method based on big data has the following beneficial effects:
1) The method comprises the steps of constructing a multi-place joint office government affair data sharing model, detecting whether abnormal data exist in the government affair data by using the government affair data sharing model, and rejecting the abnormal data, so that the abnormal data in the government affair data can be effectively rejected by using the government affair data sharing model, the government affair data can be optimized subsequently, and the accuracy of high-quality government affair data is obtained;
2) And determining a data optimization model for optimizing the government affair data without the abnormal data, optimally training the data optimization model until the optimal data optimization model is obtained, and optimizing the government affair data by using the optimal data optimization model to obtain high-quality government affair data, so that the high-quality government affair data can be optimally obtained from the government affair data by using the optimal data optimization model, and data guarantee is provided for full mining of the value of the government affair data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic view of the process of eliminating abnormal data in government affairs data according to the present invention;
fig. 3 is a schematic flow chart of the preferred method for obtaining high-quality government data in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A big data-based government affair data processing method is disclosed, as shown in fig. 1 and fig. 2, (1) a government affair data sharing model for multi-place combined office is constructed, whether abnormal data exist in the government affair data is detected by using the government affair data sharing model, and the abnormal data are removed.
The method for constructing the government affair data sharing model of multi-place combined office comprises the following steps:
acquiring government affair nodes, and randomly selecting one node from the government affair nodes as an aggregation node of the current turn by using a consensus algorithm in a block chain;
after data dimensionality reduction is carried out, an isolated forest is constructed based on aggregation nodes, and node aggregation is completed after isolated parameter vectors are removed;
and uploading the hash value of the vector after the isolated parameter vector is removed to a block chain, and sending the vector source data to the lower-round participating nodes until the construction of the government affair data sharing model is completed.
1) Performing data dimension reduction, including:
and for each dimensionality of the data vector, acquiring a corresponding value of each dimensionality by the aggregation government affair node, and performing data dimensionality reduction by using a dimensionality reduction graph method.
2) Constructing an isolated forest based on the aggregation nodes, comprising:
and aggregating government nodes based on the aggregation node, and constructing an isolated forest with k isolated trees according to the data set after dimensionality reduction to obtain isolated parameter vectors.
3) Rejecting isolated parameter vectors, comprising:
calculating a boxed graph function for each dimensionality of the government affair node to the data vector;
and if all the function values in one dimension are in the set range or more than half of the function values are out of the set range, rejecting the data vector in the dimension.
According to the technical scheme, the government affair data sharing model for multi-place combined office is constructed, whether abnormal data exist in the government affair data is detected by the government affair data sharing model, and the abnormal data are removed, so that the abnormal data in the government affair data can be effectively removed by the government affair data sharing model, follow-up optimization of the government affair data is guaranteed, and the accuracy of high-quality government affair data is obtained.
As shown in fig. 1 and fig. 3, (2) determining a data optimization model for optimizing the government affair data from which the abnormal data is removed specifically includes:
and selecting a model file containing a default network structure and a hyperparameter according to the government affair data, and determining an algorithm file of an iterative algorithm according to a preset loss expected value.
(3) Performing optimization training on the data optimization model until the optimal data optimization model is obtained, which specifically comprises the following steps:
training the data optimization model by using a tuning optimizer to obtain a target data optimization model;
evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result;
initializing the target data optimization model by using the tuning unit based on the model evaluation result, and performing cyclic optimization on the target data optimization model by using the tuning unit and the evaluator until a preset convergence condition is reached to obtain the optimal data optimization model.
1) Training the data optimization model by using the optimizer to obtain a target data optimization model, wherein the training comprises the following steps:
training the data optimization model by using a tuner according to a preset optimization mode to obtain a target data optimization model;
the preset optimization mode comprises a reinforcement learning mode, a non-number-of-candidates optimization mode and a heuristic search mode.
2) Evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result, wherein the model evaluation result comprises the following steps:
and according to a preset evaluation mode, evaluating the target data optimization model by using an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result.
3) Initializing a target data optimization model by using a tuner based on the model evaluation result, wherein the target data optimization model comprises the following steps:
and determining the optimal model parameters corresponding to the model evaluation result by adopting an empirical learning algorithm, and initializing the target data optimization model by utilizing an optimizer based on the optimal model parameters.
(4) And optimizing the government affair data by using the optimal data optimization model to obtain high-quality government affair data.
According to the technical scheme, the data optimization model used for optimizing the government affair data after the abnormal data are removed is determined, the data optimization model is optimized and trained until the optimal data optimization model is obtained, the government affair data are optimized by the optimal data optimization model to obtain high-quality government affair data, and therefore the high-quality government affair data can be obtained from the government affair data through optimization by the optimal data optimization model, and data guarantee is provided for full mining of the value of the government affair data.
The above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A government affair data processing method based on big data is characterized in that: the method comprises the following steps:
s1, building a multi-place joint office government affair data sharing model, detecting whether abnormal data exist in the government affair data by using the government affair data sharing model, and removing the abnormal data;
s2, determining a data optimization model for optimizing the government affair data without the abnormal data;
s3, carrying out optimization training on the data optimization model until an optimal data optimization model is obtained;
and S4, optimizing the government affair data by using the optimal data optimizing model to obtain high-quality government affair data.
2. A big data-based government affairs data processing method according to claim 1, wherein: the method comprises the following steps of S1, constructing a government affair data sharing model of multi-place joint office, including:
acquiring government nodes, and randomly selecting one node from the government nodes as an aggregation node of the current turn by using a consensus algorithm in a block chain;
after data dimension reduction is carried out, an isolated forest is constructed based on aggregation nodes, and node aggregation is completed after isolated parameter vectors are removed;
and uploading the hash value of the vector after the isolated parameter vector is removed to a block chain, and sending the vector source data to the next round of participating nodes until the construction of the government affair data sharing model is completed.
3. A big data-based government affairs data processing method according to claim 2, wherein: the data dimension reduction comprises the following steps:
and (4) the aggregated government affair node acquires corresponding values of all dimensions of the data vector, and performs data dimension reduction by using a dimension reduction graph method.
4. A big data-based government affairs data processing method according to claim 3, wherein: the method for constructing the isolated forest based on the aggregation nodes comprises the following steps:
and aggregating the government affair nodes based on the aggregation node, and constructing an isolated forest with k isolated trees according to the data set after dimensionality reduction to obtain an isolated parameter vector.
5. The big-data-based government affair data processing method according to claim 4, wherein: the elimination of isolated parameter vectors includes:
calculating a boxed graph function for each dimension of the data vector of the government affair node;
and if all the function values in one dimension are in the set range or more than half of the function values are out of the set range, rejecting the data vector in the dimension.
6. A big data-based government affairs data processing method according to claim 1, wherein: the step 2 of determining a data optimization model for optimizing the government affair data from which the abnormal data are removed includes:
and selecting a model file containing a default network structure and a hyperparameter according to the government affair data, and determining an algorithm file of an iterative algorithm according to a preset loss expected value.
7. A big data-based government affairs data processing method according to claim 1, wherein: and S3, performing optimization training on the data optimization model until the optimal data optimization model is obtained, wherein the optimization training comprises the following steps:
training the data optimization model by using a tuner to obtain a target data optimization model;
evaluating the target data optimization model by utilizing an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result;
and initializing the target data optimization model by using the tuning unit based on the model evaluation result, and performing circular optimization on the target data optimization model by using the tuning unit and the evaluator until a preset convergence condition is reached to obtain the optimal data optimization model.
8. The big-data-based government affair data processing method according to claim 7, wherein: the training of the data optimization model by using the tuning device to obtain the target data optimization model comprises the following steps:
training the data optimization model by using a tuner according to a preset optimization mode to obtain a target data optimization model;
the preset optimization mode comprises a reinforcement learning mode, a non-number-of-candidates optimization mode and a heuristic search mode.
9. The big-data-based government affair data processing method according to claim 7, wherein: the model parameters based on the target data optimization model are evaluated on the target data optimization model by using an evaluator to obtain a model evaluation result, and the model evaluation result comprises the following steps:
and according to a preset evaluation mode, evaluating the target data optimization model by using an evaluator based on the model parameters of the target data optimization model to obtain a model evaluation result.
10. A big data-based government data processing method according to claim 7, wherein: initializing a target data optimization model by using a tuner based on the model evaluation result, wherein the initializing comprises the following steps:
and determining the optimal model parameters corresponding to the model evaluation result by adopting an empirical learning algorithm, and initializing the target data optimization model by utilizing an optimizer based on the optimal model parameters.
CN202211520308.XA 2022-11-30 2022-11-30 Government affair data processing method based on big data Pending CN115757384A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211520308.XA CN115757384A (en) 2022-11-30 2022-11-30 Government affair data processing method based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211520308.XA CN115757384A (en) 2022-11-30 2022-11-30 Government affair data processing method based on big data

Publications (1)

Publication Number Publication Date
CN115757384A true CN115757384A (en) 2023-03-07

Family

ID=85341177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211520308.XA Pending CN115757384A (en) 2022-11-30 2022-11-30 Government affair data processing method based on big data

Country Status (1)

Country Link
CN (1) CN115757384A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116308963A (en) * 2023-05-19 2023-06-23 北京十环信息有限公司 Government affair data analysis method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116308963A (en) * 2023-05-19 2023-06-23 北京十环信息有限公司 Government affair data analysis method and system
CN116308963B (en) * 2023-05-19 2023-07-18 北京十环信息有限公司 Government affair data analysis method and system

Similar Documents

Publication Publication Date Title
US11348570B2 (en) Method for generating style statement, method and apparatus for training model, and computer device
US11132602B1 (en) Efficient online training for machine learning
US10846052B2 (en) Community discovery method, device, server and computer storage medium
US10489363B2 (en) Distributed FP-growth with node table for large-scale association rule mining
US20190294975A1 (en) Predicting using digital twins
US11107007B2 (en) Classification model generation method and apparatus, and data identification method and apparatus
US20230385034A1 (en) Automated decision making using staged machine learning
US11494638B2 (en) Learning support device and learning support method
CN112566093B (en) Terminal relation identification method and device, computer equipment and storage medium
US10642805B1 (en) System for determining queries to locate data objects
CN110719106B (en) Social network graph compression method and system based on node classification and sorting
CN115757384A (en) Government affair data processing method based on big data
US10810458B2 (en) Incremental automatic update of ranked neighbor lists based on k-th nearest neighbors
US11361195B2 (en) Incremental update of a neighbor graph via an orthogonal transform based indexing
US20240095529A1 (en) Neural Network Optimization Method and Apparatus
CN114281932A (en) Method, device and equipment for training work order quality inspection model and storage medium
WO2022170853A1 (en) Information generation method and apparatus, electronic device, and computer readable medium
CN114897666A (en) Graph data storage, access, processing method, training method, device and medium
CN110570093B (en) Method and device for automatically managing business expansion channel
CN114024912A (en) Network traffic application identification analysis method and system based on improved CHAMELEON algorithm
CN111727108B (en) Method, device and system for controlling robot and storage medium
WO2023019427A1 (en) Method and apparatus for graph-based recommendation
CN116629388B (en) Differential privacy federal learning training method, device and computer readable storage medium
KR20150107252A (en) Method for data classification based on manifold learning
CN108833429B (en) Method, device and storage medium for acquiring virus immunity strategy of power communication network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination