CN108959616A - Production numeric field data quality based on big data technology quasi real time monitoring system and method - Google Patents
Production numeric field data quality based on big data technology quasi real time monitoring system and method Download PDFInfo
- Publication number
- CN108959616A CN108959616A CN201810789021.4A CN201810789021A CN108959616A CN 108959616 A CN108959616 A CN 108959616A CN 201810789021 A CN201810789021 A CN 201810789021A CN 108959616 A CN108959616 A CN 108959616A
- Authority
- CN
- China
- Prior art keywords
- data
- real time
- numeric field
- time monitoring
- big data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- General Factory Administration (AREA)
Abstract
The production numeric field data quality based on big data technology that the invention discloses a kind of quasi real time monitoring system and method, the system comprises the quasi real time monitoring systems of the production numeric field data quality based on big data technology, it include: data access layer, for accessing various types of source datas;Data storage processing layer, stream data carry out distributed storage, and the offline batch processing task of the flow direction of automatic running work in real time;Data application layer obtains incremental data for being analyzed in real time the quality of data.The present invention can carry out rapid verification reaction to increment problem data, reduce the generation of increment problem data from the root, accelerate the data quality level for promoting Production MIS.
Description
Technical field
The present invention relates to computer software design technical field, especially a kind of production numeric field data quality, which quasi real time monitors, is
System and method.
Background technique
Big data is throughout the every aspect of safety in production at present, and wisdom is moved towards in production safety management under the support of big data
Change, the operation conditions in each field of keeping the safety in production is monitored by integrating Various types of data information, realizes production safety management work
The improvement and optimization of work.Safety in production data management based on big data has efficient capture, discovery and analysis ability,
Valuable information can be economically excavated in data many and diverse from type, substantial amounts, realized to the real-time of data information
The functions such as acquisition, data storage, the inquiry of data analysis and synthesis, for safety in production operation integrated management, integrated dispatch, comprehensive association
It adjusts, comprehensive commander provides data supporting.But due to during production run data volume it is huge, each data object has tens
A field data, the processing of data and analysis heavy workload, manually monitoring cannot achieve, cause the acquisition of quality data compared with
Quickly it can not be verified and be reacted especially for increment problem data for difficulty, affect the reality of data problem discovery
Shi Xing.
Big data processing technique generally comprises batch processing big data technology and stream process big data technology, is increased income with Hadoop
Community's technology is that the batch processing big data system of representative needs first to converge data in batch, and analytic type is loaded onto after pre-processing in batches
In data warehouse, to carry out high-performance real-time query.Compared to this kind of Non real-time processing technologies, with Spark
The product components such as Streaming, Storm, Flink are that real time data is passed through stream process by the stream process big data technology of representative,
It is loaded onto high-performance memory database and is inquired one by one.Such system can realize efficiently default point to newest real time data
The inquiry of analysis processing model, data sluggishness is low, so if big data processing technique can be applied to Product Data Management system
In system, the quasi real time property of enterprise's Product Data Management will be can be improved.
Summary of the invention
The technical problem to be solved by the invention is to provide one kind can be to the production numeric field data matter based on big data technology
Quasi real time monitoring system and method is measured, rapid verification reaction can be carried out to increment problem data, reduce increment from the root and ask
The generation of data is inscribed, the data quality level for promoting Production MIS is accelerated.
In order to solve the above technical problems, the technical solution used in the present invention is as follows.
Production numeric field data quality based on big data technology quasi real time monitoring system, comprising:
Data access layer, for accessing various types of source datas;
Data storage processing layer, stream data carry out distributed storage, and the offline batch processing of the flow direction of automatic running work in real time
Task;
Data application layer obtains incremental data for being analyzed in real time the quality of data.
Production numeric field data quality based on big data technology quasi real time monitoring method, specifically includes the following steps:
A., (SuSE) Linux OS is installed on naked physical server, disposes the Hadoop big data platform based on container;And
The verification rule and the corresponding tables of data of qualitative character of data are built in database;
B. using database journal interpretative tool, monitored data library journal change is based on, real-time interpretation Incremental Log is converted into number
According to library operation note, it is written to big data platform;
C. after the log information write-in big data platform of source database data variation, according to the requirement of real-time, it is divided into time touching
Hair and event triggering are to verify data quality problem data;
D. the problem of verification obtains data, obtain object score, system score, index score billboard data after summarizing, and supply
Score data can be also synchronized to mobile solution platform for cell phone client inquiry point by internal and external network switching platform by query analysis
Analysis.
The above-mentioned production numeric field data quality based on big data technology quasi real time monitoring method, the operating system in step A are adopted
With CentOS or Redhat operating system.
The above-mentioned production numeric field data quality based on big data technology quasi real time monitoring method, the big data platform in step A
The deployment of installation and mirror image warehouse and mirror image market including docker container assemblies and kubernetes container layout component.
The above-mentioned production numeric field data quality based on big data technology quasi real time monitoring method, the school of data described in step A
It tests rule and qualitative character includes verification Rule Information, system information, object information, attribute information, organization, evaluation task
Information, dispatching log, problem data information, Criterion Attribute score, index object score and index system score.
The above-mentioned production numeric field data quality based on big data technology quasi real time monitoring method, time trigger described in step C
Method are as follows: the journal file in source database is parsed into the text that target data platform can be used by Data-collection middleware
Part, and data are updated into target data platform, complete quasi real time synchronizing for data;Then fixed by workflow schedule tool
When triggering call storing process complete data quality problem data verification.
Quasi real time monitoring method, event described in step C trigger the above-mentioned production numeric field data quality based on big data technology
Method are as follows: Data-collection middleware captures the data change in source database, and change log is sent to Kafka message team
In column, the data in Kafka message queue are consumed by the stream process engine of target data platform, and use specific data school
Logical program is tested to be verified.
Due to using above technical scheme, the invention technological progress is as follows.
The present invention can carry out rapid verification reaction to increment problem data, reduce the production of increment problem data from the root
It is raw;Simultaneously by the way that distributing mode computer system to be applied in production numeric field data quality indicator, the quality of data is realized quasi real time
Monitoring, shortens the checking time of storage problem data, has achieved the purpose that near-realtime data problem is found, enterprise can be made each
Class, which is applied, to be analyzed in big data platform using the flexible and efficient data of the data of high quality and progress, realizes production management letter
System equipment account and business datum integrality, normalization, accuracy, consistency, timeliness are ceased, quickening improves production management
The data quality level of information system is capable of the Efficient Operation of support index system and asset management aid decision, for its institute
The accuracy and integrality of the essential informations such as facility information, the O&M information of dependence are adequately ensured.
Specific embodiment
Below in conjunction with specific embodiment, the present invention will be described in further detail.
A kind of production numeric field data quality based on big data technology quasi real time monitoring system, comprising: data access layer, data
Store process layer and data reference level.Wherein, data access layer, for accessing various types of source datas, such as 4A system, people
The source data transmitted in the systems such as resource system, financial system, goods and materials system, marketing system and production system.Data store
Layer is managed, stream data carries out distributed storage, and the offline batch processing task of the flow direction of automatic running work in real time.Data store
Reason layer setting is provided with the Kafka message queue of real-time streaming data, can decouple data write-in and reading data, real-time high availability
And high concurrent;It further include the stream process engine that stream data is calculated in real time;It further include during data acquisition is synchronous with data
Between part, support from different data sources obtain data, to the complicated conversion operation of data, and finally by data land at not apposition
Formula;It further include the Hadoop cluster of the off-line calculation function for the distributed storage and big data quantity that data are provided;It further include timing
The workflow schedule engine of offline batch processing task in automatic running workflow.Data application layer, for the quality of data into
Row analysis in real time, obtains incremental data.Including including Data Quality Analysis platform and mobile solution platform, can disappear from Kafka
Breath queue obtains incremental data in real time or carries out data query from Hadoop cluster.
A kind of production numeric field data quality based on big data technology quasi real time monitoring method, specifically includes following steps.
A., (SuSE) Linux OS is installed on naked physical server, disposes the Hadoop big data platform based on container;
And the verification rule and the corresponding tables of data of qualitative character of data are built in the database.
Operating system is using (SuSE) Linux OS such as CentOS or Redhat in the present invention.Big data platform includes
The installation of docker container assemblies and kubernetes container layout component and the deployment in mirror image warehouse and mirror image market.
Data verification rule and qualitative character include verification Rule Information, system information, object information, attribute information,
Organization, evaluation mission bit stream, dispatching log, problem data information, Criterion Attribute score, index object score and index
System score etc..
B. using database journal interpretative tool, monitored data library journal change is based on, real-time interpretation Incremental Log is converted
For database manipulation record, it is written to big data platform.
Incremental Log refers to after last simultaneously operating, produced by any operation that corresponding source database carries out
Newly-increased log and it is next it is subsynchronous required for corresponding part.
C. after the log information write-in big data platform of source database data variation, according to the requirement of real-time, when being divided into
Between triggering and event triggering to verify data quality problem data.
The method of time trigger are as follows: the journal file in source database is parsed into target data by Data-collection middleware
The file that platform can be used, and data are updated into target data platform, complete quasi real time synchronizing for data;Then pass through
Workflow schedule tool clocked flip calls the verification of storing process completion data quality problem data.
The method of event triggering are as follows: Data-collection middleware captures the data change in source database, and by change log
It is sent in Kafka message queue, the data in Kafka message queue is consumed by stream process engine, and use specific number
It is verified according to check logic program.
D. the problem of verification obtains data, obtain object score, system score, index score billboard number after summarizing
According to also score data being synchronized to mobile solution platform for cell phone client by internal and external network switching platform for query analysis
Query analysis.
Embodiment described above is only presently preferred embodiments of the present invention, is not intended to limit the invention in any way.All bases
Inventive technique any simple modification, change and equivalence change substantially to the above embodiments, still fall within skill of the present invention
In the protection scope of art scheme.
Claims (7)
1. the quasi real time monitoring system of the production numeric field data quality based on big data technology characterized by comprising
Data access layer, for accessing various types of source datas;
Data storage processing layer, stream data carry out distributed storage, and the offline batch processing of the flow direction of automatic running work in real time
Task;
Data application layer obtains incremental data for being analyzed in real time the quality of data.
2. the quasi real time monitoring method of the production numeric field data quality based on big data technology, which is characterized in that specifically include following step
It is rapid:
A., (SuSE) Linux OS is installed on naked physical server, disposes the Hadoop big data platform based on container;And
The verification rule and the corresponding tables of data of qualitative character of data are built in database;
B. using database journal interpretative tool, monitored data library journal change is based on, real-time interpretation Incremental Log is converted into number
According to library operation note, it is written to big data platform;
C. after the log information write-in big data platform of source database data variation, according to the requirement of real-time, it is divided into time touching
Hair and event triggering are to verify data quality problem data;
D. the problem of verification obtains data, obtain object score, system score, index score billboard data after summarizing, and supply
Score data can be also synchronized to mobile solution platform for cell phone client inquiry point by internal and external network switching platform by query analysis
Analysis.
3. the production numeric field data quality according to claim 2 based on big data technology quasi real time monitoring method, feature
It is, the operating system in step A uses CentOS or Redhat operating system.
4. the production numeric field data quality according to claim 2 based on big data technology quasi real time monitoring method, feature
Be, the big data platform in step A include docker container assemblies and kubernetes container layout component installation and
The deployment in mirror image warehouse and mirror image market.
5. the production numeric field data quality according to claim 2 based on big data technology quasi real time monitoring method, feature
It is, the verification rule and qualitative character of data described in step A include verification Rule Information, system information, object information, category
Property information, organization, evaluation mission bit stream, dispatching log, problem data information, Criterion Attribute score, index object score
And index system score.
6. the production numeric field data quality according to claim 2 based on big data technology quasi real time monitoring method, feature
It is, the method for time trigger described in step C are as follows: the journal file of source database platform is parsed by Data-collection middleware
The file that can be used at target data platform, and data are updated into target database, complete quasi real time synchronizing for data;
Then the verification of storing process completion data quality problem data is called by task schedule tool clocked flip.
7. the production numeric field data quality according to claim 2 based on big data technology quasi real time monitoring method, feature
It is, the method for the triggering of event described in step C are as follows: Data-collection middleware captures the data change in source database, and will
Change log is sent in Kafka message queue, is consumed in Kafka message queue by the stream process engine of target data platform
Data, and verified using specific data check logical program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810789021.4A CN108959616A (en) | 2018-07-18 | 2018-07-18 | Production numeric field data quality based on big data technology quasi real time monitoring system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810789021.4A CN108959616A (en) | 2018-07-18 | 2018-07-18 | Production numeric field data quality based on big data technology quasi real time monitoring system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108959616A true CN108959616A (en) | 2018-12-07 |
Family
ID=64497832
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810789021.4A Pending CN108959616A (en) | 2018-07-18 | 2018-07-18 | Production numeric field data quality based on big data technology quasi real time monitoring system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108959616A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109634937A (en) * | 2018-12-20 | 2019-04-16 | 成都四方伟业软件股份有限公司 | Incremental data acquisition method, apparatus and system |
CN109669936A (en) * | 2018-12-25 | 2019-04-23 | 福建南威软件有限公司 | A kind of mass data quality report generation method based on polymerization model |
CN109960629A (en) * | 2019-03-14 | 2019-07-02 | 银清科技(北京)有限公司 | To the method and apparatus of payment system portfolio real time monitoring |
CN111078496A (en) * | 2019-11-29 | 2020-04-28 | 联想(北京)有限公司 | Data monitoring method, platform and storage medium |
CN111339194A (en) * | 2020-02-24 | 2020-06-26 | 平安科技(深圳)有限公司 | Automatic scheduling method and device for middleware of database access layer |
CN111625583A (en) * | 2020-05-21 | 2020-09-04 | 广西电网有限责任公司 | Service data processing method and device, computer equipment and storage medium |
CN111737242A (en) * | 2020-06-19 | 2020-10-02 | 福建南威软件有限公司 | Method for monitoring mass data processing process |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101990208A (en) * | 2009-07-31 | 2011-03-23 | 中国移动通信集团公司 | Automatic data checking method, system and equipment |
CN105761010A (en) * | 2016-02-24 | 2016-07-13 | 国网山东省电力公司 | Method and system for real-time monitoring of group enterprise audit based on real-time data acquisition |
CN106557991A (en) * | 2016-11-04 | 2017-04-05 | 广东电网有限责任公司电力科学研究院 | Voltage monitoring data platform |
CN107895017A (en) * | 2017-11-14 | 2018-04-10 | 国网江苏省电力公司电力科学研究院 | A kind of electric energy quality monitoring system construction method based on big data technology |
-
2018
- 2018-07-18 CN CN201810789021.4A patent/CN108959616A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101990208A (en) * | 2009-07-31 | 2011-03-23 | 中国移动通信集团公司 | Automatic data checking method, system and equipment |
CN105761010A (en) * | 2016-02-24 | 2016-07-13 | 国网山东省电力公司 | Method and system for real-time monitoring of group enterprise audit based on real-time data acquisition |
CN106557991A (en) * | 2016-11-04 | 2017-04-05 | 广东电网有限责任公司电力科学研究院 | Voltage monitoring data platform |
CN107895017A (en) * | 2017-11-14 | 2018-04-10 | 国网江苏省电力公司电力科学研究院 | A kind of electric energy quality monitoring system construction method based on big data technology |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109634937A (en) * | 2018-12-20 | 2019-04-16 | 成都四方伟业软件股份有限公司 | Incremental data acquisition method, apparatus and system |
CN109669936A (en) * | 2018-12-25 | 2019-04-23 | 福建南威软件有限公司 | A kind of mass data quality report generation method based on polymerization model |
CN109960629A (en) * | 2019-03-14 | 2019-07-02 | 银清科技(北京)有限公司 | To the method and apparatus of payment system portfolio real time monitoring |
CN109960629B (en) * | 2019-03-14 | 2023-06-16 | 银清科技有限公司 | Method and device for monitoring service volume of payment system in real time |
CN111078496A (en) * | 2019-11-29 | 2020-04-28 | 联想(北京)有限公司 | Data monitoring method, platform and storage medium |
CN111339194A (en) * | 2020-02-24 | 2020-06-26 | 平安科技(深圳)有限公司 | Automatic scheduling method and device for middleware of database access layer |
CN111625583A (en) * | 2020-05-21 | 2020-09-04 | 广西电网有限责任公司 | Service data processing method and device, computer equipment and storage medium |
CN111625583B (en) * | 2020-05-21 | 2022-07-29 | 广西电网有限责任公司 | Business data processing method and device, computer equipment and storage medium |
CN111737242A (en) * | 2020-06-19 | 2020-10-02 | 福建南威软件有限公司 | Method for monitoring mass data processing process |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108959616A (en) | Production numeric field data quality based on big data technology quasi real time monitoring system and method | |
US20190317944A1 (en) | Methods and apparatus for integrated management of structured data from various sources and having various formats | |
US10540335B2 (en) | Solution to generate a scriptset for an automated database migration | |
US10685359B2 (en) | Identifying clusters for service management operations | |
US11062324B2 (en) | Identifying clusters for service management operations | |
CN112100219B (en) | Report generation method, device, equipment and medium based on database query processing | |
JP2019207687A (en) | Generating project deliverables using objects of data model | |
CN110032594B (en) | Customizable data extraction method and device for multi-source database and storage medium | |
CN112651218A (en) | Automatic generation method and management method of bidding document, medium and computer | |
WO2023130978A1 (en) | System and method for calling resource service application from digital middle office of enterprise | |
CN116302829A (en) | Data monitoring method, device, equipment and storage medium | |
CN104182829A (en) | Instrument development reliability management and support system | |
CN115167785B (en) | Label-based network disk file management method and device, network disk and storage medium | |
CN113407980B (en) | Data annotation system | |
CN114218100A (en) | Business model testing method, device, system, equipment, medium and program product | |
CN112862264A (en) | Enterprise operation condition analysis method, computer device and computer storage medium | |
Ribeiro et al. | Improving productive processes using a process mining approach | |
Grylitska | The Digital Audit as a Key Element of Ukraine's Way out from COVID-19 | |
US11093876B2 (en) | System and methods employed for accountability of an asset | |
US11580580B2 (en) | Customer review and ticket management system | |
CN115032957B (en) | Production scheduling method and device, storage medium and electronic equipment | |
US11816621B2 (en) | Multi-computer tool for tracking and analysis of bot performance | |
CN114240397A (en) | Task management system, method, device, medium, and program product | |
CN114240236A (en) | Enterprise application management processing system and management method | |
CN114926099A (en) | Automatic defect filling method for power grid dispatching EMS data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181207 |