CN110659276A - Computer data statistical system and statistical classification method thereof - Google Patents

Computer data statistical system and statistical classification method thereof Download PDF

Info

Publication number
CN110659276A
CN110659276A CN201910910589.1A CN201910910589A CN110659276A CN 110659276 A CN110659276 A CN 110659276A CN 201910910589 A CN201910910589 A CN 201910910589A CN 110659276 A CN110659276 A CN 110659276A
Authority
CN
China
Prior art keywords
data
module
unit
classification
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910910589.1A
Other languages
Chinese (zh)
Inventor
张琪
宋仪轩
刘苗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Healthcare Big Data Protection And Development Co Ltd
Original Assignee
Jiangsu Healthcare Big Data Protection And Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Healthcare Big Data Protection And Development Co Ltd filed Critical Jiangsu Healthcare Big Data Protection And Development Co Ltd
Priority to CN201910910589.1A priority Critical patent/CN110659276A/en
Publication of CN110659276A publication Critical patent/CN110659276A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data statistics, in particular to a computer data statistics system and a statistical classification method thereof. The data cleaning and classifying device comprises a data acquisition unit, a data cleaning unit, a data classifying unit and a data storage unit, wherein the data acquisition unit is used for acquiring front-end data, the data cleaning unit is used for carrying out data cleaning operation on the acquired data, the data classifying unit is used for classifying the data according to data types, and the data storage unit is used for storing the classified data. According to the computer data statistical system and the statistical classification method thereof, the data acquisition unit is arranged to realize the integrity of data acquisition and prevent the data acquisition from being lost, the data classification unit is arranged to realize the classification of data according to data types, and the classified data is stored through the data storage unit to realize the classified storage of data, so that the data calling and searching are facilitated.

Description

Computer data statistical system and statistical classification method thereof
Technical Field
The invention relates to the technical field of data statistics, in particular to a computer data statistics system and a statistical classification method thereof.
Background
With the arrival of the big data era, the quality of data statistics is more and more important, the data statistics is realized by adopting a distributed architecture system, but the existing data statistics cannot preprocess data when being collected at the front end, so that the data types are inconsistent, even the data loss phenomenon occurs, the integrity of data collection is influenced, and meanwhile, when the data statistics is stored, the data statistics cannot be classified and stored according to the relevance between the data, so that the data statistics is not convenient for later-stage searching.
Disclosure of Invention
The present invention is directed to a computer data statistics system and a statistical classification method thereof, so as to solve the problems in the background art.
In order to achieve the above object, in one aspect, the present invention provides a computer data statistics system, including a data acquisition unit, a data cleaning unit, a data classification unit and a data storage unit, where the data acquisition unit is configured to acquire front-end data, the data cleaning unit is configured to perform data cleaning operation on the acquired data, the data classification unit is configured to classify the data according to data type, and the data storage unit is configured to store the classified data, and the data statistics system has the following processes:
s1, collecting front-end data through a collection node;
s2, carrying out data cleaning processing on the acquired data;
s3, classifying the cleaned data;
and S4, storing the classified data.
Preferably, the data acquisition unit acquires data according to the following flow:
s11, front-end data acquisition, wherein the front-end data is acquired through an acquisition node;
s12, data signal conditioning, wherein the analog output of each acquisition node is respectively subjected to signal conversion so as to adapt to the requirement of the input end of the analog/digital converter on the input signal;
s13, storing the sampling signals, converting the continuous signals into discontinuous sampling signals, and converting the discontinuous sampling signals into continuous signals;
s14, converting the analog quantity signal into a digital quantity signal;
and S15, processing the sampled digital signals by digital signal processing.
Preferably, the sampled signal is stored using a unit pulse sequence function, which is expressed by the following formula:
Figure BDA0002214584120000021
preferably, the data cleansing unit comprises the following modules:
a first module: the error correcting module corrects the data error form;
and a second module: a delete duplicate entry module that deletes duplicate records or duplicate fields present in the data;
and a third module: the unified specification module is used for unifying data specifications and abstracting out consistent content;
and a module IV: the correction logic module is used for determining the logic, conditions and caliber of each source system and correcting the acquisition logic of the abnormal source system;
and a fifth module: the conversion construction module is used for carrying out standardization processing on the data;
and a module six: the data compression module is used for maintaining the integrity and the accuracy of the original data set and reorganizing the data according to a certain algorithm and a certain mode on the premise of not losing useful information;
and a seventh module: the data supplementing module is used for supplementing the data of the incomplete data;
and a module eight: and the data discarding module deletes abnormal data in the data.
Preferably, the data storage unit flow is as follows:
s41, establishing a cloud environment storage system, and establishing a large-scale cloud environment data storage system according to related storage nodes;
s42, decomposing the data processing tasks in the cloud environment data storage system into small tasks, and decomposing a large set area of data into small areas;
s43, data parallel processing, parallel processing a plurality of processing tasks.
Preferably, the data parallel processing formula is as follows:
Figure BDA0002214584120000031
suppose R is a large amount of data to be stored, having a k-element attribute, A1,A2,Ai,AkRepresenting attributes of the mass of data, with Ai being the mass of data stored on the mth node;
wherein the large amount of data R is represented as:
Figure BDA0002214584120000032
a method for statistical classification of computer data, comprising any one of the above computer data statistical systems, comprising the steps of:
s31, preprocessing source data, and providing management of algorithm learning samples and management of selecting an optimal algorithm;
s32, data distribution processing and analysis, and resources are reasonably distributed according to the processing capacity of different processors;
s33, integrating the classification results, integrating the results processed by different processors, and adopting a classification integration formula as follows:
Figure BDA0002214584120000033
wherein, PcFor accuracy, N is the number of processors.
Preferably, in S31, the source data preprocessing specifically includes the following steps:
s311, filtering and extracting source data, namely filtering and extracting source data information;
s312, learning sample selection, wherein data are randomly sampled, so that the learning samples can fully reflect the integral distribution of the required classification data, and the set learning samples are respectively extracted from different source data according to the distribution of the source data;
s313, comparing sample results, classifying the samples respectively through processors of different algorithm functions in the distributed system, comparing classification results, counting the accuracy of different algorithms on the same sample, and making result data;
s314, selecting the optimal algorithm, comparing the accuracy of different algorithms in detail, and selecting the optimal algorithm as the main algorithm for classifying the data.
Compared with the prior art, the invention has the beneficial effects that:
1. in the computer data statistical system and the statistical classification method thereof, a data acquisition unit is arranged to acquire front-end data through an acquisition node and perform signal conditioning, sampling signal storage, analog-to-digital signal conversion and digital signal processing operations on the acquired data, so that the integrity of data acquisition is realized and the data acquisition is prevented from being lost.
2. In the computer data statistical system and the statistical classification method thereof, a data cleaning unit is arranged, and data is cleaned through an error correcting module, a repeated item deleting module, a unified specification module, a correction logic module, a conversion construction module, a data compression module, a data supplementing module and a data discarding module, so that the data error rate is reduced, and meanwhile, the data occupation amount is reduced.
3. In the computer data statistical system and the statistical classification method thereof, the data classification unit is arranged to classify data according to data types, and the classified data is stored through the data storage unit, so that the data is classified and stored, and the data is convenient to call and search.
Drawings
FIG. 1 is a block diagram of a data statistics system unit of the present invention;
FIG. 2 is a flow chart of a data statistics system of the present invention;
FIG. 3 is a flow chart of a data acquisition unit of the present invention;
FIG. 4 is a schematic diagram of the data signal conditioning operation of the present invention;
FIG. 5 is a block diagram of a data cleansing unit according to the present invention;
FIG. 6 is a flow chart of a data storage unit of the present invention;
FIG. 7 is a flow chart of a data sorting unit according to the present invention;
FIG. 8 is a flow chart of source data preprocessing of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1 to 6, the present invention provides a computer data statistics system, which includes a data acquisition unit, a data cleaning unit, a data classification unit, and a data storage unit, wherein the data acquisition unit is configured to acquire front-end data, the data cleaning unit is configured to perform a data cleaning operation on the acquired data, the data classification unit is configured to classify the data according to data types, the data storage unit is configured to store the classified data, and a flow of the data statistics system is as follows:
s1, collecting front-end data through a collection node;
s2, carrying out data cleaning processing on the acquired data;
s3, classifying the cleaned data;
and S4, storing the classified data.
In this embodiment, the data acquisition process of the data acquisition unit is as follows:
s11, front-end data acquisition, wherein the front-end data is acquired through an acquisition node;
s12, data signal conditioning, wherein the analog output of each acquisition node is respectively subjected to signal conversion so as to adapt to the requirement of the input end of the analog/digital converter on the input signal;
s13, storing the sampling signals, converting the continuous signals into discontinuous sampling signals, and converting the discontinuous sampling signals into continuous signals;
s14, converting the analog quantity signal into a digital quantity signal;
and S15, processing the sampled digital signals by digital signal processing.
Wherein, the collection node is the sensor, and the sensor includes temperature sensor, humidity transducer, image sensor, sound sensor etc. is convenient for select suitable collection node according to the kind of front end, gathers the data that the front end needs to gather.
Further, the data signal conditioning function is to perform signal conversion on the analog output of each sensor respectively, so that the analog output of each sensor is adapted to the requirement of the input end of the analog/digital converter for the input signal, and the functions of the signal conditioning module generally include: the principle of the static processing of signal switching, signal conversion, signal amplification, calibration, linearization, compensation and the like is shown IN fig. 4, IN the diagram, a sensor signal is accessed from a J-IN port, then switches S1-S5 are selected according to the signal type, a conditioned signal is obtained from a J-OUT port and sent to an A/D conversion module, wherein: DGND, VDD denote the sensor-side digital power supply; AGND, V +5, and V-5 represent sensor-side analog power supplies, and the equivalent circuit equations in the standard signal mode are as follows:
Figure BDA0002214584120000051
Figure BDA0002214584120000052
VO-=0 (3)
in the formula, RxIs the on-resistance of MAX383, RwAQW21X, when R is7=R8,Rw<R2Obtained by the formulas (1) and (2):
VO+-VO-=(1/2)·R2/(R1+R2)·VA+ (4)
specifically, the sampled signal is stored by using a unit pulse sequence function, and the formula is as follows:
Figure BDA0002214584120000061
in the step of converting the sampling signal into the continuous signal, a zero-order retainer is adopted to convert the sampling signal into a signal which keeps a constant value between two continuous sampling moments, namely, in the interval of T e [ nT, (n +1) T ], the output value of the zero-order retainer is always kept as x (nT).
In a further aspect, the data cleansing unit includes the following modules:
a first module: the error correcting module is used for correcting data value errors, data type errors, data coding errors, data format errors, data abnormal errors, dependence conflicts and multi-value errors;
and a second module: the repeated item deleting module deletes repeated records or repeated fields in the data, and the basic idea of judging the repeated items is 'sorting and merging', firstly sorting the records in the database according to a certain rule, and then detecting whether the records are repeated or not by comparing whether the adjacent records are similar or not;
and a third module: the unified specification module is used for unifying data specifications and abstracting out consistent content;
and a module IV: the correction logic module is used for determining the logic, conditions and caliber of each source system and correcting the acquisition logic of the abnormal source system;
and a fifth module: the conversion construction module is used for carrying out standardized processing on data and comprises data type conversion, data semantic conversion, data granularity conversion, table/data splitting, row-column conversion, data discretization, data standardization, new field refinement and attribute construction;
and a module six: the data compression module maintains the integrity and accuracy of the original data set, reorganizes data according to a certain algorithm and a certain mode on the premise of not losing useful information, and complex data analysis and data calculation of large-scale data generally consume a large amount of time, so that reduction and compression of the data are needed before the reorganization and the compression, the data scale is reduced, interactive data mining can be faced, and information feedback is carried out on comparison data before and after the data mining. Therefore, the data mining on the simplified data set is obviously higher in efficiency, and the mined result is basically the same as the result obtained by using the original data set;
and a seventh module: the data supplementing module is used for supplementing the data of the incomplete data, the data supplementation comprises a supplementation missing value and a supplementation null value, the missing value refers to the condition that the data originally must exist but actually does not have the data, and the null value refers to the condition that the data possibly exist actually;
and a module eight: and the data discarding module deletes abnormal data in the data, wherein the types of the discarded data comprise whole deletion and variable deletion, the whole deletion refers to deletion of a sample containing a missing value, and the variable deletion can be considered if an invalid value and a missing value of a certain variable are many and the variable is not particularly important for the problem to be researched, so that the number of variables for analysis is reduced, and the sample amount is not changed.
It should be noted that the data storage unit flow is as follows:
s41, establishing a cloud environment storage system, and establishing a large-scale cloud environment data storage system according to related storage nodes;
s42, decomposing the data processing tasks in the cloud environment data storage system into small tasks, and decomposing a large set area of data into small areas;
s43, data parallel processing, parallel processing a plurality of processing tasks.
The data parallel processing formula is as follows:
Figure BDA0002214584120000071
suppose R is a large amount of data to be stored, having a k-element attribute, A1,A2,Ai,AkRepresenting attributes of the mass of data, with Ai being the mass of data stored on the mth node;
wherein the large amount of data R is represented as:
Figure BDA0002214584120000072
example 2
Referring to fig. 7-8, the present invention provides a computer data statistical classification method, including any one of the above computer data statistical systems, including the following steps:
s31, preprocessing source data, and providing management of algorithm learning samples and management of selecting an optimal algorithm;
s32, data distribution processing and analysis, and resources are reasonably distributed according to the processing capacity of different processors;
s33, integrating the classification results, integrating the results processed by different processors, and adopting a classification integration formula as follows:
Figure BDA0002214584120000081
wherein, PcFor accuracy, N is the number of processors.
In S31, the source data preprocessing specifically includes the following steps:
s311, filtering and extracting source data, namely filtering and extracting source data information;
s312, learning sample selection, wherein data are randomly sampled, so that the learning samples can fully reflect the integral distribution of the required classification data, and the set learning samples are respectively extracted from different source data according to the distribution of the source data;
s313, comparing sample results, classifying the samples respectively through processors of different algorithm functions in the distributed system, comparing classification results, counting the accuracy of different algorithms on the same sample, and making result data;
s314, selecting the optimal algorithm, comparing the accuracy of different algorithms in detail, and selecting the optimal algorithm as the main algorithm for classifying the data.
In S312, in the learning sample selection, assuming that the total number of samples is N, and the sub K sample sets are { N1, N2, … }, the sample selection may randomly select M × Nk/N data from the K samples to be processed respectively according to the set total number M of samples to be recombined to obtain a sample set required by machine learning.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and the preferred embodiments of the present invention are described in the above embodiments and the description, and are not intended to limit the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (8)

1. The utility model provides a computer data statistical system, includes data acquisition unit, data cleaning unit, data classification unit and data memory cell, its characterized in that: the data collection unit is used for collecting front-end data, the data cleaning unit is used for carrying out data cleaning operation on the collected data, the data classification unit is used for classifying the data according to data types, the data storage unit is used for storing the classified data, and the data statistical system has the following flow:
s1, collecting front-end data through a collection node;
s2, carrying out data cleaning processing on the acquired data;
s3, classifying the cleaned data;
and S4, storing the classified data.
2. The computer data statistics system of claim 1, wherein: the data acquisition unit acquires data in the following flow:
s11, front-end data acquisition, wherein the front-end data is acquired through an acquisition node;
s12, data signal conditioning, wherein the analog output of each acquisition node is respectively subjected to signal conversion so as to adapt to the requirement of the input end of the analog/digital converter on the input signal;
s13, storing the sampling signals, converting the continuous signals into discontinuous sampling signals, and converting the discontinuous sampling signals into continuous signals;
s14, converting the analog quantity signal into a digital quantity signal;
and S15, processing the sampled digital signals by digital signal processing.
3. The computer data statistics system of claim 2, wherein: the sampled signal is stored and described by a unit pulse sequence function, and the formula is as follows:
Figure FDA0002214584110000011
4. the computer data statistics system of claim 1, wherein: the data cleaning unit comprises the following modules:
a first module: the error correcting module corrects the data error form;
and a second module: a delete duplicate entry module that deletes duplicate records or duplicate fields present in the data;
and a third module: the unified specification module is used for unifying data specifications and abstracting out consistent content;
and a module IV: the correction logic module is used for determining the logic, conditions and caliber of each source system and correcting the acquisition logic of the abnormal source system;
and a fifth module: the conversion construction module is used for carrying out standardization processing on the data;
and a module six: the data compression module is used for maintaining the integrity and the accuracy of the original data set and reorganizing the data according to a certain algorithm and a certain mode on the premise of not losing useful information;
and a seventh module: the data supplementing module is used for supplementing the data of the incomplete data;
and a module eight: and the data discarding module deletes abnormal data in the data.
5. The computer data statistics system of claim 1, wherein: the data storage unit flow is as follows:
s41, establishing a cloud environment storage system, and establishing a large-scale cloud environment data storage system according to related storage nodes;
s42, decomposing the data processing tasks in the cloud environment data storage system into small tasks, and decomposing a large set area of data into small areas;
s43, data parallel processing, parallel processing a plurality of processing tasks.
6. The computer data statistics system of claim 5, wherein: the data parallel processing formula is as follows:
Figure FDA0002214584110000021
suppose R is a large amount of data to be stored, having a k-element attribute, A1,A2,Ai,AkRepresenting attributes of the mass of data, with Ai being the mass of data stored on the mth node;
wherein the large amount of data R is represented as:
Figure FDA0002214584110000022
7. a method of statistical classification of computer data comprising the computer data statistics system of any of claims 1-6, comprising the steps of:
s31, preprocessing source data, and providing management of algorithm learning samples and management of selecting an optimal algorithm;
s32, data distribution processing and analysis, and resources are reasonably distributed according to the processing capacity of different processors;
s33, integrating the classification results, integrating the results processed by different processors, and adopting a classification integration formula as follows:
Figure FDA0002214584110000031
wherein, PcFor accuracy, N is the number of processors.
8. The statistical classification method of computer data according to claim 7, characterized in that: in S31, the source data preprocessing specifically includes the following steps:
s311, filtering and extracting source data, namely filtering and extracting source data information;
s312, learning sample selection, wherein data are randomly sampled, so that the learning samples can fully reflect the integral distribution of the required classification data, and the set learning samples are respectively extracted from different source data according to the distribution of the source data;
s313, comparing sample results, classifying the samples respectively through processors of different algorithm functions in the distributed system, comparing classification results, counting the accuracy of different algorithms on the same sample, and making result data;
s314, selecting the optimal algorithm, comparing the accuracy of different algorithms in detail, and selecting the optimal algorithm as the main algorithm for classifying the data.
CN201910910589.1A 2019-09-25 2019-09-25 Computer data statistical system and statistical classification method thereof Pending CN110659276A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910910589.1A CN110659276A (en) 2019-09-25 2019-09-25 Computer data statistical system and statistical classification method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910910589.1A CN110659276A (en) 2019-09-25 2019-09-25 Computer data statistical system and statistical classification method thereof

Publications (1)

Publication Number Publication Date
CN110659276A true CN110659276A (en) 2020-01-07

Family

ID=69039120

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910910589.1A Pending CN110659276A (en) 2019-09-25 2019-09-25 Computer data statistical system and statistical classification method thereof

Country Status (1)

Country Link
CN (1) CN110659276A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807449A (en) * 2020-01-08 2020-02-18 杭州皓智天诚信息科技有限公司 Science and technology project application on-line service terminal
CN112559742A (en) * 2020-12-08 2021-03-26 北京伟杰东博信息科技有限公司 Classified storage method and system thereof
CN113407522A (en) * 2021-06-18 2021-09-17 上海市第十人民医院 Data processing method and device, computer equipment and computer readable storage medium
CN114124300A (en) * 2021-11-11 2022-03-01 广东电网有限责任公司广州供电局 Converter valve system, data processing method, electronic equipment and storage medium
CN115982503A (en) * 2023-02-07 2023-04-18 梁礼津 Website information acquisition method and system based on cloud platform

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140280173A1 (en) * 2013-03-13 2014-09-18 Msc Intellectual Properties B.V. System and method for real-time dynamic measurement of best-estimate quality levels while reviewing classified or enriched data
CN106056164A (en) * 2016-06-13 2016-10-26 北京邮电大学 Classification forecasting method based on Bayesian network
CN106874927A (en) * 2016-12-27 2017-06-20 合肥阿巴赛信息科技有限公司 The construction method and system of a kind of random strong classifier
CN107391390A (en) * 2017-07-04 2017-11-24 深圳齐心集团股份有限公司 A kind of computer big data storage system
CN108052665A (en) * 2017-12-29 2018-05-18 深圳市中易科技有限责任公司 A kind of data cleaning method and device based on distributed platform
CN109918458A (en) * 2019-01-24 2019-06-21 杭州志远科技有限公司 A kind of comprehensive geographic information data processing system
CN110134727A (en) * 2019-04-03 2019-08-16 清华大学天津高端装备研究院 A kind of data collection and transmission for serving manufacture execution level

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140280173A1 (en) * 2013-03-13 2014-09-18 Msc Intellectual Properties B.V. System and method for real-time dynamic measurement of best-estimate quality levels while reviewing classified or enriched data
CN106056164A (en) * 2016-06-13 2016-10-26 北京邮电大学 Classification forecasting method based on Bayesian network
CN106874927A (en) * 2016-12-27 2017-06-20 合肥阿巴赛信息科技有限公司 The construction method and system of a kind of random strong classifier
CN107391390A (en) * 2017-07-04 2017-11-24 深圳齐心集团股份有限公司 A kind of computer big data storage system
CN108052665A (en) * 2017-12-29 2018-05-18 深圳市中易科技有限责任公司 A kind of data cleaning method and device based on distributed platform
CN109918458A (en) * 2019-01-24 2019-06-21 杭州志远科技有限公司 A kind of comprehensive geographic information data processing system
CN110134727A (en) * 2019-04-03 2019-08-16 清华大学天津高端装备研究院 A kind of data collection and transmission for serving manufacture execution level

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王广君等: "《传感器技术及实验》", 30 June 2013, 中国地质大学出版社 *
陈盛荣,刘广钟: "分布式环境下ETL 系统的优化策略研究", 《现代计算机(专业版)》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807449A (en) * 2020-01-08 2020-02-18 杭州皓智天诚信息科技有限公司 Science and technology project application on-line service terminal
CN112559742A (en) * 2020-12-08 2021-03-26 北京伟杰东博信息科技有限公司 Classified storage method and system thereof
CN113407522A (en) * 2021-06-18 2021-09-17 上海市第十人民医院 Data processing method and device, computer equipment and computer readable storage medium
CN114124300A (en) * 2021-11-11 2022-03-01 广东电网有限责任公司广州供电局 Converter valve system, data processing method, electronic equipment and storage medium
CN114124300B (en) * 2021-11-11 2023-10-20 广东电网有限责任公司广州供电局 Converter valve system, data processing method, electronic equipment and storage medium
CN115982503A (en) * 2023-02-07 2023-04-18 梁礼津 Website information acquisition method and system based on cloud platform
CN115982503B (en) * 2023-02-07 2023-10-13 深圳慧梧科技有限公司 Website information acquisition method and system based on cloud platform

Similar Documents

Publication Publication Date Title
CN110659276A (en) Computer data statistical system and statistical classification method thereof
Liu et al. Accumulating regional density dissimilarity for concept drift detection in data streams
CN109582551B (en) Log data analysis method and device, computer equipment and storage medium
CN111177276B (en) Spark computing framework-based kinetic energy data processing system and method
EP3709127A1 (en) Novel olap precomputation model and precomputation result generation method
EP3432520A1 (en) Efficient storage and querying of time series metrics
Vyawahare et al. A hybrid database approach using graph and relational database
CN110389950B (en) Rapid running big data cleaning method
CN109408383B (en) Java memory leak analysis method and device
CN115080565A (en) Multi-source data unified processing system based on big data engine
CN113010484A (en) Log file management method and device
CN110598042A (en) Incremental update-based video structured real-time updating method and system
CN110879805A (en) Data anomaly discovery method and device, server and storage medium
CN109800221A (en) A kind of mass data association relationship analysis method, apparatus and system
CN113569879B (en) Training method of abnormal recognition model, abnormal account recognition method and related device
CN109308293B (en) Database and table dividing method for large concurrent database
CN113778996A (en) Large data stream data processing method and device, electronic equipment and storage medium
CN107818177B (en) Business intelligent model building method and building device
CN113360564A (en) ETL-based data stream processing method, system, device and readable storage medium
US8180982B2 (en) Archival and retrieval of data using linked pages and value compression
CN117251532B (en) Large-scale literature mechanism disambiguation method based on dynamic multistage matching
CN115599843A (en) Big data processing method and system based on data analysis
CN117131251B (en) Multidimensional data analysis processing system and method based on cloud computing
CN109635023B (en) Lightweight custom source data decomposition reading system and method based on ETL
CN116431618A (en) Data missing prediction and exception correction method and device based on multidimensional associated fields

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200107

RJ01 Rejection of invention patent application after publication