CN109977271B

CN109977271B - Big data processing system and processing method thereof

Info

Publication number: CN109977271B
Application number: CN201910354112.XA
Authority: CN
Inventors: 宋顶利; 张昕; 周建新
Original assignee: Chongqing Hanniu Technology Innovation Service Co ltd
Current assignee: Chongqing Kailin Jianguan Technology Co.,Ltd.
Priority date: 2019-04-29
Filing date: 2019-04-29
Publication date: 2022-12-20
Anticipated expiration: 2039-04-29
Also published as: CN109977271A

Abstract

The invention discloses a big data processing system, which comprises a data format conversion module, a data processing module and a data processing module, wherein the data format conversion module is used for converting the data format of original data to form data to be processed; the data mapping module is used for establishing a mapping set of the data to be processed; the mapping data processing module is used for processing the data of the mapping set; and the original data processing module is used for processing the original data according to the processing result of the mapping set data. The invention can improve the defects of the prior art and improve the data processing efficiency.

Description

Big data processing system and processing method thereof

Technical Field

The invention relates to the technical field of big data, in particular to a big data processing system and a big data processing method.

Background

In recent years, big data technology is rapidly developed and widely applied to a plurality of fields. Although the requirement of the big data technology on data processing precision is not high, the data source of the big data is numerous, the data volume is large, and the requirement on hardware for data processing is still high, so that the further popularization of the big data technology is limited.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a big data processing system and a processing method thereof, which can solve the defects of the prior art and improve the data processing efficiency.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows.

A big data processing system, comprising,

the data format conversion module is used for converting the data format of the original data to form data to be processed;

the data mapping module is used for establishing a mapping set of the data to be processed;

the mapping data processing module is used for processing the data of the mapping set;

and the original data processing module is used for processing the original data according to the processing result of the mapping set data.

A processing method of the big data processing system comprises the following steps:

A. the data format conversion module is used for converting the data format of the data to be processed, and the converted data to be processed comprises a data table head section, a data characteristic section, a mapping rule section and a data content section;

B. the data mapping module establishes a mapping set of the data to be processed and marks the mapping rule in a mapping rule section;

C. the mapping data processing module processes the data of the mapping set;

D. and the original data processing module processes the original data according to the processing result of the mapping set data.

Preferably, in step B, the data in the mapping set are linearly related; and calculating the similarity of the mapping rules, and setting the data to be processed corresponding to the mapping rules with the similarity larger than a set value into the same data cluster.

Preferably, in step C, processing the data of the map set comprises the steps of,

c1, establishing a data tree for the data in each data cluster, and taking the data with the same mapping rule as a node;

c2, processing data by taking the node as a starting end of data processing, wherein the data between the two nodes adopt the same processing mode, and the processing mode is determined by linear combination of the data processing modes of the nodes at the two ends;

and C3, establishing a correlation matrix among different data clusters, and establishing a multi-dimensional correlation tree of the data according to the correlation matrix.

Preferably, in step D, the processing of the raw data comprises the following steps,

d1, marking a data original address on a data table header;

d2, traversing and comparing the data features in the data feature section with the nodes of the multidimensional association tree, and taking the nodes with the highest similarity with the data features as mapping nodes;

and D3, performing inverse operation on the mapping nodes by using the mapping rules recorded by the mapping rule section to obtain a processing result of the original data.

Adopt the beneficial effect that above-mentioned technical scheme brought to lie in: according to the method, the mapping data set is established, the characteristic that the mapping data is convenient to process is utilized, the indirect processing of the original data is realized, and finally the mapping relation is fed back to the original data, so that the rapid processing of the original data is realized. The processing result of the mapping data is embodied by adopting a multi-dimensional associated tree structure, so that the dependence of the mapping data processing and the original data processing can be improved, and the accuracy of the original data processing is ensured.

Drawings

FIG. 1 is a block diagram of one embodiment of the present invention.

In the figure: 1. a data format conversion module; 2. a data mapping module; 3. a mapping data processing module; 4. and a raw data processing module.

Detailed Description

Referring to fig. 1, one embodiment of the present invention includes,

the data format conversion module 1 is used for converting the data format of the original data to form data to be processed;

the data mapping module 2 is used for establishing a mapping set of data to be processed;

the mapping data processing module 3 is used for processing the data of the mapping set;

and the original data processing module 4 is used for processing the original data according to the processing result of the mapping set data.

A. the data format conversion module 1 is used for converting the data format of the data to be processed, and the converted data to be processed comprises a data table head segment, a data characteristic segment, a mapping rule segment and a data content segment;

B. the data mapping module 2 establishes a mapping set of the data to be processed and marks the mapping rule in a mapping rule segment;

C. the mapping data processing module 3 processes the data of the mapping set;

D. and the original data processing module 4 processes the original data according to the processing result of the mapping set data.

In the step B, data in the mapping set are linearly related; and calculating the similarity of the mapping rules, and setting the data to be processed corresponding to the mapping rules with the similarity larger than a set value into the same data cluster.

In step C, processing the data of the map set includes the following steps,

c2, processing the data by taking the node as a starting end of data processing, wherein the data between the two nodes adopt the same processing mode, and the processing mode is determined by linear combination of the data processing modes of the nodes at the two ends;

In step D, the processing of the raw data comprises the following steps,

d1, marking a data original address on a data table header;

In the description of the present invention, it is to be understood that the terms "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, are merely for convenience of description of the present invention, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A big data processing system, characterized by: comprises the steps of (a) preparing a substrate,

the data mapping module is used for establishing a mapping set of the data to be processed; the data in the mapping set are linearly related; calculating the similarity of the mapping rules, and setting the data to be processed corresponding to the mapping rules with the similarity larger than a set value as the same data cluster;

the mapping data processing module is used for processing the data of the mapping set; processing the data of the mapping set includes the steps of,

c3, establishing a correlation matrix among different data clusters, and establishing a multi-dimensional correlation tree of the data according to the correlation matrix;

a raw data processing module for processing the raw data according to the processing result of the mapping set data, the processing of the raw data comprises the following steps,

d1, marking a data original address on a data table header;

2. A method of processing a big data processing system according to claim 1, comprising the steps of:

A. the data format conversion module is used for converting the data format of the original data, and the converted original data comprises a data table head section, a data characteristic section, a mapping rule section and a data content section;

B. the data mapping module is used for establishing a mapping set of the data to be processed and marking the mapping rule in the mapping rule section; the data in the mapping set are linearly related; calculating the similarity of the mapping rules, and setting the data to be processed corresponding to the mapping rules with the similarity larger than a set value as the same data cluster;

C. a mapping data processing module for processing the data of the mapping set, wherein the processing of the data of the mapping set comprises the following steps,

D. the original data processing module is used for processing the original data according to the processing result of the mapping set data, and the processing of the original data comprises the following steps,

d1, marking a data original address on a header of a data table;