Big data processing system and processing method thereof
Technical Field
The invention relates to the technical field of big data, in particular to a big data processing system and a big data processing method.
Background
In recent years, big data technology is rapidly developed and widely applied to a plurality of fields. Although the requirement of the big data technology on data processing precision is not high, the data source of the big data is numerous, the data volume is large, and the requirement on hardware for data processing is still high, so that the further popularization of the big data technology is limited.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a big data processing system and a processing method thereof, which can solve the defects of the prior art and improve the data processing efficiency.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows.
A big data processing system, comprising,
the data format conversion module is used for converting the data format of the original data to form data to be processed;
the data mapping module is used for establishing a mapping set of the data to be processed;
the mapping data processing module is used for processing the data of the mapping set;
and the original data processing module is used for processing the original data according to the processing result of the mapping set data.
A processing method of the big data processing system comprises the following steps:
A. the data format conversion module is used for converting the data format of the data to be processed, and the converted data to be processed comprises a data table head section, a data characteristic section, a mapping rule section and a data content section;
B. the data mapping module establishes a mapping set of the data to be processed and marks the mapping rule in a mapping rule section;
C. the mapping data processing module processes the data of the mapping set;
D. and the original data processing module processes the original data according to the processing result of the mapping set data.
Preferably, in step B, the data in the mapping set are linearly related; and calculating the similarity of the mapping rules, and setting the data to be processed corresponding to the mapping rules with the similarity larger than a set value into the same data cluster.
Preferably, in step C, processing the data of the map set comprises the steps of,
c1, establishing a data tree for the data in each data cluster, and taking the data with the same mapping rule as a node;
c2, processing data by taking the node as a starting end of data processing, wherein the data between the two nodes adopt the same processing mode, and the processing mode is determined by linear combination of the data processing modes of the nodes at the two ends;
and C3, establishing a correlation matrix among different data clusters, and establishing a multi-dimensional correlation tree of the data according to the correlation matrix.
Preferably, in step D, the processing of the raw data comprises the following steps,
d1, marking a data original address on a data table header;
d2, traversing and comparing the data features in the data feature section with the nodes of the multidimensional association tree, and taking the nodes with the highest similarity with the data features as mapping nodes;
and D3, performing inverse operation on the mapping nodes by using the mapping rules recorded by the mapping rule section to obtain a processing result of the original data.
Adopt the beneficial effect that above-mentioned technical scheme brought to lie in: according to the method, the mapping data set is established, the characteristic that the mapping data is convenient to process is utilized, the indirect processing of the original data is realized, and finally the mapping relation is fed back to the original data, so that the rapid processing of the original data is realized. The processing result of the mapping data is embodied by adopting a multi-dimensional associated tree structure, so that the dependence of the mapping data processing and the original data processing can be improved, and the accuracy of the original data processing is ensured.
Drawings
FIG. 1 is a block diagram of one embodiment of the present invention.
In the figure: 1. a data format conversion module; 2. a data mapping module; 3. a mapping data processing module; 4. and a raw data processing module.
Detailed Description
Referring to fig. 1, one embodiment of the present invention includes,
the data format conversion module 1 is used for converting the data format of the original data to form data to be processed;
the data mapping module 2 is used for establishing a mapping set of data to be processed;
the mapping data processing module 3 is used for processing the data of the mapping set;
and the original data processing module 4 is used for processing the original data according to the processing result of the mapping set data.
A processing method of the big data processing system comprises the following steps:
A. the data format conversion module 1 is used for converting the data format of the data to be processed, and the converted data to be processed comprises a data table head segment, a data characteristic segment, a mapping rule segment and a data content segment;
B. the data mapping module 2 establishes a mapping set of the data to be processed and marks the mapping rule in a mapping rule segment;
C. the mapping data processing module 3 processes the data of the mapping set;
D. and the original data processing module 4 processes the original data according to the processing result of the mapping set data.
In the step B, data in the mapping set are linearly related; and calculating the similarity of the mapping rules, and setting the data to be processed corresponding to the mapping rules with the similarity larger than a set value into the same data cluster.
In step C, processing the data of the map set includes the following steps,
c1, establishing a data tree for the data in each data cluster, and taking the data with the same mapping rule as a node;
c2, processing the data by taking the node as a starting end of data processing, wherein the data between the two nodes adopt the same processing mode, and the processing mode is determined by linear combination of the data processing modes of the nodes at the two ends;
and C3, establishing a correlation matrix among different data clusters, and establishing a multi-dimensional correlation tree of the data according to the correlation matrix.
In step D, the processing of the raw data comprises the following steps,
d1, marking a data original address on a data table header;
d2, traversing and comparing the data features in the data feature section with the nodes of the multidimensional association tree, and taking the nodes with the highest similarity with the data features as mapping nodes;
and D3, performing inverse operation on the mapping nodes by using the mapping rules recorded by the mapping rule section to obtain a processing result of the original data.
In the description of the present invention, it is to be understood that the terms "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, are merely for convenience of description of the present invention, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention.
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.