CN111914003A

CN111914003A - Big data analysis system based on cloud platform

Info

Publication number: CN111914003A
Application number: CN202010815987.8A
Authority: CN
Inventors: 王纯柏
Original assignee: Zhi Xiao 2 Guangzhou Technology Co ltd
Current assignee: Oriental Fortune Information Co.,Ltd.
Priority date: 2020-08-14
Filing date: 2020-08-14
Publication date: 2020-11-10
Anticipated expiration: 2040-08-14
Also published as: CN111914003B

Abstract

The invention discloses a cloud platform-based big data analysis system, which comprises a target entry unit, a data selection unit, a data acquisition unit, a redundancy analysis unit, a calculation force acquisition unit and a redundancy rule base, wherein the target entry unit is used for recording data; the data acquisition unit is used for acquiring target information and reference information of all users, wherein the target information comprises target data and target time; the target data is the content corresponding to the data range required by the corresponding user, and the target time is the generation time of the corresponding target data; the reference information comprises reference time and reference times, the reference times are the times of taking the target data for analysis, and the reference time is a time point corresponding to each time of reference of the data; according to the invention, data which do not meet the requirements of the user are removed through the data selection unit, and the data can be automatically obtained by combining with corresponding rules and algorithms.

Description

Big data analysis system based on cloud platform

Technical Field

The invention belongs to the field of data analysis, relates to a cloud processing technology, and particularly relates to a cloud platform-based big data analysis system.

Background

The patent with publication number CN109392196A discloses a big data analysis method and system based on a mobile terminal, wherein the big data analysis method comprises the following steps: collecting user information of a specified type by a mobile terminal; the mobile terminal sends a transmission channel establishment request message to a big data analysis center; if the mobile terminal does not receive the transmission channel establishment response message before the first timer is overtime, and does not receive the transmission channel establishment negative message, the mobile terminal immediately sends the transmission channel establishment request message to the big data analysis center again; if the mobile terminal receives the transmission channel establishment negative message before the first timer is overtime, the mobile terminal sends a transmission channel establishment request message to the big data analysis center again after waiting for the first time; and if the mobile terminal receives the transmission channel establishment response message before the first timer is overtime, the mobile terminal sends the identity identifier of the mobile terminal to the big data analysis center.

However, when analyzing big data, the method cannot filter the data source well, and cannot process a large amount of concurrent data effectively and in a certain order, so that the data can be processed in the shortest time and in the same batch as much as possible; to address this drawback, a solution is now provided.

Disclosure of Invention

The invention aims to provide a cloud platform-based big data analysis system.

The purpose of the invention can be realized by the following technical scheme:

a big data analysis system based on a cloud platform comprises a target entry unit, a data selection unit, a data acquisition unit, a redundancy analysis unit, a calculation force acquisition unit and a redundancy rule base;

the data acquisition unit is used for acquiring target information and reference information of all users, wherein the target information comprises target data and target time; the target data is the content corresponding to the data range required by the corresponding user, and the target time is the generation time of the corresponding target data; the reference information comprises reference time and reference times, the reference times are the times of taking the target data for analysis, and the reference time is a time point corresponding to each time of reference of the data;

the target input unit is used for inputting data requirement information by a user, and the data requirement information is a threshold value input by the corresponding user;

the target entry unit is used for transmitting data requirement information to the data selection unit, the data acquisition unit is used for transmitting target information and reference information to the data selection unit, the data selection unit receives the data requirement information transmitted by the target entry unit, and the data selection unit receives the target information and the reference information transmitted by the data acquisition unit;

the data selection unit is used for selecting and analyzing target information, reference information and data requirement information to obtain selected and analyzed data;

the data selection unit is used for marking selected analysis data as an analysis task, and the data selection unit transmits the processed analysis task to the redundancy analysis unit at intervals of Td, wherein Td is a preset time value;

the computing power obtaining unit is used for obtaining the residual computing power of the current cloud processor, the residual computing power is the capacity capable of being processed by the current cloud processor, and the computing power obtaining unit is used for transmitting the residual computing power to the redundancy analysis unit;

the redundancy rule base stores redundancy analysis rules; the redundancy analysis unit receives all analysis tasks transmitted by the data selection unit and performs redundancy analysis on the analysis tasks by combining a redundancy rule base, and the redundancy analysis comprises the following specific steps:

s1: firstly, acquiring residual computing power;

s2: then acquiring all analysis tasks, and marking the analysis tasks as Fxj, wherein j is 1.. n;

s3: acquiring time stamps of all analysis tasks, acquiring current time spans of all the analysis tasks according to the time stamps, and marking the current time spans as cross-value Kdj, wherein j is 1.. n;

s4: acquiring processing time required by an analysis task and a corresponding calculation power requirement value, and marking the processing time as Ctj, wherein j is 1.. n; the calculation power demand value is marked as Xqj, j 1.. n;

s5: acquiring a corresponding analysis task Fxj, and a span value Kdj, a processing time Ctj and a calculation capacity demand value Xqj corresponding to the analysis task Fxj;

s6: acquiring residual computing power, and marking the residual computing power as Ls;

s7: when Ls is equal to or greater than the sum of all the calculation power demand values Xqj, no processing is performed, otherwise, the process proceeds to step S8;

s8: acquiring an analysis task Fxj, performing batch arrangement analysis on the analysis task Fxj, and transmitting data corresponding to a processing frame to a cloud processor for processing according to an analysis result;

the cloud processor receives the data of the processing frame transmitted by the redundancy analysis unit and stores the data in real time;

the cloud processor is used for transmitting the processing frames to the display unit, and the display unit receives the processing frames transmitted by the cloud processor and displays 'currently processing + processing frames';

the management unit is in communication connection with the cloud processor.

Further, the specific process of the selection analysis is as follows:

the method comprises the following steps: acquiring target data and target time in target information; simultaneously acquiring reference information of corresponding target data, and automatically acquiring reference time and reference times in the corresponding reference information;

step two: marking the target data as Mi, i ═ 1.. n;

step three: selecting corresponding target data to obtain corresponding target time, obtaining a time length value under the target time distance according to the time, and marking the time length as a time value T1;

step four: acquiring the reference time and the reference times of the corresponding reference information, and marking the reference times as Y1;

step five: taking the designated time Ts as a time frame, acquiring the maximum number of times of reference appearing in a continuous time frame, and marking the number of times as a frame reference value K1;

step six: acquiring all reference times of corresponding reference information, acquiring the time when the last reference time is the current distance, and marking the time as the cooling time L1;

step seven: calculating a checking value Hx1 according to the time value T1, the reference times Y1, the frame reference time value K1 and the cooling time L1, wherein the specific calculation formula is as follows:

Hx1＝(0.456*Y1+0.544*K1)/(0.382*T1+0.618*L1)；

step eight: let i equal i + 1; repeating the third step to the eighth step; acquiring a nuclear selection value of all target data, and marking the nuclear selection value as Hxi, wherein i is 1.. n;

step nine: the check value Hxi is compared to a threshold value within the data requirement information, and all corresponding target data for which the check value Hxi is greater than the threshold value is flagged as selected analytics data.

Further, the processing time in step S4 is acquired in the following manner:

s401: acquiring the content with the same data volume, and utilizing the time of the processor for processing the content for nearly ten times;

s402: after the average value of the ten times of time is obtained, the time and the average value are subjected to difference calculation, and the corresponding time of which the difference value exceeds a preset value C1 is removed;

s403: re-calculating the average value, and marking the time as corresponding processing time;

s404: and performing the same processing on all the analysis tasks to obtain corresponding processing time.

Further, the specific analysis steps of the batch arrangement analysis in step S8 are as follows:

s801: calculating the sum of the calculation power demand value Xqj, and marking the sum as the sum;

s802: dividing the sum evaluation value by the residual calculation power, and taking the value as the number of processing frames when the sum evaluation value can be divided completely;

when the division cannot be carried out, taking an integer of the quotient value, adding one to the integer, and marking the value as the number of processing frames;

s803: performing frame processing according to the processing time Ctj of the analysis task Fxj and the calculation power demand value Xqj;

s804: sequencing the analysis tasks Fxj according to the sequence of the processing time Ctj from large to small, sequentially selecting the corresponding analysis tasks Fxj from large to small until the sum of the selected calculation capacity requirement values Xqj is larger than the residual calculation capacity, removing the last selected bit, and placing all the previous selected bits in a processing frame;

s805: continuing to obtain the data in the manner of the step S804 until all the analysis tasks Fxj are selected, and obtaining processing frames with the number corresponding to the number of the processing frames;

s806: and calculating the sum of the cross values Kdj of all the data in each processing box, and transmitting the data of the corresponding processing box to the cloud processor for processing according to the sequence from the large sum of the cross values Kdj to the small sum.

Further, the management unit is used for recording all preset values.

The invention has the beneficial effects that:

according to the invention, data which do not meet the requirements of the user are removed through the data selection unit, and the data can be automatically obtained by combining with corresponding rules and algorithms; meanwhile, the corresponding redundant analysis unit and the related units thereof are utilized to obtain the corresponding analysis task Fxj, and the corresponding span value Kdj, processing time Ctj and calculation force demand value Xqj thereof;

then, performing frame processing according to the processing time Ctj of the analysis task Fxj and the calculation power requirement value Xqj; sequencing the analysis tasks Fxj according to the sequence of the processing time Ctj from large to small, sequentially selecting the corresponding analysis tasks Fxj from large to small until the sum of the selected calculation capacity requirement values Xqj is larger than the residual calculation capacity, removing the last selected bit, and placing all the previous selected bits in a processing frame; continuously acquiring according to the mode until all the analysis tasks Fxj are selected, and obtaining the processing frames with the number corresponding to the number of the processing frames; finally, according to the sequence from large to small of the sum of the cross values Kdj, transmitting the data of the corresponding processing frames to the cloud processor for processing; the data processing can be completed under reasonable conditions, and time saving is realized.

Drawings

In order to facilitate understanding for those skilled in the art, the present invention will be further described with reference to the accompanying drawings.

FIG. 1 is a block diagram of the system of the present invention.

Detailed Description

As shown in fig. 1, a cloud platform-based big data analysis system includes a target entry unit, a data selection unit, a data acquisition unit, a redundancy analysis unit, a calculation power acquisition unit, and a redundancy rule base;

the data selection unit is used for selecting and analyzing target information, reference information and data requirement information, and the specific selection analysis process is as follows:

step two: marking the target data as Mi, i ═ 1.. n;

hx1 ═ 0.456 by Y1+0.544 by K1)/(0.382 by T1+0.618 by L1, 0.456, 0.544, 0.382, 0.618 are preset weights;

step nine: comparing the check value Hxi with a threshold value in the data requirement information, and marking all corresponding target data with the check value Hxi larger than the threshold value as selected analysis data;

s1: firstly, acquiring residual computing power;

s4: acquiring processing time required by an analysis task and a corresponding calculation power requirement value, and marking the processing time as Ctj, wherein j is 1.. n; the calculation power demand value is marked as Xqj, j 1.. n; the processing time acquisition mode is as follows:

s404: performing the same processing on all the analysis tasks to obtain corresponding processing time;

s8: obtaining an analysis task Fxj, and performing batch arrangement analysis on the analysis task Fxj, wherein the specific analysis steps are as follows:

s806: calculating the sum of the cross values Kdj of all the data in each processing frame, and transmitting the data of the corresponding processing frame to the cloud processor for processing according to the sequence from the large sum of the cross values Kdj to the small sum;

the management unit is in communication connection with the cloud processor and is used for recording all preset values.

The foregoing is merely exemplary and illustrative of the present invention and various modifications, additions and substitutions may be made by those skilled in the art to the specific embodiments described without departing from the scope of the invention as defined in the following claims.

Claims

1. A big data analysis system based on a cloud platform is characterized by comprising a target entry unit, a data selection unit, a data acquisition unit, a redundancy analysis unit, a calculation force acquisition unit and a redundancy rule base;

s1: firstly, acquiring residual computing power;

the management unit is in communication connection with the cloud processor.

2. The cloud platform-based big data analysis system according to claim 1, wherein the specific process of the selection analysis is as follows:

step two: marking the target data as Mi, i ═ 1.. n;

Hx1＝(0.456*Y1+0.544*K1)/(0.382*T1+0.618*L1)；

3. The cloud platform-based big data analysis system according to claim 1, wherein the processing time in step S4 is obtained by:

4. The cloud platform-based big data analysis system according to claim 1, wherein the specific analysis steps of the batch arrangement analysis in step S8 are as follows:

5. The cloud platform-based big data analysis system according to claim 1, wherein the management unit is configured to enter all preset values.