CN115640370A - Data analysis method and related equipment - Google Patents
Data analysis method and related equipment Download PDFInfo
- Publication number
- CN115640370A CN115640370A CN202211567870.8A CN202211567870A CN115640370A CN 115640370 A CN115640370 A CN 115640370A CN 202211567870 A CN202211567870 A CN 202211567870A CN 115640370 A CN115640370 A CN 115640370A
- Authority
- CN
- China
- Prior art keywords
- data
- processed
- processing mode
- cold
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007405 data analysis Methods 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000012545 processing Methods 0.000 claims abstract description 96
- 238000004458 analytical method Methods 0.000 claims abstract description 62
- 238000004590 computer program Methods 0.000 claims description 8
- 238000013079 data visualisation Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 description 4
- 238000009825 accumulation Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000010705 motor oil Substances 0.000 description 1
- 239000003921 oil Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a data analysis method and related equipment, wherein the method comprises the following steps: acquiring data to be processed, and classifying the data to be processed according to the accessed frequency of the data to be processed to obtain subdata of different types, wherein the subdata comprises cold data, warm data and hot data; selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data; and analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, business statistical information and real-time data.
Description
Technical Field
The present invention relates to the field of data management technologies, and in particular, to a data analysis method and related devices.
Background
In the field of internet of things, certain collected data need to be stored permanently, the permanently stored data are required at the same time, the experience of upper-layer application and the data query speed cannot be influenced, meanwhile, the collected data need to be respectively subjected to real-time statistical analysis according to set time periods (year, month, day, hour and minute) in the process of mass data accumulation, and the analysis results according to the time periods are stored. However, due to the history accumulation, the data volume is large, and there may be hundreds of billions of records in one data table, so that in the case of such a large number, even if the processing is performed by a time partition mode, it takes a long time, and as the time is longer, the data volume accumulation is more, and the query speed is difficult to guarantee.
In the prior art, by applying a big data analysis frame assembly, for example, a hadoop frame assembly, a spark frame assembly, a stream frame assembly, and a flink frame assembly, when applying a big data analysis frame, a technician needs to learn a considerable degree of expertise to handle, that is, the technician needs to have a specific professional skill, so that the professional threshold of a data processor is increased, the application range of the data analysis process is limited, and meanwhile, the existing big data analysis frame needs to occupy a large amount of server resources, so that the cost of data analysis is increased.
Disclosure of Invention
In view of this, the present invention provides a data analysis method and related devices, which are used to solve the problems in the prior art that a large data analysis frame assembly is used to make a data analysis process limited and the data analysis cost is high.
To achieve one or a part of or all of the above or other objects, the present invention provides a data analysis method, including: acquiring data to be processed, and classifying the data to be processed according to the accessed frequency of the data to be processed to obtain subdata of different types, wherein the subdata comprises cold data, warm data and hot data;
selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data;
and analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information.
Optionally, the step of classifying the data to be processed according to the accessed frequency of the data to be processed to obtain sub-data of different types includes:
acquiring a data field in the data to be processed, and calculating a heat value of the data field according to the accessed frequency of the data field;
obtaining different types of subdata based on the heat value, specifically comprising:
if the heat value is greater than or equal to a first heat threshold, the data to be processed is heat data;
if the heat value is smaller than a first heat threshold and larger than a second heat threshold, the data to be processed is temperature data;
and if the heat value is smaller than a second heat threshold value, the data to be processed is cold data.
Optionally, before the step of obtaining the sub-data of different types based on the heat value, the method further includes:
acquiring user request information, wherein the user request information comprises service scene information;
determining the first heat threshold and the second heat threshold based on the service scenario information.
Optionally, the step of analyzing the cold data according to the first target data processing manner to obtain an analysis result of the data to be processed includes:
determining dimension data of the service scene based on the service scene information;
performing data classification on the cold data based on the dimension data to obtain first data and second data, wherein the first data are stored in ClickHouse, and the second data are stored in mysql;
and obtaining the historical information and the statistical information of the data to be processed based on the first data stored in the ClickHouse, and storing the statistical information in the mysql.
Optionally, the step of performing data classification on the cold data based on the dimension data includes:
setting a cold data theme according to the dimension data;
performing data classification on the cold data based on the cold data topics to obtain target data corresponding to each cold data topic;
and taking the target data which accords with the analysis request information as the first data, and taking the target data which does not accord with the analysis request information as the second data.
Optionally, the step of analyzing the temperature data according to the second target data processing manner to obtain an analysis result of the data to be processed includes:
and performing real-time accumulative analysis on the temperature data to obtain an analysis result of the temperature data in the data to be processed, and storing the analysis result in the mysql.
Optionally, the method further includes:
and performing data visualization processing on the historical information, the statistical information and the real-time information to obtain display information.
In another aspect, the present application provides a data analysis apparatus, the apparatus comprising:
the data classification module is used for acquiring data to be processed and classifying the data to be processed according to the accessed frequency of the data to be processed to obtain subdata of different types, wherein the subdata comprises cold data, warm data and hot data;
the selection module is used for selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data;
and the analysis module is used for analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information.
In a third aspect, the present application provides an electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine-readable instructions, when executed by the processor, performing the steps of the data analysis method as described above.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the data analysis method as described above.
The embodiment of the invention has the following beneficial effects:
obtaining different types of subdata by obtaining data to be processed and classifying the data to be processed according to the accessed frequency of the data to be processed, wherein the subdata comprises cold data, warm data and hot data; selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data; and analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information. A large number of big data analysis components such as Spark, flink, storm, hbase, mapReduce and the like do not need to be called, so that technical personnel do not need to learn professional knowledge to a certain extent, people without data analysis framework design experience can conveniently use the method, server resources occupied by the big data analysis components are reduced, and cost is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
fig. 1 is a flowchart of a data analysis method provided in an embodiment of the present application;
FIG. 2 is a flow chart of another data analysis method provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of a data analysis apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;
fig. 5 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, an embodiment of the present application provides a data analysis method, including:
s101, acquiring data to be processed, and classifying the data to be processed according to the accessed frequency of the data to be processed to obtain subdata of different types, wherein the subdata comprises cold data, warm data and hot data;
illustratively, the data to be processed is classified into: cold data, data that is accessed less frequently, but data that cannot be discarded; temperature data, wherein the data can be accessed within a preset time, but the data has no meaning and value after the preset time; hot data, which is accessed at any time, e.g., cold data: engine data one year ago, may need to be used for auditing; temperature data: the last month of engine data, which may be needed for analysis; thermal data: XX minutes XX seconds of the XX point XX data may be needed for real-time display.
S102, selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data;
illustratively, specific analysis methods are respectively set for cold data, temperature data and heat data, and the cold data, the temperature data and the heat data are analyzed in parallel, so that the analysis efficiency is improved;
s103, analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information.
Illustratively, the thermal data is analyzed to obtain real-time information for display; and analyzing the cold data and the temperature data to obtain historical information and statistical information.
Obtaining different types of subdata by obtaining data to be processed and classifying the data to be processed according to the accessed frequency of the data to be processed, wherein the subdata comprises cold data, warm data and hot data; selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data; and analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information. A large number of big data analysis components such as Spark, flink, storm, hbase, mapReduce and the like do not need to be called, so that technical personnel do not need to learn professional knowledge to a certain extent, people without data analysis framework design experience can conveniently use the method, server resources occupied by the big data analysis components are reduced, and cost is reduced.
In a possible implementation manner, the step of classifying the data to be processed according to the accessed frequency of the data to be processed to obtain sub-data of different types includes:
acquiring a data field in the data to be processed, and calculating a heat value of the data field according to the accessed frequency of the data field;
obtaining different types of subdata based on the heat value, specifically comprising:
if the heat value is greater than or equal to a first heat threshold value, the data to be processed is heat data;
if the heat value is smaller than a first heat threshold and larger than a second heat threshold, the data to be processed is temperature data;
and if the heat value is smaller than a second heat threshold value, the data to be processed is cold data.
Illustratively, first, a respective hot threshold is set for each data field; calculating a heat value of each data field; recording the access amount and the access starting time S of each field by using a memory database such as memcached, accumulating the access amount by one every time of accessing, and then extracting the access amount Q of T (for example, every 15 minutes) in unit time of each field at the time E, wherein the heat value P = Q (E-S)/T; and comparing the heat degree of the data with a heat degree threshold value to obtain the sub-data of different types. The concept of the heat threshold and the data field is provided, the data is distinguished by the data field, the granularity of distinguishing cold data from hot data is reduced, the hot data is purified, the adjustment of the heat threshold can conveniently adjust the matching of the size of the hot data and the capacity of a memory, and the current available resources are utilized to the maximum extent
In a possible implementation manner, before the step of obtaining the sub-data of different types based on the heat value, the method further includes:
acquiring user request information, wherein the user request information comprises service scene information;
determining the first heat threshold and the second heat threshold based on the service scenario information.
Illustratively, different first heat threshold values and second heat threshold values are set according to different service scene information, for example, a hot data flag value and a cold data flag value of data are calculated according to a historical processing mode of the data; performing weighted calculation on the hot data mark value and the cold data mark value of the data, and judging the cold and hot grades of the data according to the corresponding relation between the result of the weighted calculation and the cold and hot grades; when the history processing mode of the data is host writing, increasing a hot data marking value; when the historical processing mode of the data is garbage recovery writing, increasing a cold data marking value; and when the history processing mode of the data is error processing recovery writing, increasing the cold data mark value. The method comprises the steps of firstly calculating a hot data mark value and a cold data mark value of data according to a historical processing mode of the data, then carrying out weighted calculation on the hot data mark value and the cold data mark value of the data, and judging the cold and hot grades of the data according to the corresponding relation between the result of the weighted calculation and the cold and hot grades. The basis for analyzing and determining the data attributes is a historical data processing mode, such as data written by a host, data written by garbage recovery and the like, but not time, and the basis of the invention is not influenced by the data even if the power failure phenomenon occurs, but is still kept accurate, so that the data can determine the cold and hot levels of the data without a plurality of determination processes, and the invention can accurately and comprehensively determine the cold and hot attributes of the data.
In a possible implementation manner, the step of analyzing the cold data according to the first target data processing manner to obtain an analysis result of the to-be-processed data includes:
determining dimension data of the business scene based on the business scene information;
performing data classification on the cold data based on the dimension data to obtain first data and second data, wherein the first data is stored in ClickHouse, and the second data is stored in mysql;
and obtaining the historical information and the statistical information of the data to be processed based on the first data stored in the ClickHouse, and storing the statistical information in the mysql.
Illustratively, clickhouse pulls first data from kafka in real time through a real-time synchronization engine, and second data is pulled through kafka consumption service and then saved on mysql.
In a possible embodiment, the step of data classifying the cold data based on the dimensional data includes:
setting a cold data theme according to the dimension data;
performing data classification on the cold data based on the cold data topics to obtain target data corresponding to each cold data topic;
and taking the target data which accords with the analysis request information as the first data, and taking the target data which does not accord with the analysis request information as the second data.
For example, data is collected in a mobile vehicle, position information is related, vehicle engine parameters are related, alarm is related, mileage is related, exhaust emission parameters are related, and the like, cold data of a service scene, such as cold data related to vehicle engine parameter data, is screened out based on different service scenes, and is combined with analysis request information of a user, such as analysis of engine oil consumption requested by the user, and cold data of oil quantity in the cold data related to the vehicle engine parameter data is determined.
In a possible embodiment, the step of analyzing the temperature data according to the second target data processing manner to obtain an analysis result of the data to be processed includes:
and performing real-time accumulative analysis on the temperature data to obtain an analysis result of the temperature data in the data to be processed, and storing the analysis result in the mysql.
Illustratively, to ensure that data is not missed, the temperature data is analyzed cumulatively in real time since the temperature data is between the cold data and the hot data.
For example, after the OBD data of the mobile vehicle is sent to the platform (reported every 10 seconds), the front-end page needs to display three types of data, namely: data which needs to be displayed on a page in real time (refreshed and changed every 10 seconds) can be defined as hot data, the data is stored by a mysql relation table, and the data coming each time is updated every 10s in a covering manner according to the equipment ID; then, directly reading data in mysql by a front page to directly display, wherein the data is real-time thermal data; and the second method comprises the following steps: the third is that pure historical information data (without processing) needs to be displayed on a page, the data is rarely read under a common condition (but historical accumulated collected information cannot be lost), and the data is usually used for auditing or historical data display. Belonging to cold data
In one possible embodiment, the method further comprises:
and performing data visualization processing on the historical information, the statistical information and the real-time information to obtain display information.
Illustratively, the analysis result of the cold data and the analysis result of the temperature data are combined to obtain complete statistical information, and further, the historical information obtained by analyzing the cold data, the real-time information obtained by analyzing the hot data, and the statistical information obtained by combining the analysis result of the cold data and the analysis result of the temperature data are processed in a visualized manner, so that a user can observe the analysis results conveniently.
As shown in fig. 2, in a possible embodiment, the to-be-processed data is extracted from a Nacos database, and the to-be-processed data is divided into cold data kafka, warm data rocktmq and hot data redis through an equipment gateway and a multi-protocol packet parser; for cold data, when big data analysis is carried out, data are classified according to a plurality of latitudes of a service scene, and two types of data are obtained after classification is finished, namely first data and second data, wherein clickhouse pulls the first data from kafka through a real-time synchronization engine, the second data is pulled through kafka consumption service and then stored on mysql, the first data is used for historical data query and acquisition and is connected to a web and/or Nacos database through a historical data acquisition interface, the first data is also used for historical data statistical analysis, and analysis results are stored in the mysql; obtaining an analysis result of the temperature data by real-time accumulative analysis aiming at the temperature data, and storing the analysis result of the temperature data in mysql; and aiming at the hot data, the hot data is connected to a web and/or Nacos database through a real-time data acquisition interface, and the mysql is connected to the web and/or Nacos database through a business data acquisition interface.
Illustratively, when data is classified according to a plurality of latitudes of a service scene, if the service scene is primary service processing, information is fed back to cold data.
In one possible embodiment, as shown in fig. 3, the present application provides a data analysis apparatus comprising:
the data classification module 201 is configured to obtain data to be processed, and classify the data to be processed according to an accessed frequency of the data to be processed to obtain different types of subdata, where the subdata includes cold data, warm data, and hot data;
a selecting module 202, configured to select a target data processing manner based on the type of the sub-data, so as to obtain a first target data processing manner corresponding to the cold data, a second target data processing manner corresponding to the temperature data, and a third target data processing manner corresponding to the hot data;
an analysis module 203, configured to analyze the cold data, the temperature data, and the heat data according to the first target data processing manner, the second target data processing manner, and the third target data processing manner to obtain an analysis result of the to-be-processed data, where the analysis result includes historical information, statistical information, and real-time information.
In one possible implementation, as shown in fig. 4, an embodiment of the present application provides an electronic device 300, including: comprising a memory 310, a processor 320 and a computer program 311 stored on the memory 310 and executable on the processor 320, when executing the computer program 311, implements: acquiring data to be processed, and classifying the data to be processed according to the accessed frequency of the data to be processed to obtain subdata of different types, wherein the subdata comprises cold data, warm data and hot data; selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data; and analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information.
In one possible implementation, as shown in fig. 5, the present application provides a computer-readable storage medium 400, on which a computer program 411 is stored, where the computer program 411 implements, when executed by a processor: acquiring data to be processed, and classifying the data to be processed according to the accessed frequency of the data to be processed to obtain subdata of different types, wherein the subdata comprises cold data, warm data and hot data; selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data; and analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer-readable storage medium may be, for example but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It will be understood by those skilled in the art that the modules or steps of the present invention described above can be implemented by a general purpose computing device, they can be centralized in a single computing device or distributed over a network of multiple computing devices, and they can alternatively be implemented by program code executable by a computing device, so that they can be stored in a storage device and executed by a computing device, or they can be separately fabricated into various integrated circuit modules, or multiple modules or steps thereof can be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.
While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
Claims (10)
1. A method of data analysis, comprising:
acquiring data to be processed, and classifying the data to be processed according to the accessed frequency of the data to be processed to obtain subdata of different types, wherein the subdata comprises cold data, warm data and hot data;
selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data;
and analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information.
2. The data analysis method of claim 1, wherein the step of classifying the data to be processed according to the frequency of access of the data to be processed to obtain different types of sub-data comprises:
acquiring a data field in the data to be processed, and calculating a heat value of the data field according to the accessed frequency of the data field;
obtaining different types of subdata based on the heat value, specifically comprising:
if the heat value is greater than or equal to a first heat threshold value, the data to be processed is heat data;
if the heat value is smaller than a first heat threshold and larger than a second heat threshold, the data to be processed is temperature data;
and if the heat value is smaller than a second heat threshold value, the data to be processed is cold data.
3. The data analysis method of claim 2, prior to the step of deriving different types of child data based on the heat value, further comprising:
acquiring user request information, wherein the user request information comprises service scene information;
determining the first and second heat thresholds based on the traffic scenario information.
4. The data analysis method of claim 3, wherein the step of analyzing the cold data according to the first target data processing manner to obtain an analysis result of the data to be processed comprises:
determining dimension data and analysis request information of the service scene based on the service scene information;
performing data classification on the cold data based on the dimension data to obtain first data and second data, wherein the first data are stored in ClickHouse, and the second data are stored in mysql;
and obtaining the historical information and the statistical information of the data to be processed based on the first data stored in the ClickHouse, and storing the statistical information in the mysql.
5. The data analysis method of claim 4, wherein the step of data classifying the cold data based on the dimensional data comprises:
setting a cold data theme according to the dimension data;
performing data classification on the cold data based on the cold data topics to obtain target data corresponding to each cold data topic;
and taking the target data which accords with the analysis request information as the first data, and taking the target data which does not accord with the analysis request information as the second data.
6. The data analysis method according to claim 4, wherein the step of analyzing the temperature data according to the second target data processing manner to obtain an analysis result of the data to be processed includes:
and performing real-time accumulative analysis on the temperature data to obtain an analysis result of the temperature data in the data to be processed, and storing the analysis result in the mysql.
7. The data analysis method of claim 1, wherein the method further comprises:
and performing data visualization processing on the historical information, the statistical information and the real-time information to obtain display information.
8. A data analysis apparatus, characterized in that the apparatus comprises:
the data classification module is used for acquiring data to be processed and classifying the data to be processed according to the accessed frequency of the data to be processed to obtain different types of subdata, wherein the subdata comprises cold data, warm data and hot data;
the selection module is used for selecting a target data processing mode based on the type of the subdata to obtain a first target data processing mode corresponding to the cold data, a second target data processing mode corresponding to the temperature data and a third target data processing mode corresponding to the hot data;
and the analysis module is used for analyzing the cold data, the temperature data and the heat data according to the first target data processing mode, the second target data processing mode and the third target data processing mode to obtain an analysis result of the data to be processed, wherein the analysis result comprises historical information, statistical information and real-time information.
9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of the data analysis method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the data analysis method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211567870.8A CN115640370A (en) | 2022-12-08 | 2022-12-08 | Data analysis method and related equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211567870.8A CN115640370A (en) | 2022-12-08 | 2022-12-08 | Data analysis method and related equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115640370A true CN115640370A (en) | 2023-01-24 |
Family
ID=84948639
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211567870.8A Pending CN115640370A (en) | 2022-12-08 | 2022-12-08 | Data analysis method and related equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115640370A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109240611A (en) * | 2018-08-28 | 2019-01-18 | 郑州云海信息技术有限公司 | The cold and hot data hierarchy method of small documents, small documents data access method and its device |
CN109886859A (en) * | 2019-01-30 | 2019-06-14 | 上海赜睿信息科技有限公司 | Data processing method, system, electronic equipment and computer readable storage medium |
US20190332298A1 (en) * | 2018-04-27 | 2019-10-31 | Western Digital Technologies, Inc. | Methods and apparatus for configuring storage tiers within ssds |
CN110858210A (en) * | 2018-08-17 | 2020-03-03 | 阿里巴巴集团控股有限公司 | Data query method and device |
CN111444249A (en) * | 2020-03-03 | 2020-07-24 | 中国平安人寿保险股份有限公司 | User portrait generation method, device and equipment based on thermal data and storage medium |
CN112445970A (en) * | 2019-09-05 | 2021-03-05 | 北京达佳互联信息技术有限公司 | Information recommendation method and device, electronic equipment and storage medium |
CN114839889A (en) * | 2022-05-05 | 2022-08-02 | 罗剑云 | Big data analysis-based mode switching method and system |
CN115437997A (en) * | 2022-07-25 | 2022-12-06 | 杭州数澜科技有限公司 | Intelligent identification optimization system for data life cycle |
-
2022
- 2022-12-08 CN CN202211567870.8A patent/CN115640370A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190332298A1 (en) * | 2018-04-27 | 2019-10-31 | Western Digital Technologies, Inc. | Methods and apparatus for configuring storage tiers within ssds |
CN110858210A (en) * | 2018-08-17 | 2020-03-03 | 阿里巴巴集团控股有限公司 | Data query method and device |
CN109240611A (en) * | 2018-08-28 | 2019-01-18 | 郑州云海信息技术有限公司 | The cold and hot data hierarchy method of small documents, small documents data access method and its device |
CN109886859A (en) * | 2019-01-30 | 2019-06-14 | 上海赜睿信息科技有限公司 | Data processing method, system, electronic equipment and computer readable storage medium |
CN112445970A (en) * | 2019-09-05 | 2021-03-05 | 北京达佳互联信息技术有限公司 | Information recommendation method and device, electronic equipment and storage medium |
CN111444249A (en) * | 2020-03-03 | 2020-07-24 | 中国平安人寿保险股份有限公司 | User portrait generation method, device and equipment based on thermal data and storage medium |
CN114839889A (en) * | 2022-05-05 | 2022-08-02 | 罗剑云 | Big data analysis-based mode switching method and system |
CN115437997A (en) * | 2022-07-25 | 2022-12-06 | 杭州数澜科技有限公司 | Intelligent identification optimization system for data life cycle |
Non-Patent Citations (1)
Title |
---|
高健 等: "基于层次化体系的武器系统大数据管理研究" * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107818150B (en) | Log auditing method and device | |
US10592666B2 (en) | Detecting anomalous entities | |
WO2017113677A1 (en) | User behavior data processing method and system | |
CN104091276B (en) | The method of on-line analysis clickstream data and relevant apparatus and system | |
US11494409B2 (en) | Asynchronously processing sequential data blocks | |
CN110147470B (en) | Cross-machine-room data comparison system and method | |
CN112035534A (en) | Real-time big data processing method and device and electronic equipment | |
CN111949850A (en) | Multi-source data acquisition method, device, equipment and storage medium | |
US8140671B2 (en) | Apparatus and method for sampling security events based on contents of the security events | |
CN114116872A (en) | Data processing method and device, electronic equipment and computer readable storage medium | |
CN113453076B (en) | User video service quality evaluation method, device, computing equipment and storage medium | |
CN115640370A (en) | Data analysis method and related equipment | |
CN116016288A (en) | Flow monitoring method, device, equipment and storage medium of industrial equipment | |
CN113965408B (en) | Method, device, medium and equipment for extracting HTTP (hyper text transport protocol) message | |
CN110708361A (en) | System, method and device for determining grade of digital content publishing user and server | |
CN111881170B (en) | Method, device, equipment and storage medium for mining timeliness query content field | |
CN112396236B (en) | Traffic flow prediction method, system, server and storage medium | |
CN113919446A (en) | Method and device for model training and similarity determination of multimedia resources | |
CN110633430B (en) | Event discovery method, apparatus, device, and computer-readable storage medium | |
CN116939669B (en) | Network element identification method, system, equipment and readable medium based on IP learning table | |
CN117349388B (en) | Data timeliness determination method and electronic equipment | |
CN113626684B (en) | Method and device for analyzing advancing mode of object | |
CN112738718B (en) | Space-time big data track matching method based on LSA | |
CN105912736A (en) | URL classifying method and device | |
CN116866213A (en) | Flow distribution acquisition method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230124 |