CN116340297A - Data analysis method, device, equipment and storage medium - Google Patents

Data analysis method, device, equipment and storage medium Download PDF

Info

Publication number
CN116340297A
CN116340297A CN202111584635.7A CN202111584635A CN116340297A CN 116340297 A CN116340297 A CN 116340297A CN 202111584635 A CN202111584635 A CN 202111584635A CN 116340297 A CN116340297 A CN 116340297A
Authority
CN
China
Prior art keywords
data
dependent
data analysis
storage space
analyzed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111584635.7A
Other languages
Chinese (zh)
Inventor
张旭华
叶邦宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202111584635.7A priority Critical patent/CN116340297A/en
Publication of CN116340297A publication Critical patent/CN116340297A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The present disclosure relates to a data analysis method, apparatus, device, and storage medium, the method comprising: acquiring original data in real time, preprocessing the original data, and taking the preprocessed original data as data to be analyzed; after receiving a data analysis request, acquiring dependent data corresponding to the data analysis request, wherein the dependent data comprises offline data and data to be analyzed; and analyzing the dependent data to obtain a data analysis result. Therefore, after the data analysis request is received, the acquired dependent data comprises the offline data and the data to be analyzed acquired in real time, and the data can be timely used for data analysis no matter how fast the data arrives, so that the execution time of a data analysis task is shortened, and the data analysis efficiency is improved.

Description

Data analysis method, device, equipment and storage medium
Technical Field
The present invention relates to the field of big data, and in particular, to a data analysis method, apparatus, device, and storage medium.
Background
In some scenarios, the application software needs to push information to the user through different platforms, so as to obtain more new users through pushing and displaying the information. In this scenario, in order to perfect and optimize the subsequent information pushing, it is generally necessary to analyze fingerprint data of the new user by technical means, and determine which platform the activation of the new user is brought by the information pushing, and this data analysis process may be referred to as data attribution.
In the prior art, data attribution adopts a streaming batch integrated architecture, and operations such as join (data association), map (data mapping), reduce (data reduction) and the like are performed on a plurality of pieces of data through sql (Structured Query Language) sentences, however, the operations need to be performed in real time and offline, and some of the data arrive quickly, some of the data arrive slowly, the task execution time is difficult to predict and cannot be measured.
Disclosure of Invention
In order to solve the problem that task execution time is difficult to predict and cannot be measured in the related art, the disclosure provides a data analysis method, a device, equipment and a storage medium, and the technical scheme of the disclosure is as follows:
according to a first aspect of embodiments of the present disclosure, there is provided a data analysis method, including:
acquiring original data in real time, preprocessing the original data, and taking the preprocessed original data as data to be analyzed;
after receiving a data analysis request, acquiring dependent data corresponding to the data analysis request, wherein the dependent data comprises offline data and data to be analyzed;
and analyzing the dependent data to obtain a data analysis result.
Optionally, after the obtaining the dependent data corresponding to the data analysis request, the method further includes:
acquiring marking information of the dependent data, wherein the marking information is used for indicating the use state of the dependent data;
screening unused dependent data according to the marking information to obtain target data;
the step of analyzing the dependent data to obtain a data analysis result comprises the following steps:
and analyzing the target data to obtain a data analysis result.
Optionally, the obtaining the tag information of the dependent data includes:
and acquiring the marking information of the dependent data from a first preset storage space, wherein the first preset storage space is a storage space outside the local storage space.
Optionally, after preprocessing the raw data and taking the preprocessed raw data as the data to be analyzed, the method further includes:
the data to be analyzed are stored in different storage nodes of a second preset storage space in a barrel mode, and the second preset storage space is used for storing offline data and the data to be analyzed;
the obtaining the dependent data corresponding to the data analysis request includes:
and acquiring the dependent data corresponding to the data analysis request from different storage nodes of the second preset storage space.
Optionally, after the data to be analyzed is stored in the different storage nodes of the second preset storage space in a sub-bucket mode, the method further includes:
offline data and data to be analyzed, the downloading amount of which meets preset conditions, are taken as candidate data in advance, and the candidate data are stored in a local storage space;
before the obtaining the dependent data corresponding to the data analysis request from the different storage nodes in the second preset storage space, the method further includes:
obtaining dependent data corresponding to the data analysis request from candidate data stored in the local storage space;
and executing the step of acquiring the dependent data corresponding to the data analysis request from different storage nodes in the second preset storage space under the condition that the dependent data corresponding to the data analysis request is not acquired.
According to a second aspect of embodiments of the present disclosure, there is also provided a data analysis apparatus, the apparatus comprising:
the processing unit is configured to acquire original data in real time, preprocess the original data and take the preprocessed original data as data to be analyzed;
the acquisition unit is configured to acquire dependent data corresponding to the data analysis request after receiving the data analysis request, wherein the dependent data comprises offline data and data to be analyzed;
and the analysis unit is configured to perform analysis on the dependent data to obtain a data analysis result.
Optionally, the acquiring unit is configured to perform:
acquiring marking information of the dependent data, wherein the marking information is used for indicating the use state of the dependent data;
screening unused dependent data according to the marking information to obtain target data;
the analysis unit is configured to perform:
and analyzing the target data to obtain a data analysis result.
Optionally, the acquiring unit is configured to perform:
and acquiring the marking information of the dependent data from a first preset storage space, wherein the first preset storage space is a storage space outside the local storage space.
Optionally, the processing unit is configured to perform:
the data to be analyzed are stored in different storage nodes of a second preset storage space in a barrel mode, and the second preset storage space is used for storing offline data and the data to be analyzed;
the acquisition unit is configured to perform:
and acquiring the dependent data corresponding to the data analysis request from different storage nodes of the second preset storage space.
Optionally, the processing unit is configured to perform:
offline data and data to be analyzed, the downloading amount of which meets preset conditions, are taken as candidate data in advance, and the candidate data are stored in a local storage space;
the acquisition unit is configured to perform:
obtaining dependent data corresponding to the data analysis request from candidate data stored in the local storage space; and executing the step of acquiring the dependent data corresponding to the data analysis request from different storage nodes in the second preset storage space under the condition that the dependent data corresponding to the data analysis request is not acquired.
According to a third aspect of embodiments of the present disclosure, there is also provided an electronic device, including:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the method of the first aspect.
According to a fourth aspect of embodiments of the present disclosure, there is also provided a computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform the method of the first aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to carry out the method of the first aspect.
In the technical scheme provided by the embodiment of the disclosure,
acquiring original data in real time, preprocessing the original data, and taking the preprocessed original data as data to be analyzed; after receiving a data analysis request, acquiring dependent data corresponding to the data analysis request, wherein the dependent data comprises offline data and data to be analyzed; and analyzing the dependent data to obtain a data analysis result.
Therefore, after the data analysis request is received, the acquired dependent data comprises the offline data and the data to be analyzed acquired in real time, and the data can be timely used for data analysis no matter how fast the data arrives, so that the execution time of a data analysis task is shortened, and the data analysis efficiency is improved.
Drawings
FIG. 1 is a flow chart illustrating a method of data analysis according to an exemplary embodiment;
FIG. 2 is a flow chart illustrating another data analysis method according to an exemplary embodiment;
FIG. 3 is a block diagram of a data analysis device, according to an example embodiment;
fig. 4 is a block diagram of an apparatus according to an example embodiment.
Detailed Description
In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims. In addition, the user information (including but not limited to user equipment information, user personal information, etc.) related to the present disclosure is information authorized by the user or sufficiently authorized by each party.
In order to solve the problem that a server in the related art receives a large amount of information acquisition requests at the same time and the load pressure is too high, the embodiment of the disclosure provides a data analysis system, a method, a device, equipment and a storage medium.
In a first aspect, a data analysis method provided by an embodiment of the present disclosure will be described in detail.
As shown in fig. 1, a flowchart of a data analysis method according to an embodiment of the present disclosure specifically includes the following steps.
In S11, the original data is acquired in real time, and is preprocessed, and the preprocessed original data is used as the data to be analyzed.
The original data is data to be analyzed, for example, in order to obtain more new users, an application program can put advertisements on other platforms, record the downloading and installing actions of the users of the other platforms on the advertisements, and further judge which platform the user activates this time brings through data analysis. The raw data may be fingerprint data of the device, such as IMEI (International Mobile Equipment Identity ) data, ip (Internet Protocol, internet protocol) data, etc.
The original data may contain some invalid data, or the data formats of the original data may differ from each other, so that the original data needs to be preprocessed to obtain the data to be analyzed.
In one implementation manner, the original data is preprocessed, after the preprocessed original data is used as the data to be analyzed, the data to be analyzed can be stored in different storage nodes of a second preset storage space in a barrel mode, and the second preset storage space is used for storing the offline data and the data to be analyzed. And subsequently, the dependent data corresponding to the data analysis request can be acquired from different storage nodes in the second preset storage space.
It can be understood that the external database is used for storing data due to dependence, content with large data magnitude is stored in a barrel manner during storage, and barrel-taking is performed according to the data magnitude during use, so that the resources required by the computing frame can be ensured to be fixed, and meanwhile, the magnitude of the required external database can be estimated and the capacity can be relatively easily expanded.
In S12, after receiving the data analysis request, the dependent data corresponding to the data analysis request is obtained, where the dependent data includes offline data and data to be analyzed.
Wherein the offline data is previously acquired history data that has been persistently stored. In the present disclosure, the offline data may be generated based on real-time data, such as the real-time data of today may be stored on a disk, thus forming offline data, and the offline data may be used for generating the data in the early morning.
In one implementation manner, after the dependent data corresponding to the data analysis request is acquired, the tag information of the dependent data may be acquired, where the tag information is used to indicate a usage state of the dependent data; and screening unused dependent data according to the marking information to obtain target data. Further, data analysis may be performed subsequently based on the target data.
The marking information is obtained by marking the dependent data, and the marking process refers to recording the use state of the dependent data in an external storage space, wherein the use state of the dependent data can be divided into used and unused. For example, if the id (Identity document, id) of the dependent data is abc, the corresponding flag information indicates that the use state of the dependent data abc is used if abc_used, whereas if the corresponding flag information is abc_new, the use state of the dependent data abc is unused.
It will be appreciated that when data analysis is performed, the dependent data is generally globally unique, that is, the used dependent data cannot be subjected to data analysis for the second time, so that unused dependent data can be screened out through the tag information, and accuracy and efficiency of data analysis are improved.
In one implementation, the tag information of the dependent data may be obtained from a first preset storage space, where the first preset storage space is a storage space other than the local storage space.
That is, the tag information depending on the data is stored outside the local storage space, so that occupation of the local storage space can be reduced, and occurrence of data tilting or overload can be reduced.
In one implementation, the dependent data may be preloaded, specifically, offline data and data to be analyzed whose download amounts meet a preset condition are used as candidate data in advance, and the candidate data is stored in a local storage space; and after receiving the data analysis request, acquiring the dependent data corresponding to the data analysis request from the candidate data stored in the local storage space, and executing the step of acquiring the dependent data corresponding to the data analysis request from different storage nodes in the second preset storage space under the condition that the dependent data corresponding to the data analysis request is not acquired.
That is, the first N dependent data with the highest download amount or the download amount exceeding the preset download amount threshold may be preloaded on each computing node, where the size of N may be determined according to the configuration of the computing node and the size of the data amount, and the dependent data with the higher download amount, that is, the dependent data with the higher use frequency, may be preloaded, so that repeated data requests for the second preset storage space may be avoided.
In S13, the dependent data is analyzed to obtain a data analysis result.
In the method, the data analysis result can be pushed to the corresponding advertising media party in real time, and the media party can perform real-time advertising optimization based on the data, so that more new users are brought to download; in addition, financial settlement can be performed, and the effect of the company on each advertising medium is checked, so that the cost is influenced, and even the DAU (Daily Active User, the number of daily active users) is influenced.
For example, as shown in fig. 2, a flowchart of another data analysis method according to an embodiment of the disclosure includes the following steps:
in S21, the raw data is acquired in real time, and is preprocessed, and the preprocessed raw data is used as the data to be analyzed. The original data may include some invalid data, or the data formats of the original data may differ from each other, so that the original data needs to be preprocessed to obtain the data to be analyzed.
In S22, the data to be analyzed is stored in buckets into different storage nodes in a second preset storage space, where the second preset storage space is used for storing offline data and the data to be analyzed. It can be understood that the external database is used for storing data due to dependence, content with large data magnitude is stored in a barrel manner during storage, and barrel-taking is performed according to the data magnitude during use, so that the resources required by the computing frame can be ensured to be fixed, and meanwhile, the magnitude of the required external database can be estimated and the capacity can be relatively easily expanded.
In S23, offline data and data to be analyzed, the downloading amount of which meets the preset condition, are taken as candidate data in advance, and the candidate data are stored in the local storage space. The first N dependent data with the highest download amount or the download amount exceeding the preset download amount threshold value can be preloaded on each computing node, wherein the size of N can be determined according to the configuration of the computing node and the size of the data amount, and the dependent data with the higher download amount, that is, the dependent data with higher use frequency, so that repeated data requests on the second preset storage space can be avoided through preloading.
In S24, after receiving the data analysis request, the dependent data corresponding to the data analysis request is obtained from the candidate data stored in the local storage space, where the dependent data includes offline data and data to be analyzed.
In S25, in the case where the dependent data corresponding to the data analysis request is not acquired, the dependent data corresponding to the data analysis request is acquired from a different storage node in the second preset storage space.
In S26, the tag information of the dependent data is obtained from a first preset storage space, where the first preset storage space is a storage space other than the local storage space, and the tag information is used to indicate a usage state of the dependent data. The marking information is obtained by marking the dependent data, and the marking process refers to recording the use state of the dependent data in an external storage space, wherein the use state of the dependent data can be divided into used and unused.
In S27, the unused dependent data is filtered based on the flag information to obtain target data. It can be appreciated that when data analysis is performed, the dependent data generally has global uniqueness, that is, the used dependent data cannot be subjected to data analysis for the second time, and unused dependent data can be screened out through the marking information, so that accuracy and efficiency of data analysis are improved.
In S28, the target data is analyzed to obtain a data analysis result.
In view of the above, in the technical solution provided in the embodiments of the present disclosure, after receiving a data analysis request, the acquired dependent data includes offline data and data to be analyzed acquired in real time, which can be used for performing data analysis in time no matter how fast the data arrives, so as to shorten the execution time of the data analysis task and improve the data analysis efficiency.
In a second aspect, a data analysis device provided in an embodiment of the present disclosure will be described in detail.
As shown in fig. 3, a data analysis device provided in an embodiment of the present disclosure includes:
a processing unit 301 configured to perform real-time acquisition of original data, perform preprocessing on the original data, and take the preprocessed original data as data to be analyzed;
an obtaining unit 302, configured to obtain, after receiving a data analysis request, dependency data corresponding to the data analysis request, where the dependency data includes offline data and the data to be analyzed;
and an analysis unit 303 configured to perform analysis on the dependent data to obtain a data analysis result.
In an implementation, the obtaining unit 302 is configured to perform:
acquiring marking information of the dependent data, wherein the marking information is used for indicating the use state of the dependent data;
screening unused dependent data according to the marking information to obtain target data;
the analysis unit 303 is configured to perform:
and analyzing the target data to obtain a data analysis result.
In an implementation, the obtaining unit 302 is configured to perform:
and acquiring the marking information of the dependent data from a first preset storage space, wherein the first preset storage space is a storage space outside the local storage space.
In an implementation, the processing unit 301 is configured to perform:
the data to be analyzed are stored in different storage nodes of a second preset storage space in a barrel mode, and the second preset storage space is used for storing offline data and the data to be analyzed;
the acquisition unit 301 is configured to perform:
and acquiring the dependent data corresponding to the data analysis request from different storage nodes of the second preset storage space.
In an implementation, the processing unit 303 is configured to perform:
offline data and data to be analyzed, the downloading amount of which meets preset conditions, are taken as candidate data in advance, and the candidate data are stored in a local storage space;
the acquisition unit 301 is configured to perform:
obtaining dependent data corresponding to the data analysis request from candidate data stored in the local storage space; and executing the step of acquiring the dependent data corresponding to the data analysis request from different storage nodes in the second preset storage space under the condition that the dependent data corresponding to the data analysis request is not acquired.
In view of the above, in the technical solution provided in the embodiments of the present disclosure, after receiving a data analysis request, the acquired dependent data includes offline data and data to be analyzed acquired in real time, which can be used for performing data analysis in time no matter how fast the data arrives, so as to shorten the execution time of the data analysis task and improve the data analysis efficiency.
In a third aspect, an electronic device provided in an embodiment of the present disclosure will be described in detail.
In an exemplary embodiment, a computer-readable storage medium is also provided, such as a memory, comprising instructions executable by a processor of an electronic device to perform the above-described method. Alternatively, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
In an exemplary embodiment, a computer program product is also provided which, when run on a computer, causes the computer to carry out the method of data analysis described above.
Fig. 4 is a block diagram of another apparatus 800, shown in accordance with an exemplary embodiment. For example, apparatus 800 may be a mobile phone, computer, digital broadcast electronic device, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to fig. 4, apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the apparatus 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the device 800. Examples of such data include instructions for any application or method operating on the device 800, contact data, phonebook data, messages, pictures, videos, and the like. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
Power supply component 807 provides power to the various components of device 800. Power supply component 807 can include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for device 800.
The multimedia component 808 includes a screen between the device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the apparatus 800. For example, the sensor assembly 814 may detect an on/off state of the device 800, a relative positioning of the components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or one component of the apparatus 800, the presence or absence of user contact with the apparatus 800, an orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communication between the apparatus 800 and other devices, either in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for performing any of the data analysis methods described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including instructions executable by processor 820 of apparatus 800 to perform the above-described method. Alternatively, for example, the storage medium may be a non-transitory computer-readable storage medium, which may be, for example, ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium, which when executed by a processor of an electronic device, enables the electronic device to perform any one of the data analysis methods described above.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to implement any of the data analysis methods described above.
In view of the above, in the technical solution provided in the embodiments of the present disclosure, after receiving a data analysis request, the acquired dependent data includes offline data and data to be analyzed acquired in real time, which can be used for performing data analysis in time no matter how fast the data arrives, so as to shorten the execution time of the data analysis task and improve the data analysis efficiency.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method of data analysis, the method comprising:
acquiring original data in real time, preprocessing the original data, and taking the preprocessed original data as data to be analyzed;
after receiving a data analysis request, acquiring dependent data corresponding to the data analysis request, wherein the dependent data comprises offline data and data to be analyzed;
and analyzing the dependent data to obtain a data analysis result.
2. The method of claim 1, wherein after the obtaining the dependent data corresponding to the data analysis request, the method further comprises:
acquiring marking information of the dependent data, wherein the marking information is used for indicating the use state of the dependent data;
screening unused dependent data according to the marking information to obtain target data;
the step of analyzing the dependent data to obtain a data analysis result comprises the following steps:
and analyzing the target data to obtain a data analysis result.
3. The method of claim 2, wherein the obtaining the data-dependent tag information comprises:
and acquiring the marking information of the dependent data from a first preset storage space, wherein the first preset storage space is a storage space outside the local storage space.
4. The method according to claim 1, wherein the preprocessing the raw data, and taking the preprocessed raw data as the data to be analyzed, further comprises:
the data to be analyzed are stored in different storage nodes of a second preset storage space in a barrel mode, and the second preset storage space is used for storing offline data and the data to be analyzed;
the obtaining the dependent data corresponding to the data analysis request includes:
and acquiring the dependent data corresponding to the data analysis request from different storage nodes of the second preset storage space.
5. The method of claim 4, wherein after said storing the data to be analyzed in buckets in different storage nodes of a second preset storage space, the method further comprises:
offline data and data to be analyzed, the downloading amount of which meets preset conditions, are taken as candidate data in advance, and the candidate data are stored in a local storage space;
before the obtaining the dependent data corresponding to the data analysis request from the different storage nodes in the second preset storage space, the method further includes:
obtaining dependent data corresponding to the data analysis request from candidate data stored in the local storage space;
and executing the step of acquiring the dependent data corresponding to the data analysis request from different storage nodes in the second preset storage space under the condition that the dependent data corresponding to the data analysis request is not acquired.
6. A data analysis device, the device comprising:
the processing unit is configured to acquire original data in real time, preprocess the original data and take the preprocessed original data as data to be analyzed;
the acquisition unit is configured to acquire dependent data corresponding to the data analysis request after receiving the data analysis request, wherein the dependent data comprises offline data and data to be analyzed;
and the analysis unit is configured to perform analysis on the dependent data to obtain a data analysis result.
7. The apparatus of claim 6, wherein the acquisition unit is configured to perform:
acquiring marking information of the dependent data, wherein the marking information is used for indicating the use state of the dependent data;
screening unused dependent data according to the marking information to obtain target data;
the analysis unit is configured to perform:
and analyzing the target data to obtain a data analysis result.
8. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data analysis method of any one of claims 1 to 5.
9. A computer readable storage medium, characterized in that instructions in the computer readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data analysis method of any one of claims 1 to 5.
10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the data analysis method of any of claims 1 to 5.
CN202111584635.7A 2021-12-22 2021-12-22 Data analysis method, device, equipment and storage medium Pending CN116340297A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111584635.7A CN116340297A (en) 2021-12-22 2021-12-22 Data analysis method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111584635.7A CN116340297A (en) 2021-12-22 2021-12-22 Data analysis method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116340297A true CN116340297A (en) 2023-06-27

Family

ID=86889984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111584635.7A Pending CN116340297A (en) 2021-12-22 2021-12-22 Data analysis method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116340297A (en)

Similar Documents

Publication Publication Date Title
RU2640632C2 (en) Method and device for delivery of information
CN109447125B (en) Processing method and device of classification model, electronic equipment and storage medium
CN106547547B (en) data acquisition method and device
CN117390330A (en) Webpage access method and device
CN112131466A (en) Group display method, device, system and storage medium
CN107402767B (en) Method and device for displaying push message
EP3057006A1 (en) Method and device of filtering address
CN110968492B (en) Information processing method and device and storage medium
CN111695064B (en) Buried point loading method and device
CN109842688B (en) Content recommendation method and device, electronic equipment and storage medium
CN111221593A (en) Dynamic loading method and device
CN108012258B (en) Data traffic management method and device for virtual SIM card, terminal and server
CN112667852B (en) Video-based searching method and device, electronic equipment and storage medium
CN110213062B (en) Method and device for processing message
CN116340297A (en) Data analysis method, device, equipment and storage medium
CN108804181B (en) Control content obtaining method and device and storage medium
CN107257384B (en) Service state monitoring method and device
CN107526683B (en) Method and device for detecting functional redundancy of application program and storage medium
CN113946228A (en) Statement recommendation method and device, electronic equipment and readable storage medium
CN112333233A (en) Event information reporting method and device, electronic equipment and storage medium
CN111597106A (en) Point burying management method and device
CN113378022A (en) In-station search platform, search method and related device
CN110989987A (en) Portal webpage generation method, portal webpage generation device, client, server and storage medium
CN105892832B (en) Method and device for displaying display information
CN110119471B (en) Method and device for checking consistency of search results

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination