CN114969191A - Data analysis method, system and device based on big data and storage medium - Google Patents

Data analysis method, system and device based on big data and storage medium Download PDF

Info

Publication number
CN114969191A
CN114969191A CN202111401235.8A CN202111401235A CN114969191A CN 114969191 A CN114969191 A CN 114969191A CN 202111401235 A CN202111401235 A CN 202111401235A CN 114969191 A CN114969191 A CN 114969191A
Authority
CN
China
Prior art keywords
data
target data
analysis
model
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111401235.8A
Other languages
Chinese (zh)
Inventor
郄彬
郑伯涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou City Construction College
Original Assignee
Guangzhou City Construction College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou City Construction College filed Critical Guangzhou City Construction College
Priority to CN202111401235.8A priority Critical patent/CN114969191A/en
Publication of CN114969191A publication Critical patent/CN114969191A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces
    • G06Q30/0643Graphical representation of items or shoppers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data analysis method, a system, a device and a storage medium based on big data, wherein the method comprises the following steps: acquiring target data generated by an online shopping mall, storing the target data into a data lake, preprocessing the target data in the data lake, and training to generate a plurality of data analysis models according to the preprocessed target data; and carrying out visualization processing on the data analysis model and generating a visualization analysis result. According to the method and the device, the data analysis model is generated according to the target data of the online shopping mall, the corresponding visual analysis result is generated, effective analysis on massive target data is completed, visual and clear data analysis schemes are provided for users through the visual analysis result, the users are helped to adjust marketing strategies, and more scientific marketing schemes are achieved.

Description

Data analysis method, system and device based on big data and storage medium
Technical Field
The present application relates to the field of big data, and in particular, to a method, a system, an apparatus, and a storage medium for data analysis based on big data.
Background
With the rapid development of electronic commerce, online shopping has become one of the important shopping modes for people. Moreover, different from the traditional shopping mode of direct purchase after browsing, a plurality of shopping modes such as second killing marketing, live broadcasting marketing and the like are derived from online shopping, people can buy commodities in a certain time period in a concentrated manner, a large amount of data needing to be processed is generated in a short time, and merchants are difficult to effectively analyze massive data.
Disclosure of Invention
The present application is directed to solving, at least in part, one of the technical problems in the related art. Therefore, the application provides a data analysis method, a system, a device and a storage medium based on big data.
In a first aspect, an embodiment of the present application provides a data analysis method based on big data, where the method includes: acquiring target data of an online shopping mall; storing the target data into a data lake; preprocessing the target data in the data lake; training to generate a data analysis model according to the preprocessed target data; and carrying out visualization processing on the data analysis model and generating a visualization analysis result.
Optionally, the obtaining target data of the online shopping mall includes: acquiring activity data of the online shopping mall from a preset data port; obtaining network data from a designated website crawler; and taking the activity data and the network data as the target data and storing the target data into the data lake.
Optionally, the method further comprises: storing the collected target data into a message queue; and acquiring the target data from the message queue according to a preset timing task, and storing the target data into a data lake.
Optionally, the preprocessing the target data includes: cleaning the target data; calculating according to the target data which is cleaned; and carrying out normalized processing on the calculated target data to finish the preprocessing of the target data.
Optionally, the training and generating a data analysis model according to the target data after the preprocessing is completed includes: calculating the preprocessed target data according to a preset model data algorithm to obtain model data; and generating the data analysis model according to the model data.
Optionally, the method further comprises: and evaluating the data analysis model according to a preset evaluation standard. And evaluating the model data algorithm and generating an algorithm evaluation report.
Optionally, the visualizing the data analysis model and generating a visualized analysis result includes: carrying out visualization processing on the data analysis model through a preset visualization library to obtain a visualization model; generating a visualization analysis service according to the visualization model; and generating a visual analysis result according to the visual analysis service.
In a second aspect, an embodiment of the present application provides a big data based data analysis system, where the system includes: the data acquisition module is used for acquiring target data of the online shopping mall; the data storage module is used for storing the target data into a data lake; the data processing module is used for preprocessing the target data in the data lake; the model building module is used for training and generating a data analysis model according to the preprocessed target data; and the visualization module is used for performing visualization processing on the data analysis model and generating a visualization analysis result.
In a third aspect, an embodiment of the present application provides an apparatus, including: at least one processor; at least one memory for storing at least one program; when the at least one program is executed by the at least one processor, the at least one program causes the at least one processor to implement the big data based data analysis method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer storage medium, in which a program executable by a processor is stored, and the program executable by the processor is used for implementing the data analysis method according to the first aspect when executed by the processor.
The embodiment of the application has the following beneficial effects: acquiring target data generated by an online shopping mall, storing the target data into a data lake, preprocessing the target data in the data lake, and training to generate a plurality of data analysis models according to the preprocessed target data; and carrying out visualization processing on the data analysis model and generating a visualization analysis result. According to the method and the device, the data analysis model is generated according to the target data of the online shopping mall, the corresponding visual analysis result is generated, effective analysis on massive target data is completed, visual and clear data analysis schemes are provided for users through the visual analysis result, the users are helped to adjust marketing strategies, and more scientific marketing schemes are achieved.
Drawings
The accompanying drawings are included to provide a further understanding of the claimed subject matter and are incorporated in and constitute a part of this specification, illustrate embodiments of the subject matter and together with the description serve to explain the principles of the subject matter and not to limit the subject matter.
Fig. 1 is a schematic diagram of an online mall platform provided in an embodiment of the present application;
FIG. 2 is a flow chart illustrating steps of a big data based data analysis method according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of an apparatus provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It should be noted that although functional block divisions are provided in the system drawings and logical orders are shown in the flowcharts, in some cases, the steps shown and described may be performed in different orders than the block divisions in the systems or in the flowcharts. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
With the rapid development of electronic commerce, online shopping has become one of the important shopping modes for people. Moreover, different from the traditional shopping mode of direct purchase after browsing, a plurality of shopping modes such as second killing marketing, live broadcasting marketing and the like are derived from online shopping, people can buy commodities in a certain period of time in a concentrated manner, and a large amount of data needing to be processed is generated in a short time. Compared with data generated by a traditional shopping mode, the data generated by the marketing activities has the characteristics of large data volume and high concurrency, so that merchants are difficult to timely and effectively analyze massive data and obtain effective guidance suggestions according to the data.
Based on this, the embodiment of the present application provides a data analysis method, system, device and storage medium based on big data, and the method provided in the embodiment of the present application includes: acquiring target data generated by an online shopping mall, storing the target data into a data lake, preprocessing the target data in the data lake, and training to generate a plurality of data analysis models according to the preprocessed target data; and carrying out visualization processing on the data analysis model and generating a visualization analysis result. According to the method and the device, the data analysis model is generated according to the target data of the online shopping mall, the corresponding visual analysis result is generated, effective analysis on massive target data is completed, visual and clear data analysis schemes are provided for users through the visual analysis result, the users are helped to adjust marketing strategies, and more scientific marketing schemes are achieved.
The embodiments of the present application will be further explained with reference to the drawings.
Referring to fig. 1, fig. 1 is a schematic diagram of an online mall platform provided in an embodiment of the present application, where the platform 100 includes a mall system 110 and a big data based data analysis system 120 provided in an embodiment of the present application. As shown in fig. 1, the mall system includes an online mall 111 and a mall background management module 112, wherein a user participates in marketing activities such as killing every second, live broadcasting and the like in the online mall and generates massive marketing activity data, and the marketing activity data belongs to target data in the embodiment of the present application; and the shop background management module is used for managing commodities and various marketing activities by the shop and customizing the visual analysis service by the shop.
It can be understood that, in order to meet the requirement of massive and high-concurrency data throughput when marketing activities such as killing second and the like are carried out, the mall system is developed by using a distributed micro-service framework, and the framework has the characteristics of high availability, high concurrency and the like so as to meet the requirement of the online mall on data throughput when marketing activities such as killing second and the like are carried out.
In addition, as shown in fig. 1, the big data based data analysis system includes a data collection module 121, a data storage module 122, a data processing module 123, a model construction module 124, and a visualization module 125. The data acquisition module is used for acquiring target data of the online shopping mall; the data storage module is used for storing the target data into a data lake; the data processing module is used for preprocessing target data in the data lake; the model construction module is used for training and generating a data analysis model according to the preprocessed target data; the visualization module is used for performing visualization processing on the data analysis model and generating a visualization analysis result.
In some embodiments, as shown in fig. 1, the data collection module 121 further includes a mall data collection unit, a crawler unit, and a custom data source unit, where the custom data unit is used to set a data source of the data collection module, such as a website crawled by the crawler unit, a data port collected by the mall data collection unit, and the like, the mall data collection unit is used to collect activity data of the online mall, and the crawler unit is used to obtain network data from the internet crawler. When the data acquisition module performs updating iteration, the data unit can be customized to acquire the second killing related data of other platforms by modifying the data port acquired by the mall data acquisition unit, so that the data source expansion is conveniently completed, the model construction time is greatly shortened, and the training optimization of the model is more objective and universal.
In some embodiments, as shown in FIG. 1, the data storage module 122 includes a database cluster unit and a data lake. Compared with a data lake, the database has a certain structure, is used as a message middleware library or a system information library and the like, has less strict requirements on data reading and writing speed, is a storage environment with low data reading and writing quantity, is safer, and has retrieval logic, so that the database cluster unit is used for storing system data of each module or unit, and the system data comprises preset data, cache area data and the like of each module or unit. Compared with a database cluster unit, the data reading and writing speed of the data lake is higher, the adjustment of the data mode is more flexible, and the data lake is more suitable for model training and the environment of reading and writing a large amount of data, so that the data lake is used for storing mass target data acquired by the data acquisition module.
In some embodiments, as shown in fig. 1, the data processing module 123 includes a data cleaning unit, a data calculating unit and a data sorting unit, the data cleaning unit is configured to clean the target data, the data calculating unit is configured to clean the target data according to a preset algorithm, and the data sorting unit is configured to perform normalization processing on the target data.
In some embodiments, as shown in FIG. 1, the model building module 124 includes an algorithm implementation unit, a model generation unit, a model training unit, a model evaluation unit, and an algorithm reporting unit. The algorithm implementation unit is used for calculating the preprocessed target data according to a preset model data algorithm to obtain model data; the model generation unit is used for generating a data analysis model according to the model data; the model evaluation unit is used for evaluating the data analysis model according to a preset evaluation standard; and the algorithm report unit is used for evaluating the model data algorithm and generating an algorithm evaluation report.
In some embodiments, as shown in fig. 1, the visualization module 125 includes a legend engine unit configured to perform visualization processing on the data analysis model through a preset visualization library to obtain a visualization model, and a service providing unit configured to generate a visualization analysis service according to the visualization model and generate a visualization analysis result according to the visualization analysis service. The visual analysis result is more popular and easier to understand for the merchant, the market demand is better met, and the dilemma that the merchant cannot understand and cannot adjust the marketing activity parameters according to the analysis result because the analysis result is too professional is avoided.
The data analysis method based on big data provided by the embodiment of the present application is explained below with reference to the online mall platform shown in fig. 1.
Referring to fig. 2, fig. 2 is a flowchart illustrating steps of a big data based data analysis method according to an embodiment of the present application, where the method is applied to the big data based data analysis system shown in fig. 1, and the method includes, but is not limited to, steps S200 to S240:
s200, acquiring target data of the online shopping mall;
specifically, the merchant can manage the commodities of the online shopping mall in the shopping mall background management module, and set various marketing modes such as killing in seconds and live broadcasting, and the consumer participates in activities such as killing in seconds and live broadcasting in the online shopping mall and purchases the commodities. In the process of consumption of the user, as the marketing activities such as killing and live broadcasting in seconds have real-time performance, the mall system can generate massive data in a short time, the data include but are not limited to various data of the user, various data of activities, various data of commodities, a user click rate, a user retention rate, a user purchase rate, a user behavior path, commodity retrieval numbers, classified retrieval numbers, twenty-four hour website activity parameters and the like, and in the embodiment of the application, the data are called activity data. According to the embodiment of the application, the activity data of the online shopping mall can be acquired from the preset data port through the shopping mall data acquisition unit, and the activity data is used as target data to perform subsequent analysis.
In some embodiments, the data collection module proposed in the embodiments of the present application may further collect, through the internet, data related to the marketing activities of the online shopping mall, such as evaluations on the marketing activities of the online shopping mall in a social network, summary data of the marketing activities over the past year, and the like, which are referred to as network data. According to the embodiment of the application, the crawler unit can crawl from the specified network to acquire the network data. These network data and the above-mentioned activity data are collectively used as target data in the embodiment of the present application, and data analysis is performed based on the target data.
According to the content, the data acquisition module autonomously acquires activity data required by the corresponding module or acquires more comprehensive and objective network data on the internet through the crawler unit, and the data source is more diversified. And only need give data acquisition module a period, data acquisition module just can acquire sufficient data and construct the model, and follow-up this model is used in second killing the output data in the mall module and is recycled after being collected by data acquisition module, and the repeated training can accomplish the optimization of model after using.
S210, storing the target data into a data lake;
specifically, the target data collected in step S200 is stored in a data lake of the data storage module, so that other modules can call the data.
In some embodiments, because data for performing activities such as killing by seconds has the characteristics of large data volume and high concurrency, a message queue can be set, massive target data are cached in a database built in a frame by using the message queue, the target data are stably acquired by a data acquisition module subsequently, and the target data are acquired in a data lake, so that errors such as data loss and repetition of the data acquisition module caused in a high-data concurrency environment are prevented.
In other embodiments, a timing task may be further set, where the timing task is used to check the remaining memory of the database built in the framework in the foregoing content at regular time, and when the remaining memory is lower than a program-set danger value, the threshold of the data acquisition interface is relaxed to accelerate the speed of acquiring data to the data lake, and the acquired data in the built-in database is deleted, so as to achieve the purposes of releasing the memory and preventing data overflow.
S220, preprocessing target data in the data lake;
specifically, target data in the data lake is preprocessed through the data processing module. The data cleaning unit is used for cleaning target data, specifically cleaning the target data acquired by the data acquisition module or target data prestored by the data storage module, and removing redundant dirty data and invalid data. After the data cleaning is finished, calculating the cleaned target data by utilizing the preset algorithm in the data calculation unit, and calculating to obtain the target data required by the subsequent data analysis. Then, the target data which does not need to be calculated and the target data which is completed to be calculated are input into the data sorting unit, and the target data is subjected to normalized processing by the data sorting unit, so that the preprocessing process of the target data is completed. It is to be understood that the normalization process may be to unify the data length, format, and the like of the target data.
S230, training to generate a data analysis model according to the preprocessed target data;
specifically, in this step, the target data is processed by the model building module. Firstly, an algorithm real-time unit calculates the preprocessed target data according to a preset model data algorithm to obtain model data. It should be noted that different model data can be obtained according to different model data algorithms, and the different model data are used for generating different data analysis models. A data analysis model is generated by the model generation unit according to the model data, and the data analysis model comprises but is not limited to an activity price interval and user dependency analysis model, a user behavior path analysis model and the like. Through the data analysis models, the purpose of effectively analyzing massive target data of the online shopping mall can be achieved.
In some embodiments, the data analysis model may also be evaluated by the model training unit according to a preset evaluation criterion. The evaluation criterion may be a predetermined model parameter. The model meeting the evaluation standard is stored, and the model not meeting the standard enters the model training unit again for training until the reference model meeting the evaluation standard and having high completion degree is finally obtained.
In some embodiments, the algorithm reporting unit may further evaluate a preset model data algorithm, specifically, evaluate an effect generated after the algorithm is set at a fixed time, for example, the algorithm implementation time, the algorithm correctness, the algorithm robustness, and other indicators are collated, and an algorithm evaluation report is generated, where the algorithm evaluation report can provide an effective basis for a manager to adjust the algorithm, so that the manager can perform effective algorithm adjustment in different seasons and under different service environments.
S240, performing visualization processing on the data analysis model and generating a visualization analysis result;
specifically, in the embodiment of the present application, the data analysis model is visualized by the visualization unit. The visualization unit includes a legend engine unit and a service providing unit. The legend engine unit performs visualization processing on the data analysis model through a visualization library such as echarts and provides functions such as visualization data retrieval. The service providing unit provides the visualized model to the merchant for use in the form of interfaces and the like, the merchant customizes the visualization analysis service according to the self requirement, and then the service providing unit generates a visualization analysis result according to the generated visualization analysis service. The merchant can check the analysis result of the data obtained by developing the second killing, direct broadcasting and other marketing activities in the shop background management module, and scientifically and effectively evaluates the marketing activities through visual and clear visual analysis results so as to adjust the marketing scheme and develop more scientific marketing activities.
Through steps S200 to S240, the big data based data analysis method provided in the embodiment of the present application obtains target data generated by an online mall, stores the target data in a data lake, preprocesses the target data in the data lake, and trains to generate a plurality of data analysis models according to the preprocessed target data; and carrying out visualization processing on the data analysis model and generating a visualization analysis result. The method comprises the steps that a mall system based on a distributed micro-service framework is constructed, flexible change of a data source is achieved through a mall data acquisition unit and a crawler unit, data storage capacity is enlarged through a data lake, a message queue and the like, and reading and writing of high-concurrency target data are achieved; and the model evaluation unit is used for continuously adjusting the data analysis model, and the algorithm reporting unit is used for periodically evaluating the model data algorithm so as to meet the data analysis requirements under different service environments. Finally, the method and the system provide visual and clear data analysis schemes for the users through visual analysis results, help the users to adjust marketing strategies, and achieve more scientific marketing schemes.
Referring to fig. 3, fig. 3 is a schematic diagram of an apparatus 300 provided in an embodiment of the present application, where the apparatus 300 includes at least one processor 310 and at least one memory 320 for storing at least one program; in fig. 3, a processor and a memory are taken as an example.
The processor and memory may be connected by a bus or other means, such as by a bus in FIG. 3.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Another embodiment of the present application also provides an apparatus that may be used to perform the control method as in any of the above embodiments, e.g., to perform the method steps of fig. 2 described above.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
The embodiment of the application also discloses a computer storage medium, wherein a program executable by a processor is stored, and the program executable by the processor is used for realizing the matching method of the synthetic voice and the original video when being executed by the processor.
One of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
While the preferred embodiments of the present invention have been described, the present invention is not limited to the above embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and such equivalent modifications or substitutions are included in the scope of the present invention defined by the claims.

Claims (10)

1. A method for big data based data analysis, the method comprising:
acquiring target data of an online shopping mall;
storing the target data into a data lake;
preprocessing the target data in the data lake;
training to generate a data analysis model according to the preprocessed target data;
and carrying out visualization processing on the data analysis model and generating a visualization analysis result.
2. The big data based data analysis method according to claim 1, wherein the obtaining target data of the online shopping mall comprises:
acquiring activity data of the online shopping mall from a preset data port;
obtaining network data from a designated website crawler;
and taking the activity data and the network data as the target data and storing the target data into the data lake.
3. The big-data based data analysis method according to any of claims 1-2, wherein the method further comprises:
storing the collected target data into a message queue;
and acquiring the target data from the message queue according to a preset timing task, and storing the target data into a data lake.
4. The big-data-based data analysis method according to claim 1, wherein the preprocessing the target data comprises:
cleaning the target data;
calculating according to the target data which is cleaned;
and carrying out normalized processing on the calculated target data to finish the preprocessing of the target data.
5. The big data based data analysis method according to claim 1, wherein training and generating a data analysis model according to the target data after preprocessing comprises:
calculating the preprocessed target data according to a preset model data algorithm to obtain model data;
and generating the data analysis model according to the model data.
6. The big-data based data analysis method of claim 5, further comprising:
evaluating the data analysis model according to a preset evaluation standard;
and evaluating the model data algorithm and generating an algorithm evaluation report.
7. The big data based data analysis method according to claim 1, wherein the visualizing the data analysis model and generating a visualized analysis result comprises:
performing visualization processing on the data analysis model through a preset visualization library to obtain a visualization model;
generating visualization analysis service according to the visualization model;
and generating a visual analysis result according to the visual analysis service.
8. A big-data based data analysis system, the system comprising:
the data acquisition module is used for acquiring target data of the online shopping mall;
the data storage module is used for storing the target data into a data lake;
the data processing module is used for preprocessing the target data in the data lake;
the model building module is used for training and generating a data analysis model according to the preprocessed target data;
and the visualization module is used for performing visualization processing on the data analysis model and generating a visualization analysis result.
9. An apparatus, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the big data based data analytics method as claimed in any one of claims 1 to 7.
10. A computer storage medium in which a processor-executable program is stored, wherein the processor-executable program, when executed by the processor, is configured to implement a data analysis method as claimed in any one of claims 1 to 7.
CN202111401235.8A 2021-11-24 2021-11-24 Data analysis method, system and device based on big data and storage medium Pending CN114969191A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111401235.8A CN114969191A (en) 2021-11-24 2021-11-24 Data analysis method, system and device based on big data and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111401235.8A CN114969191A (en) 2021-11-24 2021-11-24 Data analysis method, system and device based on big data and storage medium

Publications (1)

Publication Number Publication Date
CN114969191A true CN114969191A (en) 2022-08-30

Family

ID=82975031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111401235.8A Pending CN114969191A (en) 2021-11-24 2021-11-24 Data analysis method, system and device based on big data and storage medium

Country Status (1)

Country Link
CN (1) CN114969191A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858633A (en) * 2023-02-27 2023-03-28 广州汇通国信科技有限公司 Time sequence data analysis method and device based on data lake
CN116108086A (en) * 2023-02-27 2023-05-12 广州汇通国信科技有限公司 Time sequence data evaluation method and device, electronic equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858633A (en) * 2023-02-27 2023-03-28 广州汇通国信科技有限公司 Time sequence data analysis method and device based on data lake
CN116108086A (en) * 2023-02-27 2023-05-12 广州汇通国信科技有限公司 Time sequence data evaluation method and device, electronic equipment and storage medium
CN116108086B (en) * 2023-02-27 2023-09-26 广州汇通国信科技有限公司 Time sequence data evaluation method and device, electronic equipment and storage medium
CN115858633B (en) * 2023-02-27 2023-10-20 广州汇通国信科技有限公司 Time sequence data analysis method and device based on data lake

Similar Documents

Publication Publication Date Title
US10846643B2 (en) Method and system for predicting task completion of a time period based on task completion rates and data trend of prior time periods in view of attributes of tasks using machine learning models
US11281969B1 (en) Artificial intelligence system combining state space models and neural networks for time series forecasting
CN110033314B (en) Advertisement data processing method and device
Tekouabou Intelligent management of bike sharing in smart cities using machine learning and Internet of Things
US9177326B2 (en) Method and system for determining overall content values for content elements in a web network and for optimizing internet traffic flow through the web network
CN114969191A (en) Data analysis method, system and device based on big data and storage medium
US20150310358A1 (en) Modeling consumer activity
CN112148973B (en) Data processing method and device for information push
US20090177612A1 (en) Method and Apparatus for Analyzing Data to Provide Decision Making Information
WO2020024718A1 (en) Method and device for predicting foreign exchange transaction volume
US10832262B2 (en) Modeling consumer activity
CN111080417A (en) Processing method for improving booking smoothness rate, model training method and system
US9324026B2 (en) Hierarchical latent variable model estimation device, hierarchical latent variable model estimation method, supply amount prediction device, supply amount prediction method, and recording medium
CN111695938A (en) Product pushing method and system
CN111861605A (en) Business object recommendation method
Utomo et al. Eliciting agents’ behaviour using scenario-based questionnaire in agent-based dairy supply chain simulation
US20190197578A1 (en) Generating significant performance insights on campaigns data
CN109711849B (en) Ether house address portrait generation method and device, electronic equipment and storage medium
CN117076770A (en) Data recommendation method and device based on graph calculation, storage value and electronic equipment
CN117312657A (en) Abnormal function positioning method and device for financial application, computer equipment and medium
CN114443671A (en) Recommendation model updating method and device, computer equipment and storage medium
CN114708110A (en) Joint training method and device for continuous guarantee behavior prediction model and electronic equipment
CN113469819A (en) Recommendation method of fund product, related device and computer storage medium
Bhandari et al. An Ensemble Regression Model to Predict a Rent of House
CN111241382A (en) Data processing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination