CN112000636A - User behavior statistical analysis method based on Flink streaming processing - Google Patents

User behavior statistical analysis method based on Flink streaming processing Download PDF

Info

Publication number
CN112000636A
CN112000636A CN202010898539.9A CN202010898539A CN112000636A CN 112000636 A CN112000636 A CN 112000636A CN 202010898539 A CN202010898539 A CN 202010898539A CN 112000636 A CN112000636 A CN 112000636A
Authority
CN
China
Prior art keywords
data
user behavior
real
time
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010898539.9A
Other languages
Chinese (zh)
Inventor
李振
鲁宾宾
曹书凯
张晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Minsheng Science And Technology Co ltd
Original Assignee
Minsheng Science And Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Minsheng Science And Technology Co ltd filed Critical Minsheng Science And Technology Co ltd
Priority to CN202010898539.9A priority Critical patent/CN112000636A/en
Publication of CN112000636A publication Critical patent/CN112000636A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data statistical analysis, and provides a user behavior statistical analysis method based on Flink streaming processing, which comprises the steps of collecting user behavior data and sending the user behavior data to Kafka; the Flink consumes the Kafka data, removes the dirty data and carries out classification processing, and the data is accessed to the distributed stream processing module and stored in the distributed file system; performing data aggregation and arrangement to obtain index data; constructing a data warehouse model, and migrating index data to a real-time database; index data is extracted from the real-time database to provide real-time data service for the analysis module. The invention adopts the Flink real-time stream processing engine to provide the processing capability with high concurrency, millisecond level and low time delay, thereby effectively solving the problem of data processing timeliness under a big data scene; the distributed offline processing is assisted, the multidimensional image is performed on the user, so that an enterprise can know the user characteristics and the product using condition in real time, the product marketing strategy is adjusted in time, the interface layout is optimized, and the user experience is improved.

Description

User behavior statistical analysis method based on Flink streaming processing
Technical Field
The invention relates to the technical field of data statistical analysis, in particular to a user behavior statistical analysis method based on Flink streaming processing.
Background
Under the digital age, the application range and the boundary of the internet are continuously expanded. A plurality of internet enterprises and traditional enterprises gradually accelerate the updating iteration of self application systems, and the updating iteration comprises a computer end and a mobile phone end, and enables self services and products by means of the scientific and technological strength of the internet. For the upgrade optimization of an application system, the statistical analysis of user behaviors is the most important technical support. Through user behavior statistical analysis, the intention, the characteristics and the requirements of the user are fully embodied, and the enterprise is helped to design products, optimize interfaces and accurately market by using a big data technology, so that the use experience of the user is improved.
A conventional user behavior analysis system generally adopts Spark or Hadoop MR technology to convert and process large-scale data, can only provide second-level feedback, and cannot meet a business risk prevention and control scene or an instant online analysis scene.
According to the method and the device, the time effectiveness problem in a big data scene can be effectively solved by adopting a Flink real-time stream processing engine and assisting a distributed offline processing technology.
Disclosure of Invention
The invention aims to solve the defects of the prior art, provides the user behavior statistical analysis method based on the Flink streaming processing, has good timeliness, and performs multi-dimensional image on the user, so that an enterprise can know the user characteristics and the product use condition in real time.
The invention adopts the following technical scheme:
a statistical analysis method for user behavior based on Flink streaming processing comprises the following steps:
s1, collecting user behavior data under different application systems, and uploading the user behavior data to Kafka;
s2, consuming Kafka data by Flink, removing dirty data and carrying out classification processing to obtain a plurality of data streams, wherein each data stream represents data of one type; accessing different types of data into a distributed stream processing module for distributed stream processing, and storing the data into a big data distributed file system;
s3, performing polymerization and arrangement on the different types of data processed in the step S2 to obtain index data; constructing a data warehouse, and migrating the index data to a real-time database;
and S4, extracting the index data of the user behavior from the real-time database, and providing real-time data service for the analysis module.
Further, in step S1, the method for collecting user behavior data under different application systems includes:
switching page segments in a single-page mode of an application system, or acquiring user behavior data by adopting an SDK (software development kit) embedded point without initiating a request to a back-end service when clicking an event;
the method comprises the steps that page switching or clicking events in a multi-page mode of an application system need to initiate requests to a back-end service, and Log data are obtained from a back-end server Log or a transit server Log.
Further, step S2 specifically includes:
s2.1, analyzing, assembling and converting server log data in Kafka to obtain a standard message format;
s2.2, filtering empty data, abnormal data and error data;
s2.3, dividing the data streams according to different event types to obtain a plurality of data streams, wherein each data stream represents one type of data; according to the service requirements and the type of index calculation, a plurality of data streams are divided, for example, indexes such as real-time click quantity, real-time browsing quantity and the like only relate to simple basic accumulation operation, and the clicked or browsed data streams can be accessed to a real-time calculation module for real-time calculation; for the indexes of the statistical calculation in the period, such as average access time length, average online time length and the like, which need to relate to the summary calculation of data in the period, the accessed or online data stream can be accessed to an offline calculation module for offline calculation.
Further, in step S3, the index data includes a real-time index and an offline index;
the real-time indexes comprise user browsing amount, visitor number and online user number; the real-time module converts, filters and deduplicates KafkaDataSource (data source) data through FlinkDatastream (data stream), integrates the KafkaDataSource (data source) data into a multi-dimensional data tuple, and constructs an HDFS Datasink (data injection module) output data processing result;
the off-line module loads distributed file system data to Hive, divides the database according to the service type and sets date partitions; according to the complexity of the business analysis requirements, constructing a source data layer ODS, a data theme layer DW and a data mart layer DM, and carrying out batch scheduling at regular time (hour/day);
and migrating the real-time indexes and the off-line indexes to a real-time database, and providing a real-time query function of a user behavior analysis result.
Further, the timed batch scheduling specifically includes:
the index calculation of the user group behaviors adopts hourly batch, and simultaneously meets the timeliness requirement of data analysis;
and calculating the user behavior analysis indexes divided by the service type dimension by adopting T +1 batch.
Further, in step S4, index data of the user behavior is extracted from the real-time database, and Hbase is selected as the real-time database.
The invention also provides a computer program of the user behavior statistical analysis method based on the Flink streaming processing.
An information data processing terminal for realizing the method for user behavior statistical analysis based on Flink streaming processing.
A computer-readable storage medium, comprising instructions, which when run on a computer, cause the computer to execute the above-mentioned method for statistical analysis of user behavior based on Flink streaming
The invention has the beneficial effects that: by constructing a Flink real-time processing engine and assisting quantitative analysis, multi-dimensional images are performed on users, so that enterprises can know user characteristics and product use conditions in real time, product marketing strategies are adjusted in time, interface layout is optimized, and user experience is improved.
Drawings
Fig. 1 is a schematic flow chart of a method for statistical analysis of user behavior based on Flink streaming processing according to an embodiment of the present invention.
Detailed Description
Specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that technical features or combinations of technical features described in the following embodiments should not be considered as being isolated, and they may be combined with each other to achieve better technical effects.
As shown in fig. 1, in the user behavior statistical analysis method based on the Flink streaming processing according to the embodiment of the present invention, the Flink real-time processing technology is used to perform online processing and analysis on user behavior data, so as to ensure timeliness and accuracy of analysis system data. The method comprises the following specific steps:
s1, collecting user behavior data under different application systems, and uploading the user behavior data to Kafka;
preferably, in the user behavior data reporting link, according to different system characteristics and data requirements, the step S1 divides the acquisition means; the switching of page fragments or some clicking operations and the like in the single-page mode of the application system do not need to initiate a request for a back-end service, and the SDK is needed to bury points to acquire data in the scene; the method comprises the following steps that a request needs to be sent to a back-end service when a page switching or clicking event of an application system in a multi-page mode, and Log data can be obtained from a back-end server Log or a transit server Log; the user behavior data under the different scenarios are uploaded to Kafka.
S2, consuming Kafka data by Flink, removing dirty data (abnormal data or valueless data) and carrying out classification processing to obtain a plurality of data streams, wherein each data stream represents one type of data; accessing different types of data into a distributed stream processing module and storing the data into a big data distributed file system;
preferably, the method specifically comprises the following steps:
the Flink consumes the Kafka data, analyzes the request service log data, and filters null data, abnormal data, error data, and the like. Kafka has different data formats, SDK reported data has a standard message format, and log text data needs to be analyzed, assembled and converted into the standard message format. The standard message can be further processed. Dividing data streams according to different Event (user behavior Event) types, and enabling a part of data streams to enter a real-time computing module for real-time computing; the other part is accessed into a distributed file storage system to provide data support for offline calculation of an offline calculation module of user behavior data;
s3, performing polymerization and arrangement on the different types of data processed in the step S2 to obtain index data; constructing a data warehouse, and migrating the index data to a real-time database;
this step involves the calculation of two types of indices: real-time index and off-line index;
the real-time module calculates indexes such as user browsing amount, visitor number and online user number, converts, filters and deduplicates KafkaDataSource data through FlinkDatastream, integrates the KafkaDataSource data into a multidimensional data tuple, and constructs an HDFS Datasink (data injection module) to output a data processing result.
And the off-line module loads the distributed file system data to Hive, divides the database according to the service types and sets date partitions. And constructing a source data layer ODS, a data subject layer DW and a data mart layer DM according to the complexity of the business analysis requirements, and performing batch scheduling at regular time (hour/day).
Migrating the index calculation result to a real-time database, and providing a real-time query function of a user behavior analysis result;
preferably, the timed batch scheduling of the offline data warehouse specifically includes: the index calculation of the user group behaviors adopts hourly batch, and meanwhile, the timeliness requirement of data analysis can be met; and (3) user behavior analysis indexes of service type dimension division are involved, and the calculation of the partial indexes adopts T +1 batch.
S4, extracting the index data of the user behavior from the real-time database, and providing real-time data service for an analysis module;
and extracting index data of user behaviors from a real-time database, wherein Hbase is preferably selected from the database, and the real-time requirements of different analysis dimensions are met on the premise of ensuring large data storage.
The invention adopts the Flink real-time stream processing engine to process the user behavior data of the client in real time, has the advantages of providing high concurrency, millisecond level and low time delay processing capability and effectively solving the timeliness problem of data processing in a big data scene; the distributed offline processing technology is assisted, the multidimensional image is performed on the user, so that an enterprise can know user characteristics and product use conditions in real time, product marketing strategies are adjusted in time, interface layout is optimized, and user experience is improved.
While several embodiments of the present invention have been presented herein, it will be appreciated by those skilled in the art that changes may be made to the embodiments herein without departing from the spirit of the invention. The above examples are merely illustrative and should not be taken as limiting the scope of the invention.

Claims (9)

1. A statistical analysis method for user behavior based on Flink streaming processing is characterized by comprising the following steps:
s1, collecting user behavior data under different application systems, and uploading the user behavior data to Kafka;
s2, consuming Kafka data by Flink, removing dirty data and carrying out classification processing to obtain a plurality of data streams, wherein each data stream represents data of one type; carrying out distributed stream processing on different types of data, and storing the data to a big data distributed file system;
s3, performing polymerization and arrangement on the different types of data processed in the step S2 to obtain index data; constructing a data warehouse, and migrating the index data to a real-time database;
and S4, extracting the index data of the user behavior from the real-time database, and providing real-time data service for analysis.
2. The Flink streaming processing-based statistical analysis method for user behavior as claimed in claim 1, wherein in step S1, the method for collecting user behavior data under different application systems is:
switching page segments in a single-page mode of an application system, or acquiring user behavior data by adopting an SDK (software development kit) embedded point without initiating a request to a back-end service when clicking an event;
the method comprises the steps that page switching or clicking events in a multi-page mode of an application system need to initiate requests to a back-end service, and Log data are obtained from a back-end server Log or a transit server Log.
3. The Flink streaming processing-based statistical analysis method of user behavior as claimed in claim 2, wherein step S2 specifically comprises:
s2.1, analyzing, assembling and converting server log data in Kafka to obtain a standard message format;
s2.2, filtering empty data, abnormal data and error data;
s2.3, dividing the data streams according to different event types to obtain a plurality of data streams, wherein each data stream represents data of one type; dividing a plurality of data streams according to the service requirements and the type of index calculation; only index data related to simple basic accumulation operation and clicked or browsed data streams are calculated in real time; and performing off-line calculation on the index data subjected to statistical calculation in the period.
4. The Flink streaming based user behavior statistical analysis method as claimed in claim 2, wherein in step S3, the index data comprises real-time index and off-line index; dividing the indexes into real-time indexes and off-line indexes according to whether data summary statistics or complex calculation of multi-dimensional indexes in a certain period is needed in the index calculation process;
the real-time indexes comprise user browsing amount, visitor number and online user number; calculating in real time, converting, filtering and removing duplication of KafkaDataSource data through FlinkDatastream data streams, integrating the KafkaDataSource data into a multi-dimensional data tuple, and constructing an HDFSDatasink output data processing result;
the off-line indexes comprise average access duration, average on-line duration, user area distribution statistics and user grade statistics; the off-line calculation comprises loading distributed file system data to Hive, dividing the database according to the service type, and setting date partitions; according to the business analysis requirements, constructing a source data layer ODS, a data theme layer DW and a data mart layer DM, and performing timed batch scheduling;
and migrating the real-time indexes and the off-line indexes to a real-time database, and providing a real-time query function of a user behavior analysis result.
5. The Flink streaming processing-based user behavior statistical analysis method according to claim 4, wherein the timed batch scheduling specifically comprises:
the index calculation of the user group behaviors adopts hourly batch, and simultaneously meets the timeliness requirement of data analysis; and adopting T +1 batch for calculating the user behavior analysis indexes divided by the service type dimension.
6. The Flink streaming based statistical analysis method for user behavior as claimed in claim 2, wherein in step S4, the index data of user behavior is extracted from the real-time database, and Hbase is selected as the real-time database.
7. A computer program for implementing the Flink streaming processing-based statistical analysis method for user behavior according to any one of claims 1 to 6.
8. An information data processing terminal for implementing the Flink streaming processing-based user behavior statistical analysis method according to any one of claims 1 to 6.
9. A computer-readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the method for statistical analysis of user behavior based on Flink streaming as claimed in any of claims 1 to 6.
CN202010898539.9A 2020-08-31 2020-08-31 User behavior statistical analysis method based on Flink streaming processing Pending CN112000636A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010898539.9A CN112000636A (en) 2020-08-31 2020-08-31 User behavior statistical analysis method based on Flink streaming processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010898539.9A CN112000636A (en) 2020-08-31 2020-08-31 User behavior statistical analysis method based on Flink streaming processing

Publications (1)

Publication Number Publication Date
CN112000636A true CN112000636A (en) 2020-11-27

Family

ID=73464935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010898539.9A Pending CN112000636A (en) 2020-08-31 2020-08-31 User behavior statistical analysis method based on Flink streaming processing

Country Status (1)

Country Link
CN (1) CN112000636A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365355A (en) * 2020-12-10 2021-02-12 深圳迅策科技有限公司 Method, device and readable medium for calculating fund valuation and risk index in real time
CN112463868A (en) * 2020-12-04 2021-03-09 车智互联(北京)科技有限公司 Data processing method, data processing system and computing device
CN112507029A (en) * 2020-12-18 2021-03-16 上海哔哩哔哩科技有限公司 Data processing system and data real-time processing method
CN112559445A (en) * 2020-12-11 2021-03-26 上海哔哩哔哩科技有限公司 Data writing method and device
CN112633761A (en) * 2020-12-31 2021-04-09 中国平安财产保险股份有限公司 Index data query method, device, equipment and storage medium
CN112650889A (en) * 2020-12-28 2021-04-13 中国兵器装备集团自动化研究所 Method and system for constructing enterprise safety, environmental protection and security protection monitoring data warehouse
CN112699118A (en) * 2020-12-25 2021-04-23 京东方科技集团股份有限公司 Data synchronization method, corresponding device, system and storage medium
CN112711594A (en) * 2021-01-15 2021-04-27 科技谷(厦门)信息技术有限公司 Rail transit data integration method
CN112948090A (en) * 2021-03-23 2021-06-11 耿赛 Big data analysis and sorting method applied to network service processing and server
CN113157747A (en) * 2021-04-30 2021-07-23 中国银行股份有限公司 Data service method and device
CN113421131A (en) * 2021-07-21 2021-09-21 赛诺数据科技(南京)有限公司 Intelligent marketing system based on big data content
CN113468019A (en) * 2021-06-28 2021-10-01 康键信息技术(深圳)有限公司 Hbase-based index monitoring method, device, equipment and storage medium
CN113783931A (en) * 2021-08-02 2021-12-10 中企云链(北京)金融信息服务有限公司 Internet of things data aggregation and analysis method
CN113779094A (en) * 2021-11-09 2021-12-10 通号通信信息集团有限公司 Batch-flow-integration-based data processing method and device, computer equipment and medium
CN113806416A (en) * 2021-03-12 2021-12-17 京东科技控股股份有限公司 Method and device for realizing real-time data service and electronic equipment
CN114116901A (en) * 2021-11-24 2022-03-01 上海金仕达软件科技有限公司 Method, system and storage medium based on flink data summarization
CN114363435A (en) * 2021-12-31 2022-04-15 广东柯内特环境科技有限公司 Environmental data monitoring and processing method
CN114996300A (en) * 2022-05-20 2022-09-02 上海浦东发展银行股份有限公司 Real-time big data visual analysis method for bank credit card center
CN115017201A (en) * 2022-08-09 2022-09-06 中企云链(北京)金融信息服务有限公司 FLINK processing engine-based user behavior analysis method and system
CN115080156A (en) * 2022-08-23 2022-09-20 卓望数码技术(深圳)有限公司 Flow-batch-integration-based optimized calculation method and device for big data batch calculation
CN117892727A (en) * 2024-03-14 2024-04-16 中国电子科技集团公司第三十研究所 Real-time text data stream deduplication system and method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170103133A1 (en) * 2015-10-09 2017-04-13 Alibaba Group Holding Limited Recommendation method and device
CN109271412A (en) * 2018-09-28 2019-01-25 中国-东盟信息港股份有限公司 The real-time streaming data processing method and system of smart city
CN109684352A (en) * 2018-12-29 2019-04-26 江苏满运软件科技有限公司 Data analysis system, method, storage medium and electronic equipment
CN110245158A (en) * 2019-06-10 2019-09-17 上海理想信息产业(集团)有限公司 A kind of multi-source heterogeneous generating date system and method based on Flink stream calculation technology
CN110908883A (en) * 2019-11-15 2020-03-24 江苏满运软件科技有限公司 User portrait data monitoring method, system, equipment and storage medium
CN111241078A (en) * 2020-01-07 2020-06-05 网易(杭州)网络有限公司 Data analysis system, data analysis method and device
CN111339073A (en) * 2020-02-24 2020-06-26 天津满运软件科技有限公司 Real-time data processing method and device, electronic equipment and readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170103133A1 (en) * 2015-10-09 2017-04-13 Alibaba Group Holding Limited Recommendation method and device
CN109271412A (en) * 2018-09-28 2019-01-25 中国-东盟信息港股份有限公司 The real-time streaming data processing method and system of smart city
CN109684352A (en) * 2018-12-29 2019-04-26 江苏满运软件科技有限公司 Data analysis system, method, storage medium and electronic equipment
CN110245158A (en) * 2019-06-10 2019-09-17 上海理想信息产业(集团)有限公司 A kind of multi-source heterogeneous generating date system and method based on Flink stream calculation technology
CN110908883A (en) * 2019-11-15 2020-03-24 江苏满运软件科技有限公司 User portrait data monitoring method, system, equipment and storage medium
CN111241078A (en) * 2020-01-07 2020-06-05 网易(杭州)网络有限公司 Data analysis system, data analysis method and device
CN111339073A (en) * 2020-02-24 2020-06-26 天津满运软件科技有限公司 Real-time data processing method and device, electronic equipment and readable storage medium

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463868A (en) * 2020-12-04 2021-03-09 车智互联(北京)科技有限公司 Data processing method, data processing system and computing device
CN112365355B (en) * 2020-12-10 2023-12-26 深圳迅策科技有限公司 Method, device and readable medium for calculating foundation valuation and risk index in real time
CN112365355A (en) * 2020-12-10 2021-02-12 深圳迅策科技有限公司 Method, device and readable medium for calculating fund valuation and risk index in real time
CN112559445A (en) * 2020-12-11 2021-03-26 上海哔哩哔哩科技有限公司 Data writing method and device
CN112559445B (en) * 2020-12-11 2022-12-27 上海哔哩哔哩科技有限公司 Data writing method and device
CN112507029A (en) * 2020-12-18 2021-03-16 上海哔哩哔哩科技有限公司 Data processing system and data real-time processing method
CN112699118A (en) * 2020-12-25 2021-04-23 京东方科技集团股份有限公司 Data synchronization method, corresponding device, system and storage medium
CN112650889A (en) * 2020-12-28 2021-04-13 中国兵器装备集团自动化研究所 Method and system for constructing enterprise safety, environmental protection and security protection monitoring data warehouse
CN112633761A (en) * 2020-12-31 2021-04-09 中国平安财产保险股份有限公司 Index data query method, device, equipment and storage medium
CN112633761B (en) * 2020-12-31 2023-09-19 中国平安财产保险股份有限公司 Index data query method, device, equipment and storage medium
CN112711594A (en) * 2021-01-15 2021-04-27 科技谷(厦门)信息技术有限公司 Rail transit data integration method
CN113806416A (en) * 2021-03-12 2021-12-17 京东科技控股股份有限公司 Method and device for realizing real-time data service and electronic equipment
CN113806416B (en) * 2021-03-12 2023-11-03 京东科技控股股份有限公司 Method and device for realizing real-time data service and electronic equipment
CN112948090B (en) * 2021-03-23 2021-10-26 国网江苏省电力有限公司信息通信分公司 Big data analysis and sorting method applied to network service processing and server
CN112948090A (en) * 2021-03-23 2021-06-11 耿赛 Big data analysis and sorting method applied to network service processing and server
CN113157747A (en) * 2021-04-30 2021-07-23 中国银行股份有限公司 Data service method and device
CN113468019A (en) * 2021-06-28 2021-10-01 康键信息技术(深圳)有限公司 Hbase-based index monitoring method, device, equipment and storage medium
CN113421131A (en) * 2021-07-21 2021-09-21 赛诺数据科技(南京)有限公司 Intelligent marketing system based on big data content
CN113421131B (en) * 2021-07-21 2023-11-28 赛诺数据科技(南京)有限公司 Intelligent marketing system based on big data content
CN113783931A (en) * 2021-08-02 2021-12-10 中企云链(北京)金融信息服务有限公司 Internet of things data aggregation and analysis method
CN113783931B (en) * 2021-08-02 2023-07-25 中企云链(北京)金融信息服务有限公司 Data aggregation and analysis method for Internet of things
WO2023082681A1 (en) * 2021-11-09 2023-05-19 通号通信信息集团有限公司 Data processing method and apparatus based on batch-stream integration, computer device, and medium
CN113779094B (en) * 2021-11-09 2022-03-22 通号通信信息集团有限公司 Batch-flow-integration-based data processing method and device, computer equipment and medium
CN113779094A (en) * 2021-11-09 2021-12-10 通号通信信息集团有限公司 Batch-flow-integration-based data processing method and device, computer equipment and medium
CN114116901A (en) * 2021-11-24 2022-03-01 上海金仕达软件科技有限公司 Method, system and storage medium based on flink data summarization
CN114363435A (en) * 2021-12-31 2022-04-15 广东柯内特环境科技有限公司 Environmental data monitoring and processing method
CN114363435B (en) * 2021-12-31 2023-12-12 广东柯内特环境科技有限公司 Environment data monitoring and processing method
CN114996300A (en) * 2022-05-20 2022-09-02 上海浦东发展银行股份有限公司 Real-time big data visual analysis method for bank credit card center
CN115017201A (en) * 2022-08-09 2022-09-06 中企云链(北京)金融信息服务有限公司 FLINK processing engine-based user behavior analysis method and system
CN115080156B (en) * 2022-08-23 2022-11-11 卓望数码技术(深圳)有限公司 Flow-batch-integration-based optimized calculation method and device for big data batch calculation
CN115080156A (en) * 2022-08-23 2022-09-20 卓望数码技术(深圳)有限公司 Flow-batch-integration-based optimized calculation method and device for big data batch calculation
CN117892727A (en) * 2024-03-14 2024-04-16 中国电子科技集团公司第三十研究所 Real-time text data stream deduplication system and method
CN117892727B (en) * 2024-03-14 2024-05-17 中国电子科技集团公司第三十研究所 Real-time text data stream deduplication system and method

Similar Documents

Publication Publication Date Title
CN112000636A (en) User behavior statistical analysis method based on Flink streaming processing
US10409650B2 (en) Efficient access scheduling for super scaled stream processing systems
CN111209352B (en) Data processing method and device, electronic equipment and storage medium
US20170242889A1 (en) Cache Based Efficient Access Scheduling for Super Scaled Stream Processing Systems
CN113360554B (en) Method and equipment for extracting, converting and loading ETL (extract transform load) data
CN111274256A (en) Resource control method, device, equipment and storage medium based on time sequence database
CN113779094B (en) Batch-flow-integration-based data processing method and device, computer equipment and medium
CN112231296B (en) Distributed log processing method, device, system, equipment and medium
CN113312376B (en) Method and terminal for real-time processing and analysis of Nginx logs
CN111221791A (en) Method for importing multi-source heterogeneous data into data lake
CN112948492A (en) Data processing system, method and device, electronic equipment and storage medium
CN113420009B (en) Electromagnetic data analysis device, system and method based on big data
CN107291770A (en) The querying method and device of mass data in a kind of distributed system
CN108108445A (en) A kind of data intelligence processing method and system
CN112084190A (en) Big data based acquired data real-time storage and management system and method
CN115086301B (en) Data analysis system and method for compression uploading equalization
CN111343269B (en) Data downloading method, device, computer equipment and storage medium
CN117131059A (en) Report data processing method, device, equipment and storage medium
CN106919566A (en) A kind of query statistic method and system based on mass data
CN115599871A (en) Lake and bin integrated data processing system and method
CN115510139A (en) Data query method and device
CN110647448A (en) Mobile application operation log data real-time analysis method, server and system
CN113612832A (en) Streaming data distribution method and system
Wu et al. RIVA: A Real-Time Information Visualization and analysis platform for social media sentiment trend
CN205754379U (en) Log processing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination