CN113761018A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN113761018A
CN113761018A CN202110206002.6A CN202110206002A CN113761018A CN 113761018 A CN113761018 A CN 113761018A CN 202110206002 A CN202110206002 A CN 202110206002A CN 113761018 A CN113761018 A CN 113761018A
Authority
CN
China
Prior art keywords
data
dimensions
real
target object
multiple dimensions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110206002.6A
Other languages
Chinese (zh)
Inventor
马向攀
冯志恒
周海阳
高亚新
李源
尹翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202110206002.6A priority Critical patent/CN113761018A/en
Publication of CN113761018A publication Critical patent/CN113761018A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data processing method, a device, equipment and a storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining real-time data streams of a target object in multiple dimensions, conducting data statistics on the real-time data streams of the multiple dimensions within a preset time period to obtain statistical data of the multiple dimensions, wherein the statistical data of each dimension is used for indicating an attribute value of the target object, and storing the statistical data of the multiple dimensions into a first database. Through the processing process, data acquisition, processing and storage of the target object in multiple dimensions are realized, data support is provided for monitoring and analysis of multi-dimensional data of the target object, and meanwhile the real-time query requirement of a user on the multi-dimensional data of the target object can be met.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of computers, and in particular, to a data processing method, apparatus, device, and storage medium.
Background
With the continuous development of internet technology, e-commerce is becoming more and more widely used. At present, network consumption becomes an integral part of the life of people, and people can consult or select various articles such as electronic products, articles for daily use and the like through various channels such as large and commercial platforms, application programs and the like.
Taking the e-commerce platform as an example, for an enterprise user who is resident in the e-commerce platform, there is a need to acquire various types of analysis data, such as the browsing number of platform stores, the number of consultations, the online customer service quality, and the like.
However, due to the huge data amount of various types of data generated by the e-commerce platform every day, the data processing capability of the platform cannot realize real-time monitoring and analysis of various types of statistical data.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, data processing equipment and a storage medium, and realizes statistics and monitoring of multi-dimensional data.
A first aspect of an embodiment of the present application provides a data processing method, including:
acquiring real-time data streams of a target object in multiple dimensions, wherein each dimension corresponds to one real-time data stream;
respectively carrying out data statistics on real-time data streams of multiple dimensions in a preset time period to obtain statistical data of the multiple dimensions, wherein the statistical data of each dimension is used for indicating an attribute value of the target object;
storing the statistical data of the plurality of dimensions to a first database.
In an embodiment of the present application, the acquiring real-time data of a target object in different dimensions includes:
and acquiring real-time data streams of the target object in multiple dimensions from different message queues.
In an embodiment of the application, if at least two of the real-time data streams with multiple dimensions are used to indicate the same attribute of the target object, the method further includes:
performing data merging on the real-time data streams of the at least two dimensions to obtain merged data streams;
correspondingly, respectively carrying out data statistics on the real-time data streams of multiple dimensions within a preset time period to obtain statistical data of multiple dimensions, including:
and performing data statistics on the merged data stream within a preset time period to obtain statistical data of the merged data stream.
In an embodiment of the application, the performing data merging on the real-time data streams of the at least two dimensions to obtain a merged data stream includes:
and performing data merging on the real-time data streams of the at least two dimensions by adopting Union operation to obtain the merged data stream.
In an embodiment of the present application, before performing data statistics on real-time data streams of multiple dimensions respectively within a preset time period to obtain statistical data of the multiple dimensions, the method further includes:
acquiring a preset data dimension table from a second database, wherein a plurality of preset target objects are recorded in the preset data dimension table;
determining whether the target object is a preset target object or not according to the preset data dimension table;
and if the target object is the preset target object, executing the step of data statistics.
In an embodiment of the present application, if the target object is a preset target object, the method further includes:
and respectively carrying out data filtering on the real-time data streams of the multiple dimensions to obtain filtered data streams of the multiple dimensions, wherein the data stream of each dimension comprises effective data in the data stream.
In an embodiment of the present application, before performing data statistics on real-time data streams of multiple dimensions respectively within a preset time period to obtain statistical data of the multiple dimensions, the method further includes:
and respectively carrying out deduplication processing on the real-time data streams of the multiple dimensions to obtain deduplicated data streams of the multiple dimensions, wherein the data streams of each dimension do not include repeated data.
In an embodiment of the present application, the performing data statistics on the real-time data streams of multiple dimensions within a preset time period to obtain statistical data of multiple dimensions respectively includes:
and respectively carrying out data statistics on the real-time data streams of multiple dimensions in a preset time period by adopting a Flink architecture to obtain statistical data of multiple dimensions.
In one embodiment of the present application, the statistics of the plurality of dimensions comprise at least a first statistic and a second statistic; the storing the statistical data of the plurality of dimensions to a first database comprises:
storing the first statistical data in a first field of a first record of the first database and the second statistical data in a second field of the first record of the first database.
In one embodiment of the present application, the first database is a distributed document database; the storing the statistical data of the plurality of dimensions to a first database comprises:
storing the statistical data of the multiple dimensions to the first database according to a preset writing period; or
After the statistical data of all dimensions are acquired, the statistical data of the plurality of dimensions are stored in a first database.
In one embodiment of the present application, the method further comprises:
receiving a data query request, wherein the data query request is used for requesting statistical data of at least two dimensions of the target object;
in response to the data query request, acquiring statistical data of at least two dimensions of the target object from the first database;
returning a data query response, the data query response including statistics of at least two dimensions of the target object.
A second aspect of an embodiment of the present application provides a data processing apparatus, including: the device comprises an acquisition module, a processing module and a storage module.
The acquisition module is used for acquiring real-time data streams of the target object in multiple dimensions, wherein each dimension corresponds to one real-time data stream;
the processing module is used for respectively carrying out data statistics on the real-time data streams of multiple dimensions in a preset time period to obtain statistical data of the multiple dimensions, wherein the statistical data of each dimension is used for indicating an attribute value of the target object;
and the storage module is used for storing the statistical data of the multiple dimensions to a first database.
In an embodiment of the present application, the obtaining module is specifically configured to:
and acquiring real-time data streams of the target object in multiple dimensions from different message queues.
In an embodiment of the application, if at least two of the real-time data streams of the multiple dimensions are used to indicate the same attribute of the target object, the processing module is further configured to:
performing data merging on the real-time data streams of the at least two dimensions to obtain merged data streams;
correspondingly, the processing module is specifically configured to perform data statistics on the merged data stream within a preset time period to obtain statistical data of the merged data stream.
In an embodiment of the present application, the processing module is specifically configured to:
and performing data merging on the real-time data streams of the at least two dimensions by adopting Union operation to obtain the merged data stream.
In an embodiment of the present application, the processing module is further configured to perform data statistics on the real-time data streams of multiple dimensions within a preset time period, before obtaining statistical data of multiple dimensions:
acquiring a preset data dimension table from a second database, wherein a plurality of preset target objects are recorded in the preset data dimension table;
determining whether the target object is a preset target object or not according to the preset data dimension table;
and if the target object is the preset target object, executing the step of data statistics.
In an embodiment of the application, if the target object is a preset target object, the processing module is further configured to:
and respectively carrying out data filtering on the real-time data streams of the multiple dimensions to obtain filtered data streams of the multiple dimensions, wherein the data stream of each dimension comprises effective data in the data stream.
In an embodiment of the present application, the processing module is further configured to perform data statistics on the real-time data streams of multiple dimensions within a preset time period, before obtaining statistical data of multiple dimensions:
and respectively carrying out deduplication processing on the real-time data streams of the multiple dimensions to obtain deduplicated data streams of the multiple dimensions, wherein the data streams of each dimension do not include repeated data.
In an embodiment of the present application, the processing module is specifically configured to:
and respectively carrying out data statistics on the real-time data streams of multiple dimensions in a preset time period by adopting a Flink architecture to obtain statistical data of multiple dimensions.
In one embodiment of the present application, the statistics of the plurality of dimensions comprise at least a first statistic and a second statistic; the storage module is specifically configured to:
storing the first statistical data in a first field of a first record of the first database and the second statistical data in a second field of the first record of the first database.
In one embodiment of the present application, the first database is a distributed document database; the storage module is specifically configured to:
storing the statistical data of the multiple dimensions to the first database according to a preset writing period; or
After the statistical data of all dimensions are acquired, the statistical data of the plurality of dimensions are stored in a first database.
In one embodiment of the present application, the apparatus further comprises: the device comprises a receiving module and a sending module.
A receiving module, configured to receive a data query request, where the data query request is used to request statistical data of at least two dimensions of the target object;
the obtaining module is further used for responding to the data query request, and obtaining statistical data of at least two dimensions of the target object from the first database;
and the sending module is used for returning a data query response, and the data query response comprises the statistical data of at least two dimensions of the target object.
A third aspect of embodiments of the present application provides an electronic device, including:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of the first aspects of the present application.
A fourth aspect of embodiments of the present application provides a computer-readable storage medium having stored thereon a computer program for execution by a processor to perform the method according to any one of the first aspect of the present application.
A fifth aspect of embodiments of the present application provides a computer program product comprising a computer program that, when executed by a processor, performs the method of any one of the first aspects of the present application.
The embodiment of the application provides a data processing method, a device, equipment and a storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining real-time data streams of a target object in multiple dimensions, conducting data statistics on the real-time data streams of the multiple dimensions within a preset time period to obtain statistical data of the multiple dimensions, wherein the statistical data of each dimension is used for indicating an attribute value of the target object, and storing the statistical data of the multiple dimensions into a first database. Through the processing process, data acquisition, processing and storage of the target object in multiple dimensions are realized, data support is provided for monitoring and analysis of multi-dimensional data of the target object, and meanwhile the real-time query requirement of a user on the multi-dimensional data of the target object can be met.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a schematic structural diagram of a data processing platform according to an embodiment of the present disclosure;
fig. 2 is a first schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 3 is a schematic flowchart illustrating a data processing method according to an embodiment of the present application;
fig. 4 is a first schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 6 is a hardware structure diagram of an electronic device according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein.
It will be understood that the terms "comprises" and "comprising," and any variations thereof, as used herein, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, words used in the technical fields related to the embodiments of the present application will be briefly described.
Kafka: the system is a high-throughput distributed message system written by Scala.
Flink: is a distributed processing engine that can be used to perform stateful computations on bounded and unbounded data streams.
Trigger: is a Window trigger in Flink to decide when to trigger Window calculations.
Redis: is an open-source, memory structured storage system, which is often used as a database, cache, message broker, etc.
ES: is an open source distributed search engine based on RESTful web interface and built on Apache Lucene. And the system is also a distributed document database, wherein each field can be indexed, and the data of each field can be searched, so that the system can be transversely expanded to hundreds of servers for storing and processing data at a beat (PB) level.
At present, the e-commerce platform lacks a real-time monitoring and analyzing tool for various data states. In order to provide better service for enterprise users who enter a resident e-commerce platform and promote benign development of the enterprise users, the embodiment of the application provides a data processing platform, and the overall architecture of Kafka + Flink + Redis + ES is adopted to realize the real-time acquisition, data processing and data storage processes of multi-dimensional data streams. Compared with the existing data processing platform, the data processing platform of the embodiment of the application has the advantages of higher data processing speed, higher data writing and updating speed and high data throughput.
The data processing platform provided by the embodiment of the application can be used for monitoring and analyzing various data states in various application scenes in real time, such as browsing volume, consulting volume, number of online customers for service and the like in a shop scene, number of online players, hot equipment arrangement, number of online customers for service and the like in a game scene, monitoring data of various sensors in an internet of things scene, early warning times and the like.
Exemplarily, fig. 1 is a schematic structural diagram of a data processing platform provided in an embodiment of the present application, and as shown in fig. 1, the data processing platform provided in the embodiment includes a plurality of message queues, a plurality of computing modules, a Redis database, and an ES database.
As an example, data flow 1 enters message queue 1, data flow 2 enters message queue 2, data flow 3 enters message queue 3, …, and data flow n enters message queue n. The calculation module 1 obtains the real-time data stream 1 and the real-time data stream 2 from the message queue 1 and the message queue 2, and writes the data analysis result and the effective data in the data streams 1 and 2 into the ES database after data analysis. The calculation module 2 obtains the real-time data stream 3 from the message queue 3, and writes the data analysis result and the valid data in the data stream 3 into the ES database after data analysis. And the calculation module n acquires the real-time data stream n from the message queue n, and after data analysis, the data analysis result and the effective data in the data stream n are sucked into the ES database. A user initiates a query request to the data processing platform through the terminal device, and obtains analysis results and original data of various types of data from an ES database of the data processing platform.
As an example, the Redis database stores preconfigured information for data analysis, and the computing modules 1 to n may perform data preprocessing on the acquired real-time data stream, such as data filtering, by reading the preconfigured information in the Redis database, so as to improve the computing efficiency of the computing modules.
By adopting the data processing platform provided by the embodiment of the application, the joint processing of a plurality of parallel data streams is realized by adopting a real-time computing engine Flink architecture, and the data processing platform has the advantages of high throughput, low delay, high performance, easy Application Program Interface (API), and the like. A user can monitor the data state of various data in real time through the data processing platform, and can trigger message notification by setting certain rules, acquire abnormal data in time and adjust corresponding execution strategies.
The technical solution of the present application will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present disclosure, where the data processing method according to the present disclosure may be applied to the data processing platform shown in fig. 1 or other devices or apparatuses having a data processing function. As shown in fig. 2, the data processing method provided in this embodiment includes the following steps:
step 101, acquiring real-time data streams of a target object in multiple dimensions, wherein each dimension corresponds to one real-time data stream.
In one embodiment of the present application, real-time data streams of a target object in multiple dimensions can be obtained from different message queues.
The message queue is an important component of a distributed system, can be used for solving the problems of data decoupling, asynchronous messages and the like, and realizes a high-performance, high-availability, scalable and final consistency framework. Commonly used message queues include the following: ActiveMQ, RabbitMQ, ZeroMQ, Kafka, MetaMQ, RocktMQ, etc., which do not limit the embodiments of the present application in any way.
In this embodiment, target objects monitored by different application scenario data processing platforms are different. For example, for a shop scenario, the data processing platform may be configured to monitor a trend of various items of data of a target shop, such as a browsing volume, an order volume, a consultation volume, an online number of customer service staff, and the like of the target shop within a preset time period. For a game scene, the data processing platform can be used for monitoring the change trend of various data of a game hall, such as the number of players in the game hall, the number of players in a game, the number of use times of various types of equipment and the like.
For ease of understanding, the following embodiments are all described with the store scenario as an example. In a shop scene, a target object, namely a target shop, can uniquely determine a shop through the identification ID of the shop. The data processing platform can acquire real-time data streams of the target stores in multiple dimensions through multiple APIs. Dimensions are understood here to be dimensions of data acquisition.
Illustratively, when user a clicks into a target store through a platform search, the platform generates a piece of data indicating that user a browses a store home page or a store item presentation page at a certain time. When user B orders a certain item at a target store through the platform, the platform generates a piece of data which is used for instructing user B to generate an order at a certain moment. When a user C initiates online consultation to the customer service of a target store through the platform, the platform generates a piece of data, and the piece of data is used for indicating the user C to carry out customer service consultation at a certain moment.
As can be seen from the above examples, for a target store, the platform may record data of multiple dimensions in real time, and with reference to fig. 1, the data of different dimensions will enter different message queues to wait for data processing, for example, the message queue 1 records browsing data, the message queue 2 records online consultation data, and the message queue 3 records order data.
It should be noted that the real-time data stream of the target object in different dimensions is only used as an example, and besides the above example, the data processing method provided by the present embodiment may also be applied to other target objects as long as the target object has data in multiple dimensions.
Step 102, performing data statistics on the real-time data streams of multiple dimensions within a preset time period to obtain statistical data of multiple dimensions.
Wherein the statistical data of each dimension is used to indicate a property value of the target object.
In this embodiment, the statistical data of each dimension is obtained by performing data processing on the real-time data stream of each dimension, and is used to indicate the data volume of the dimension in a preset time period.
For example, suppose that the message queue 1 corresponds to the computing module 1 of the data processing platform, and the preset time period is 5 minutes. The browsing data of the target store, for example, 40 pieces of browsing data, are recorded in the message queue 1, the computing module 1 of the data processing platform acquires the preset browsing data within 5 minutes, for example, the first 25 pieces of data of the 40 pieces of browsing data, from the message queue 1, and determines that the browsing data includes browsing records of 20 different users through data analysis, wherein 3 users browse the target store for multiple times within 5 minutes. Therefore, the final statistical result is that the browsing amount of the target store is 20 in 5 minutes.
For example, assuming that the message queue 2 corresponds to the computing module 2 of the data processing platform, the preset time period is also 5 minutes. The consulting data of the target store, for example, 20 pieces of consulting data, are recorded in the message queue 2, the computing module 2 of the data processing platform obtains the consulting data within 5 minutes preset from the message queue 2, for example, 20 pieces of consulting data, and it is determined that the consulting data corresponds to 5 users through data analysis. Therefore, the final statistical result is that the consulting amount of the target shop is 5 in 5 minutes.
For example, it is assumed that the message queue 3 corresponds to the computing module 3 of the data processing platform, and the preset time period is 30 minutes. The information queue 3 records the on-line and off-line status data of the target store customer service, for example, the data includes 5 pieces of status data, the computing module 3 of the data processing platform acquires the on-line and off-line status data of the customer service within a preset 30 minutes from the information queue 3, and the online number of the store customer service is determined through data analysis. Thus, the final statistic is the number of online targeted store customers served in 30 minutes.
In an embodiment of the application, a Flink framework is adopted to perform data statistics on real-time data streams of multiple dimensions in a preset time period, so as to obtain statistical data of multiple dimensions. The calculation module in the Flink architecture can realize the combination of real-time data streams of at least two dimensions in a plurality of dimensions, reduce the data statistics of the redundancy data by the calculation module and improve the data processing efficiency.
As an example, before performing data statistics on the real-time data streams of multiple dimensions respectively within a preset time period, preprocessing of data of the real-time data streams of multiple dimensions may be performed, including merging and filtering of the data streams, and the specific process may be referred to in the following embodiments and is not specifically developed here.
And 103, storing the statistical data of the multiple dimensions into a first database.
Optionally, the first database of this embodiment may be a distributed document database, for example, an ES database. A plurality of computing modules of the data processing platform share one ES database, each field in the ES database can be indexed, therefore, data of each field can be searched, the function that the ES can update and insert (Usert) by fields is utilized, updating of partial fields of one record in the ES database can be realized, and complexity of data storage is reduced.
In one embodiment of the present application, the statistical data of the plurality of dimensions includes at least a first statistical data and a second statistical data, and storing the statistical data of the plurality of dimensions in a first database includes: the first statistical data is stored in a first field of a first record of the first database and the second statistical data is stored in a second field of the first record of the first database.
Illustratively, in combination with the example in step 102, the calculating module 1 corresponding to the message queue 1 is configured to count the browsing volume of the target store within a preset time period, and the calculating module 2 corresponding to the message queue 2 is configured to count the consulting volume of the target object within the preset time period, so as to obtain statistical data of two dimensions. The statistical data of the two dimensions are stored in different fields in the same record of the first database, and the updating of the fields of the same record by multiple calculation modules of the Flink is realized, so that the data are low in coupling and easy to maintain. The same record can be understood as a certain row in the data table, and the statistical data of different dimensions are recorded in different columns of the row respectively.
It should be understood that the data processing platform continuously writes the statistical data of the target object in multiple dimensions into the first database according to a preset time period, one preset time period corresponds to one record in the database, and the data processing platform can also perform data statistics of different time dimensions, such as various types of statistical data of the target object every day, every week, and every month, according to the multiple statistical data records in the database.
In an embodiment of the present application, the statistical data of multiple dimensions may be stored in the first database according to a preset write cycle.
The preset writing period is larger than or equal to the calculation period of the calculation module for data statistics. For example, if the preset writing period is 10 minutes and the preset calculation period is 5 minutes (corresponding to the preset time period of the embodiment), two records are written at a time, and each record includes statistical data of multiple dimensions.
For a shop with low data traffic, in order to avoid storing a large amount of useless statistical data in the first database, for example, the statistical data in each record is 0, a reasonable writing period can be set according to the type of the shop (hot shop, common shop, small shop, etc.), and the validity of the data stored in the first database can be improved.
In an embodiment of the application, after the statistical data of all dimensions is acquired, the statistical data of multiple dimensions may be stored in the first database. In the embodiment, after all the calculation modules of the data processing platform obtain the statistical data, the statistical data of multiple dimensions are written into the first database together to form a record in the first database.
In an embodiment of the application, after the statistical data of a dimension is acquired, the statistical data of the dimension may be stored in the first database. In the embodiment, after one calculation module of the data processing platform obtains the statistical data, the statistical data of the dimension is directly written into the first database without waiting for other calculation modules, that is, the calculation modules can be independent from each other.
The data processing method provided by the embodiment of the application comprises the steps of firstly obtaining real-time data streams of a target object in multiple dimensions, secondly carrying out data statistics on the real-time data streams of the multiple dimensions within a preset time period to obtain statistical data of the multiple dimensions, wherein the statistical data of each dimension is used for indicating an attribute value of the target object, and finally storing the statistical data of the multiple dimensions into a first database. The data acquisition, processing and storage of the target object in multiple dimensions are realized through the processing process, data support is provided for monitoring and analysis of multi-dimensional data of the target object, and meanwhile the real-time query requirement of a user on the multi-dimensional data of the target object can be met.
On the basis of the above embodiments, in some embodiments, before performing data statistics on the real-time data streams of multiple dimensions respectively within a preset time period, data preprocessing may also be performed on the real-time data streams of multiple dimensions. The preprocessing process of the data comprises the following several embodiments:
in a possible embodiment, if the target object is a preset target object, the data processing method further includes: and respectively filtering the real-time data streams of multiple dimensions to obtain filtered data streams of multiple dimensions, wherein the data stream of each dimension comprises effective data in the data stream.
Specifically, whether the target object is a preset target object may be determined as follows:
acquiring a preset data dimension table from a second database, wherein a plurality of preset target objects are recorded in the preset data dimension table; determining whether the target object is a preset target object or not according to a preset data dimension table; if the target object is a preset target object, the step of data statistics of the above embodiment is executed.
The preset target object may be a target object subscribing to the data monitoring function in the data processing platform. For example, taking a store as an example, if the store does not subscribe to the data monitoring function on the data processing platform, the data processing platform will directly discard the real-time data stream of the store, or the real-time data stream of the store is not collected, and the store cannot obtain the real-time monitoring data through the data processing platform. If the shop is laid on the data processing platform to subscribe the data monitoring function, the data processing platform can record the shop identification ID in a shop white list (corresponding to a preset data dimension table) of the second database, and the data processing platform determines that the shop subscribes the data monitoring function by inquiring the shop white list of the second database, so that the subsequent data statistics step is executed. It should be noted that, a corresponding preset data dimension table may also be set according to an actual application scenario, and this embodiment of the present application is not limited at all.
The above embodiments are used to remove invalid data in a real-time data stream of multiple dimensions. The invalid data includes error data (for example, time information corresponding to the generated data is incorrect), data with a field being empty, or other invalid data formulated according to different business rules or application scenarios, which is not limited in this embodiment of the present application.
In a possible embodiment, if at least two dimensions of the real-time data streams in the plurality of dimensions are used to indicate the same attribute of the target object, the data processing method further includes: and carrying out data merging on the real-time data streams of at least two dimensions to obtain merged data streams.
Correspondingly, respectively carry out data statistics to the real-time data stream of a plurality of dimensions in the preset time interval, obtain the statistical data of a plurality of dimensions, include: and performing data statistics on the combined data stream within a preset time period to obtain statistical data of the combined data stream.
For example, a store is taken as an example, it is assumed that message queue 1 records message data of a target store, corresponding to data stream 1, and message queue 2 records online consulting data of the target store, corresponding to data stream 2. The message leaving data is data in a message board of the store, the online consultation data is data in a service dialog box of the store, and the two data record consultation or feedback of the user to the target store, so that data streams of the two dimensions can be combined.
Specifically, if the user a leaves a message on the message board of the store and initiates online consultation to the customer service in the preset time period, the same type of data of the user a in the data stream 1 and the data stream 2 may be merged into one data, that is, only one of the message data and the online consultation data of the same user in the preset time period is reserved. And after the data stream 1 and the data stream 3 are combined, performing data statistics on the message data or the online consulting data in the combined data stream to obtain the consulting amount of the target shop in a preset time period.
As an example, Union operations may be used to merge real-time data streams of at least two dimensions to obtain a merged data stream. The Union operation is to pull the rows of the associated data in the two data tables together. For example, two rows of data are obtained from table a, two rows of data are obtained from table B, and finally 4 rows of data are formed, where table a may correspond to the message data table, and table B may correspond to the online consulting data table.
The embodiment is used for merging the data streams of at least two data streams, avoids performing data statistics on the data streams of the same type for multiple times, and performs the step of performing data statistics after the data streams are merged, thereby improving the accuracy and efficiency of the data statistics.
In a possible embodiment, before performing data statistics on real-time data streams of multiple dimensions respectively within a preset time period to obtain statistical data of the multiple dimensions, the data processing method further includes: and respectively carrying out deduplication processing on the real-time data streams of the multiple dimensions to obtain deduplicated data streams of the multiple dimensions, wherein the data streams of each dimension do not include repeated data.
Specifically, only the first piece of data in the session data can be retained through the custom Trigger, so that the purpose of removing the repeated data is achieved. For example, the user a browses the shop homepage a plurality of times within the preset time period, and only retains the data of the user a browsing the shop homepage for the first time within the preset time period, and records browsing once. For another example, the user a initiates a query to the customer service for multiple times within a preset time period, and only data of the user a initiating the query for the first time within the preset time period is retained, and a consultation is remembered.
The embodiment is used for carrying out deduplication processing on the data stream of each dimension, avoids carrying out data statistics on repeated data in the same data stream for multiple times, and improves accuracy and efficiency of the data statistics.
The embodiments described above mainly relate to the processes of data acquisition, processing and storage by a data processing platform, and store various types of statistical data of a target object in a first database, where the first database may be an ES database, so that when a subsequent multidimensional data query is performed, a query result of the data that is responded to is returned based on a query field (which may be one field or multiple fields). The following describes the multidimensional data query process of the data processing platform in detail with reference to fig. 3.
Exemplarily, fig. 3 is a schematic flowchart of a second data processing method provided in the embodiment of the present application. As shown in fig. 3, the data processing method provided in this embodiment includes the following steps:
step 201, receiving a data query request, where the data query request is used to request statistical data of at least two dimensions of a target object.
In this embodiment, a user accesses the data processing platform through a terminal device, and as shown in fig. 1, the user may specify a combined query of multidimensional statistical data, for example, specify a consultation amount, an order amount, a browsing amount, and the like of a query target store laid in a preset time period.
Step 202, in response to the data query request, obtaining statistical data of at least two dimensions of the target object from the first database.
According to the embodiment, the data processing platform monitors the statistical data of the target object in multiple dimensions in real time, and the statistical data of the target object in different dimensions can be written into the first database in a timed or batch mode. The first database can write the statistics of multiple dimensions of the target object into one record in fields, and each field in the first database can be indexed, that is, the statistics of each field can be searched. In practical application, more dimensionality statistical data can be transversely expanded according to requirements, and data throughput is improved.
In one embodiment of the application, the data query request includes index numbers of the statistical data of the at least two dimensions of the target object in the first database, and the statistical data of the at least two dimensions of the target object is obtained from the first database according to the index numbers in response to the data query request.
And step 203, returning a data query response, wherein the data query response comprises the statistical data of at least two dimensions of the target object.
In some embodiments, before the step of obtaining the statistics of at least two dimensions of the target object from the first database in response to the data query request, user authentication is further included. And under the condition that the user is determined to have the query authority, acquiring the multi-dimensional statistical data from the first database, and displaying the multi-dimensional statistical data to the user.
Optionally, in some embodiments, the data processing platform may trigger a message notification according to a preset rule, in addition to writing the multidimensional statistical data of the target object in a timed or batch manner, for example, setting a threshold for the statistical data of different dimensions, and if the statistical data of a certain dimension exceeds the threshold, sending a prompt message (including a telephone call, a short message, an instant messaging software notification, etc.) to the user.
In the embodiment of the present application, the data processing apparatus may be divided into the functional modules according to the method embodiment, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a form of hardware or a form of a software functional module. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation. The following description will be given by taking an example in which each functional module is divided by using a corresponding function.
Fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. As shown in fig. 4, the data processing apparatus 300 according to the present embodiment includes: an acquisition module 301, a processing module 302 and a storage module 303.
An obtaining module 301, configured to obtain real-time data streams of a target object in multiple dimensions, where each dimension corresponds to one real-time data stream;
the processing module 302 is configured to perform data statistics on the real-time data streams of multiple dimensions within a preset time period, so as to obtain statistical data of multiple dimensions, where the statistical data of each dimension is used to indicate an attribute value of the target object;
a storage module 303, configured to store the statistical data of the multiple dimensions in a first database.
In an embodiment of the present application, the obtaining module 301 is specifically configured to:
and acquiring real-time data streams of the target object in multiple dimensions from different message queues.
In an embodiment of the application, if at least two of the real-time data streams of the multiple dimensions are used to indicate the same attribute of the target object, the processing module 302 is further configured to:
performing data merging on the real-time data streams of the at least two dimensions to obtain merged data streams;
correspondingly, the processing module 302 is specifically configured to perform data statistics on the merged data stream within a preset time period to obtain statistical data of the merged data stream.
In an embodiment of the present application, the processing module 302 is specifically configured to:
and performing data merging on the real-time data streams of the at least two dimensions by adopting Union operation to obtain the merged data stream.
In an embodiment of the present application, before the processing module 302 performs data statistics on the real-time data streams of multiple dimensions respectively within a preset time period to obtain statistical data of multiple dimensions, the processing module is further configured to:
acquiring a preset data dimension table from a second database, wherein a plurality of preset target objects are recorded in the preset data dimension table;
determining whether the target object is a preset target object or not according to the preset data dimension table;
and if the target object is the preset target object, executing the step of data statistics.
In an embodiment of the application, if the target object is a preset target object, the processing module 302 is further configured to:
and respectively carrying out data filtering on the real-time data streams of the multiple dimensions to obtain filtered data streams of the multiple dimensions, wherein the data stream of each dimension comprises effective data in the data stream.
In an embodiment of the present application, before the processing module 302 performs data statistics on the real-time data streams of multiple dimensions respectively within a preset time period to obtain statistical data of multiple dimensions, the processing module is further configured to:
and respectively carrying out deduplication processing on the real-time data streams of the multiple dimensions to obtain deduplicated data streams of the multiple dimensions, wherein the data streams of each dimension do not include repeated data.
In an embodiment of the present application, the processing module 302 is specifically configured to:
and respectively carrying out data statistics on the real-time data streams of multiple dimensions in a preset time period by adopting a Flink architecture to obtain statistical data of multiple dimensions.
In one embodiment of the present application, the statistics of the plurality of dimensions comprise at least a first statistic and a second statistic; the storage module 303 is specifically configured to:
storing the first statistical data in a first field of a first record of the first database and the second statistical data in a second field of the first record of the first database.
In one embodiment of the present application, the first database is a distributed document database; the storage module 303 is specifically configured to:
storing the statistical data of the multiple dimensions to the first database according to a preset writing period; or
After the statistical data of all dimensions are acquired, the statistical data of the plurality of dimensions are stored in a first database.
Fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, and based on the apparatus shown in fig. 4, as shown in fig. 5, the data processing apparatus 300 according to the embodiment further includes: a receiving module 304 and a transmitting module 305.
A receiving module 304, configured to receive a data query request, where the data query request is used to request statistical data of at least two dimensions of the target object;
an obtaining module 301, configured to, in response to the data query request, obtain statistical data of at least two dimensions of the target object from the first database;
a sending module 305, configured to return a data query response, where the data query response includes statistics of at least two dimensions of the target object.
The data processing apparatus provided in this embodiment may execute the technical solutions of any of the above method embodiments, and the implementation principles and technical effects are similar, which are not described herein again.
Exemplarily, fig. 6 is a hardware structure diagram of an electronic device provided in the embodiment of the present application, and as shown in fig. 6, an electronic device 400 provided in the embodiment includes:
a memory 401;
a processor 402; and
a computer program;
the computer program is stored in the memory 401 and configured to be executed by the processor 402 to implement the technical solution of any one of the above method embodiments, and the implementation principle and the technical effect are similar, and are not described herein again.
Optionally, the memory 401 may be separate or integrated with the processor 402. When the memory 401 is a separate device from the processor 402, the electronic device 400 further comprises: a bus 403 for connecting the memory 401 and the processor 402.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by the processor 402 to implement the technical solution in any of the foregoing method embodiments.
The present application provides a computer program product, including a computer program, which when executed by a processor implements the technical solutions in any of the foregoing method embodiments.
An embodiment of the present application further provides a chip, including: a processing module and a communication interface, wherein the processing module can execute the technical scheme in the method embodiment.
Further, the chip further includes a storage module (e.g., a memory), where the storage module is configured to store instructions, and the processing module is configured to execute the instructions stored in the storage module, and the execution of the instructions stored in the storage module causes the processing module to execute the technical solution in the foregoing method embodiment.
It should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.
The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile storage NVM, such as at least one disk memory, and may also be a usb disk, a removable hard disk, a read-only memory, a magnetic or optical disk, etc.
The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
The storage medium may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the storage medium may reside as discrete components in an electronic device.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the present disclosure as defined by the appended claims.

Claims (15)

1. A data processing method, comprising:
acquiring real-time data streams of a target object in multiple dimensions, wherein each dimension corresponds to one real-time data stream;
respectively carrying out data statistics on real-time data streams of multiple dimensions in a preset time period to obtain statistical data of the multiple dimensions, wherein the statistical data of each dimension is used for indicating an attribute value of the target object;
storing the statistical data of the plurality of dimensions to a first database.
2. The method of claim 1, wherein the obtaining real-time data of the target object in different dimensions comprises:
and acquiring real-time data streams of the target object in multiple dimensions from different message queues.
3. The method of claim 1, wherein if at least two of the plurality of real-time data streams are used to indicate the same property of the target object, the method further comprises:
performing data merging on the real-time data streams of the at least two dimensions to obtain merged data streams;
correspondingly, respectively carrying out data statistics on the real-time data streams of multiple dimensions within a preset time period to obtain statistical data of multiple dimensions, including:
and performing data statistics on the merged data stream within a preset time period to obtain statistical data of the merged data stream.
4. The method of claim 3, wherein the data merging the real-time data streams of the at least two dimensions to obtain a merged data stream comprises:
and performing data merging on the real-time data streams of the at least two dimensions by adopting Union operation to obtain the merged data stream.
5. The method according to claim 1, before performing data statistics on the real-time data streams of the multiple dimensions respectively within a preset time period to obtain statistical data of the multiple dimensions, the method further comprises:
acquiring a preset data dimension table from a second database, wherein a plurality of preset target objects are recorded in the preset data dimension table;
determining whether the target object is a preset target object or not according to the preset data dimension table;
and if the target object is the preset target object, executing the step of data statistics.
6. The method of claim 1, wherein if the target object is a predetermined target object, the method further comprises:
and respectively carrying out data filtering on the real-time data streams of the multiple dimensions to obtain filtered data streams of the multiple dimensions, wherein the data stream of each dimension comprises effective data in the data stream.
7. The method according to claim 1, before performing data statistics on the real-time data streams of the multiple dimensions respectively within a preset time period to obtain statistical data of the multiple dimensions, the method further comprises:
and respectively carrying out deduplication processing on the real-time data streams of the multiple dimensions to obtain deduplicated data streams of the multiple dimensions, wherein the data streams of each dimension do not include repeated data.
8. The method according to claim 1, wherein performing data statistics on the real-time data streams of multiple dimensions within a preset time period to obtain statistical data of multiple dimensions respectively comprises:
and respectively carrying out data statistics on the real-time data streams of multiple dimensions in a preset time period by adopting a Flink architecture to obtain statistical data of multiple dimensions.
9. The method of claim 1, wherein the statistics for the plurality of dimensions comprise at least a first statistic and a second statistic; the storing the statistical data of the plurality of dimensions to a first database comprises:
storing the first statistical data in a first field of a first record of the first database and the second statistical data in a second field of the first record of the first database.
10. The method of claim 1, wherein the first database is a distributed document database; the storing the statistical data of the plurality of dimensions to a first database comprises:
storing the statistical data of the multiple dimensions to the first database according to a preset writing period; or
After the statistical data of all dimensions are acquired, the statistical data of the plurality of dimensions are stored in a first database.
11. The method according to any one of claims 1-9, further comprising:
receiving a data query request, wherein the data query request is used for requesting statistical data of at least two dimensions of the target object;
in response to the data query request, acquiring statistical data of at least two dimensions of the target object from the first database;
returning a data query response, the data query response including statistics of at least two dimensions of the target object.
12. A data processing apparatus, comprising:
the acquisition module is used for acquiring real-time data streams of the target object in multiple dimensions, wherein each dimension corresponds to one real-time data stream;
the processing module is used for respectively carrying out data statistics on the real-time data streams of multiple dimensions in a preset time period to obtain statistical data of the multiple dimensions, wherein the statistical data of each dimension is used for indicating an attribute value of the target object;
and the storage module is used for storing the statistical data of the multiple dimensions to a first database.
13. An electronic device, comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1 to 11.
14. A computer-readable storage medium, on which a computer program is stored, which computer program is executed by a processor to implement the method according to any one of claims 1 to 11.
15. A computer program product, characterized in that it comprises a computer program which, when executed by a processor, implements the method of any one of claims 1 to 11.
CN202110206002.6A 2021-02-24 2021-02-24 Data processing method, device, equipment and storage medium Pending CN113761018A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110206002.6A CN113761018A (en) 2021-02-24 2021-02-24 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110206002.6A CN113761018A (en) 2021-02-24 2021-02-24 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113761018A true CN113761018A (en) 2021-12-07

Family

ID=78786626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110206002.6A Pending CN113761018A (en) 2021-02-24 2021-02-24 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113761018A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116126872A (en) * 2023-04-18 2023-05-16 紫金诚征信有限公司 Correlation method, device and computer readable medium for real-time dimension table

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202081A (en) * 2015-04-30 2016-12-07 阿里巴巴集团控股有限公司 Real-time data processing method and device
WO2017016423A1 (en) * 2015-07-29 2017-02-02 阿里巴巴集团控股有限公司 Real-time new data update method and device
CN107665241A (en) * 2017-09-07 2018-02-06 北京京东尚科信息技术有限公司 A kind of real time data various dimensions De-weight method and device
CN109726074A (en) * 2018-08-31 2019-05-07 网联清算有限公司 Log processing method, device, computer equipment and storage medium
CN111125121A (en) * 2020-03-30 2020-05-08 四川新网银行股份有限公司 Real-time data display method based on HBase table
CN111311326A (en) * 2020-02-18 2020-06-19 平安科技(深圳)有限公司 User behavior real-time multidimensional analysis method and device and storage medium
CN112100253A (en) * 2020-09-15 2020-12-18 李博 Method for constructing operator real-time business report

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202081A (en) * 2015-04-30 2016-12-07 阿里巴巴集团控股有限公司 Real-time data processing method and device
WO2017016423A1 (en) * 2015-07-29 2017-02-02 阿里巴巴集团控股有限公司 Real-time new data update method and device
CN107665241A (en) * 2017-09-07 2018-02-06 北京京东尚科信息技术有限公司 A kind of real time data various dimensions De-weight method and device
CN109726074A (en) * 2018-08-31 2019-05-07 网联清算有限公司 Log processing method, device, computer equipment and storage medium
CN111311326A (en) * 2020-02-18 2020-06-19 平安科技(深圳)有限公司 User behavior real-time multidimensional analysis method and device and storage medium
CN111125121A (en) * 2020-03-30 2020-05-08 四川新网银行股份有限公司 Real-time data display method based on HBase table
CN112100253A (en) * 2020-09-15 2020-12-18 李博 Method for constructing operator real-time business report

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116126872A (en) * 2023-04-18 2023-05-16 紫金诚征信有限公司 Correlation method, device and computer readable medium for real-time dimension table
CN116126872B (en) * 2023-04-18 2023-06-23 紫金诚征信有限公司 Correlation method, device and computer readable medium for real-time dimension table

Similar Documents

Publication Publication Date Title
CN108510311B (en) Method and device for determining marketing scheme and electronic equipment
JP5693746B2 (en) Product information ranking
US20130166488A1 (en) Personalized information pushing method and device
CN112307057A (en) Data processing method and device, electronic equipment and computer storage medium
CN111209352A (en) Data processing method and device, electronic equipment and storage medium
CN108809704B (en) Data deduplication statistical method and device based on dynamic time window
US8793236B2 (en) Method and apparatus using historical influence for success attribution in network site activity
RU2622850C2 (en) Method and server for processing product identifiers and machine-readable storage medium
CN110087228B (en) Method and device for determining service package
CN113918622B (en) Information tracing method and system based on block chain
CN106411639A (en) Method and system for monitoring access data
US10733244B2 (en) Data retrieval system
CN107748772B (en) Trademark identification method and device
CN108932241B (en) Log data statistical method, device and node
CN110427358B (en) Data cleaning method and device and information recommendation method and device
CN113761018A (en) Data processing method, device, equipment and storage medium
CN106933903B (en) Storage method and device applied to distributed storage
CN108664492A (en) A kind of method, apparatus, electronic equipment and storage medium pushing content to user
CN110708361B (en) System, method and device for determining grade of digital content publishing user and server
CN108664550B (en) Funnel analysis method and device for user behavior data
CN116842106A (en) Resource clue generation method and device
CN111045983A (en) Nuclear power station electronic file management method and device, terminal equipment and medium
US10331693B1 (en) Filters and event schema for categorizing and processing streaming event data
CN115994830A (en) Method for constructing fetch model, method for collecting data and related device
CN115358761A (en) After-sale processing method and device, after-sale processing equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination