CN106951552A - A kind of user behavior data processing method based on Hadoop - Google Patents
A kind of user behavior data processing method based on Hadoop Download PDFInfo
- Publication number
- CN106951552A CN106951552A CN201710191813.7A CN201710191813A CN106951552A CN 106951552 A CN106951552 A CN 106951552A CN 201710191813 A CN201710191813 A CN 201710191813A CN 106951552 A CN106951552 A CN 106951552A
- Authority
- CN
- China
- Prior art keywords
- data
- user
- real
- real time
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1734—Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Abstract
The present invention relates to a kind of user behavior data processing method based on Hadoop, methods described includes:User's history data source is imported into distributed file system HDFS;The historical behavior tables of data of user is generated based on the user's history data source;The real-time behavioral data stream of user is collected by Flume;The Kafka data that record is collected from the Flume in real time;According to the different service types of real-time behavioral data stream, the real time data of user behavior generation is handled in real time with real-time Computational frame Spark, to generate the real time data table of user;With the real time data table and historical behavior tables of data of the IMSI number association user in the IMSI storehouses, the wide table of behavioral data of user is obtained;The wide table of the behavioral data of the user is exported and is saved in HBase databases according to preset configuration file;By inquiry system Impala and HBase database integrations, to provide the inquiry entrance of user behavior data to outside.The technical scheme that the present invention is provided, can set up user behavior data business system that is efficient, becoming more meticulous.
Description
Technical field
The invention belongs to communication technical field, it is related to a kind of user behavior data processing method based on Hadoop.
Background technology
Commercialization and widespread deployment with 4G networks, when mobile communication business formally enters mobile Internet comprehensively
Generation, the mobile network's bandwidth developed rapidly directly brings numerous and diverse application and user behavior, and the data in communication network are complicated
Degree, information content are all increased rapidly therewith, and the complexity and operand requirement for causing data processing all have higher requirement therewith,
The data-handling capacity of traditional database system receives great challenge.And in face of mass data processing demand and it is lower when
Ductility limitation require, traditional data system input CPU computing capabilitys, internal memory response and handle up, the network bandwidth suffer from it is huge
Benchmark, and face under high security, polycentric development trend many bottlenecks.The arrival in big data epoch makes single node
Computation schema can not meet the demand of data processing, distributed data processing is progressively flat as big data with storage system
The preferred framework of platform, big data technology becomes the focus of many mutually researchs.And Hadoop big data platforms are based primarily upon static number
According to the parallel processing of file, although handle up in mass data, calculate, having high efficiency in terms of storage, but real-time compared with
Difference, belongs to height and handles up, high concurrent, the framework of high time delay, and the process performance for small documents is always its unavoidable problem,
Therefore for helpless under the higher data processing of some real-times and usage scenario.
It is special there is presently no a kind of method handled for Internet user's real time data and history (offline) Data Integration
It is not the lean operation method that can adapt to operator's big data development.
The content of the invention
In view of this, it is an object of the invention to provide a kind of user behavior data processing method based on Hadoop, energy
It is enough to set up user behavior data business system that is efficient, becoming more meticulous.
To reach above-mentioned purpose, the present invention provides following technical scheme:
A kind of user behavior data processing method based on Hadoop, methods described includes:
User's history data source is imported into distributed file system HDFS, to provide data access by the HDFS
Interface;Wherein, the user's history data source includes international mobile subscriber identity IMSI storehouses, International Mobile Equipment Identity code
At least one in IMEI storehouses and reptile storehouse;
The historical behavior tables of data of user is generated based on the user's history data source;
The real-time behavioral data stream of user is collected by metadata acquisition tool Flume, the real-time behavioral data stream includes
The real-time internet log of user and user internet behavior real time parsing data;
The distributed ordering system Kafka data that record is collected from the Flume in real time, and be as message format component
Real-time Computational frame provides data;
According to the different service types of real-time behavioral data stream, user's row is handled in real time with real-time Computational frame Spark
For the real time data of generation, to generate the real time data table of user;
With the real time data table and historical behavior tables of data of the IMSI number association user in the IMSI storehouses, user is obtained
The wide table of behavioral data;
The wide table of the behavioral data of the user is exported and is saved in HBase databases according to preset configuration file;
By inquiry system Impala and HBase database integrations, to provide the inquiry entrance of user behavior data to outside.
Further, the historical behavior tables of data for generating user based on the user's history data source includes:
All historical behavior data of the user are associated by the IMSI number in the IMSI storehouses, and by the user's
All historical behavior data are mapped in Tool for Data Warehouse Hive, to form the historical behavior tables of data of the user.
Further, it is described after the distributed ordering system Kafka data that record is collected from the Flume in real time
Method also includes:
Judge whether pending data have been buffered in Kafka configuration files;If so, by the pending data
Send to the real-time Computational frame Spark;If it is not, by the data feedback to processing to the distributed ordering system
Kafka。
Further, the IMSI storehouses, IMEI storehouses and reptile storehouse imported into HDFS by Sqoop from relevant database
In.
Further, the fact that the user behavioral data stream include user mobile terminal access characteristics, search
Information and flow consume corresponding real time data.
Further, obtaining the wide table of behavioral data of user includes:
Based on different service logics, obtain the real time data table of all input users with Map/Reduce frameworks and go through
The output valve of history behavioral data table, to form the wide table of the behavioral data;Wherein, an IMSI number characterizes a user.
Further, the structure of table is numbered including IMSI number with business in the HBase databases combination and be used for
Deposit the row of the specific business information of user.
The beneficial effects of the present invention are:
(1) the magnanimity history initial data of user is stored on HDFS by the present invention, is provided for initial data and possesses Gao Rong
Wrong, height is handled up, the memory space of low cost, supports to access the data in file system in the form of streaming;Pass through data acquisition work
Has the real time data that Flume collects user behavior, real time data is real-time including the real-time internet log of user, the behavior of user internet
Data, the Kafka data that record is collected from Flume in real time are parsed, and are the real-time Computational frame in upper strata as message format component
Authentic data support is provided, the real time data of user behavior generation is then handled in real time with Spark internal memories Computational frame.Pass through
The real time data and historical data of IMSI number association user, obtain the wide table of unified user behavior data, and be stored in distribution
In database HBase, a feasible solution is provided for the storage of mass users behavioral data, conventional method is alleviated
Middle unit stores the pressure of customer data.
(2) present invention is based on Hadoop platform, will set up the user behavior system task that becomes more meticulous and is distributed to by low configuration
In the cluster environment of computer composition, integrated with Impala and HBase and the efficient query engine of user behavior data is provided, reduced
Query time postpones, and the execution speed than primary MapReduce and Hive is many soon.
(3) user behavior data generation method of the present invention, for the single data of legacy user, the party
Method establish efficiently, the user behavior data business system that becomes more meticulous, be provided simultaneously with high scalability, effectively lifting operator is fine
Change operation ability.
Brief description of the drawings
In order that the purpose of the present invention, technical scheme and beneficial effect are clearer, the present invention provides drawings described below and carried out
Explanation:
A kind of flow chart for user behavior data generation method based on Hadoop that Fig. 1 provides for the present invention;
Fig. 2 is the design diagram of user's history behavioral data table in the present invention;
Fig. 3 is the modelling schematic diagram of the real-time behavioral data of user in the present invention;
Fig. 4 is the design diagram of the wide table of user behavior data in the present invention;
Fig. 5 is HBase storage organization figures in the present invention.
Embodiment
Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described in detail.
Referring to Fig. 1, the application embodiment provides a kind of user behavior data processing method based on Hadoop, it is described
Method includes:
User's history data source is imported into distributed file system HDFS, to provide data access by the HDFS
Interface;Wherein, the user's history data source includes international mobile subscriber identity IMSI storehouses, International Mobile Equipment Identity code
At least one in IMEI storehouses and reptile storehouse;
The historical behavior tables of data of user is generated based on the user's history data source;
The real-time behavioral data stream of user is collected by metadata acquisition tool Flume, the real-time behavioral data stream includes
The real-time internet log of user and user internet behavior real time parsing data;
The distributed ordering system Kafka data that record is collected from the Flume in real time, and be as message format component
Real-time Computational frame provides data;
According to the different service types of real-time behavioral data stream, user's row is handled in real time with real-time Computational frame Spark
For the real time data of generation, to generate the real time data table of user;
With the real time data table and historical behavior tables of data of the IMSI number association user in the IMSI storehouses, user is obtained
The wide table of behavioral data;
The wide table of the behavioral data of the user is exported and is saved in HBase databases according to preset configuration file;
By inquiry system Impala and HBase database integrations, to provide the inquiry entrance of user behavior data to outside.
In the present embodiment, the historical behavior tables of data for generating user based on the user's history data source includes:
All historical behavior data of the user are associated by the IMSI number in the IMSI storehouses, and by the user's
All historical behavior data are mapped in Tool for Data Warehouse Hive, to form the historical behavior tables of data of the user.
In the present embodiment, distributed ordering system Kafka in real time record from the Flume collect data it
Afterwards, methods described also includes:
Judge whether pending data have been buffered in Kafka configuration files;If so, by the pending data
Send to the real-time Computational frame Spark;If it is not, by the data feedback to processing to the distributed ordering system
Kafka。
In the present embodiment, the IMSI storehouses, IMEI storehouses and reptile storehouse are imported by Sqoop from relevant database
Into HDFS.
In the present embodiment, behavioral data stream includes access spy of the user in mobile terminal to the fact that the user
Property, search information and flow consume corresponding real time data.
In the present embodiment, obtaining the wide table of behavioral data of user includes:
Based on different service logics, obtain the real time data table of all input users with Map/Reduce frameworks and go through
The output valve of history behavioral data table, to form the wide table of the behavioral data;Wherein, an IMSI number characterizes a user.
In the present embodiment, in the HBase databases structure of table include combination that IMSI number and business number with
And for depositing the row of the specific business information of user.
In the present embodiment, Hadoop is an open source projects of Apache organization and administration, has been obtained at present substantial amounts of
Using, Hadoop has been grown into including Hadoop common, HDFS, MapReduce, ZooKeeper, Avro, Chukwa,
10 sub-projects including HBase, Hive, Mahout, Pig, Hadoop core is by Hadoop Common, HDFS
(Hadoop Distributed File System) and Map Reduce three subsystems are constituted.Wherein Hadoop
Common parts provide the foundation support sexual function for the overall frameworks of Hadoop, mainly include file system, remote process
Invocation protocol and data serializing storehouse;HDFS is distributed file system, with high fault tolerance and use cost than relatively low spy
Point;Map Reduce are mainly used in writing the parallelisation procedure for quickly handling mass data on large-scale computer cluster
It is a programming model and software frame.
Spark is a distributed internal memory Computational frame, is characterized in that large-scale data can be handled, calculating speed is fast.
The integrated Hadoop of Spark needs distributed file system could be operated, and the MapReduce that it has continued Hadoop calculates mould
Type, by contrast Spark calculating process be maintained in internal memory, reduce disk read-write, can by it is multiple operation merge
After calculate, therefore improve calculating speed.Spark must be ridden in hadoop cluster, and its data source is HDFS, substantially
It is a Computational frame on Yarn, as MapReduce.Spark cores are divided into RDD.Spark SQL、Spark
The core components such as Streaming, MLlib, GraphX, SparkR solve the problems, such as many big datas, its perfect framework day
It is welcome.Its corresponding ecological environment in terms of visualization, just grows stronger day by day including zepplin etc..Spark read and write process unlike
Hadoop overflows write-in disk, is all based on internal memory, therefore speed is quickly.The width of other DAG job scheduling systems, which is relied on, to be allowed
Spark speed is improved.
Sqoop is the instrument of the efficient transfer data between relevant database and HDFS, can be by a relational data
Data in storehouse are imported into Hadoop HDFS, can also be led HDFS data into relevant database.
Flume is the High Availabitity that Cloudera is provided, highly reliable, distributed massive logs collection, polymerization
With the system of transmission, Flume supports to customize Various types of data sender in log system, for collecting data;Meanwhile, Flume
There is provided and simple process is carried out to data, and write the ability of various data receivings (customizable).
Kafka is that a kind of distributed post of high-throughput subscribes to message system, and it can handle the net of consumer's scale
Everything flow data in standing.Kafka can record the data collected from metadata acquisition tool Flume in real time, and conduct disappears
Breath Buffer Unit provides authentic data support for the real-time Computational frame in upstream.
HBase is a high reliability, high-performance, towards row, telescopic distributed memory system, utilizes HBase skills
Art can erect large-scale structure storage cluster on cheap PC Server.HBase is different from general relational database,
It is a database for being suitable for unstructured data storage.
Impala is by the big data real-time query analysis tool of the leading exploitation of Cloudera companies, than being based on originally
MapReduce HiveSQL inquiry velocities lift 3~90 times, and more flexibly easy-to-use.Class SQL query statement is provided, can
Inquiry is stored in the PB level big datas in Hadoop HDFS and HBase.Inquiry velocity is its maximum advantage soon.Impala makees
For big data real-time query analysis tool, fast with inquiry velocity, flexibility is high, easily integrates, the features such as scalability is strong.
Currently used APP durations on the day of with user's history behavioral data (area attribute, user handle set meal) and user
Exemplified by, the technical scheme that the present invention is provided comprises the following steps:
Step 1:User's history data source is imported into distributed file system HDFS, the number of high-throughput is provided by HDFS
According to access ability, wherein data source includes IMSI storehouses, IMEI storehouses, reptile storehouse;IMSI storehouses, IMEI storehouses, reptile storehouse by Sqoop from
Relevant database imported into HDFS, and the sheet format of user's history data source is as shown in Figure 2.
Step 2:The real time data of user behavior is collected by metadata acquisition tool Flume, real time data is with the day of user
Exemplified by currently used App durations, the Kafka data that record is collected from Flume in real time, and be that upper strata is real as message format component
When Computational frame provide authentic data support.
Step 3:Illustrated according to the example of step 2, handle user App's by Spark real time data processings instrument
Using duration, so that each App's used on the day of calculating active user uses duration and exports in real time, similarly, when meter
One day enough is calculated, one week, in January, data can be exported, real time data structure is as shown in Figure 3.
Step 4:According to different service logics, when service logic in this example handles set meal and App use for user
It is long, the output valve of all input users (IMSI represents a user) is obtained with Map/Reduce frameworks, user is formed
Behavior table, sheet format is as shown in Figure 4.
Step 5:According to configuration file, user behavior data is saved in HBase, Impala and HBase is integrated and provides
The inquiry entrance of user behavior data, compared to primary MapReduce and Hive execution speed, will be significantly increased use
The statistical analysis speed of the wide table of family behavioral data.Storage organization in HBase as shown in figure 5, RowKey be IMSI+ business numbering,
There is a row Data in row cluster:Label, deposits the specific business information of user.
The beneficial effects of the present invention are:
(1) the magnanimity history initial data of user is stored on HDFS by the present invention, is provided for initial data and possesses Gao Rong
Wrong, height is handled up, the memory space of low cost, supports to access the data in file system in the form of streaming;Pass through data acquisition work
Has the real time data that Flume collects user behavior, real time data is real-time including the real-time internet log of user, the behavior of user internet
Data, the Kafka data that record is collected from Flume in real time are parsed, and are the real-time Computational frame in upper strata as message format component
Authentic data support is provided, the real time data of user behavior generation is then handled in real time with Spark internal memories Computational frame.Pass through
The real time data and historical data of IMSI number association user, obtain the wide table of unified user behavior data, and be stored in distribution
In database HBase, a feasible solution is provided for the storage of mass users behavioral data, conventional method is alleviated
Middle unit stores the pressure of customer data.
(2) present invention is based on Hadoop platform, will set up the user behavior system task that becomes more meticulous and is distributed to by low configuration
In the cluster environment of computer composition, integrated with Impala and HBase and the efficient query engine of user behavior data is provided, reduced
Query time postpones, and the execution speed than primary MapReduce and Hive is many soon.
(3) user behavior data generation method of the present invention, for the single data of legacy user, the party
Method establish efficiently, the user behavior data business system that becomes more meticulous, be provided simultaneously with high scalability, effectively lifting operator is fine
Change operation ability.
Finally illustrate, preferred embodiment above is merely illustrative of the technical solution of the present invention and unrestricted, although logical
Cross above preferred embodiment the present invention is described in detail, it is to be understood by those skilled in the art that can be
Various changes are made to it in form and in details, without departing from claims of the present invention limited range.
Claims (7)
1. a kind of user behavior data processing method based on Hadoop, it is characterised in that methods described includes:
User's history data source is imported into distributed file system HDFS, connect with providing data access by the HDFS
Mouthful;Wherein, the user's history data source includes international mobile subscriber identity IMSI storehouses, International Mobile Equipment Identity code IMEI
At least one in storehouse and reptile storehouse;
The historical behavior tables of data of user is generated based on the user's history data source;
The real-time behavioral data stream of user is collected by metadata acquisition tool Flume, the real-time behavioral data stream includes user
Real-time internet log and user internet behavior real time parsing data;
The distributed ordering system Kafka data that record is collected from the Flume in real time, and be real-time as message format component
Computational frame provides data;
According to the different service types of real-time behavioral data stream, user behavior production is handled in real time with real-time Computational frame Spark
Raw real time data, to generate the real time data table of user;
With the real time data table and historical behavior tables of data of the IMSI number association user in the IMSI storehouses, the row of user is obtained
For the wide table of data;
The wide table of the behavioral data of the user is exported and is saved in HBase databases according to preset configuration file;
By inquiry system Impala and HBase database integrations, to provide the inquiry entrance of user behavior data to outside.
2. according to the method described in claim 1, it is characterised in that the history of user is generated based on the user's history data source
Behavioral data table includes:
All historical behavior data of the user, and owning the user are associated by the IMSI number in the IMSI storehouses
Historical behavior data are mapped in Tool for Data Warehouse Hive, to form the historical behavior tables of data of the user.
3. according to the method described in claim 1, it is characterised in that recorded in real time from described in distributed ordering system Kafka
After the data that Flume is collected, methods described also includes:
Judge whether pending data have been buffered in Kafka configuration files;If so, the pending data are sent
To the real-time Computational frame Spark;If it is not, by the pending data feedback to the distributed ordering system Kafka.
4. according to the method described in claim 1, it is characterised in that the IMSI storehouses, IMEI storehouses and reptile storehouse pass through Sqoop
From relevant database imported into HDFS.
5. according to the method described in claim 1, it is characterised in that behavioral data stream includes user and existed by the fact that the user
Access characteristics, search information and the flow of mobile terminal consume corresponding real time data.
6. according to the method described in claim 1, it is characterised in that obtaining the wide table of behavioral data of user includes:
Based on different service logics, the real time data table and history row of all input users is obtained with Map/Reduce frameworks
For the output valve of tables of data, to form the wide table of the behavioral data;Wherein, an IMSI number characterizes a user.
7. according to the method described in claim 1, it is characterised in that the structure of table includes IMSI number in the HBase databases
The combination numbered with business and the row for depositing the specific business information of user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710191813.7A CN106951552A (en) | 2017-03-27 | 2017-03-27 | A kind of user behavior data processing method based on Hadoop |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710191813.7A CN106951552A (en) | 2017-03-27 | 2017-03-27 | A kind of user behavior data processing method based on Hadoop |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106951552A true CN106951552A (en) | 2017-07-14 |
Family
ID=59474151
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710191813.7A Pending CN106951552A (en) | 2017-03-27 | 2017-03-27 | A kind of user behavior data processing method based on Hadoop |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106951552A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107748800A (en) * | 2017-11-15 | 2018-03-02 | 北京易讯通信息技术股份有限公司 | A kind of fusion of distributed real-time data processing government affairs service data and sharing method |
CN108133041A (en) * | 2018-01-11 | 2018-06-08 | 四川九洲电器集团有限责任公司 | Data collecting system and method based on web crawlers and data transfer technology |
CN108153836A (en) * | 2017-12-14 | 2018-06-12 | 浙江航天恒嘉数据科技有限公司 | A kind of time series data accesses system and method |
CN109388637A (en) * | 2018-09-21 | 2019-02-26 | 北京京东金融科技控股有限公司 | Data warehouse information processing method, device, system, medium |
CN110162563A (en) * | 2019-05-28 | 2019-08-23 | 深圳市网心科技有限公司 | A kind of data storage method, system and electronic equipment and storage medium |
CN110209422A (en) * | 2018-05-09 | 2019-09-06 | 腾讯科技(深圳)有限公司 | A kind of method for processing business, computer equipment and client |
CN111694891A (en) * | 2019-03-12 | 2020-09-22 | 马上消费金融股份有限公司 | Data table processing method and device |
CN112416982A (en) * | 2021-01-25 | 2021-02-26 | 北京轻松筹信息技术有限公司 | Method and device for calculating real-time user characteristics |
CN113177049A (en) * | 2021-05-13 | 2021-07-27 | 中移智行网络科技有限公司 | Data processing method, device and system |
CN113434376A (en) * | 2021-06-24 | 2021-09-24 | 山东浪潮科学研究院有限公司 | Web log analysis method and device based on NoSQL |
CN114490525A (en) * | 2022-02-22 | 2022-05-13 | 北京科杰科技有限公司 | System and method for analyzing and putting out and putting in storage of super-large unstructured text files remotely based on hadoop |
CN115801353A (en) * | 2022-11-03 | 2023-03-14 | 智网安云(武汉)信息技术有限公司 | Linkage script processing method after real-time aggregation of safety event logs based on big data level |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761309A (en) * | 2014-01-23 | 2014-04-30 | 中国移动(深圳)有限公司 | Operation data processing method and system |
CN105893628A (en) * | 2016-05-17 | 2016-08-24 | 中国农业银行股份有限公司 | Real-time data collection system and method |
CN105930446A (en) * | 2016-04-20 | 2016-09-07 | 重庆重邮汇测通信技术有限公司 | Telecommunication customer tag generation method based on Hadoop distributed technology |
US20170032384A1 (en) * | 2015-07-29 | 2017-02-02 | Geofeedia, Inc. | System and Method for Analyzing Social Media Users Based on User Content Posted from Monitored Locations |
-
2017
- 2017-03-27 CN CN201710191813.7A patent/CN106951552A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761309A (en) * | 2014-01-23 | 2014-04-30 | 中国移动(深圳)有限公司 | Operation data processing method and system |
US20170032384A1 (en) * | 2015-07-29 | 2017-02-02 | Geofeedia, Inc. | System and Method for Analyzing Social Media Users Based on User Content Posted from Monitored Locations |
CN105930446A (en) * | 2016-04-20 | 2016-09-07 | 重庆重邮汇测通信技术有限公司 | Telecommunication customer tag generation method based on Hadoop distributed technology |
CN105893628A (en) * | 2016-05-17 | 2016-08-24 | 中国农业银行股份有限公司 | Real-time data collection system and method |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107748800A (en) * | 2017-11-15 | 2018-03-02 | 北京易讯通信息技术股份有限公司 | A kind of fusion of distributed real-time data processing government affairs service data and sharing method |
CN108153836A (en) * | 2017-12-14 | 2018-06-12 | 浙江航天恒嘉数据科技有限公司 | A kind of time series data accesses system and method |
CN108133041A (en) * | 2018-01-11 | 2018-06-08 | 四川九洲电器集团有限责任公司 | Data collecting system and method based on web crawlers and data transfer technology |
CN110209422B (en) * | 2018-05-09 | 2021-08-27 | 腾讯科技(深圳)有限公司 | Service processing method, computer equipment and client |
CN110209422A (en) * | 2018-05-09 | 2019-09-06 | 腾讯科技(深圳)有限公司 | A kind of method for processing business, computer equipment and client |
CN109388637A (en) * | 2018-09-21 | 2019-02-26 | 北京京东金融科技控股有限公司 | Data warehouse information processing method, device, system, medium |
CN109388637B (en) * | 2018-09-21 | 2020-09-01 | 京东数字科技控股有限公司 | Data warehouse information processing method, device, system and medium |
CN111694891A (en) * | 2019-03-12 | 2020-09-22 | 马上消费金融股份有限公司 | Data table processing method and device |
CN111694891B (en) * | 2019-03-12 | 2021-01-12 | 马上消费金融股份有限公司 | Data table processing method and device |
CN110162563A (en) * | 2019-05-28 | 2019-08-23 | 深圳市网心科技有限公司 | A kind of data storage method, system and electronic equipment and storage medium |
CN110162563B (en) * | 2019-05-28 | 2023-11-17 | 深圳市网心科技有限公司 | Data warehousing method and system, electronic equipment and storage medium |
CN112416982A (en) * | 2021-01-25 | 2021-02-26 | 北京轻松筹信息技术有限公司 | Method and device for calculating real-time user characteristics |
CN112416982B (en) * | 2021-01-25 | 2021-09-21 | 北京轻松筹信息技术有限公司 | Method and device for calculating real-time user characteristics |
CN113177049A (en) * | 2021-05-13 | 2021-07-27 | 中移智行网络科技有限公司 | Data processing method, device and system |
CN113434376A (en) * | 2021-06-24 | 2021-09-24 | 山东浪潮科学研究院有限公司 | Web log analysis method and device based on NoSQL |
CN113434376B (en) * | 2021-06-24 | 2023-04-11 | 山东浪潮科学研究院有限公司 | Web log analysis method and device based on NoSQL |
CN114490525A (en) * | 2022-02-22 | 2022-05-13 | 北京科杰科技有限公司 | System and method for analyzing and putting out and putting in storage of super-large unstructured text files remotely based on hadoop |
CN114490525B (en) * | 2022-02-22 | 2022-08-02 | 北京科杰科技有限公司 | System and method for analyzing and warehousing of ultra-large unstructured text files based on hadoop remote |
CN115801353A (en) * | 2022-11-03 | 2023-03-14 | 智网安云(武汉)信息技术有限公司 | Linkage script processing method after real-time aggregation of safety event logs based on big data level |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106951552A (en) | A kind of user behavior data processing method based on Hadoop | |
CN103491187B (en) | A kind of big data united analysis processing method based on cloud computing | |
US8260826B2 (en) | Data processing system and method | |
CN105930446B (en) | A kind of telecom client label generating method based on Hadoop distributed computing technology | |
CN105989129B (en) | Real time data statistical method and device | |
CN103246749B (en) | The matrix database system and its querying method that Based on Distributed calculates | |
CN109272155A (en) | A kind of corporate behavior analysis system based on big data | |
CN107577805A (en) | A kind of business service system towards the analysis of daily record big data | |
CN106815338A (en) | A kind of real-time storage of big data, treatment and inquiry system | |
CN107038162A (en) | Real time data querying method and system based on database journal | |
CN107704545A (en) | Railway distribution net magnanimity information method for stream processing based on Storm Yu Kafka message communicatings | |
CN105512167A (en) | Multi-business user data managing system based on mixed database and method for same | |
CN107895046B (en) | Heterogeneous data integration platform | |
CN106126641A (en) | A kind of real-time recommendation system and method based on Spark | |
CN104820670A (en) | Method for acquiring and storing big data of power information | |
CN108021809A (en) | A kind of data processing method and system | |
CN103390038A (en) | HBase-based incremental index creation and retrieval method | |
CN107247799A (en) | Data processing method, system and its modeling method of compatible a variety of big data storages | |
CN110688399A (en) | Stream type calculation real-time report system and method | |
CN106850258A (en) | A kind of Log Administration System, method and device | |
CN107103064A (en) | Data statistical approach and device | |
CN111221791A (en) | Method for importing multi-source heterogeneous data into data lake | |
CN107067322A (en) | A kind of system and method applied to P2P network loan business data access models | |
CN103646051A (en) | Big-data parallel processing system and method based on column storage | |
CN107025298A (en) | A kind of big data calculates processing system and method in real time |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170714 |