CN108182213A - A kind of data processing optimization device and method based on distributed system - Google Patents

A kind of data processing optimization device and method based on distributed system Download PDF

Info

Publication number
CN108182213A
CN108182213A CN201711382011.0A CN201711382011A CN108182213A CN 108182213 A CN108182213 A CN 108182213A CN 201711382011 A CN201711382011 A CN 201711382011A CN 108182213 A CN108182213 A CN 108182213A
Authority
CN
China
Prior art keywords
local cache
data
caching
cluster
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711382011.0A
Other languages
Chinese (zh)
Inventor
黄晓伟
肖万明
余涵
叶承坤
高建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
FUJIAN NEW LAND SOFTWARE ENGINEERING Co Ltd
Original Assignee
FUJIAN NEW LAND SOFTWARE ENGINEERING Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by FUJIAN NEW LAND SOFTWARE ENGINEERING Co Ltd filed Critical FUJIAN NEW LAND SOFTWARE ENGINEERING Co Ltd
Priority to CN201711382011.0A priority Critical patent/CN108182213A/en
Publication of CN108182213A publication Critical patent/CN108182213A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a kind of data processing optimization device based on distributed system, and including distributed caching cluster, computing cluster and local cache master control management, distributed caching cluster carries out the storage of full dose information and detached with computing cluster;Computing cluster includes two and above calculate node, and each calculate node includes local cache, caching agent and computing unit;The local cache abstract API operates and is embedded with capacity extension and quota management function, resolves into multiple internal fragmentations by business need and realization is to the dynamic expansion and quota management of each internal fragmentation capacity, computing unit is supplied to use with jar packet forms;The monitoring work and data cached on-line synchronous function of each internal fragmentation of local cache in caching agent module corresponding server;The management of local cache master control is managed collectively the caching of each server node energetically.Distributed type assemblies caching is merged with local cache, and the Microsecond grade of the operations such as association matching, the data filtering of mass data is made to be treated as possibility.

Description

A kind of data processing optimization device and method based on distributed system
Technical field
The present invention relates to the big data data processing fields of computer information technology more particularly to a kind of distribution that is based on to be The data processing optimization device and method of system.
Background technology
With the high speed development of Internet era, huge variation has occurred in people’s lives.People can utilize internet It is worked, studying and living, also exponentially increases for the generation speed and shared speed of data, so as to cause the play of data volume Increase.But the source and type due to data become complicated variety, and data volume is very huge, have with traditional data processing method Very big difference.
In traditional data processing mode, the data volume of data storage, processing and analysis is relatively fewer, and relationship may be used Type database efficiently carries out data processing, but in mass data processing demand, and traditional technology has been unable to meet present data processing Demand, therefore industry is generally using distributed computing technology (such as hadoop, storm, spark) to mass data depth analysis again Data prediction, also referred to as Data Preparation are carried out before excavating.
Data Preparation usually has the following feature in processing data:(1) source data amount is big, mainly some signalings (sensor signaling, network element signaling etc.) or daily record (electric business accesses record, consumer record etc.) information;(2) computing cluster data Handling capacity is big, usually require that averagely every record pretreatment duration to reach tens microseconds (every server more than 50m's per second Data throughput capabilities);(3) in Flow Technique processing, data processing, the requirement of data analysis overall process actual effect are high, to make The even Millisecond delay of real-time response, usually second grade.These are mainly used in the effective sensitive application field to data, than Such as real-time marketing, quotation analysis, position tracking;(4) data type is more, information is imperfect, needs to be associated completion, data The pretreatment works such as standardized format.The generation of the data meeting sequential of same analysis personnel (user, user group etc.), association Caching to be used for multiple times;(5) certain applications need to be filtered data, obtain the data for meeting analysis personnel.Analysis master The information of body has hundred, ten million rank magnitude, and special caching is needed to store;(6) data cached mainly some dimension tables are with dividing The information of main body is analysed, these information are stablized relatively, do not need to real-time update, life cycle is typically day or hour.
Feature described above certainly will be involved in the Data Preparation Process in mass data to Computational frame and caching technology With reference to complete the work such as information completion, escape, several scheme replies solve generally use as follows:
Scheme one:Computing unit is loaded directly into caching, as shown in Figure 1.In the Computational frame of distributed data processing, this A little computing units are worker or container one by one.In batch processing, these computing units are handling each batch Cache information will be reloaded during data, and these cache informations can not be shared between computing unit.In stream process, though As long as so cache information of loading, but shared buffer memory is still unable between computing unit.This scheme is primarily adapted for use in small The scene of caching, the no waste it will cause computing resource and memory source.
Scheme two:Distributed caching merges deployment with Computational frame, as shown in Figure 2.Introducing distributed caching (such as Redis, Memcached etc.), it can solve the problems, such as that buffer memory capacity limits.But distributed caching frame is in addition to occupying certain memory Outside resource, can also consumption calculations resource, formed with the resource contention of Computational frame, the performance of pretreatment is influenced when serious.Meanwhile The part caching of the program is there is still a need for cross-node accesses, and there is also performance issues during cross-node in scheme three.Using some quotient Distributed caching (such as Coherence), in cache information Autonomic Migration Framework to calculate node the machine or on the node that closes on, The performance issue of cross-node access can be alleviated to a certain extent, but still had the problem of resource contention.So program Be primarily adapted for use in processing data with it is data cached can be on same node, that is to say, that need to processing data can take Mould or by service area every scene.
Scheme three:Distributed caching is independently disposed with Computational frame, as shown in Figure 3.It can solve buffer memory capacity limitation, money The problem of source competes.In the case that data throughput requirement is little, this is also a kind of preferable mode of relevance grade.But data When throughput ratio is larger, under existing hardware condition, single inquiry is all Millisecond, and such processing speed is difficult to meet number According to the processing handling capacity (requirement of tens) per second.According to the concurrently inquiry of multithreading, it is possible to reduce average single inquiry Access time, but entirety cpu loads are larger.Meanwhile distributed caching (by taking Redis as an example) is using " single thread-multichannel is multiple With io models ", " queuing phenomena " when hot spot data occurs in access, is susceptible to, leads to query performance rapid drawdown, and easily occur Counterlogic cpu cores use load 100%, and other more idle situations.
Invention content
It is an object of the present invention to propose that a kind of handling capacity is high, the good data processing based on distributed system of real-time Optimize device, solve the problems, such as that existing memory waste is big.
To achieve these goals, the technical solution adopted in the present invention is:
A kind of data processing optimization device based on distributed system, including distributed caching cluster, computing cluster and sheet Ground cache master control management, distributed caching cluster include two and above cache node, carry out full dose information storage and with meter Calculate cluster separation;
Computing cluster includes two and above calculate node, and each calculate node includes local cache, caching agent and meter Calculate unit;The local cache abstract API operates and is embedded with capacity extension and quota management function, is resolved by business need Multiple internal fragmentations are simultaneously realized to the dynamic expansion and quota management of each internal fragmentation capacity, be supplied to by jar packet forms in terms of Unit is calculated to use;The monitoring work of each internal fragmentation of local cache in caching agent module corresponding server and data cached On-line synchronous function;
The local cache master control management is managed collectively the caching of each server node energetically, unified externally to provide service behaviour Make the interface with internal memory monitoring, realize the life cycle management of local cache.
Wherein, the distributed caching cluster is built by Redis or Memcached, can linear expansion.
Wherein, Java heap outer memory of the local cache for kv data structures, supports across jvm access.
Wherein, the API operations include inquiry, establishment caches, additions and deletions change caching.
Wherein, the monitoring work includes cleaning, deletes work.
Wherein, the life cycle management of the local cache includes the following steps:
S01:The cache information in distributed caching cluster is periodically handled by external application;
S02:Notice local cache master control management after having handled, the management of local cache master control will be complete in distributed caching cluster Cache information into processing is corresponded to as internal fragmentation title corresponding in local cache;
S03:Local cache master control management notifies that caching agent carries out same processing operation in each calculate node;
S04:Caching agent completes internal fragmentation cleaning, deletion or data cached on-line synchronous in corresponding local cache, The information of completion is fed back into local cache master control management;
S05:Local cache master control management record treated state abnormal generates alarm log if having.
Wherein, the processing includes update, condition monitoring.
Invention additionally discloses a kind of data processing optimization methods based on distributed system, include the following steps:
S100:Parallel to read initial data, the initial data includes data flow or data file;
S200:Computing unit is according to identification code of date, the API inquiry local cache correspondence memories that local cache is called to provide With the presence or absence of the data of the corresponding identification code of date in fragment;
S300:It, then will the corresponding number when there is the pairing code to match with the identification code of date in local cache According in the Supplementing Data to initial data of identification code, data prediction is completed;
S400:When in local cache without pairing code, then identification code of date phase is obtained from distributed caching cluster The pairing code matched;
S500:Judge whether the corresponding internal fragmentation of the pairing code write-in local cache successfully;If so, perform step Rapid S300;If it is not, then perform step S600;
S600:Judge whether the internal fragmentation space of local cache reaches predetermined threshold value;
S700:If so, old cache information in internal fragmentation is cleared up according to preset parameter, and again will pairing code write-in Local cache;
S800:If it is not, then step S500 is re-executed according to preset parameter spread internal fragmentation size.
Wherein, the data include the position signaling, web log, consumer record of user mobile phone.
Wherein, the identification code of date is international mobile subscriber identity.
Beneficial effects of the present invention are:
Described device establishes the system that ripe distributed type assemblies caching is combined with local cache, distributed type assemblies caching Store full dose cache information, memory space can linear expansion and cache node by the way of Hubei is cooled, ensure cluster Concurrent capability promotion is taken into account while High Availabitity.Local cache storage the machine calculates the valid cache information that data need so that Per data, the average behavior of the pretreatment operations such as association completion reaches tens Microsecond grades, while solve local cache across JVM It the problems such as shared, buffer memory capacity limitation, computing resource competition, network delay, is carried for links such as follow-up analysis, data minings in real time It handles up for height, the data preparation of high real-time.
Distributed type assemblies caching is merged with local cache, makes the operations such as association matching, the data filtering of mass data Microsecond grade is treated as possibility, then is combined with distributed computing framework, has played the parallel computation energy of cluster to greatest extent Power, for follow-up analysis in real time and data mining provide it is high handle up, the data preparation of high real-time.It is inquired every time in pretreatment Duration drops to Microsecond grade from Millisecond, and cache management is not take up computing resource substantially in itself, association completion etc. per data The average behavior of pretreatment operation reaches tens Microsecond grades, and each pretreatment potentiality for calculating node reaches more than 50m/s.
Description of the drawings
Fig. 1 is the system structure diagram that prior art computing unit is loaded directly into caching;
Fig. 2 is the structure diagram that prior art distributed caching merges deployment with Computational frame;
Fig. 3 is the structure diagram that prior art distributed caching is independently disposed with Computational frame;
Fig. 4 is that the present invention is based on the structure diagrams that the data processing of distributed system optimizes device;
Fig. 5 is the flow chart of the data processing optimization method the present invention is based on distributed system.
Specific embodiment
Below with reference to specific embodiment shown in the drawings, the present invention will be described in detail.But these embodiments are simultaneously The present invention is not limited, structure that those of ordinary skill in the art are made according to these embodiments, method or functionally Transformation is all contained in protection scope of the present invention.
As shown in fig.4, a kind of data processing optimization dress based on distributed system disclosed in an embodiment of the present invention It puts, including distributed caching cluster, computing cluster and local cache master control management;
Distributed caching cluster includes two and above cache node, carries out the information storage of full dose and divides with computing cluster From;Distributed caching cluster can according to the requirement linear expansion of memory space, the distributed caching cluster by Redis or Memcached is built, and can also introduce two-by-two the mutually high availability schemes such as standby as required.
Computing cluster includes two and above calculate node, and each calculate node includes local cache, caching agent and meter Calculate unit;The local cache abstract API operation, the API operations include inquiry, establishment caches, additions and deletions change caching, and embed There are capacity extension and quota management function, multiple internal fragmentations are resolved by business need and realize to each internal fragmentation capacity Dynamic expansion and quota management, computing unit is supplied to use with jar packet forms;In caching agent module corresponding server The monitoring work of each internal fragmentation of local cache, the monitoring work includes cleaning, deletes work and data cached online Synchronizing function;Local cache uses java out-pile memories, has the following characteristic:Across jv access is supported, in same service As long as a data of device caching, multiple computing units can access;Out-pile memory is a kind of lightweight design, will not be occupied too More computing resources;Out-pile memory uses the data structure of kv, keeps consistent with cache cluster;Out-pile memory also avoids rubbish Influence of the recover to performance throughout.
The local cache master control management is managed collectively the caching of each server node energetically, unified externally to provide service behaviour Make the interface with internal memory monitoring, realize the life cycle management of local cache.
The local cache master control management provides the life cycle management of local cache and provides service operations and memory The interface of monitoring, the life cycle management of the local cache include the following steps:
S01:The cache information in distributed caching cluster is periodically handled by external application;The period can be according to According to the demand setting used all day long or hour rank;
S02:Notice local cache master control management after having handled, the management of local cache master control will be complete in distributed caching cluster Cache information into processing is corresponded to as internal fragmentation title corresponding in local cache;
S03:Local cache master control management notifies that caching agent carries out same processing operation in each calculate node;
S04:Caching agent completes internal fragmentation cleaning, deletion or data cached on-line synchronous in corresponding local cache, The information of completion is fed back into local cache master control management;
S05:Local cache master control management record treated state abnormal generates alarm log if having.
Specifically, the local cache life cycle management includes local cache update operation, it is as follows:
S01:The cache information in distributed caching cluster is periodically updated by external application;The period can be according to According to the demand setting used all day long or hour rank;
S02:Notice local cache master control management after having updated, the management of local cache master control will be complete in distributed caching cluster Being corresponded into newer cache information becomes corresponding internal fragmentation title in local cache;
S03:Local cache master control management notifies that caching agent carries out same update operation in each calculate node;
S04:Caching agent completes internal fragmentation cleaning, deletion or data cached on-line synchronous in corresponding local cache, The information of completion is fed back into local cache master control management;
S05:Local cache master control management records updated state, if there is abnormal generation alarm log.
The local cache life cycle management can also be other operations such as condition monitoring other than updating and operating, herein It repeats no more.
Fig. 4 show the two-level cache of the optimization device of the data processing based on distributed system described in the above embodiment Massive data processing system schematic diagram, whole system includes distributed caching cluster, computing cluster and independently of two Local cache master control management except a cluster, distributed caching cluster and computing cluster are made of respectively several nodes, respectively Either cache node is a PC server or the similar equipment with calculating or storage capacity to the calculate node of module deployment. It in calculate node, is made of several computing units, local cache and caching agent, while local cache can be according to changing The service attribute of raw data is divided into several internal fragmentations, and manages the contents such as spatial cache independently.
Fig. 5 is that a kind of disclosed data processing optimization method based on distributed system is opened in one embodiment of the invention, Include the following steps:
S100:Parallel to read initial data, the initial data includes data flow or data file;
S200:Computing unit is according to identification code of date, the API inquiry local cache correspondence memories that local cache is called to provide With the presence or absence of the data of the corresponding identification code of date in fragment;
S300:It, then will the corresponding number when there is the pairing code to match with the identification code of date in local cache According in the Supplementing Data to initial data of identification code, data prediction is completed;
S400:When in local cache without pairing code, then identification code of date phase is obtained from distributed caching cluster The pairing code matched;
S500:Judge whether the corresponding internal fragmentation of the pairing code write-in local cache successfully;If so, perform step Rapid S300;If it is not, then perform step S600;
S600:Judge whether the internal fragmentation space of local cache reaches predetermined threshold value;
S700:If so, old cache information in internal fragmentation is cleared up according to preset parameter, and again will pairing code write-in Local cache;
S800:If it is not, then step S500 is re-executed according to preset parameter spread internal fragmentation size.
As shown in fig.5, below by taking the position signaling of user mobile phone as an example, certain web log, consumer record etc. Source data also is adapted for the present embodiment.In the position signaling of user mobile phone, there was only user's IMSI number in original position data, do not have There is cell-phone number, to be completed by being associated with the pretreatment of completion.The IMSI number of full dose (ten million rank) is stored in cache cluster (International Mobile Subscriber Identification Number, international mobile subscriber identity) with The correspondence relationship information of cell-phone number.Specifically, it is excellent to disclose a kind of data processing based on distributed system in one embodiment Change method, includes the following steps:
S100:Parallel to read initial data, the initial data includes data flow or data file;
S200:Computing unit is key according to IMSI number, the API inquiry local cache correspondence memories that local cache is called to provide With the presence or absence of the data of the corresponding identification code of date in fragment;
S300:When there is the key-value pair with the IMSI number in local cache, then by the mobile phone of the correspondence IMSI number In number completion to initial data.Complete data prediction;
S400:When in local cache without pairing code, then the key-value pair of IMSI number is obtained from distributed caching cluster;
S500:Judge whether the corresponding internal fragmentation of IMSI number key-value pair write-in local cache successfully;If so, Perform step S300;If it is not, then perform step S600;
S600:Judge whether the internal fragmentation space of local cache reaches predetermined threshold value;
S700:If so, old cache information in internal fragmentation is cleared up according to preset parameter, and again by IMSI number key assignments To local cache is written;
S800:If it is not, then step S500 is re-executed according to preset parameter spread internal fragmentation size.
Described device establishes the system that ripe distributed type assemblies caching is combined with local cache, distributed type assemblies caching Store full dose cache information, memory space can linear expansion and cache node by the way of Hubei is cooled, ensure cluster Concurrent capability promotion is taken into account while High Availabitity.Local cache storage the machine calculates the valid cache information that data need so that Per data, the average behavior of the pretreatment operations such as association completion reaches tens Microsecond grades, while solve local cache across JVM It the problems such as shared, buffer memory capacity limitation, computing resource competition, network delay, is carried for links such as follow-up analysis, data minings in real time It handles up for height, the data preparation of high real-time.
Distributed type assemblies caching is merged with local cache, makes the operations such as association matching, the data filtering of mass data Microsecond grade is treated as possibility, then is combined with distributed computing framework, has played the parallel computation energy of cluster to greatest extent Power, for follow-up analysis in real time and data mining provide it is high handle up, the data preparation of high real-time.It is inquired every time in pretreatment Duration drops to Microsecond grade from Millisecond, and cache management is not take up computing resource substantially in itself, association completion etc. per data The average behavior of pretreatment operation reaches tens Microsecond grades, and each pretreatment potentiality for calculating node reaches more than 50m/s.
It should be appreciated that although this specification is described in terms of embodiments, but not each embodiment only includes one A independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should will say For bright book as an entirety, the technical solution in each embodiment may also be suitably combined to form those skilled in the art can With the other embodiment of understanding.
Those listed above is a series of to be described in detail only for feasibility embodiment of the invention specifically Bright, they are not to limit the scope of the invention, all equivalent implementations made without departing from skill spirit of the present invention Or change should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of data processing optimization device based on distributed system, it is characterised in that:Including distributed caching cluster, calculate Cluster and local cache master control management, distributed caching cluster include two and above cache node, and the information for carrying out full dose is deposited It stores up and is detached with computing cluster;
Computing cluster includes two and above calculate node, and each calculate node includes local cache, caching agent and calculates single Member;The local cache abstract API operates and is embedded with capacity extension and quota management function, is resolved by business need multiple Internal fragmentation is simultaneously realized to the dynamic expansion and quota management of each internal fragmentation capacity, is supplied to calculating single with jar packet forms Member uses;The monitoring work of each internal fragmentation of local cache in caching agent module corresponding server and it is data cached Line locking function;
The local cache master control management is managed collectively the caching of each server node energetically, it is unified externally provide service operations with The interface of internal memory monitoring realizes the life cycle management of local cache.
2. a kind of data processing optimization device based on distributed system according to claim 1, it is characterised in that:It is described Distributed caching cluster is built by Redis or Memcached, can linear expansion.
3. a kind of data processing optimization device based on distributed system according to claim 1, it is characterised in that:It is described Java heap outer memory of the local cache for kV data structures, supports across jvm access.
4. a kind of data processing optimization device based on distributed system according to claim 1, it is characterised in that:It is described API operations include inquiry, establishment caches, additions and deletions change caching.
5. a kind of data processing optimization device based on distributed system according to claim 1, it is characterised in that:It is described Monitoring work includes cleaning, deletes work.
6. a kind of data processing optimization device based on distributed system according to claim 1, it is characterised in that:It is described The life cycle management of local cache includes the following steps:
S01:The cache information in distributed caching cluster is periodically handled by external application;
S02:Notice local cache master control management after having handled, the management of local cache master control will be in distributed caching clusters at completion The cache information of reason, which corresponds to, becomes corresponding internal fragmentation title in local cache;
S03:Local cache master control management notifies that caching agent carries out same processing operation in each calculate node;
S04:Caching agent completes internal fragmentation cleaning, deletion or data cached on-line synchronous in corresponding local cache, will be complete Into information feed back to local cache master control management;
S05:Local cache master control management record treated state abnormal generates alarm log if having.
7. a kind of data processing optimization device based on distributed system according to claim 6, it is characterised in that:It is described Processing includes update, condition monitoring.
8. a kind of data processing optimization method based on distributed system, it is characterised in that include the following steps:
S100:Parallel to read initial data, the initial data includes data flow or data file;
S200:Computing unit is according to identification code of date, the API inquiry local cache correspondence memory fragments that local cache is called to provide In with the presence or absence of the corresponding identification code of date data;
S300:When there is the pairing code to match with the identification code of date in local cache, then the corresponding data are known In the Supplementing Data to initial data of other code, data prediction is completed;
S400:When in local cache without pairing code, then obtain what identification code of date matched from distributed caching cluster Match code;
S500:Judge whether the corresponding internal fragmentation of the pairing code write-in local cache successfully;If so, perform step S300;If it is not, then perform step S600;
S600:Judge whether the internal fragmentation space of local cache reaches predetermined threshold value;
S700:If so, clearing up old cache information in internal fragmentation according to preset parameter, and pairing code is written locally again Caching;
S800:If it is not, then step S500 is re-executed according to preset parameter spread internal fragmentation size.
9. a kind of data processing optimization method based on distributed system according to claim 8, it is characterised in that:It is described Data include the position signaling, web log, consumer record of user mobile phone.
10. a kind of data processing optimization method based on distributed system according to claim 7 or 8 or 9, feature exist In:The identification code of date is international mobile subscriber identity.
CN201711382011.0A 2017-12-20 2017-12-20 A kind of data processing optimization device and method based on distributed system Pending CN108182213A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711382011.0A CN108182213A (en) 2017-12-20 2017-12-20 A kind of data processing optimization device and method based on distributed system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711382011.0A CN108182213A (en) 2017-12-20 2017-12-20 A kind of data processing optimization device and method based on distributed system

Publications (1)

Publication Number Publication Date
CN108182213A true CN108182213A (en) 2018-06-19

Family

ID=62546538

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711382011.0A Pending CN108182213A (en) 2017-12-20 2017-12-20 A kind of data processing optimization device and method based on distributed system

Country Status (1)

Country Link
CN (1) CN108182213A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189829A (en) * 2018-08-20 2019-01-11 广州知弘科技有限公司 Information safety system and method based on big data
CN109669737A (en) * 2018-12-19 2019-04-23 百度在线网络技术(北京)有限公司 Application processing method, device, equipment and medium
CN110795632A (en) * 2019-10-30 2020-02-14 北京达佳互联信息技术有限公司 State query method and device and electronic equipment
CN110858199A (en) * 2018-08-08 2020-03-03 北京京东尚科信息技术有限公司 Document data distributed computing method and device
CN111324670A (en) * 2020-02-27 2020-06-23 中国邮政储蓄银行股份有限公司 Method and system for separate deployment of computing storage based on HDFS (Hadoop distributed File System) and Vertica
CN111538739A (en) * 2020-04-28 2020-08-14 北京思特奇信息技术股份有限公司 WSG-based automatic synchronization method and system for service gateway
CN112148202A (en) * 2019-06-26 2020-12-29 杭州海康威视数字技术股份有限公司 Training sample reading method and device
CN112693502A (en) * 2019-10-23 2021-04-23 上海宝信软件股份有限公司 Urban rail transit monitoring system and method based on big data architecture
CN113032437A (en) * 2021-04-16 2021-06-25 建信金融科技有限责任公司 Caching method, caching device, caching medium and electronic equipment based on distributed database
CN113127741A (en) * 2021-04-29 2021-07-16 杭州弧途科技有限公司 Cache method for reading and writing data of mass users and posts in part-time post recommendation system
CN113220722A (en) * 2021-04-26 2021-08-06 深圳市云网万店科技有限公司 Data query method and device, computer equipment and storage medium
CN113672583A (en) * 2021-08-20 2021-11-19 浩鲸云计算科技股份有限公司 Big data multi-data source analysis method and system based on storage and calculation separation
CN113934759A (en) * 2021-10-15 2022-01-14 东北大学 Data caching device and system for fusion calculation in Gaia system
CN116028525A (en) * 2023-03-31 2023-04-28 成都四方伟业软件股份有限公司 Intelligent management method for data slicing

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101146127A (en) * 2007-10-30 2008-03-19 金蝶软件(中国)有限公司 A client buffer update method and device in distributed system
US8195610B1 (en) * 2007-05-08 2012-06-05 IdeaBlade, Inc. Method and apparatus for cache management of distributed objects
CN103297485A (en) * 2012-03-05 2013-09-11 日电(中国)有限公司 Distributed cache automatic management system and distributed cache automatic management method
CN104361030A (en) * 2014-10-24 2015-02-18 西安未来国际信息股份有限公司 Distributed cache architecture with task distribution function and cache method
CN106021468A (en) * 2016-05-17 2016-10-12 上海携程商务有限公司 Updating method and system for distributed caches and local caches
CN106790705A (en) * 2017-02-27 2017-05-31 郑州云海信息技术有限公司 A kind of Distributed Application local cache realizes system and implementation method
CN107479829A (en) * 2017-08-03 2017-12-15 杭州铭师堂教育科技发展有限公司 A kind of Redis cluster mass datas based on message queue quickly clear up system and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8195610B1 (en) * 2007-05-08 2012-06-05 IdeaBlade, Inc. Method and apparatus for cache management of distributed objects
CN101146127A (en) * 2007-10-30 2008-03-19 金蝶软件(中国)有限公司 A client buffer update method and device in distributed system
CN103297485A (en) * 2012-03-05 2013-09-11 日电(中国)有限公司 Distributed cache automatic management system and distributed cache automatic management method
CN104361030A (en) * 2014-10-24 2015-02-18 西安未来国际信息股份有限公司 Distributed cache architecture with task distribution function and cache method
CN106021468A (en) * 2016-05-17 2016-10-12 上海携程商务有限公司 Updating method and system for distributed caches and local caches
CN106790705A (en) * 2017-02-27 2017-05-31 郑州云海信息技术有限公司 A kind of Distributed Application local cache realizes system and implementation method
CN107479829A (en) * 2017-08-03 2017-12-15 杭州铭师堂教育科技发展有限公司 A kind of Redis cluster mass datas based on message queue quickly clear up system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YOO RM: "Scalable MapReduce on a large-scale shared-memory system", 《IEEE》 *
宋杰: "MapReduce大数据处理平台与算法研究进展", 《软件学报》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110858199A (en) * 2018-08-08 2020-03-03 北京京东尚科信息技术有限公司 Document data distributed computing method and device
CN109189829A (en) * 2018-08-20 2019-01-11 广州知弘科技有限公司 Information safety system and method based on big data
CN109669737B (en) * 2018-12-19 2023-04-18 百度在线网络技术(北京)有限公司 Application processing method, device, equipment and medium
CN109669737A (en) * 2018-12-19 2019-04-23 百度在线网络技术(北京)有限公司 Application processing method, device, equipment and medium
CN112148202A (en) * 2019-06-26 2020-12-29 杭州海康威视数字技术股份有限公司 Training sample reading method and device
CN112148202B (en) * 2019-06-26 2023-05-26 杭州海康威视数字技术股份有限公司 Training sample reading method and device
CN112693502A (en) * 2019-10-23 2021-04-23 上海宝信软件股份有限公司 Urban rail transit monitoring system and method based on big data architecture
CN110795632A (en) * 2019-10-30 2020-02-14 北京达佳互联信息技术有限公司 State query method and device and electronic equipment
CN111324670A (en) * 2020-02-27 2020-06-23 中国邮政储蓄银行股份有限公司 Method and system for separate deployment of computing storage based on HDFS (Hadoop distributed File System) and Vertica
CN111538739A (en) * 2020-04-28 2020-08-14 北京思特奇信息技术股份有限公司 WSG-based automatic synchronization method and system for service gateway
CN111538739B (en) * 2020-04-28 2023-11-17 北京思特奇信息技术股份有限公司 Method and system for automatically synchronizing service gateway based on WSG
CN113032437A (en) * 2021-04-16 2021-06-25 建信金融科技有限责任公司 Caching method, caching device, caching medium and electronic equipment based on distributed database
CN113032437B (en) * 2021-04-16 2023-06-02 建信金融科技有限责任公司 Caching method and device based on distributed database, medium and electronic equipment
CN113220722A (en) * 2021-04-26 2021-08-06 深圳市云网万店科技有限公司 Data query method and device, computer equipment and storage medium
CN113127741A (en) * 2021-04-29 2021-07-16 杭州弧途科技有限公司 Cache method for reading and writing data of mass users and posts in part-time post recommendation system
CN113672583A (en) * 2021-08-20 2021-11-19 浩鲸云计算科技股份有限公司 Big data multi-data source analysis method and system based on storage and calculation separation
CN113934759A (en) * 2021-10-15 2022-01-14 东北大学 Data caching device and system for fusion calculation in Gaia system
CN113934759B (en) * 2021-10-15 2024-05-17 东北大学 Data caching device and system for fusion calculation in Gaia system
CN116028525A (en) * 2023-03-31 2023-04-28 成都四方伟业软件股份有限公司 Intelligent management method for data slicing

Similar Documents

Publication Publication Date Title
CN108182213A (en) A kind of data processing optimization device and method based on distributed system
Ju et al. iGraph: an incremental data processing system for dynamic graph
CN107315776A (en) A kind of data management system based on cloud computing
CN110727727B (en) Statistical method and device for database
CN110414771A (en) Update method, device, server and the client of enterprise organization structure data
Jeong et al. Anomaly teletraffic intrusion detection systems on hadoop-based platforms: A survey of some problems and solutions
Narkhede et al. HMR log analyzer: Analyze web application logs over Hadoop MapReduce
CN104317957B (en) A kind of open platform of report form processing, system and report processing method
CN110784498B (en) Personalized data disaster tolerance method and device
CN106815254A (en) A kind of data processing method and device
CN104850627A (en) Method and apparatus for performing paging display
CN104636286A (en) Data access method and equipment
CN105069151A (en) HBase secondary index construction apparatus and method
CN104270412A (en) Three-level caching method based on Hadoop distributed file system
CN112131305A (en) Account processing system
CN111104406A (en) Hierarchical service data storage method and device, computer equipment and storage medium
Elagib et al. Big data analysis solutions using MapReduce framework
Mukherjee Synthesis of non-replicated dynamic fragment allocation algorithm in distributed database systems
CN100485640C (en) Cache for an enterprise software system
Ravindra et al. Latency aware elastic switching-based stream processing over compressed data streams
US11609910B1 (en) Automatically refreshing materialized views according to performance benefit
CN110134698A (en) Data managing method and Related product
US20230094293A1 (en) Method and apparatus for constructing recommendation model and neural network model, electronic device, and storage medium
Mukherjee Non-replicated dynamic fragment allocation in distributed database systems
CN115587147A (en) Data processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180619