CN104331421A - High-efficiency processing method and system for big data - Google Patents

High-efficiency processing method and system for big data Download PDF

Info

Publication number
CN104331421A
CN104331421A CN201410540392.0A CN201410540392A CN104331421A CN 104331421 A CN104331421 A CN 104331421A CN 201410540392 A CN201410540392 A CN 201410540392A CN 104331421 A CN104331421 A CN 104331421A
Authority
CN
China
Prior art keywords
data
task
index
module
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410540392.0A
Other languages
Chinese (zh)
Inventor
王佐成
任子晖
马韵洁
张凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Sun Create Electronic Co Ltd
Original Assignee
Anhui Sun Create Electronic Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Sun Create Electronic Co Ltd filed Critical Anhui Sun Create Electronic Co Ltd
Priority to CN201410540392.0A priority Critical patent/CN104331421A/en
Publication of CN104331421A publication Critical patent/CN104331421A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a high-efficiency processing method for big data, which comprises the following steps that a data node receives data to be stored; the data node stores the data, an index is simultaneously created according to a business scenario and is stored in a memory, and the data is gradually stored in a disk by index curing; a user inputs a task request, and an SQL (Structured Query Language) engine implements rapid retrieval of the data according to the created index and outputs the data to a computational node; a task processing module of a management node executes task scheduling, applies for resources to a resource management module and determines a spare computational node, and the spare computational node processes the data; the finally processed data is shown for the user. The invention also discloses the high-efficiency processing system for the big data. According to the invention, all processing is executed concurrently; hardware equipment of a computer is utilized to the greatest extent; processing efficiency is greatly improved; the user can more rapidly obtain a processing result when a task is executed.

Description

A kind of high-efficient treatment method of large data and system
Technical field
The present invention relates to the large market demand processing technology field of computing machine, especially a kind of high-efficient treatment method of large data and system.
Background technology
Along with the mega project such as safe city, smart city extensively carrying out in various places, data are gathered, data fusion further develops, data volume to be processed is needed to reach TB level, PB level, the process of big data quantity creates a series of realistic problem, original relevant database is when so large data volume, and its Technical Architecture, processing power, processing mode etc. more and more cannot be met consumers' demand.
The development of cloud computing, large data technique provides good solution route to the process of mass data, and Hadoop frame system uses parallel computation (MapReduce) especially, the mode of distributed storage (HDFS) achieves storage and the calculating of big data quantity.But, because distributed storage (HDFS) does not support that structuring query statement (SQL) directly processes, the data of distributed storage (HDFS) are difficult to directly be subsequently processed, and calculation task finally all needs to change into parallel computation MapReduce framework performs, its management node (Jobtracker) task is heavy, efficiency is low, easily cause Single Point of Faliure.How processing mass data fast, easily, how while raising task treatment effeciency, the availability increasing system becomes problem demanding prompt solution.
Summary of the invention
Primary and foremost purpose of the present invention is to provide in a kind of storage in large data, retrieval, computation process the high-efficient treatment method realizing the large data that large data fast, efficiently process.
For achieving the above object, present invention employs following technical scheme: a kind of high-efficient treatment method of large data, the method comprises the step of following order:
(1) back end receives data to be stored;
(2) back end stores data, meanwhile, creates index being kept in internal memory according to business scenario, and is solidified by index and be progressively kept in disk;
(3) user's incoming task request, SQL engine realizes data quick-searching according to the index created, and exports data to computing node;
(4) the task processing module of management node is executed the task scheduling, and to resource management module application resource, determines idle computing node, and processed data by this computing node; (5) final process data are presented to user.
The data type that described back end receives comprises structuring, semi-structured and unstructured data.
When carrying out data storage and index creation, first, index rule is created according to business scenario, then the data received are stored, be stored in hard disk, meanwhile, the basis of distributed file system use blur+lencense component construction index, indexed facet is set up to service application scene, is formed in the condition rear, usage degree is higher chooses and be stored in memory module according to index data.
When retrieving, by submit queries request, the inquiry request information of control module to input is analyzed, control module adopts SQL engine first to carry out automatic semantics recognition to querying condition, first from the index of memory module memory storage, target is searched, obtain raw data by the index degaussing dish that finds, and data are returned, present to user; If search less than, then to search to disk index stores district.
Described task processing module by the priority according to task, complexity situation to resource management module application resource, resource management module provides concrete task process resource according to dispatching algorithm, return to task processing module, task processing module issues task to corresponding computing node.
Described index is first stored in memory module, by internal memory working mechanism the index file exceeding memory capacity is cured in disk and preserves, the form of file carries out the storage of many copies in a distributed manner, index file forms sequencing and index file usage degree parameter for according to being cured stored in disk working mechanism with memory storage area size, index, the index formed at first, usage degree is minimum is first cured to disk, and the index file being cured to disk is distributed storage.
Another object of the present invention is to the efficient disposal system that a kind of large data are provided, comprising:
Store and index creation module, back end stores the data received, and meanwhile, creates index, be first kept in memory module by index file, be more progressively kept in disk according to business scenario;
Retrieval module, SQL engine, according to the index created, realizes data quick-searching, and exports data to computing node;
Processing module, the scheduling of executing the task, application resource, manages resource, and be responsible for the cutting of task simultaneously, function that process, merger, failed tasks are restarted, the execution of finally finishing the work.
Described processing module comprises:
Resource management module, realizes the management to computing module resource, by computing node client, and the resource service condition of in good time perception computing node, preparing dynamically is at any time task matching resource;
Task processing module, reception task, according to the priority of task, complexity situation to resource management module application resource, resource management module provides concrete task process resource according to dispatching algorithm, return to task processing module, task processing module is responsible for task to pass to given computing module, and be responsible for the cutting of task simultaneously, function that process, merger, failed tasks are restarted, the execution of finally finishing the work;
Computing module, the physics of specifically executing the task or virtual resource node.
As shown from the above technical solution, the present invention adopts multithreading to create index on each back end; Each back end arranges core buffer, store the index created, when index reaches a certain amount of, history index data and the index record that is not well used are cured to disk by escape mechanism, and carry out distributed storage to ensure availability, simultaneously in order to improve data high availability; SQL engine is adopted to realize real-time, fast query for index; Resource management module and task processing module separate by management node, and resource management realizes management, the scheduling of resource in cluster, and the resource bid of all tasks of task processing modules implement, task cutting, result merging, task status maintenance, result export.All process of the present invention are all concurrence performance, make use of the hardware device of computing machine to greatest extent, drastically increase treatment effeciency, make the user Shi Nenggeng that executes the task obtain result soon.
Accompanying drawing explanation
Fig. 1 is method flow diagram of the present invention.
Fig. 2 is the process flow diagram of data of the present invention storage and index creation.
Fig. 3 is retrieval flow figure of the present invention.
Fig. 4 is task processing flow chart of the present invention.
Embodiment
A high-efficient treatment method for large data, comprising: first, and back end receives data to be stored; Secondly, back end stores data, meanwhile, creates index and be kept in internal memory according to business scenario, and is solidified by index and be progressively kept in disk; Again, the request of user's incoming task, SQL engine realizes data quick-searching according to the index created, and exports data to computing node; Then, the task processing module of management node is executed the task scheduling, and to resource management module application resource, determines idle computing node, and processed data by this computing node; Finally, final process data are presented to user, the data type that described back end receives comprises structuring, semi-structured and unstructured data, as shown in Figure 1.
As shown in Figure 1, back end realizes storage to data to be stored, uses blur+lencense component construction index on the basis of HDFS simultaneously, and indexed facet is set up to service application scene, chooses valuable, time order and function order and builds.After index creation completes, can retrieve for index, use Squirre-SQL assembly to realize SQL and operate and carry out data structured displaying.Processing module, realize task quick, efficiently process, resource management and task process main functional modules separate by management node, be divided into resource management module and task processing module, resource management module realize resource distribution, resource status monitoring, resource reclaim function, the application of task processing modules implement resource, utilize function, solve the problem that former management node task is heavy, efficiency is low, easily cause the machine of delaying.
As shown in Figure 2, when carrying out data storage and index creation, first, create index rule according to business scenario, then the data received are stored, be stored in hard disk, simultaneously, the basis of distributed file system uses blur+lencense component construction index, and indexed facet is set up to service application scene, is formed in the condition rear, usage degree is higher chooses and be stored in memory module according to index data.Described index is first stored in memory module, by internal memory working mechanism the index file exceeding memory capacity is cured in disk and preserves, the form of file carries out the storage of many copies in a distributed manner, index file stored in disk working mechanism with memory storage area size, index forms sequencing and index file usage degree parameter is according to being cured, to be formed at first, the index that usage degree is minimum is first cured to disk, the index file being cured to disk is distributed storage, the maximum business datum index of such application will be kept at memory field all the time, be convenient to quick use.
As shown in Figure 2, according to business index building rule: this index creates based on concrete business, directly serves service application, utilizes existing regular index building while back end carries out data storage, data are stored in disk, and the metadata store of generation is on management node.On the basis of distributed file system HDFS, back end creates process, use the mode of blur+lencense to carry out the establishment of data directory.Index is first stored in memory module, and memory module keeps a certain amount of memory size, in order to ensure the high availability of data, is cured storage in a hard disk simultaneously, and the form of file stores in a distributed manner.
As shown in Figure 3, when retrieving, by submit queries request, such as certain fuzzy vehicle license plate information; The inquiry request information of control module to input is analyzed, control module adopts SQL engine first to carry out automatic semantics recognition to querying condition, first from the index of memory module memory storage, target is searched, such as the license board information of vehicle, obtain raw data by the index degaussing dish that finds, and data are returned, present to user; If search less than, then to search to disk index stores district.Under simple service environment, the data of searching directly can return to user, under the service environment of complexity, and also can by being back to user after the task processing module of management node and resource management module process.Disk index stores module in Fig. 3 is exactly the disk storage in Fig. 1.
As shown in Figure 4, described task processing module by the priority according to task, complexity situation to resource management module application resource, resource management module provides concrete task process resource according to dispatching algorithm, return to task processing module, task processing module issues task to corresponding computing node.
As shown in Figure 1, native system comprises: store and index creation module, back end stores the data received, and meanwhile, creates index, be first kept in memory module by index file, be more progressively kept in disk according to business scenario; Retrieval module, SQL engine, according to the index created, realizes data quick-searching, and exports data to computing node; Processing module, the scheduling of executing the task, application resource, manages resource, and be responsible for the cutting of task simultaneously, function that process, merger, failed tasks are restarted, the execution of finally finishing the work.
Described processing module comprises: resource management module, realizes the management to computing module resource, by computing node client, and the resource service condition of in good time perception computing node, preparing dynamically is at any time task matching resource; Task processing module, reception task, according to the priority of task, complexity situation to resource management module application resource, resource management module provides concrete task process resource according to dispatching algorithm, return to task processing module, task processing module is responsible for task to pass to given computing module, and be responsible for the cutting of task simultaneously, function that process, merger, failed tasks are restarted, the execution of finally finishing the work; Computing module, the physics of specifically executing the task or virtual resource node.
Resource management module realizes the management to computing module resource, by every platform computer client, Resource Management node can the resource service condition of the computing node of perception in good time, resource content comprises internal memory, CPU, disk, network etc., have at fingertips to the resource situation of real-time, tunable degree, preparing dynamically is at any time task matching resource.Task refers to concrete some application, and as the incomplete license board information inputted according to front end, remove the garage information mating large database concept, first this task can be caught by task processing module.Computing module refers to the physics or virtual resource node of specifically executing the task.The resource management module passage client be deployed on computing module obtains and the load information, health information etc. of computing node, and task processing module issues task to computing node.First each task is received by task processing module, task processing module by the priority according to task, complexity situation to resource management module application resource, resource management module provides concrete task process resource according to dispatching algorithm, return to task processing module, task processing module is responsible for task being passed to given resource processing module, and be responsible for the cutting of task simultaneously, function, the execution of finally finishing the work such as process, merger, failed tasks are restarted.
In sum, all process of the present invention are all concurrence performance, make use of the hardware device of computing machine to greatest extent, drastically increase treatment effeciency, make the user Shi Nenggeng that executes the task obtain result soon.

Claims (8)

1. a high-efficient treatment method for large data, the method comprises the step of following order:
(1) back end receives data to be stored;
(2) back end stores data, meanwhile, creates index being kept in internal memory according to business scenario, and is solidified by index and be progressively kept in disk;
(3) user's incoming task request, SQL engine realizes data quick-searching according to the index created, and exports data to computing node;
(4) the task processing module of management node is executed the task scheduling, and to resource management module application resource, determines idle computing node, and processed data by this computing node;
(5) final process data are presented to user.
2. the high-efficient treatment method of large data according to claim 1, is characterized in that: the data type that described back end receives comprises structuring, semi-structured and unstructured data.
3. the high-efficient treatment method of large data according to claim 1, it is characterized in that: when carrying out data storage and index creation, first, index rule is created according to business scenario, then the data received are stored, be stored in hard disk, simultaneously, the basis of distributed file system uses blur+lencense component construction index, indexed facet is set up to service application scene, is formed in the condition rear, usage degree is higher chooses and be stored in memory module according to index data.
4. the high-efficient treatment method of large data according to claim 1, it is characterized in that: when retrieving, by submit queries request, the inquiry request information of control module to input is analyzed, control module adopts SQL engine first to carry out automatic semantics recognition to querying condition, first from the index of memory module memory storage, target is searched, obtain raw data by the index degaussing dish that finds, and data are returned, present to user; If search less than, then to search to disk index stores district.
5. the high-efficient treatment method of large data according to claim 1, it is characterized in that: described task processing module by the priority according to task, complexity situation to resource management module application resource, resource management module provides concrete task process resource according to dispatching algorithm, return to task processing module, task processing module issues task to corresponding computing node.
6. the high-efficient treatment method of large data according to claim 3, it is characterized in that: described index is first stored in memory module, by internal memory working mechanism the index file exceeding memory capacity is cured in disk and preserves, the form of file carries out the storage of many copies in a distributed manner, index file stored in disk working mechanism with memory storage area size, index forms sequencing and index file usage degree parameter is according to being cured, to be formed at first, the index that usage degree is minimum is first cured to disk, the index file being cured to disk is distributed storage.
7. a large data efficient disposal system, is characterized in that: comprising:
Store and index creation module, back end stores the data received, and meanwhile, creates index, be first kept in memory module by index file, be more progressively kept in disk according to business scenario;
Retrieval module, SQL engine, according to the index created, realizes data quick-searching, and exports data to computing node;
Processing module, the scheduling of executing the task, application resource, manages resource, and be responsible for the cutting of task simultaneously, function that process, merger, failed tasks are restarted, the execution of finally finishing the work.
8. large data efficient disposal system according to claim 7, is characterized in that: described processing module comprises:
Resource management module, realizes the management to computing module resource, by computing node client, and the resource service condition of in good time perception computing node, preparing dynamically is at any time task matching resource;
Task processing module, reception task, according to the priority of task, complexity situation to resource management module application resource, resource management module provides concrete task process resource according to dispatching algorithm, return to task processing module, task processing module is responsible for task to pass to given computing module, and be responsible for the cutting of task simultaneously, function that process, merger, failed tasks are restarted, the execution of finally finishing the work;
Computing module, the physics of specifically executing the task or virtual resource node.
CN201410540392.0A 2014-10-14 2014-10-14 High-efficiency processing method and system for big data Pending CN104331421A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410540392.0A CN104331421A (en) 2014-10-14 2014-10-14 High-efficiency processing method and system for big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410540392.0A CN104331421A (en) 2014-10-14 2014-10-14 High-efficiency processing method and system for big data

Publications (1)

Publication Number Publication Date
CN104331421A true CN104331421A (en) 2015-02-04

Family

ID=52406148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410540392.0A Pending CN104331421A (en) 2014-10-14 2014-10-14 High-efficiency processing method and system for big data

Country Status (1)

Country Link
CN (1) CN104331421A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104765851A (en) * 2015-04-21 2015-07-08 成都博元时代软件有限公司 Big data analysis and extraction method
CN104778259A (en) * 2015-04-21 2015-07-15 成都博元时代软件有限公司 High-efficiency data analyzing and processing method
CN105005622A (en) * 2015-07-24 2015-10-28 肖华 Method for high-speed storage of high-fidelity continuous-frame queries and image output method thereof
CN105550025A (en) * 2015-12-08 2016-05-04 北京航空航天大学 Distributed IaaS (Infrastructure as a Service) scheduling method and system
CN105740068A (en) * 2016-01-27 2016-07-06 中国科学院计算技术研究所 Big data platform oriented and memory data locality based scheduling method and system
CN106682167A (en) * 2016-12-26 2017-05-17 努比亚技术有限公司 User behavior data statistics device and method
WO2017127976A1 (en) * 2016-01-25 2017-08-03 华为技术有限公司 Method for training and scheduling incremental learning cloud system and related device
CN107015946A (en) * 2016-01-27 2017-08-04 常州普适信息科技有限公司 Distributed high-order SVD and its incremental computations a kind of method
CN108153642A (en) * 2016-12-02 2018-06-12 航天星图科技(北京)有限公司 A kind of method that selection calculate node is loaded according to operation
CN109213743A (en) * 2017-06-30 2019-01-15 北京京东尚科信息技术有限公司 A kind of data query method and apparatus
CN109522053A (en) * 2017-09-20 2019-03-26 阿里巴巴集团控股有限公司 A kind of massive parallel processing and data processing method
CN111221865A (en) * 2020-01-09 2020-06-02 上海合阔信息技术有限公司 Recipe query method and device, electronic equipment and storage medium
CN111277900A (en) * 2018-12-05 2020-06-12 深圳市茁壮网络股份有限公司 Starting method and device of set top box
CN111338768A (en) * 2020-02-03 2020-06-26 重庆特斯联智慧科技股份有限公司 Public security resource scheduling system utilizing urban brain
CN112100146A (en) * 2020-09-21 2020-12-18 重庆紫光华山智安科技有限公司 Efficient erasure correction distributed storage writing method, system, medium and terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008152502A (en) * 2006-12-18 2008-07-03 Sharp Corp Document image retrieval device and program
CN101241469A (en) * 2007-02-05 2008-08-13 力博特公司 Method and device for storing and reading data in embedded system
CN101340331A (en) * 2007-07-06 2009-01-07 中国电信股份有限公司 Method for executing system task by idle terminal in P2P network
CN101853287A (en) * 2010-05-24 2010-10-06 南京高普科技有限公司 Data compression quick retrieval file system and method thereof
CN103412933A (en) * 2013-08-20 2013-11-27 南京物联网应用研究院有限公司 Cloud search platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008152502A (en) * 2006-12-18 2008-07-03 Sharp Corp Document image retrieval device and program
CN101241469A (en) * 2007-02-05 2008-08-13 力博特公司 Method and device for storing and reading data in embedded system
CN101340331A (en) * 2007-07-06 2009-01-07 中国电信股份有限公司 Method for executing system task by idle terminal in P2P network
CN101853287A (en) * 2010-05-24 2010-10-06 南京高普科技有限公司 Data compression quick retrieval file system and method thereof
CN103412933A (en) * 2013-08-20 2013-11-27 南京物联网应用研究院有限公司 Cloud search platform

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778259A (en) * 2015-04-21 2015-07-15 成都博元时代软件有限公司 High-efficiency data analyzing and processing method
CN104765851A (en) * 2015-04-21 2015-07-08 成都博元时代软件有限公司 Big data analysis and extraction method
CN105005622A (en) * 2015-07-24 2015-10-28 肖华 Method for high-speed storage of high-fidelity continuous-frame queries and image output method thereof
CN105005622B (en) * 2015-07-24 2018-12-07 肖华 A kind of method and its image output method of high speed storing Gao Zhenlian frame inquiry number
CN105550025B (en) * 2015-12-08 2019-04-16 北京航空航天大学 Distributed infrastructure services (IaaS) dispatching method and system
CN105550025A (en) * 2015-12-08 2016-05-04 北京航空航天大学 Distributed IaaS (Infrastructure as a Service) scheduling method and system
WO2017127976A1 (en) * 2016-01-25 2017-08-03 华为技术有限公司 Method for training and scheduling incremental learning cloud system and related device
CN108027889A (en) * 2016-01-25 2018-05-11 华为技术有限公司 A kind of training, dispatching method and relevant device for incremental learning cloud system
CN108027889B (en) * 2016-01-25 2020-07-28 华为技术有限公司 Training and scheduling method for incremental learning cloud system and related equipment
CN105740068A (en) * 2016-01-27 2016-07-06 中国科学院计算技术研究所 Big data platform oriented and memory data locality based scheduling method and system
CN107015946A (en) * 2016-01-27 2017-08-04 常州普适信息科技有限公司 Distributed high-order SVD and its incremental computations a kind of method
CN108153642A (en) * 2016-12-02 2018-06-12 航天星图科技(北京)有限公司 A kind of method that selection calculate node is loaded according to operation
CN106682167A (en) * 2016-12-26 2017-05-17 努比亚技术有限公司 User behavior data statistics device and method
CN106682167B (en) * 2016-12-26 2020-08-14 山东昆仲信息科技有限公司 Statistical device and method for user behavior data
CN109213743A (en) * 2017-06-30 2019-01-15 北京京东尚科信息技术有限公司 A kind of data query method and apparatus
CN109213743B (en) * 2017-06-30 2021-10-15 北京京东尚科信息技术有限公司 Data query method and device
CN109522053A (en) * 2017-09-20 2019-03-26 阿里巴巴集团控股有限公司 A kind of massive parallel processing and data processing method
CN111277900A (en) * 2018-12-05 2020-06-12 深圳市茁壮网络股份有限公司 Starting method and device of set top box
CN111277900B (en) * 2018-12-05 2022-12-23 深圳市茁壮网络股份有限公司 Starting method and device of set top box
CN111221865A (en) * 2020-01-09 2020-06-02 上海合阔信息技术有限公司 Recipe query method and device, electronic equipment and storage medium
CN111338768A (en) * 2020-02-03 2020-06-26 重庆特斯联智慧科技股份有限公司 Public security resource scheduling system utilizing urban brain
CN112100146A (en) * 2020-09-21 2020-12-18 重庆紫光华山智安科技有限公司 Efficient erasure correction distributed storage writing method, system, medium and terminal
CN112100146B (en) * 2020-09-21 2021-06-29 重庆紫光华山智安科技有限公司 Efficient erasure correction distributed storage writing method, system, medium and terminal

Similar Documents

Publication Publication Date Title
CN104331421A (en) High-efficiency processing method and system for big data
US9053067B2 (en) Distributed data scalable adaptive map-reduce framework
JP5744707B2 (en) Computer-implemented method, computer program, and system for memory usage query governor (memory usage query governor)
CN106919675B (en) Data storage method and device
US20170364540A1 (en) Normalized searchable cloud layer
CN103399887A (en) Query and statistical analysis system for mass logs
US8694486B2 (en) Deadline-driven parallel execution of queries
CN103440288A (en) Big data storage method and device
WO2015094269A1 (en) Hybrid flows containing a continuous flow
CN103198097A (en) Massive geoscientific data parallel processing method based on distributed file system
CN103559300A (en) Data query method and device
CN104035938A (en) Performance continuous integration data processing method and device
CN103177035A (en) Data query device and data query method in data base
CN115335821B (en) Offloading statistics collection
US20230359647A1 (en) Read-Write Separation and Automatic Scaling-Based Cloud Arrangement System and Method
CN104461710A (en) Method and device for processing tasks
Al-Khasawneh et al. MapReduce a comprehensive review
CN104516985A (en) Rapid mass data importing method based on HBase database
Dai et al. Research and implementation of big data preprocessing system based on Hadoop
CN113722600B (en) Data query method, device, equipment and product applied to big data
CN109597826B (en) Data processing method and device, electronic equipment and computer readable storage medium
EP3475852A1 (en) Method and system for flexible, high performance structured data processing
Althebyan et al. A scalable Map Reduce tasks scheduling: a threading-based approach
US20150149498A1 (en) Method and System for Performing an Operation Using Map Reduce
KR20160081231A (en) Method and system for extracting image feature based on map-reduce for searching image

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150204