CN111427887A - Method, device and system for rapidly scanning HBase partition table - Google Patents

Method, device and system for rapidly scanning HBase partition table Download PDF

Info

Publication number
CN111427887A
CN111427887A CN202010188346.4A CN202010188346A CN111427887A CN 111427887 A CN111427887 A CN 111427887A CN 202010188346 A CN202010188346 A CN 202010188346A CN 111427887 A CN111427887 A CN 111427887A
Authority
CN
China
Prior art keywords
physical
partition
partitions
data table
hbase
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010188346.4A
Other languages
Chinese (zh)
Inventor
刘智鑫
蔡苗
陈震宇
刘国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Postal Savings Bank of China Ltd
Original Assignee
Postal Savings Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Postal Savings Bank of China Ltd filed Critical Postal Savings Bank of China Ltd
Priority to CN202010188346.4A priority Critical patent/CN111427887A/en
Publication of CN111427887A publication Critical patent/CN111427887A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a method, a device and a system for rapidly scanning an HBase partition table, wherein the method comprises the following steps: pre-partitioning the HBase data table to obtain a plurality of physical partitions; partitioning the RDD of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions, and establishing a mapping relation between the logical partitions and the physical partitions so as to map each logical partition to the corresponding physical partition; and when Spark runs, allocating a SCAN scanning object to each physical partition to realize parallel scanning of the HBase data table. The method and the device have the advantages that the HBase data tables are pre-partitioned, and the SCAN scanning object is established for the pre-partition of each HBase data table, so that the data of each partition can be read in parallel, and the partition tables of the HBase can be rapidly scanned.

Description

Method, device and system for rapidly scanning HBase partition table
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a method, a device and a system for rapidly scanning an HBase partition table.
Background
In practical application, besides an application scene of inquiring single data, a user may need to scan a data table of the whole HBase to realize table scanning operation, but the index does not play a role in accelerating assistance on the table scanning operation.
In general, the table scan operation needs to implement the filtering query requirement of the full table from the beginning to the end, which may involve specific aggregation operations such as Count statistics, Sum summation, etc. of indexes, and may also involve stripe-by-stripe reading operation of the full table data. HBase mainly supports two operations of GET and SCAN to obtain data from a data table at present, a GET object is used for obtaining single record data, and a SCAN object is used for scanning data in a specified range.
At present, when reading data in an HBase data table, a Spark client mainly obtains the data in the table in a scanning manner, and usually only generates one SCAN object, but the SCAN object needs to sequentially SCAN Region partitions of all HBase data tables step by step, which results in a slow scanning speed. Therefore, the current method cannot well use the distributed processing capability of Spark, and does not fully use the concept of partitioning Region of data table in HBase.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a method, a device and a system for rapidly scanning an HBase partition table.
The embodiment of the invention provides the following specific technical scheme:
in a first aspect, the present invention provides a method for rapidly scanning an HBase partition table, where the method includes:
pre-partitioning the HBase data table to obtain a plurality of physical partitions;
partitioning the RDD of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions, and establishing a mapping relation between the logical partitions and the physical partitions so as to map each logical partition to a corresponding physical partition;
and when Spark is operated, allocating a SCAN scanning object to each physical partition to realize parallel scanning of the HBase data table.
Preferably, the method further comprises:
creating a task for each of said logical partitions;
and running the task to process the scanning result of the corresponding SCAN scanning object so as to realize the parallel processing of the HBase data table.
Preferably, the pre-partitioning the HBase data table to obtain a plurality of physical partitions specifically includes:
calculating the data volume to be processed;
and equally dividing the HBase data table according to the data volume to obtain a plurality of continuous physical partitions.
Preferably, equally dividing the HBase data table according to the data size to obtain a plurality of continuous physical partitions specifically includes:
and equally dividing the HBase data table according to the range identified by the row key of the HBase data table to obtain a plurality of continuous physical partitions. Preferably, the pre-partitioning the HBase data table to obtain a plurality of physical partitions specifically includes:
and dividing the HBase data table according to the historical data change trend to obtain a plurality of continuous physical partitions.
Preferably, the allocating one SCAN object to each physical partition specifically includes:
acquiring a starting primary key and an ending primary key of each physical partition;
generating a SCAN object having the same start primary key and end primary key according to the start primary key and end primary key of each physical partition.
In a second aspect, the present invention provides an apparatus for rapidly scanning an HBase partition table, where the apparatus includes:
the first partitioning module is used for pre-partitioning the HBase data table to obtain a plurality of physical partitions;
the second partitioning module is used for partitioning the RDD of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions;
the mapping module is used for establishing a mapping relation with the physical partitions so that each logic partition is mapped to the corresponding physical partition;
and the first allocation module is used for allocating a SCAN scanning object to each physical partition when Spark runs so as to realize parallel scanning of the HBase data table.
Preferably, the apparatus further comprises:
a second allocation module for creating a task for each of said logical partitions;
and the operation module is used for operating the task to process the scanning result of the corresponding SCAN scanning object so as to realize the parallel processing of the HBase data table.
Preferably, the first partitioning module specifically includes:
the calculation module is used for calculating the data volume to be processed;
and the dividing module is used for equally dividing the HBase data table according to the data volume to obtain a plurality of continuous physical partitions.
Preferably, the dividing module is specifically configured to equally divide the HBase data table according to a range identified by a row key of the HBase data table to obtain a plurality of continuous physical partitions.
Preferably, the first partitioning module is specifically configured to:
and dividing the HBase data table according to the historical data change trend to obtain a plurality of continuous physical partitions.
Preferably, the first distribution module specifically includes:
the acquisition module is used for acquiring a starting primary key and an ending primary key of each physical partition;
and the generation module generates the SCAN scanning object with the same starting primary key and ending primary key according to the starting primary key and ending primary key of each physical partition.
In a third aspect, the present invention provides a computer system comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
pre-partitioning the HBase data table to obtain a plurality of physical partitions;
partitioning the RDD of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions, and establishing a mapping relation between the logical partitions and the physical partitions so as to map each logical partition to a corresponding physical partition;
and when Spark is operated, allocating a SCAN scanning object to each physical partition to realize parallel scanning of the HBase data table.
The embodiment of the invention has the following beneficial effects:
the method is used for pre-partitioning the HBase data table, and an SCAN scanning object is established for the pre-partitioning of each HBase data table, so that the data of each partition can be read in parallel, and the partition table of the HBase can be scanned quickly.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram illustrating a correspondence relationship between Partition in Spark and Region in HBase in the prior art according to an embodiment of the present application;
FIG. 2 is a flowchart of a method for fast scanning an HBase partition table according to an embodiment of the present application;
fig. 3 is a schematic diagram of a correspondence relationship between a Partition in Spark and a Region in HBase according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an apparatus for rapidly scanning an HBase partition table according to a second embodiment of the present application;
fig. 5 is a schematic structural diagram of a computer system according to a third embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
As shown in fig. 1, when reading data in an HBase data table, a Spark client mainly obtains the data in the table in a scanning manner, and usually only generates one SCAN object, but the SCAN object needs to sequentially SCAN all Region partitions of the HBase data table step by step, which results in a slow scanning speed. Therefore, the current method cannot well use the distributed processing capability of Spark, and does not fully use the concept of partitioning Region of data table in HBase. Based on this, the application provides a method for rapidly scanning an HBase partition table, as shown in fig. 2, a Spark client can read the HBase partition table in parallel, thereby achieving rapid scanning of an HBase data table.
The method for rapidly scanning the HBase partition table specifically comprises the following steps:
and S1, pre-partitioning the HBase data table to obtain a plurality of physical partitions.
The method specifically comprises the following steps:
s11, calculating the data volume to be processed;
and S12, equally dividing the HBase data table according to the data size to obtain a plurality of continuous physical partitions.
Specifically, the step S12 may be:
and equally dividing the HBase data table according to the range identified by the row key of the HBase data table to obtain a plurality of continuous physical partitions.
The physical partition refers to a Region of the HBase data table.
For example, if the range identified by the row key Rowkey is (0 ~ 19), the HBase data table can be divided into (0 ~ 9), (10 ~ 19) two physical partitions. The number and the range of the physical partitions can be defined according to actual requirements.
Further, the above step S1 can be implemented as follows:
and equally dividing the HBase data table according to the historical data change trend to obtain a plurality of continuous physical partitions.
For example, for a certain bank, the traffic from 9 am to 11 am is small, and the traffic from 1 pm to 3 pm is large, so when partitioning is performed, the range identified by the Rowkey from 1 pm to 3 pm can be set to be twice as large as that from 9 am to 11 am in combination with the historical data change trend, thereby increasing the processing speed.
S2, partitioning the RDDs of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions, and establishing a mapping relation between the logical partitions and the physical partitions so that each logical partition is mapped to the corresponding physical partition.
The rdd (resource Distributed dataset) is called an elastic Distributed dataset, and is the most basic data abstraction in Spark, and represents a collection of immutable, partitionable, and parallel-computable elements in the collection. RDD has the characteristics of a data flow model: automatic fault tolerance, location-aware scheduling, and scalability. RDD allows a user to explicitly cache a working set in a memory when executing a plurality of queries, and subsequent queries can reuse the working set, which greatly improves the query speed of data.
The RDD data structure in Spark supports the concept of Partition on logic, so that the RDD data structure in Spark is partitioned based on the pre-partitioned HBase data table to obtain the logic partitions (Partition) with the same number as the physical partitions (regions) of the HBase data table.
And S3, when Spark is operated, allocating a SCAN scanning object to each physical partition to realize parallel scanning of the HBase data table.
In this embodiment, the correspondence between Partition in Spark and Region in HBase is shown in fig. 3.
The allocating a SCAN object to each physical partition specifically includes:
1. acquiring a starting primary key and an ending primary key of each physical partition;
2. generating a SCAN object having the same start primary key and end primary key according to the start primary key and end primary key of each physical partition.
In this manner, each physical partition (Region) of the HBase data table can be scanned in parallel. Under the condition of abundant machine resources, the speed of parallel scanning is naturally faster as the number of partitions is larger, the size of each partition is reasonably set, and the time consumption of scanning can be controlled within a constant time range.
S4, creating a task for each logic partition.
And S5, running a task to process the scanning result of the corresponding SCAN scanning object so as to realize the parallel processing of the HBase data table.
The scheme can be applied to training of a calculation model, and the training time of the calculation model is prolonged.
The method specifically comprises the following steps:
1. storing the newly added features of a computational model into a data table of Hbase, wherein a column in the data table represents a feature of the computational model;
2. scanning the data table of the HBase according to the method for rapidly scanning the HBase partition table to obtain all technical characteristics of the calculation model;
3. and retraining the calculation model according to all the acquired technical characteristics.
Thus, the training capability of the calculation model can be improved.
Example two
Corresponding to the embodiment, as shown in fig. 4, the present application further provides an apparatus for rapidly scanning an HBase partition table, including:
a first partitioning module 21, configured to pre-partition the HBase data table to obtain a plurality of physical partitions;
the second partitioning module 22 is configured to partition the RDDs of the Spark according to the number of the physical partitions, so as to obtain logical partitions whose number is the same as that of the physical partitions;
a mapping module 23, configured to establish a mapping relationship with the physical partitions so that each logical partition is mapped to a corresponding physical partition;
and the first allocation module 24 is configured to allocate one SCAN object to each physical partition when Spark is running, so as to implement parallel scanning on the HBase data table.
Preferably, the above apparatus further comprises:
a second allocating module 25, configured to create a task for each logical partition;
and the running module 26 is configured to run a task to process a scanning result of the corresponding SCAN object so as to implement parallel processing on the HBase data table.
Preferably, the first partitioning module 21 specifically includes:
a calculating module 211, configured to calculate a data amount to be processed;
and the dividing module 212 is configured to equally divide the HBase data table according to the data amount to obtain a plurality of continuous physical partitions.
Preferably, the dividing module 212 is specifically configured to equally divide the HBase data table according to a range identified by a row key of the HBase data table, so as to obtain a plurality of continuous physical partitions.
Preferably, the first partitioning module 21 is specifically configured to:
and dividing the HBase data table according to the historical data change trend to obtain a plurality of continuous physical partitions.
Preferably, the first distribution module 24 specifically includes:
an obtaining module 241, configured to obtain a start primary key and an end primary key of each physical partition;
the generating module 242 generates SCAN objects having the same start primary key and end primary key according to the start primary key and end primary key of each physical partition.
EXAMPLE III
The present application further provides a computer system comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
pre-partitioning the HBase data table to obtain a plurality of physical partitions;
partitioning the RDD of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions, and establishing a mapping relation between the logical partitions and the physical partitions so as to map each logical partition to the corresponding physical partition;
and when Spark runs, allocating a SCAN scanning object to each physical partition to realize parallel scanning of the HBase data table.
FIG. 5 illustrates an architecture of a computer system that may include, in particular, a processor 32, a video display adapter 34, a disk drive 36, an input/output interface 38, a network interface 310, and a memory 312. The processor 32, video display adapter 34, disk drive 36, input/output interface 38, network interface 310, and memory 312 may be communicatively coupled via a communication bus 314.
The processor 32 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solution provided in the present Application.
The Memory 312 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random access Memory), a static storage device, a dynamic storage device, or the like. The memory 312 may store an operating system 316 for controlling the operation of the computer system 30, a Basic Input Output System (BIOS)318 for controlling low-level operations of the computer system. In addition, a web browser 320, a data storage management system 322, and the like may also be stored. In summary, when the technical solution provided by the present application is implemented by software or firmware, the relevant program code is stored in the memory 312 and invoked by the processor 32 for execution.
The input/output interface 38 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The network interface 310 is used for connecting a communication module (not shown in the figure) to realize communication interaction between the device and other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Communication bus 314 includes a path to transfer information between the various components of the device, such as processor 32, video display adapter 34, disk drive 36, input/output interface 38, network interface 310, and memory 312.
In addition, the computer system can also obtain the information of specific receiving conditions from the virtual resource object receiving condition information database for condition judgment and the like.
It should be noted that although the above-described device only shows the processor 32, the video display adapter 34, the disk drive 36, the input/output interface 38, the network interface 310, the memory 312, the communication bus 314, etc., in a specific implementation, the device may also include other components necessary for normal operation.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a cloud server, or a network device) to execute the method according to the embodiments or some parts of the embodiments of the present application.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention. In addition, the computer system, the apparatus for rapidly scanning the HBase partition table, and the method for rapidly scanning the HBase partition table provided in the foregoing embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments and are not described herein again.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method for rapidly scanning an HBase partition table is characterized by comprising the following steps:
pre-partitioning the HBase data table to obtain a plurality of physical partitions;
partitioning the RDD of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions, and establishing a mapping relation between the logical partitions and the physical partitions so as to map each logical partition to a corresponding physical partition;
and when Spark is operated, allocating a SCAN scanning object to each physical partition to realize parallel scanning of the HBase data table.
2. The method of claim 1, further comprising:
creating a task for each of said logical partitions;
and running the task to process the scanning result of the corresponding SCAN scanning object so as to realize the parallel processing of the HBase data table.
3. The method according to claim 1, wherein pre-partitioning the HBase data table to obtain a plurality of physical partitions specifically comprises:
calculating the data volume to be processed;
and equally dividing the HBase data table according to the data volume to obtain a plurality of continuous physical partitions.
4. The method according to claim 3, wherein averaging the HBase data table according to the data size to obtain a plurality of consecutive physical partitions specifically comprises:
and equally dividing the HBase data table according to the range identified by the row key of the HBase data table to obtain a plurality of continuous physical partitions.
5. The method according to claim 1, wherein pre-partitioning the HBase data table to obtain a plurality of physical partitions specifically comprises:
and dividing the HBase data table according to the historical data change trend to obtain a plurality of continuous physical partitions.
6. The method according to any one of claims 1 to 5, wherein the allocating one SCAN SCAN object to each physical partition specifically comprises:
acquiring a starting primary key and an ending primary key of each physical partition;
generating a SCAN object having the same start primary key and end primary key according to the start primary key and end primary key of each physical partition.
7. An apparatus for rapidly scanning HBase partition table, comprising:
the first partitioning module is used for pre-partitioning the HBase data table to obtain a plurality of physical partitions;
the second partitioning module is used for partitioning the RDD of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions;
the mapping module is used for establishing a mapping relation with the physical partitions so that each logic partition is mapped to the corresponding physical partition;
and the first allocation module is used for allocating a SCAN scanning object to each physical partition when Spark runs so as to realize parallel scanning of the HBase data table.
8. The apparatus of claim 7, further comprising:
a second allocation module for creating a task for each of said logical partitions;
and the operation module is used for operating the task to process the scanning result of the corresponding SCAN scanning object so as to realize the parallel processing of the HBase data table.
9. The apparatus of claim 7, wherein the first partition module specifically comprises:
the calculation module is used for calculating the data volume to be processed;
and the dividing module is used for equally dividing the HBase data table according to the data volume to obtain a plurality of continuous physical partitions.
10. A computer system, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
pre-partitioning the HBase data table to obtain a plurality of physical partitions;
partitioning the RDD of the Spark according to the number of the physical partitions to obtain the logical partitions with the same number as the physical partitions, and establishing a mapping relation between the logical partitions and the physical partitions so as to map each logical partition to a corresponding physical partition;
and when Spark is operated, allocating a SCAN scanning object to each physical partition to realize parallel scanning of the HBase data table.
CN202010188346.4A 2020-03-17 2020-03-17 Method, device and system for rapidly scanning HBase partition table Pending CN111427887A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010188346.4A CN111427887A (en) 2020-03-17 2020-03-17 Method, device and system for rapidly scanning HBase partition table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010188346.4A CN111427887A (en) 2020-03-17 2020-03-17 Method, device and system for rapidly scanning HBase partition table

Publications (1)

Publication Number Publication Date
CN111427887A true CN111427887A (en) 2020-07-17

Family

ID=71548273

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010188346.4A Pending CN111427887A (en) 2020-03-17 2020-03-17 Method, device and system for rapidly scanning HBase partition table

Country Status (1)

Country Link
CN (1) CN111427887A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490674A (en) * 2022-04-18 2022-05-13 北京奥星贝斯科技有限公司 Partition table establishing method, partition table data writing method, partition table data reading method, partition table data writing device, partition table data reading device and partition table data reading device
CN114969110A (en) * 2022-07-21 2022-08-30 阿里巴巴(中国)有限公司 Query method and device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462161A (en) * 2013-10-18 2015-03-25 上海宝信软件股份有限公司 Structural data query method based on distributed database
CN105487925A (en) * 2015-12-08 2016-04-13 浙江宇视科技有限公司 Data scanning method and device
CN105550293A (en) * 2015-12-11 2016-05-04 深圳市华讯方舟软件技术有限公司 Background refreshing method based on Spark-SQL big data processing platform
CN105956043A (en) * 2016-04-26 2016-09-21 海尔优家智能科技(北京)有限公司 Method and device for allocating Map task for MapReduce running on Hbase database
CN107741961A (en) * 2017-09-25 2018-02-27 阿里巴巴集团控股有限公司 Full table scan method and device based on Hbase
CN109165222A (en) * 2018-08-20 2019-01-08 福州大学 A kind of HBase secondary index creation method and system based on coprocessor
CN109582696A (en) * 2018-10-09 2019-04-05 阿里巴巴集团控股有限公司 The generation method and device of scan task, electronic equipment
CN110287038A (en) * 2019-06-10 2019-09-27 天翼电子商务有限公司 Promote the method and system of the data-handling efficiency of Spark Streaming frame
CN110442602A (en) * 2019-07-02 2019-11-12 新华三大数据技术有限公司 Data query method, apparatus, server and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462161A (en) * 2013-10-18 2015-03-25 上海宝信软件股份有限公司 Structural data query method based on distributed database
CN105487925A (en) * 2015-12-08 2016-04-13 浙江宇视科技有限公司 Data scanning method and device
CN105550293A (en) * 2015-12-11 2016-05-04 深圳市华讯方舟软件技术有限公司 Background refreshing method based on Spark-SQL big data processing platform
CN105956043A (en) * 2016-04-26 2016-09-21 海尔优家智能科技(北京)有限公司 Method and device for allocating Map task for MapReduce running on Hbase database
CN107741961A (en) * 2017-09-25 2018-02-27 阿里巴巴集团控股有限公司 Full table scan method and device based on Hbase
CN109165222A (en) * 2018-08-20 2019-01-08 福州大学 A kind of HBase secondary index creation method and system based on coprocessor
CN109582696A (en) * 2018-10-09 2019-04-05 阿里巴巴集团控股有限公司 The generation method and device of scan task, electronic equipment
CN110287038A (en) * 2019-06-10 2019-09-27 天翼电子商务有限公司 Promote the method and system of the data-handling efficiency of Spark Streaming frame
CN110442602A (en) * 2019-07-02 2019-11-12 新华三大数据技术有限公司 Data query method, apparatus, server and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490674A (en) * 2022-04-18 2022-05-13 北京奥星贝斯科技有限公司 Partition table establishing method, partition table data writing method, partition table data reading method, partition table data writing device, partition table data reading device and partition table data reading device
CN114969110A (en) * 2022-07-21 2022-08-30 阿里巴巴(中国)有限公司 Query method and device

Similar Documents

Publication Publication Date Title
CN109783237B (en) Resource allocation method and device
US11734271B2 (en) Data query method, apparatus and device
US20170364697A1 (en) Data interworking method and data interworking device
US9183151B2 (en) Thread cache allocation
CN112527848B (en) Report data query method, device and system based on multiple data sources and storage medium
CN108074210B (en) Object acquisition system and method for cloud rendering
CN108073350B (en) Object storage system and method for cloud rendering
CN103677759A (en) Objectification parallel computing method and system for information system performance improvement
CN113590508B (en) Dynamic reconfigurable memory address mapping method and device
CN112115160B (en) Query request scheduling method and device and computer system
CN111427887A (en) Method, device and system for rapidly scanning HBase partition table
CN113392863A (en) Method and device for acquiring machine learning training data set and terminal
CN112307062A (en) Database aggregation query method, device and system
CN116302363B (en) Virtual machine creation method, system, computer equipment and storage medium
CN109871260B (en) Multi-dimensional service current limiting method and system based on shared memory between containers
EP4390646A1 (en) Data processing method in distributed system, and related system
CN110908783A (en) Management and control method, system and equipment for virtual machine of cloud data center
CN116185545A (en) Page rendering method and device
CN115756756A (en) Video memory resource allocation method, device and equipment based on GPU virtualization technology
CN115545639A (en) Financial business processing method and device, electronic equipment and storage medium
CN114143590A (en) Video playing method, server and storage medium
WO2019218677A1 (en) Data storage method for power grid simulation analysis, device, and electronic apparatus
CN111090633A (en) Small file aggregation method, device and equipment of distributed file system
US20240220334A1 (en) Data processing method in distributed system, and related system
CN115391042B (en) Resource allocation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination