CN105574093A - Method for establishing index in HDFS based spark-sql big data processing system - Google Patents
Method for establishing index in HDFS based spark-sql big data processing system Download PDFInfo
- Publication number
- CN105574093A CN105574093A CN201510918956.4A CN201510918956A CN105574093A CN 105574093 A CN105574093 A CN 105574093A CN 201510918956 A CN201510918956 A CN 201510918956A CN 105574093 A CN105574093 A CN 105574093A
- Authority
- CN
- China
- Prior art keywords
- index
- data
- sql
- spark
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000012545 processing Methods 0.000 title abstract description 8
- 230000008569 process Effects 0.000 claims description 28
- 238000001914 filtration Methods 0.000 abstract description 2
- 238000007796 conventional method Methods 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 5
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- ZLIBICFPKPWGIZ-UHFFFAOYSA-N pyrimethanil Chemical compound CC1=CC(C)=NC(NC=2C=CC=CC=2)=N1 ZLIBICFPKPWGIZ-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/134—Distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (9)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510918956.4A CN105574093B (en) | 2015-12-10 | 2015-12-10 | A method of index is established in the spark-sql big data processing system based on HDFS |
PCT/CN2016/094925 WO2017096939A1 (en) | 2015-12-10 | 2016-08-12 | Method for establishing index on hdfs-based spark-sql big-data processing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510918956.4A CN105574093B (en) | 2015-12-10 | 2015-12-10 | A method of index is established in the spark-sql big data processing system based on HDFS |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105574093A true CN105574093A (en) | 2016-05-11 |
CN105574093B CN105574093B (en) | 2019-09-10 |
Family
ID=55884224
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510918956.4A Expired - Fee Related CN105574093B (en) | 2015-12-10 | 2015-12-10 | A method of index is established in the spark-sql big data processing system based on HDFS |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105574093B (en) |
WO (1) | WO2017096939A1 (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106599062A (en) * | 2016-11-18 | 2017-04-26 | 北京奇虎科技有限公司 | Data processing method and device in SparkSQL system |
CN106777278A (en) * | 2016-12-29 | 2017-05-31 | 海尔优家智能科技(北京)有限公司 | A kind of data processing method and device based on Spark |
CN106844415A (en) * | 2016-11-18 | 2017-06-13 | 北京奇虎科技有限公司 | A kind of data processing method and device in SparkSQL systems |
WO2017096939A1 (en) * | 2015-12-10 | 2017-06-15 | 深圳市华讯方舟软件技术有限公司 | Method for establishing index on hdfs-based spark-sql big-data processing system |
CN107092685A (en) * | 2017-04-24 | 2017-08-25 | 广州新盛通科技有限公司 | A kind of method that file system and RDBMS store transaction data are used in combination |
CN107391555A (en) * | 2017-06-07 | 2017-11-24 | 中国科学院信息工程研究所 | A kind of metadata real time updating method towards Spark Sql retrievals |
CN108132986A (en) * | 2017-12-14 | 2018-06-08 | 北京航天测控技术有限公司 | A kind of immediate processing method of aircraft magnanimity biosensor assay data |
CN107368517B (en) * | 2017-06-02 | 2018-07-13 | 上海恺英网络科技有限公司 | A kind of method and apparatus of high amount of traffic inquiry |
CN108874897A (en) * | 2018-05-23 | 2018-11-23 | 新华三大数据技术有限公司 | Data query method and device |
CN110019497A (en) * | 2017-08-07 | 2019-07-16 | 北京国双科技有限公司 | A kind of method for reading data and device |
CN110046176A (en) * | 2019-04-28 | 2019-07-23 | 南京大学 | A kind of querying method of the large-scale distributed DataFrame based on Spark |
CN111177102A (en) * | 2019-12-25 | 2020-05-19 | 苏州浪潮智能科技有限公司 | Optimization method and system for realizing HDFS (Hadoop distributed File System) starting acceleration |
CN112015729A (en) * | 2019-05-29 | 2020-12-01 | 核桃运算股份有限公司 | Data management apparatus, method and computer storage medium thereof |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110674154B (en) * | 2019-09-26 | 2023-04-07 | 浪潮软件股份有限公司 | Spark-based method for inserting, updating and deleting data in Hive |
CN110928835A (en) * | 2019-10-12 | 2020-03-27 | 虏克电梯有限公司 | Novel file storage system and method based on mass storage |
CN111125216B (en) * | 2019-12-10 | 2024-03-12 | 中盈优创资讯科技有限公司 | Method and device for importing data into Phoenix |
CN111752804B (en) * | 2020-06-29 | 2022-09-09 | 中国电子科技集团公司第二十八研究所 | Database cache system based on database log scanning |
CN113297204B (en) * | 2020-07-15 | 2024-03-08 | 阿里巴巴集团控股有限公司 | Index generation method and device |
CN112231321B (en) * | 2020-10-20 | 2022-09-20 | 中国电子科技集团公司第二十八研究所 | Oracle secondary index and index real-time synchronization method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060041606A1 (en) * | 2004-08-19 | 2006-02-23 | Fujitsu Services Limited | Indexing system for a computer file store |
CN101344881A (en) * | 2007-07-09 | 2009-01-14 | 中国科学院大气物理研究所 | Index generation method and device and search system for mass file type data |
CN101727465A (en) * | 2008-11-03 | 2010-06-09 | 中国移动通信集团公司 | Methods for establishing and inquiring index of distributed column storage database, device and system thereof |
CN103631910A (en) * | 2013-11-26 | 2014-03-12 | 烽火通信科技股份有限公司 | Distributed database multi-column composite query system and method |
CN104462291A (en) * | 2014-11-27 | 2015-03-25 | 杭州华为数字技术有限公司 | Method and device for data processing |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104133867A (en) * | 2014-07-18 | 2014-11-05 | 中国科学院计算技术研究所 | DOT in-fragment secondary index method and DOT in-fragment secondary index system |
CN105574093B (en) * | 2015-12-10 | 2019-09-10 | 深圳市华讯方舟软件技术有限公司 | A method of index is established in the spark-sql big data processing system based on HDFS |
-
2015
- 2015-12-10 CN CN201510918956.4A patent/CN105574093B/en not_active Expired - Fee Related
-
2016
- 2016-08-12 WO PCT/CN2016/094925 patent/WO2017096939A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060041606A1 (en) * | 2004-08-19 | 2006-02-23 | Fujitsu Services Limited | Indexing system for a computer file store |
CN101344881A (en) * | 2007-07-09 | 2009-01-14 | 中国科学院大气物理研究所 | Index generation method and device and search system for mass file type data |
CN101727465A (en) * | 2008-11-03 | 2010-06-09 | 中国移动通信集团公司 | Methods for establishing and inquiring index of distributed column storage database, device and system thereof |
CN103631910A (en) * | 2013-11-26 | 2014-03-12 | 烽火通信科技股份有限公司 | Distributed database multi-column composite query system and method |
CN104462291A (en) * | 2014-11-27 | 2015-03-25 | 杭州华为数字技术有限公司 | Method and device for data processing |
Non-Patent Citations (2)
Title |
---|
唐振坤: "基于Spark的机器学习平台设计与实现", 《中国学位论文全文数据库(万方数据)》 * |
晁平复 等: "支持通信数据查询分析的分布式计算系统", 《华东师范大学学报(自然科学版)》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017096939A1 (en) * | 2015-12-10 | 2017-06-15 | 深圳市华讯方舟软件技术有限公司 | Method for establishing index on hdfs-based spark-sql big-data processing system |
CN106844415A (en) * | 2016-11-18 | 2017-06-13 | 北京奇虎科技有限公司 | A kind of data processing method and device in SparkSQL systems |
CN106599062A (en) * | 2016-11-18 | 2017-04-26 | 北京奇虎科技有限公司 | Data processing method and device in SparkSQL system |
CN106844415B (en) * | 2016-11-18 | 2021-08-20 | 北京奇虎科技有限公司 | Data processing method and device in spark SQL system |
CN106777278A (en) * | 2016-12-29 | 2017-05-31 | 海尔优家智能科技(北京)有限公司 | A kind of data processing method and device based on Spark |
CN107092685A (en) * | 2017-04-24 | 2017-08-25 | 广州新盛通科技有限公司 | A kind of method that file system and RDBMS store transaction data are used in combination |
CN107368517B (en) * | 2017-06-02 | 2018-07-13 | 上海恺英网络科技有限公司 | A kind of method and apparatus of high amount of traffic inquiry |
CN107391555B (en) * | 2017-06-07 | 2020-08-04 | 中国科学院信息工程研究所 | Spark-Sql retrieval-oriented metadata real-time updating method |
CN107391555A (en) * | 2017-06-07 | 2017-11-24 | 中国科学院信息工程研究所 | A kind of metadata real time updating method towards Spark Sql retrievals |
CN110019497A (en) * | 2017-08-07 | 2019-07-16 | 北京国双科技有限公司 | A kind of method for reading data and device |
CN108132986A (en) * | 2017-12-14 | 2018-06-08 | 北京航天测控技术有限公司 | A kind of immediate processing method of aircraft magnanimity biosensor assay data |
CN108874897A (en) * | 2018-05-23 | 2018-11-23 | 新华三大数据技术有限公司 | Data query method and device |
CN108874897B (en) * | 2018-05-23 | 2019-09-13 | 新华三大数据技术有限公司 | Data query method and device |
CN110046176A (en) * | 2019-04-28 | 2019-07-23 | 南京大学 | A kind of querying method of the large-scale distributed DataFrame based on Spark |
CN110046176B (en) * | 2019-04-28 | 2023-03-31 | 南京大学 | Spark-based large-scale distributed DataFrame query method |
CN112015729A (en) * | 2019-05-29 | 2020-12-01 | 核桃运算股份有限公司 | Data management apparatus, method and computer storage medium thereof |
CN112015729B (en) * | 2019-05-29 | 2024-04-02 | 核桃运算股份有限公司 | Data management device, method and computer storage medium thereof |
CN111177102A (en) * | 2019-12-25 | 2020-05-19 | 苏州浪潮智能科技有限公司 | Optimization method and system for realizing HDFS (Hadoop distributed File System) starting acceleration |
Also Published As
Publication number | Publication date |
---|---|
CN105574093B (en) | 2019-09-10 |
WO2017096939A1 (en) | 2017-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105574093A (en) | Method for establishing index in HDFS based spark-sql big data processing system | |
CN109299102B (en) | HBase secondary index system and method based on Elastcissearch | |
US9672235B2 (en) | Method and system for dynamically partitioning very large database indices on write-once tables | |
US20150310129A1 (en) | Method of managing database, management computer and storage medium | |
US11269954B2 (en) | Data searching method of database, apparatus and computer program for the same | |
US20140201192A1 (en) | Automatic data index establishment method | |
US11030196B2 (en) | Method and apparatus for processing join query | |
CN112231321B (en) | Oracle secondary index and index real-time synchronization method | |
US8880553B2 (en) | Redistribute native XML index key shipping | |
WO2021179782A1 (en) | Method, device and apparatus for improving execution efficiency of database appliance, and medium | |
US10990573B2 (en) | Fast index creation system for cloud big data database | |
CN109597829B (en) | Middleware method for realizing searchable encryption relational database cache | |
CN113704248B (en) | Block chain query optimization method based on external index | |
US10558636B2 (en) | Index page with latch-free access | |
JP5287071B2 (en) | Database management system and program | |
CN111125216B (en) | Method and device for importing data into Phoenix | |
KR101679011B1 (en) | Method and Apparatus for moving data in DBMS | |
EP3091447B1 (en) | Method for modifying root nodes and modifying apparatus | |
CN113961730A (en) | Graph data query method, system, computer device and readable storage medium | |
KR101642072B1 (en) | Method and Apparatus for Hybrid storage | |
CN112182028B (en) | Data line number query method and device based on table of distributed database | |
JP5936465B2 (en) | Multiple database automatic search device | |
CN111061721B (en) | Data processing method and device | |
CN115033547A (en) | Data processing method and device, electronic equipment and system | |
CN118069685A (en) | Hudi data lake index creation method, use method and related products |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 518102 Guangdong Province, Baoan District Xixiang street Shenzhen City Tian Yi Lu Chen Tian Bao Industrial District thirty-seventh building 3 floor Applicant after: SHENZHEN HUAXUN FANGZHOU SOFTWARE TECHNOLOGY Co.,Ltd. Applicant after: CHINA COMMUNICATION TECHNOLOGY Co.,Ltd. Address before: 518102 Guangdong Province, Baoan District Xixiang street Shenzhen City Tian Yi Lu Chen Tian Bao Industrial District thirty-seventh building 3 floor Applicant before: SHENZHEN HUAXUN FANGZHOU SOFTWARE TECHNOLOGY Co.,Ltd. Applicant before: CHINA COMMUNICATION TECHNOLOGY Co.,Ltd. |
|
COR | Change of bibliographic data | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PP01 | Preservation of patent right |
Effective date of registration: 20210630 Granted publication date: 20190910 |
|
PP01 | Preservation of patent right | ||
PD01 | Discharge of preservation of patent |
Date of cancellation: 20230421 Granted publication date: 20190910 |
|
PD01 | Discharge of preservation of patent | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230606 Address after: 518102 room 404, building 37, chentian Industrial Zone, chentian community, Xixiang street, Bao'an District, Shenzhen City, Guangdong Province Patentee after: Shenzhen Huaxun ark Photoelectric Technology Co.,Ltd. Patentee after: SHENZHEN HUAXUN FANGZHOU SOFTWARE TECHNOLOGY Co.,Ltd. Address before: 518102 3rd floor, building 37, chentian Industrial Zone, Baotian 1st Road, Xixiang street, Bao'an District, Shenzhen City, Guangdong Province Patentee before: SHENZHEN HUAXUN FANGZHOU SOFTWARE TECHNOLOGY Co.,Ltd. Patentee before: CHINA COMMUNICATION TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190910 |