CN105574093B - A method of index is established in the spark-sql big data processing system based on HDFS - Google Patents
A method of index is established in the spark-sql big data processing system based on HDFS Download PDFInfo
- Publication number
- CN105574093B CN105574093B CN201510918956.4A CN201510918956A CN105574093B CN 105574093 B CN105574093 B CN 105574093B CN 201510918956 A CN201510918956 A CN 201510918956A CN 105574093 B CN105574093 B CN 105574093B
- Authority
- CN
- China
- Prior art keywords
- index
- data
- sql
- spark
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000012545 processing Methods 0.000 title claims abstract description 33
- 238000001914 filtration Methods 0.000 claims abstract description 5
- 230000008569 process Effects 0.000 claims description 15
- 238000003780 insertion Methods 0.000 claims description 9
- 230000037431 insertion Effects 0.000 claims description 9
- 235000013399 edible fruits Nutrition 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- ZLIBICFPKPWGIZ-UHFFFAOYSA-N pyrimethanil Chemical compound CC1=CC(C)=NC(NC=2C=CC=CC=2)=N1 ZLIBICFPKPWGIZ-UHFFFAOYSA-N 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/134—Distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (7)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510918956.4A CN105574093B (en) | 2015-12-10 | 2015-12-10 | A method of index is established in the spark-sql big data processing system based on HDFS |
PCT/CN2016/094925 WO2017096939A1 (en) | 2015-12-10 | 2016-08-12 | Method for establishing index on hdfs-based spark-sql big-data processing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510918956.4A CN105574093B (en) | 2015-12-10 | 2015-12-10 | A method of index is established in the spark-sql big data processing system based on HDFS |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105574093A CN105574093A (en) | 2016-05-11 |
CN105574093B true CN105574093B (en) | 2019-09-10 |
Family
ID=55884224
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510918956.4A Expired - Fee Related CN105574093B (en) | 2015-12-10 | 2015-12-10 | A method of index is established in the spark-sql big data processing system based on HDFS |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105574093B (en) |
WO (1) | WO2017096939A1 (en) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105574093B (en) * | 2015-12-10 | 2019-09-10 | 深圳市华讯方舟软件技术有限公司 | A method of index is established in the spark-sql big data processing system based on HDFS |
CN106844415B (en) * | 2016-11-18 | 2021-08-20 | 北京奇虎科技有限公司 | Data processing method and device in spark SQL system |
CN106599062A (en) * | 2016-11-18 | 2017-04-26 | 北京奇虎科技有限公司 | Data processing method and device in SparkSQL system |
CN106777278B (en) * | 2016-12-29 | 2021-02-23 | 海尔优家智能科技(北京)有限公司 | Spark-based data processing method and device |
CN107092685A (en) * | 2017-04-24 | 2017-08-25 | 广州新盛通科技有限公司 | A kind of method that file system and RDBMS store transaction data are used in combination |
CN107368517B (en) * | 2017-06-02 | 2018-07-13 | 上海恺英网络科技有限公司 | A kind of method and apparatus of high amount of traffic inquiry |
CN107391555B (en) * | 2017-06-07 | 2020-08-04 | 中国科学院信息工程研究所 | Spark-Sql retrieval-oriented metadata real-time updating method |
CN110019497B (en) * | 2017-08-07 | 2021-06-08 | 北京国双科技有限公司 | Data reading method and device |
CN108132986B (en) * | 2017-12-14 | 2020-06-16 | 北京航天测控技术有限公司 | Rapid processing method for test data of mass sensors of aircraft |
CN108874897B (en) * | 2018-05-23 | 2019-09-13 | 新华三大数据技术有限公司 | Data query method and device |
CN110046176B (en) * | 2019-04-28 | 2023-03-31 | 南京大学 | Spark-based large-scale distributed DataFrame query method |
CN112015729B (en) * | 2019-05-29 | 2024-04-02 | 核桃运算股份有限公司 | Data management device, method and computer storage medium thereof |
CN110674154B (en) * | 2019-09-26 | 2023-04-07 | 浪潮软件股份有限公司 | Spark-based method for inserting, updating and deleting data in Hive |
CN110928835A (en) * | 2019-10-12 | 2020-03-27 | 虏克电梯有限公司 | Novel file storage system and method based on mass storage |
CN111125216B (en) * | 2019-12-10 | 2024-03-12 | 中盈优创资讯科技有限公司 | Method and device for importing data into Phoenix |
CN111177102B (en) * | 2019-12-25 | 2022-07-19 | 苏州浪潮智能科技有限公司 | Optimization method and system for realizing HDFS (Hadoop distributed File System) starting acceleration |
CN111752804B (en) * | 2020-06-29 | 2022-09-09 | 中国电子科技集团公司第二十八研究所 | Database cache system based on database log scanning |
CN113297204B (en) * | 2020-07-15 | 2024-03-08 | 阿里巴巴集团控股有限公司 | Index generation method and device |
CN112231321B (en) * | 2020-10-20 | 2022-09-20 | 中国电子科技集团公司第二十八研究所 | Oracle secondary index and index real-time synchronization method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101344881A (en) * | 2007-07-09 | 2009-01-14 | 中国科学院大气物理研究所 | Index generation method and device and search system for mass file type data |
CN101727465A (en) * | 2008-11-03 | 2010-06-09 | 中国移动通信集团公司 | Methods for establishing and inquiring index of distributed column storage database, device and system thereof |
CN103631910A (en) * | 2013-11-26 | 2014-03-12 | 烽火通信科技股份有限公司 | Distributed database multi-column composite query system and method |
CN104462291A (en) * | 2014-11-27 | 2015-03-25 | 杭州华为数字技术有限公司 | Method and device for data processing |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2417342A (en) * | 2004-08-19 | 2006-02-22 | Fujitsu Serv Ltd | Indexing system for a computer file store |
CN104133867A (en) * | 2014-07-18 | 2014-11-05 | 中国科学院计算技术研究所 | DOT in-fragment secondary index method and DOT in-fragment secondary index system |
CN105574093B (en) * | 2015-12-10 | 2019-09-10 | 深圳市华讯方舟软件技术有限公司 | A method of index is established in the spark-sql big data processing system based on HDFS |
-
2015
- 2015-12-10 CN CN201510918956.4A patent/CN105574093B/en not_active Expired - Fee Related
-
2016
- 2016-08-12 WO PCT/CN2016/094925 patent/WO2017096939A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101344881A (en) * | 2007-07-09 | 2009-01-14 | 中国科学院大气物理研究所 | Index generation method and device and search system for mass file type data |
CN101727465A (en) * | 2008-11-03 | 2010-06-09 | 中国移动通信集团公司 | Methods for establishing and inquiring index of distributed column storage database, device and system thereof |
CN103631910A (en) * | 2013-11-26 | 2014-03-12 | 烽火通信科技股份有限公司 | Distributed database multi-column composite query system and method |
CN104462291A (en) * | 2014-11-27 | 2015-03-25 | 杭州华为数字技术有限公司 | Method and device for data processing |
Non-Patent Citations (2)
Title |
---|
基于Spark的机器学习平台设计与实现;唐振坤;《中国学位论文全文数据库(万方数据)》;20150106;摘要、第23-25页 |
支持通信数据查询分析的分布式计算系统;晁平复 等;《华东师范大学学报(自然科学版)》;20140930(第05期);第89、91-92、95-98、101页 |
Also Published As
Publication number | Publication date |
---|---|
CN105574093A (en) | 2016-05-11 |
WO2017096939A1 (en) | 2017-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105574093B (en) | A method of index is established in the spark-sql big data processing system based on HDFS | |
CN109299102B (en) | HBase secondary index system and method based on Elastcissearch | |
US6801904B2 (en) | System for keyword based searching over relational databases | |
US6789094B2 (en) | Method and apparatus for providing extended file attributes in an extended attribute namespace | |
RU2427896C2 (en) | Annotation of documents in jointly operating applications by data in separated information systems | |
US8527556B2 (en) | Systems and methods to update a content store associated with a search index | |
US8924373B2 (en) | Query plans with parameter markers in place of object identifiers | |
US20120084291A1 (en) | Applying search queries to content sets | |
JP3914662B2 (en) | Database processing method and apparatus, and medium storing the processing program | |
CN103870588B (en) | A kind of method and device used in data base | |
CN1979469A (en) | Index and its extending and searching method | |
JP2009110260A (en) | File sharing system in cooperation with search engine | |
WO2018097846A1 (en) | Edge store designs for graph databases | |
MX2010012866A (en) | Paging hierarchical data. | |
Rozsnyai et al. | Large-scale distributed storage system for business provenance | |
CN112231321B (en) | Oracle secondary index and index real-time synchronization method | |
Yafooz et al. | Managing unstructured data in relational databases | |
CN104035993A (en) | Memory search method for e-books, e-book management system and reading system | |
CN110134335A (en) | A kind of RDF data management method, device and storage medium based on key-value pair | |
US7844596B2 (en) | System and method for aiding file searching and file serving by indexing historical filenames and locations | |
US20100082587A1 (en) | Apparatus, method, and computer program product for searching structured document | |
CN110110034A (en) | A kind of RDF data management method, device and storage medium based on figure | |
CN106649462B (en) | A kind of implementation method for mass data full-text search scene | |
KR101679011B1 (en) | Method and Apparatus for moving data in DBMS | |
KR101299555B1 (en) | Apparatus and method for text search using index based on hash function |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 518102 Guangdong Province, Baoan District Xixiang street Shenzhen City Tian Yi Lu Chen Tian Bao Industrial District thirty-seventh building 3 floor Applicant after: SHENZHEN HUAXUN FANGZHOU SOFTWARE TECHNOLOGY Co.,Ltd. Applicant after: CHINA COMMUNICATION TECHNOLOGY Co.,Ltd. Address before: 518102 Guangdong Province, Baoan District Xixiang street Shenzhen City Tian Yi Lu Chen Tian Bao Industrial District thirty-seventh building 3 floor Applicant before: SHENZHEN HUAXUN FANGZHOU SOFTWARE TECHNOLOGY Co.,Ltd. Applicant before: CHINA COMMUNICATION TECHNOLOGY Co.,Ltd. |
|
COR | Change of bibliographic data | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PP01 | Preservation of patent right |
Effective date of registration: 20210630 Granted publication date: 20190910 |
|
PP01 | Preservation of patent right | ||
PD01 | Discharge of preservation of patent | ||
PD01 | Discharge of preservation of patent |
Date of cancellation: 20230421 Granted publication date: 20190910 |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230606 Address after: 518102 room 404, building 37, chentian Industrial Zone, chentian community, Xixiang street, Bao'an District, Shenzhen City, Guangdong Province Patentee after: Shenzhen Huaxun ark Photoelectric Technology Co.,Ltd. Patentee after: SHENZHEN HUAXUN FANGZHOU SOFTWARE TECHNOLOGY Co.,Ltd. Address before: 518102 3rd floor, building 37, chentian Industrial Zone, Baotian 1st Road, Xixiang street, Bao'an District, Shenzhen City, Guangdong Province Patentee before: SHENZHEN HUAXUN FANGZHOU SOFTWARE TECHNOLOGY Co.,Ltd. Patentee before: CHINA COMMUNICATION TECHNOLOGY Co.,Ltd. |
|
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190910 |