CN104937552A - 并行数据库和分布式文件系统上的数据分析平台 - Google Patents

并行数据库和分布式文件系统上的数据分析平台 Download PDF

Info

Publication number
CN104937552A
CN104937552A CN201480006056.6A CN201480006056A CN104937552A CN 104937552 A CN104937552 A CN 104937552A CN 201480006056 A CN201480006056 A CN 201480006056A CN 104937552 A CN104937552 A CN 104937552A
Authority
CN
China
Prior art keywords
section
plan
data analysis
data
distributed treatment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201480006056.6A
Other languages
English (en)
Other versions
CN104937552B (zh
Inventor
C.E.威尔顿
S.杨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Inc
EMC Corp
Original Assignee
EMC Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by EMC Inc filed Critical EMC Inc
Publication of CN104937552A publication Critical patent/CN104937552A/zh
Application granted granted Critical
Publication of CN104937552B publication Critical patent/CN104937552B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1858Parallel file systems, i.e. file systems supporting multiple processors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24524Access plan code generation and invalidation; Reuse of access plans
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying

Abstract

公开在大型分布式系统的上下文中进行数据分析处理,该大型分布式系统包括大规模并行处理(MPP)数据库和分布式存储层。在各种实施例中,接收数据分析请求。创建计划来生成对请求的响应。对多个分布式处理段中的每个指派计划的对应部分,其包括通过如在指派中指示的那样调用嵌入处理段中的一个或多个数据分析函数。

Description

并行数据库和分布式文件系统上的数据分析平台
对其它申请的交叉引用
该申请要求于2013年2月25日提交的题为INTEGRATION OF MASSIVELY PARALLEL PROCESSING WITH A DATA INTENSIVE SOFTWARE FRAMEWORK(利用数据密集型软件框架的大规模并行处理整合)的美国临时专利申请号61/769,043的优先权,其为了所有目的通过引用合并于此。
背景技术
分布式存储系统能够采用使数据跨大的商用硬件集群分布的方式存储数据库、文件和其它对象。例如,Hadoop®是用于使数据和关联计算(例如,应用任务的执行)跨大的商用硬件集群分布的开放源软件框架。
EMC Greenplum®为数据存储和分析提供大规模并行处理(MPP)架构。典型地,数据存储在段服务器中,其中的每个存储并且管理总体数据集的一部分。例如EMC Greenplum®等高级MPP数据库系统提供对巨大数据集进行数据分析处理的能力,其包括通过使用户能够使用熟悉和/或行业标准语言和协议(例如SQL)来规定要进行的数据分析和/或其它处理。数据分析处理的示例无限制地包括逻辑回归、多项逻辑回归、K均值聚类、基于关联规则的市场篮子分析、基于潜在狄利克雷的主题建模等。
尽管例如Hadoop®等分布式存储系统提供在商用硬件上可靠存储大量数据的能力,这样的系统至今未被优化来支持关于存储在它们中的数据的数据挖掘和分析处理。
附图说明
本发明的各种实施例在下列详细描述和附图中公开。
图1是图示大型分布式系统的实施例的框图。
图2是图示大型分布式系统的数据分析架构的实施例的框图。
图3是图示数据库查询处理过程的实施例的流程图。
图4是图示段服务器的实施例的框图。
图5是图示进行数据分析处理的过程的实施例的流程图。
具体实施方式
本发明可以采用许多方式实现,包括作为:过程;设备;系统;物质组成;计算机可读存储介质上包含的计算机程序产品;和/或处理器,例如配置成执行存储在存储器(其耦合于处理器)上和/或由该存储器提供的指令的处理器。在该说明书中,这些实现或本发明可采取的任何其它形式可称为技术。一般,公开的过程的步骤的顺序可在本发明的范围内更改。除非另外规定,例如描述为配置成执行任务的处理器或存储器等部件可实现为暂时配置成在指定时间执行任务的通用部件或被制造来执行任务的特定部件。如本文使用的,术语‘处理器’指一个或多个装置、电路和/或处理核,其配置成处理例如计算机程序指令等数据。
本发明的一个或多个实施例的详细描述在下文连同图示本发明的原理的附图一起提供。本发明连同这样的实施例描述,但本发明不限于任何实施例。本发明的范围仅由权利要求限制并且本发明包含许多备选、修改和等同物。在下列描述中阐述许多特定细节以便提供对本发明的全面理解。为了示例目的提供这些细节,并且本发明可根据权利要求实践而没有这些具体细节中的一些或全部。为了清楚起见,在与本发明有关的技术领域中已知的技术材料未被详细描述使得不会不必要地模糊本发明。
公开在大的分布式数据存储系统的上下文中提供高级数据分析能力。在各种实施例中,大规模并行处理(MPP)数据库系统适于关于存储在大的分布式存储层中的数据来管理并且提供数据分析,例如Hadoop®分布式存储框架的实现。数据分析处理的示例无限制地包括逻辑回归、多项逻辑回归、K均值聚类、基于关联规则的市场篮子分析、基于潜在狄利克雷的主题建模等。在一些实施例中,高级数据分析函数(例如统计和其它分析函数)嵌入多个段服务器(其包括系统的MPP数据库部分)中的每个中。在一些实施例中,为了执行数据分析任务(例如计算统计信息、进行优化等),主控节点选择段的子集来进行关联处理,并且向每个段发送要由该段进行的数据分析处理的指示,其包括例如要使用的嵌入式数据分析函数的识别,和定位和/或访问该段进行指示的处理所针对的数据的子集所需要的关联元数据。
图1是图示大型分布式系统的实施例的框图。在示出的示例中,大型分布式系统包括大的商用服务器集群。主控主机包括主要主控102和备用主控104。主要主控102负责接受查询;例如至少部分基于系统元数据106计划查询,系统元数据106在各种实施例中包括指示数据存储在系统内哪里的信息;将查询派送到段以供执行;以及收集来自段的结果。备用主控104是主要主控102的热备份。网络互连108用于在执行进程之间传送元组。数据库引擎的计算单元叫做“段”。大量段主机(其在图1中由主机110、112和114表示)中的每个可以具有多个段。段主机110、112、114上的段例如配置成执行由主要主控102指派的任务,例如关于存储在分布式存储层116(例如,Hadoop®或其它存储层)中的数据执行查询计划的指派部分。
当主控节点102接受查询时,根据查询中表的统计信息来解析并且计划它,例如基于元数据106。在计划阶段后,生成查询计划。查询计划切成许多片。在查询执行阶段中,对于每个片,选择一组段(典型地包括在段主机1至s上托管的段的子集)来执行片。在各种实施例中,组的大小可通过使用数据分布和可用资源的知识(例如,相应段上的工作负载,等)来动态确定。
在各种实施例中,数据分析工作或其它查询可使用SQL和/或任何其它规定语言或语法而完全或部分表达。主控节点(例如主要主控102)解析SQL或其它输入并且调用主控上可用的脚本或其它代码来进行顶级处理以进行请求的处理。在各种实施例中,由主控102生成的查询计划例如可对多个段中的每个识别要由该段处理的全局数据集的对应部分。识别要由特定段处理的数据的位置(例如,利用分布式存储层116)的元数据由主控102发送到段。在各种实施例中,分布式存储层116包括存储在Hadoop分布式文件系统(HDFS)实例中的数据并且元数据指示要由该段处理的数据在HDFS内的位置。主控102另外将对段指示要进行的特定处理。在各种实施例中,来自主控的指示可直接或间接指示在每个段处嵌入的要由段使用来进行所需处理的一个或多个分析函数。
图2是图示大型分布式系统的数据分析架构的实施例的框图。在各种实施例中,图2的数据分析架构200在大型分布式系统中实现,例如图1的大型分布式系统。在示出的示例中,数据分析架构200包括用户界面202,其使数据分析请求能够使用SQL来表达,例如如由规范所指示的。各种驱动函数204(例如,在该示例中具有模板化SQL的python或其它脚本)可被调用以执行例如迭代算法的外环、优化器调用等。在该示例中还包括python脚本的高级抽象层206提供例如迭代控制器、凸优化器等功能性。上层202、204和206与RDBMS内置函数208和/或与内环210和/或低级抽象层212(其在该示例中包括编译C++)交互来执行低级任务,执行经由用户界面202接收的任务需要这样的低级任务。访问数据以通过与底层RDBMS查询处理层214交互来进行分析计算和/或其它处理。在各种实施例中,在图2中示出的部件中的一个或多个可跨包括系统的节点实现,例如跨段或其它处理单元,该系统包括大型分布式系统(例如在图1中示出的)的MPP数据库部分。在各种实施例中,核心数据分析处理至少部分使用嵌入系统中所包括的段(或其它处理单元)中的每个中的函数来进行。在一些实施例中,函数包括“共享对象”或函数库,其包括编译C++或其它编译代码,例如Java或Fortran。当对段指派更宽泛任务的一部分时,段使用指派所牵涉的嵌入式函数以至少部分进行已经指派给段的数据分析和/或其它处理。
图3是图示数据库查询处理过程的实施例的流程图。在一些实施例中,主控节点(例如图1的主要主控102)实现图3的过程。在示出的示例中,接收查询(302)。查询的示例无限制地包括全部或部分表达为SQL语句集的高级数据分析请求。生成查询计划(304)。该计划分成多个片,并且对于每个片识别要参与查询计划的该片的执行的段的对应集(“群”)(306)。对于查询计划的每个片,选择来进行该片所需要的处理的段发送通信,其既包括要由该段执行的计划的可适用部分又包括接收段执行指派给该段的任务可需要的元数据(308)。在一些实施例中,查询计划中包括的元数据和/或发送到选择参与计划的该片的执行的相应段的其它通信包括来自中央元数据存储的元数据(例如,图1的元数据106),并且包括对段指示数据的位置的信息,关于该数据该段要进行查询计划片相关处理。在过去的方法中,段典型地将存储并且管理总体数据的对应部分,并且发送元数据来执行查询计划相关任务典型地将不是必需的。在一些实施例中,发送到所选的段的指派中包括的元数据和/或其它数据可指示完全或部分地要由段使用已经嵌入分布式系统中的段中的每个中的一个或多个数据分析函数进行的数据分析处理。从查询任务派送到的相应段接收查询结果,并且处理查询结果以例如在主控节点处生成对查询的主控或总体响应(310)。
图4是图示段服务器的实施例的框图。在各种实施例中,一个或多个段服务器(例如段服务器402)可部署在多个段主机(例如图1的段主机110、112和114)中的每个中。在示出的示例中,段服务器402包括通信接口404,其配置成例如经由网络互连(例如图1的互连108)接收网络通信,其包括由主控节点(例如图1的主要主控102)发送的指派。查询执行器406进行完成由主控节点在该示例中使用存储层接口408指派的任务以访问存储在分布式存储层(例如图1的分布式存储层116)中的数据所需要的处理。在嵌入分布式系统中的每个段服务器中的共享数据分析库410中包括的一个或多个数据分析函数可被调用来进行数据分析处理,如执行指派任务所需要的。可嵌入段服务器的函数的示例在各种实施例中无限制地包括:用户定义的函数(例如,用在规定范围内的值随机初始化阵列的UDF、使矩阵转置的UDF、使2维阵列解嵌套成1维阵列集的UDF,等)、阶跃函数和各种用户定义聚合器的最后的函数。
图5是图示进行数据分析处理的过程的实施例的流程图。在各种实施例中,图5的过程由段服务器响应于例如从主控节点接收指派而进行以执行数据分析查询计划的指派部分。在示出的示例中,接收指派任务(502)。嵌入指派任务的元数据用于根据需要访问数据来执行指派任务(504)。在段服务器或其它处理单元处嵌入的数据分析函数根据需要被调用来执行指派任务(506)。可嵌入段服务器的函数的示例在各种实施例中无限制地包括:进行Gibbs采样用于推理潜在狄利克雷分配的函数,或生成关联规则的函数。一旦完成处理,将结果返回例如从其接收指派的主控节点(508)。
使用本文公开的技术,可以在建立在可标度分布式文件系统上的高性能并行数据库系统上提供可标度且高性能数据分析平台。并行数据库和分布式文件系统的优势组合来克服大的数据分析的挑战。最后,在各种实施例中,用户能够使用熟悉的SQL查询来运行分析任务,并且底层并行数据库引擎将这些SQL查询转化成执行计划集,其根据数据局部性和负载平衡来优化。
尽管前面的实施例已经为了清楚理解的目的而相当详细地描述,本发明不限于提供的细节。存在实现本发明的许多备选方式。公开的实施例是说明性而非限制性的。

Claims (22)

1. 一种方法,其包括:
接收数据分析请求;
创建计划来生成对所述请求的响应;以及
对多个分布式处理段中的每个指派要由该段执行的计划的对应部分,其包括通过如在所述指派中指示的那样调用嵌入所述处理段的一个或多个数据分析函数。
2. 如权利要求1所述的方法,其中所述数据分析请求包括一个或多个SQL语句。
3. 如权利要求1所述的方法,其中所述数据分析请求包括一个或多个SQL语句来计算以下中的一个或多个:逻辑回归、多项逻辑回归、K均值聚类、基于关联规则的市场篮子分析和基于潜在狄利克雷的主题建模。
4. 如权利要求1所述的方法,其中所述数据分析请求在大型分布式系统的主控节点处接收。
5. 如权利要求1所述的方法,其中创建计划来生成对所述请求的响应包括创建查询计划、将所述查询计划切成多个片以及对每个片识别一组处理段来执行包括所述查询计划的片的任务。
6. 如权利要求1所述的方法,其中对多个分布式处理段中的每个指派要由该段执行的计划的对应部分包括将指示要由该段处理的数据在分布式数据存储层内的位置的元数据嵌入要发送到所述多个分布式处理段中的一个或多个的指派通信。
7. 如权利要求6所述的方法,其中所述分布式处理段中的每个配置成使用所述元数据来访问要由该段处理的所述数据。
8. 如权利要求6所述的方法,其中所述分布式数据层包括存储在Hadoop分布式文件系统(HDFS)实例中的数据并且所述元数据指示要由该段处理的数据在所述HDFS内的位置。
9. 如权利要求1所述的方法,其进一步包括将包括所述一个或多个数据分析函数的库或其它共享对象嵌入所述多个分布式处理段中的每个中。
10. 如权利要求9所述的方法,其中所述库或其它共享对象包括在如部署的处理段中。
11. 如权利要求9所述的方法,其中所述库或其它共享对象包含采用以下中的一个或多个的形式的所述一个或多个数据分析函数:编译C++代码、编译Java、编译Fortran或其它编译代码。
12. 如权利要求1所述的方法,其中所述多个分布式处理段包括并行处理段的子集,其包括大规模并行处理(MPP)数据库系统。
13. 一种系统,其包括:
通信接口;以及
处理器,其耦合于所述通信接口并且配置成:
  接收数据分析请求;
  创建计划来生成对所述请求的响应;以及
  经由通信对多个分布式处理段中的每个指派要由该段执行的所述计划的对应部分,其包括通过如在所述指派中指示的那样调用嵌入所述处理段中的一个或多个数据分析函数,所述通信经由所述通信接口发送。
14. 如权利要求13所述的系统,其中所述数据分析请求包括一个或多个SQL语句。
15. 如权利要求13所述的系统,其中所述数据分析请求在大型分布式系统的主控节点处接收。
16. 如权利要求13所述的系统,其中所述处理器配置成至少部分通过创建查询计划、将所述查询计划切成多个片以及对每个片识别一组处理段来执行包括所述查询计划的片的任务而创建所述计划来生成对所述请求的响应。
17. 如权利要求13所述的系统,其中所述处理器配置成至少部分通过将指示要由该段处理的数据在分布式数据存储层内的位置的元数据嵌入要经由所述通信接口发送到多个分布式处理段中的一个或多个的通信中而对所述多个分布式处理段中的每个指派要由该段执行的所述计划的对应部分。
18. 如权利要求17所述的系统,其中所述多个分布式处理段中的每个配置成使用所述元数据来访问要由该段处理的所述数据。
19. 如权利要求13所述的系统,其中所述多个分布式处理段中的每个使包括一个或多个数据分析函数的库或其它共享对象嵌入其中。
20. 如权利要求13所述的系统,其中所述多个分布式处理段包括并行处理段子集,其包括大规模并行处理(MPP)数据库系统。
21. 一种包含在有形的非暂时性计算机可读存储介质中的计算机程序产品,其包括计算机指令用于:
接收数据分析请求;
创建计划来生成对所述请求的响应;以及
对多个分布式处理段中的每个指派要由该段执行的计划的对应部分,其包括通过如在所述指派中指示的那样调用嵌入所述处理段中的一个或多个数据分析函数。
22. 如权利要求21所述的计算机程序产品,其中对多个分布式处理段中的每个指派要由该段执行的所述计划的对应部分包括将指示要由该段处理的数据在分布式数据存储层内的位置的元数据嵌入要发送到所述多个分布式处理段中的一个或多个的指派通信中。
CN201480006056.6A 2013-02-25 2014-02-14 并行数据库和分布式文件系统上的数据分析平台 Active CN104937552B (zh)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201361769043P 2013-02-25 2013-02-25
US61/769043 2013-02-25
US13/840,912 US9563648B2 (en) 2013-02-25 2013-03-15 Data analytics platform over parallel databases and distributed file systems
US13/840912 2013-03-15
PCT/US2014/016596 WO2014130371A1 (en) 2013-02-25 2014-02-14 Data analytics platform over parallel databases and distributed file systems

Publications (2)

Publication Number Publication Date
CN104937552A true CN104937552A (zh) 2015-09-23
CN104937552B CN104937552B (zh) 2019-09-20

Family

ID=51389310

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480006056.6A Active CN104937552B (zh) 2013-02-25 2014-02-14 并行数据库和分布式文件系统上的数据分析平台

Country Status (4)

Country Link
US (31) US9753980B1 (zh)
EP (1) EP2959384B1 (zh)
CN (1) CN104937552B (zh)
WO (1) WO2014130371A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109313587A (zh) * 2016-04-25 2019-02-05 康维达无线有限责任公司 用于在服务层处使能数据分析服务的方法
CN110618979A (zh) * 2019-08-14 2019-12-27 平安科技(深圳)有限公司 嵌套循环的数据处理方法、装置及计算机设备

Families Citing this family (81)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9753980B1 (en) 2013-02-25 2017-09-05 EMC IP Holding Company LLC M X N dispatching in large scale distributed system
US9984083B1 (en) * 2013-02-25 2018-05-29 EMC IP Holding Company LLC Pluggable storage system for parallel query engines across non-native file systems
GB2516501A (en) 2013-07-25 2015-01-28 Ibm Method and system for processing data in a parallel database environment
US10402374B2 (en) * 2013-08-26 2019-09-03 Vmware, Inc. Log-structured storage device format
US9317809B1 (en) * 2013-09-25 2016-04-19 Emc Corporation Highly scalable memory-efficient parallel LDA in a shared-nothing MPP database
US11238073B2 (en) 2014-02-07 2022-02-01 Convida Wireless, Llc Enabling resource semantics
US9665633B2 (en) 2014-02-19 2017-05-30 Snowflake Computing, Inc. Data management systems and methods
US9432192B1 (en) 2014-03-28 2016-08-30 Emc Corporation Content aware hierarchical encryption for secure storage systems
US10672078B1 (en) * 2014-05-19 2020-06-02 Allstate Insurance Company Scoring of insurance data
US10445339B1 (en) 2014-05-28 2019-10-15 EMC IP Holding Company LLC Distributed contextual analytics
US10853356B1 (en) * 2014-06-20 2020-12-01 Amazon Technologies, Inc. Persistent metadata catalog
US9792185B2 (en) 2014-06-24 2017-10-17 International Business Machines Corporation Directed backup for massively parallel processing databases
US9747333B2 (en) * 2014-10-08 2017-08-29 Cloudera, Inc. Querying operating system state on multiple machines declaratively
US10078663B2 (en) * 2014-10-29 2018-09-18 Red Hat, Inc. Dual overlay query processing
KR102128323B1 (ko) * 2014-10-31 2020-06-30 에스케이 텔레콤주식회사 사용자 언어환경을 고려한 데이터 분석 서비스 제공 장치 및 그를 위한 컴퓨터로 읽을 수 있는 기록 매체
US10037187B2 (en) 2014-11-03 2018-07-31 Google Llc Data flow windowing and triggering
US20160210306A1 (en) * 2015-01-15 2016-07-21 Commvault Systems, Inc. Managing structured data in a data storage system
GB2534373A (en) 2015-01-20 2016-07-27 Ibm Distributed system with accelerator and catalog
US10108687B2 (en) 2015-01-21 2018-10-23 Commvault Systems, Inc. Database protection using block-level mapping
GB201504710D0 (en) * 2015-03-20 2015-05-06 Ibm Establishing transaction metadata
US10318491B1 (en) 2015-03-31 2019-06-11 EMC IP Holding Company LLC Object metadata query with distributed processing systems
US11016946B1 (en) * 2015-03-31 2021-05-25 EMC IP Holding Company LLC Method and apparatus for processing object metadata
US10031814B2 (en) 2015-04-14 2018-07-24 Microsoft Technology Licensing, Llc Collection record location as log tail beginning
US9959137B2 (en) 2015-04-14 2018-05-01 Microsoft Technology Licensing, Llc Transaction redo using skip element for object
US10102251B2 (en) 2015-04-14 2018-10-16 Microsoft Technology Licensing, Llc Lockless open collection data structure
US10592494B2 (en) 2015-04-14 2020-03-17 Microsoft Technology Licensing, Llc Collection record for overlapping data stream collections
US10133768B2 (en) * 2015-04-14 2018-11-20 Microsoft Technology Licensing, Llc Latest external dependee entity in transaction record
US9904598B2 (en) 2015-04-21 2018-02-27 Commvault Systems, Inc. Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
US9940213B2 (en) * 2015-06-10 2018-04-10 International Business Machines Corporation Integrating external services with a clustered file system
US10742731B2 (en) 2015-06-10 2020-08-11 International Business Machines Corporation Maintaining service configuration consistency across nodes of a clustered file system
US9588815B1 (en) 2015-06-17 2017-03-07 EMC IP Holding Company LLC Architecture for data collection and event management supporting automation in service provider cloud environments
US9319365B1 (en) * 2015-10-09 2016-04-19 Machine Zone, Inc. Systems and methods for storing and transferring message data
EP3365808B1 (en) * 2015-10-23 2021-08-25 Oracle International Corporation Proxy databases
KR102233380B1 (ko) * 2015-11-04 2021-03-26 에스케이텔레콤 주식회사 질의 및 데이터 경로 관리장치 및 컴퓨터 프로그램
US10127238B1 (en) * 2015-12-08 2018-11-13 EMC IP Holding Company LLC Methods and apparatus for filtering dynamically loadable namespaces (DLNs)
CN105389394A (zh) * 2015-12-22 2016-03-09 北京奇虎科技有限公司 基于多个数据库集群的数据请求处理方法及装置
CN106919622B (zh) 2015-12-28 2021-10-15 伊姆西Ip控股有限责任公司 用于分布式数据处理的方法和设备
CN107220261B (zh) * 2016-03-22 2020-10-30 中国移动通信集团山西有限公司 一种基于分布式数据的实时挖掘方法及装置
US11539784B2 (en) 2016-06-22 2022-12-27 International Business Machines Corporation Content-based distribution and execution of analytics applications on distributed datasets
US10650017B1 (en) * 2016-08-29 2020-05-12 Amazon Technologies, Inc. Tiered storage for data processing
EP3291103B1 (en) * 2016-09-01 2019-11-06 Huawei Technologies Co., Ltd. System and method for creating a snapshot of a subset of a database
US10069597B2 (en) * 2016-09-07 2018-09-04 Western Digital Technologies, Inc. Aggregated metadata transfer at a data storage device
US10635670B2 (en) * 2016-09-23 2020-04-28 Sap Se Integration of query processing engines in a distributed database system
US10657127B2 (en) * 2016-11-28 2020-05-19 Sap Se Distributed joins in a distributed database system
US11157690B2 (en) 2017-02-22 2021-10-26 Microsoft Technology Licensing, Llc Techniques for asynchronous execution of computationally expensive local spreadsheet tasks
US10725799B2 (en) * 2017-02-22 2020-07-28 Microsoft Technology Licensing, Llc Big data pipeline management within spreadsheet applications
US10498817B1 (en) * 2017-03-21 2019-12-03 Amazon Technologies, Inc. Performance tuning in distributed computing systems
US10838959B2 (en) * 2017-04-10 2020-11-17 Sap Se Harmonized structured query language and non-structured query language query processing
US10891201B1 (en) * 2017-04-27 2021-01-12 EMC IP Holding Company LLC Dynamic rule based model for long term retention
CN107423431A (zh) * 2017-08-03 2017-12-01 东北林业大学 一种基于分布式文件系统的遥感数据存储方法及系统
US10902000B2 (en) 2017-09-29 2021-01-26 Oracle International Corporation Heartbeat propagation in a distributed stream processing system
US11669509B2 (en) 2017-09-29 2023-06-06 Jpmorgan Chase Bank, N.A. System and method for achieving optimal change data capture (CDC) on hadoop
US11210181B2 (en) * 2017-09-29 2021-12-28 Jpmorgan Chase Bank, N.A. System and method for implementing data manipulation language (DML) on Hadoop
US11861423B1 (en) 2017-10-19 2024-01-02 Pure Storage, Inc. Accelerating artificial intelligence (‘AI’) workflows
US11494692B1 (en) * 2018-03-26 2022-11-08 Pure Storage, Inc. Hyperscale artificial intelligence and machine learning infrastructure
US10671435B1 (en) 2017-10-19 2020-06-02 Pure Storage, Inc. Data transformation caching in an artificial intelligence infrastructure
US10437643B2 (en) 2017-11-10 2019-10-08 Bank Of America Corporation Independent storage and processing of data with centralized event control
CN108491507B (zh) * 2018-03-22 2022-03-11 北京交通大学 一种基于Hadoop分布式环境的不确定交通流数据并行持续查询方法
JP7176218B2 (ja) * 2018-04-02 2022-11-22 トヨタ自動車株式会社 情報検索サーバ、情報検索システムおよび情報検索方法
US11163764B2 (en) 2018-06-01 2021-11-02 International Business Machines Corporation Predictive data distribution for parallel databases to optimize storage and query performance
US11157496B2 (en) 2018-06-01 2021-10-26 International Business Machines Corporation Predictive data distribution for parallel databases to optimize storage and query performance
US10740356B2 (en) 2018-06-27 2020-08-11 International Business Machines Corporation Dynamic incremental updating of data cubes
US10608889B2 (en) 2018-06-29 2020-03-31 Hewlett Packard Enterprise Development Lp High-level interface to analytics engine
US11138215B2 (en) * 2018-06-29 2021-10-05 Oracle International Corporation Method and system for implementing parallel database queries
US11467920B2 (en) * 2018-10-25 2022-10-11 EMC IP Holding Company LLC Methods and systems to index file data of virtual machine (VM) image
CN109302497A (zh) * 2018-11-29 2019-02-01 北京京东尚科信息技术有限公司 基于hadoop的数据处理方法、访问代理装置和系统
CN111367954A (zh) * 2018-12-26 2020-07-03 中兴通讯股份有限公司 数据查询处理方法、装置及系统、计算机可读存储介质
US11269732B2 (en) 2019-03-12 2022-03-08 Commvault Systems, Inc. Managing structured data in a data storage system
US11144569B2 (en) 2019-05-14 2021-10-12 International Business Machines Corporation Operations to transform dataset to intent
US11216446B2 (en) * 2019-08-29 2022-01-04 Snowflake Inc. Identifying software regressions based on query retry attempts in a database environment
US11580102B2 (en) * 2020-04-02 2023-02-14 Ocient Holdings LLC Implementing linear algebra functions via decentralized execution of query operator flows
US20210334236A1 (en) * 2020-04-24 2021-10-28 Vmware, Inc. Supporting distributed and local objects using a multi-writer log-structured file system
US11397714B2 (en) 2020-05-04 2022-07-26 Salesforce.Com, Inc. Database implementation for different application versions
US11586608B1 (en) * 2020-06-23 2023-02-21 Amazon Technologies, Inc. Handling requests to access separately stored items in a non-relational database
CN112685446B (zh) * 2020-12-31 2023-07-25 上海梦鱼信息科技有限公司 通过Elasticsearch数据库的复杂SQL查询方法、装置、处理器及存储介质
US11797525B2 (en) 2021-06-23 2023-10-24 EMC IP Holding Company LLC One path metadata page reconstruction with no dynamical memory allocation for delta-log based storage
US11934893B2 (en) 2021-07-06 2024-03-19 Pure Storage, Inc. Storage system that drives an orchestrator based on events in the storage system
US11816356B2 (en) 2021-07-06 2023-11-14 Pure Storage, Inc. Container orchestrator-aware storage system
CN113535745B (zh) * 2021-08-09 2022-01-18 威讯柏睿数据科技(北京)有限公司 一种层次化数据库操作加速系统和方法
US11741134B2 (en) 2021-09-07 2023-08-29 Oracle International Corporation Conversion and migration of key-value store to relational model
US11663189B1 (en) * 2021-12-01 2023-05-30 Oracle International Corporation Generating relational table structures from NoSQL datastore and migrating data

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090182792A1 (en) * 2008-01-14 2009-07-16 Shashidhar Bomma Method and apparatus to perform incremental truncates in a file system
US20090254916A1 (en) * 2008-04-08 2009-10-08 Infosys Technologies Ltd. Allocating resources for parallel execution of query plans
US7653665B1 (en) * 2004-09-13 2010-01-26 Microsoft Corporation Systems and methods for avoiding database anomalies when maintaining constraints and indexes in presence of snapshot isolation
CN102033889A (zh) * 2009-09-29 2011-04-27 熊凡凡 分布式数据库并行处理系统
US7984043B1 (en) * 2007-07-24 2011-07-19 Amazon Technologies, Inc. System and method for distributed query processing using configuration-independent query plans
US20110246511A1 (en) * 2010-04-06 2011-10-06 John Smith Method and system for defining and populating segments
WO2012050582A1 (en) * 2010-10-14 2012-04-19 Hewlett-Packard Development Company, L.P. Continuous querying of a data stream
US8171018B2 (en) * 2003-07-07 2012-05-01 Ibm International Group B.V. SQL code generation for heterogeneous environment
WO2012124178A1 (ja) * 2011-03-16 2012-09-20 日本電気株式会社 分散記憶システムおよび分散記憶方法

Family Cites Families (379)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5191611A (en) * 1989-04-03 1993-03-02 Lang Gerald S Method and apparatus for protecting material on storage media and for transferring material on storage media to various recipients
JPH0362257A (ja) * 1989-07-31 1991-03-18 Toshiba Corp ネットワークモニタリングシステム
US5454102A (en) * 1993-01-19 1995-09-26 Canon Information Systems, Inc. Method and apparatus for transferring structured data using a self-generating node network
WO1995009395A1 (en) 1993-09-27 1995-04-06 Oracle Corporation Method and apparatus for parallel processing in a database system
US5495607A (en) 1993-11-15 1996-02-27 Conner Peripherals, Inc. Network management system having virtual catalog overview of files distributively stored across network domain
US5655116A (en) 1994-02-28 1997-08-05 Lucent Technologies Inc. Apparatus and methods for retrieving information
US5922030A (en) * 1995-12-20 1999-07-13 Nartron Corporation Method and system for controlling a solid product release mechanism
US5706514A (en) 1996-03-04 1998-01-06 Compaq Computer Corporation Distributed execution of mode mismatched commands in multiprocessor computer systems
JPH104431A (ja) * 1996-06-17 1998-01-06 Fujitsu Ltd スケジューリング装置およびスケジューリング方法
JP2933021B2 (ja) 1996-08-20 1999-08-09 日本電気株式会社 通信網障害回復方式
US6219692B1 (en) * 1997-03-21 2001-04-17 Stiles Invention, L.L.C. Method and system for efficiently disbursing requests among a tiered hierarchy of service providers
US6266682B1 (en) 1998-08-31 2001-07-24 Xerox Corporation Tagging related files in a document management system
US6269380B1 (en) 1998-08-31 2001-07-31 Xerox Corporation Property based mechanism for flexibility supporting front-end and back-end components having different communication protocols
EP0996071A3 (en) 1998-09-30 2005-10-05 Nippon Telegraph and Telephone Corporation Classification tree based information retrieval scheme
US7111290B1 (en) * 1999-01-28 2006-09-19 Ati International Srl Profiling program execution to identify frequently-executed portions and to assist binary translation
US6922708B1 (en) 1999-02-18 2005-07-26 Oracle International Corporation File system that supports transactions
US7010554B2 (en) 2002-04-04 2006-03-07 Emc Corporation Delegation of metadata management in a storage system by leasing of free file system blocks and i-nodes from a file system owner
US6564252B1 (en) 1999-03-11 2003-05-13 Microsoft Corporation Scalable storage system with unique client assignment to storage server partitions
US6654806B2 (en) * 1999-04-09 2003-11-25 Sun Microsystems, Inc. Method and apparatus for adaptably providing data to a network environment
US6700871B1 (en) * 1999-05-04 2004-03-02 3Com Corporation Increased throughput across data network interface by dropping redundant packets
US7181419B1 (en) * 2001-09-13 2007-02-20 Ewinwin, Inc. Demand aggregation system
US6745385B1 (en) 1999-09-01 2004-06-01 Microsoft Corporation Fixing incompatible applications by providing stubs for APIs
US6741983B1 (en) 1999-09-28 2004-05-25 John D. Birdwell Method of indexed storage and retrieval of multidimensional information
US6947953B2 (en) 1999-11-05 2005-09-20 The Board Of Trustees Of The Leland Stanford Junior University Internet-linked system for directory protocol based data storage, retrieval and analysis
US6594651B2 (en) 1999-12-22 2003-07-15 Ncr Corporation Method and apparatus for parallel execution of SQL-from within user defined functions
US6718372B1 (en) 2000-01-07 2004-04-06 Emc Corporation Methods and apparatus for providing access by a first computing system to data stored in a shared storage device managed by a second computing system
US7089583B2 (en) * 2000-01-14 2006-08-08 Saba Software, Inc. Method and apparatus for a business applications server
WO2001052056A2 (en) * 2000-01-14 2001-07-19 Saba Software, Inc. Method and apparatus for a business applications management system platform
US6742035B1 (en) 2000-02-28 2004-05-25 Novell, Inc. Directory-based volume location service for a distributed file system
US7225191B1 (en) 2000-06-27 2007-05-29 Emc Corporation Method and apparatus for verifying storage access requests in a computer storage system with multiple storage elements
JP2002014777A (ja) 2000-06-29 2002-01-18 Hitachi Ltd データ移行方法並びにプロトコル変換装置及びそれを用いたスイッチング装置
US8219681B1 (en) 2004-03-26 2012-07-10 Emc Corporation System and method for managing provisioning of storage resources in a network with virtualization of resources in such a network
US6907005B1 (en) * 2000-07-24 2005-06-14 Telefonaktiebolaget L M Ericsson (Publ) Flexible ARQ for packet data transmission
US20020108099A1 (en) * 2000-10-11 2002-08-08 Charles Paclat Method for developing business components
US7475199B1 (en) * 2000-10-19 2009-01-06 Emc Corporation Scalable network file system
US7865596B2 (en) * 2000-11-02 2011-01-04 Oracle America, Inc. Switching system for managing storage in digital networks
US6907414B1 (en) 2000-12-22 2005-06-14 Trilogy Development Group, Inc. Hierarchical interface to attribute based database
US7509322B2 (en) 2001-01-11 2009-03-24 F5 Networks, Inc. Aggregated lock management for locking aggregated files in a switched file system
US8195760B2 (en) 2001-01-11 2012-06-05 F5 Networks, Inc. File aggregation in a switched file system
US7512673B2 (en) * 2001-01-11 2009-03-31 Attune Systems, Inc. Rule based aggregation of files and transactions in a switched file system
US7054927B2 (en) * 2001-01-29 2006-05-30 Adaptec, Inc. File system metadata describing server directory information
IL141599A0 (en) * 2001-02-22 2002-03-10 Infocyclone Inc Information retrieval system
US6650656B2 (en) * 2001-02-28 2003-11-18 Crossroads Systems, Inc. Method and system for reconciling extended copy command target descriptor lengths
US6980946B2 (en) 2001-03-15 2005-12-27 Microsoft Corporation Method for hybrid processing of software instructions of an emulated computer system
US6691109B2 (en) * 2001-03-22 2004-02-10 Turbo Worx, Inc. Method and apparatus for high-performance sequence comparison
US7415038B2 (en) * 2001-03-29 2008-08-19 International Business Machines Corporation Method and system for network management providing access to application bandwidth usage calculations
US7013303B2 (en) 2001-05-04 2006-03-14 Sun Microsystems, Inc. System and method for multiple data sources to plug into a standardized interface for distributed deep search
US7797271B1 (en) * 2001-06-18 2010-09-14 Versata Development Group, Inc. Custom browse hierarchies for subsets of items in a primary hierarchy
US6678695B1 (en) 2001-06-29 2004-01-13 Trilogy Development Group, Inc. Master data maintenance tool for single source data
US7194513B2 (en) * 2001-07-08 2007-03-20 Imran Sharif System and method for using an internet appliance to send/receive digital content files as E-mail attachments
US7039669B1 (en) 2001-09-28 2006-05-02 Oracle Corporation Techniques for adding a master in a distributed database without suspending database operations at extant master sites
US20030093471A1 (en) 2001-10-18 2003-05-15 Mitch Upton System and method using asynchronous messaging for application integration
US7177823B2 (en) * 2001-11-06 2007-02-13 International Business Machines Corporation In-queue jobs information monitoring and filtering
JP4162184B2 (ja) 2001-11-14 2008-10-08 株式会社日立製作所 データベース管理システムの実行情報を取得する手段を有する記憶装置
CN1297844C (zh) * 2001-11-27 2007-01-31 夏普株式会社 液晶板、液晶板制造方法、液晶板制造设备和偏振片粘结设备
US8195705B2 (en) * 2001-12-11 2012-06-05 International Business Machines Corporation Hybrid search memory for network processor and computer systems
US6957222B1 (en) 2001-12-31 2005-10-18 Ncr Corporation Optimizing an outer join operation using a bitmap index structure
US8914429B2 (en) * 2002-02-08 2014-12-16 Willaim Pitts Method for creating global distributed namespace
US20030172094A1 (en) 2002-03-06 2003-09-11 International Business Machines Corporation Automatic file system maintenance
US7672853B2 (en) 2002-03-29 2010-03-02 Siebel Systems, Inc. User interface for processing requests for approval
US6947925B2 (en) 2002-04-15 2005-09-20 International Business Machines Corporation System and method for performing lookups across namespace domains using universal resource locators
US6950815B2 (en) * 2002-04-23 2005-09-27 International Business Machines Corporation Content management system and methodology featuring query conversion capability for efficient searching
US6954748B2 (en) 2002-04-25 2005-10-11 International Business Machines Corporation Remote data access and integration of distributed data sources through data schema and query abstraction
US7487405B1 (en) * 2002-05-10 2009-02-03 Oracle International Corporation Method and mechanism for dynamically configuring logical paths of state machines
US7010521B2 (en) 2002-05-13 2006-03-07 Netezza Corporation Optimized database appliance
US8611919B2 (en) 2002-05-23 2013-12-17 Wounder Gmbh., Llc System, method, and computer program product for providing location based services and mobile e-commerce
US6999958B2 (en) 2002-06-07 2006-02-14 International Business Machines Corporation Runtime query optimization for dynamically selecting from multiple plans in a query based upon runtime-evaluated performance criterion
US6915291B2 (en) 2002-06-07 2005-07-05 International Business Machines Corporation Object-oriented query execution data structure
US6910032B2 (en) * 2002-06-07 2005-06-21 International Business Machines Corporation Parallel database query processing for non-uniform data sources via buffered access
US7174332B2 (en) 2002-06-11 2007-02-06 Ip. Com, Inc. Method and apparatus for safeguarding files
WO2004008348A1 (en) 2002-07-16 2004-01-22 Horn Bruce L Computer system for automatic organization, indexing and viewing of information from multiple sources
US8417678B2 (en) 2002-07-30 2013-04-09 Storediq, Inc. System, method and apparatus for enterprise policy management
AU2003265335A1 (en) 2002-07-30 2004-02-16 Deepfile Corporation Method and apparatus for managing file systems and file-based data storage
US7493311B1 (en) * 2002-08-01 2009-02-17 Microsoft Corporation Information server and pluggable data sources
US7370064B2 (en) 2002-08-06 2008-05-06 Yousefi Zadeh Homayoun Database remote replication for back-end tier of multi-tier computer systems
US6996556B2 (en) 2002-08-20 2006-02-07 International Business Machines Corporation Metadata manager for database query optimizer
US7284030B2 (en) 2002-09-16 2007-10-16 Network Appliance, Inc. Apparatus and method for processing data in a network
US6996582B2 (en) * 2002-10-03 2006-02-07 Hewlett-Packard Development Company, L.P. Virtual storage systems and virtual storage system operational methods
US7313512B1 (en) 2002-10-18 2007-12-25 Microsoft Corporation Software license enforcement mechanism for an emulated computing environment
US8176186B2 (en) 2002-10-30 2012-05-08 Riverbed Technology, Inc. Transaction accelerator for client-server communications systems
US7421433B2 (en) 2002-10-31 2008-09-02 Hewlett-Packard Development Company, L.P. Semantic-based system including semantic vectors
US8041735B1 (en) 2002-11-01 2011-10-18 Bluearc Uk Limited Distributed file system and method
US7373350B1 (en) * 2002-11-07 2008-05-13 Data Advantage Group Virtual metadata analytics and management platform
US7370007B2 (en) * 2002-11-18 2008-05-06 Sap Aktiengesellschaft Catalog search agent
AU2003282361A1 (en) 2002-11-20 2004-06-15 Filesx Ltd. Fast backup storage and fast recovery of data (fbsrd)
US20040103087A1 (en) 2002-11-25 2004-05-27 Rajat Mukherjee Method and apparatus for combining multiple search workers
US7051034B1 (en) 2002-12-18 2006-05-23 Oracle International Corporation Dynamic optimization for processing a restartable sub-tree of a query execution plan
US8543564B2 (en) * 2002-12-23 2013-09-24 West Publishing Company Information retrieval systems with database-selection aids
US6993516B2 (en) 2002-12-26 2006-01-31 International Business Machines Corporation Efficient sampling of a relational database
JP4237515B2 (ja) 2003-02-07 2009-03-11 株式会社日立グローバルストレージテクノロジーズ ネットワークストレージ仮想化方法およびネットワークストレージシステム
US7254636B1 (en) 2003-03-14 2007-08-07 Cisco Technology, Inc. Method and apparatus for transparent distributed network-attached storage with web cache communication protocol/anycast and file handle redundancy
US20040186842A1 (en) 2003-03-18 2004-09-23 Darren Wesemann Systems and methods for providing access to data stored in different types of data repositories
US7895191B2 (en) 2003-04-09 2011-02-22 International Business Machines Corporation Improving performance of database queries
US7447786B2 (en) 2003-05-09 2008-11-04 Oracle International Corporation Efficient locking of shared data that is accessed for reads in a cluster database
US7653699B1 (en) 2003-06-12 2010-01-26 Symantec Operating Corporation System and method for partitioning a file system for enhanced availability and scalability
CA2533167A1 (en) 2003-07-22 2005-01-27 Kinor Technologies Inc. Information access using ontologies
US7739316B2 (en) 2003-08-21 2010-06-15 Microsoft Corporation Systems and methods for the implementation of base schema for organizing units of information manageable by a hardware/software interface system
US8131739B2 (en) 2003-08-21 2012-03-06 Microsoft Corporation Systems and methods for interfacing application programs with an item-based storage platform
JP4335619B2 (ja) * 2003-09-04 2009-09-30 株式会社エヌ・ティ・ティ・ドコモ パケット優先制御装置及びその方法
US6912482B2 (en) * 2003-09-11 2005-06-28 Veritas Operating Corporation Data storage analysis mechanism
JP2005107928A (ja) * 2003-09-30 2005-04-21 Fujitsu Ltd データファイルシステム、データアクセスノード、ブレインノード、データアクセスプログラム、およびブレインプログラム。
US7702676B2 (en) * 2006-12-29 2010-04-20 Teradata Us, Inc. Parallel virtual optimization
JP2005182683A (ja) 2003-12-24 2005-07-07 Hitachi Ltd データ転送方法及びシステム並びにプログラム
US7346613B2 (en) 2004-01-26 2008-03-18 Microsoft Corporation System and method for a unified and blended search
US20050198401A1 (en) 2004-01-29 2005-09-08 Chron Edward G. Efficiently virtualizing multiple network attached stores
US7334002B2 (en) 2004-02-27 2008-02-19 Microsoft Corporation System and method for recovery units in databases
US7606792B2 (en) * 2004-03-19 2009-10-20 Microsoft Corporation System and method for efficient evaluation of a query that invokes a table valued function
US7297530B2 (en) * 2004-04-01 2007-11-20 Biotrace Limited Device for use in monitoring a swab technique
US7343459B2 (en) * 2004-04-30 2008-03-11 Commvault Systems, Inc. Systems and methods for detecting & mitigating storage risks
GB0412655D0 (en) * 2004-06-07 2004-07-07 British Telecomm Distributed storage network
US7873650B1 (en) 2004-06-11 2011-01-18 Seisint, Inc. System and method for distributing data in a parallel processing system
US7693826B1 (en) * 2004-06-11 2010-04-06 Seisint, Inc. System and method for pre-compiling a query and pre-keying a database system
US7406461B1 (en) * 2004-06-11 2008-07-29 Seisint, Inc. System and method for processing a request to perform an activity associated with a precompiled query
US7707143B2 (en) 2004-06-14 2010-04-27 International Business Machines Corporation Systems, methods, and computer program products that automatically discover metadata objects and generate multidimensional models
US20050289098A1 (en) 2004-06-24 2005-12-29 International Business Machines Corporation Dynamically selecting alternative query access plans
US8271976B2 (en) 2004-06-30 2012-09-18 Microsoft Corporation Systems and methods for initializing multiple virtual processors within a single virtual machine
US8972977B2 (en) * 2004-06-30 2015-03-03 Microsoft Technology Licensing, Llc Systems and methods for providing seamless software compatibility using virtual machines
US7596571B2 (en) * 2004-06-30 2009-09-29 Technorati, Inc. Ecosystem method of aggregation and search and related techniques
US7660873B2 (en) 2004-08-16 2010-02-09 General Electric Company Systems and methods for communicating messages
GB2418108B (en) * 2004-09-09 2007-06-27 Surfcontrol Plc System, method and apparatus for use in monitoring or controlling internet access
US7761678B1 (en) * 2004-09-29 2010-07-20 Verisign, Inc. Method and apparatus for an improved file repository
US7725601B2 (en) 2004-10-12 2010-05-25 International Business Machines Corporation Apparatus, system, and method for presenting a mapping between a namespace and a set of computing resources
US9639554B2 (en) 2004-12-17 2017-05-02 Microsoft Technology Licensing, Llc Extensible file system
US8051052B2 (en) 2004-12-21 2011-11-01 Sandisk Technologies Inc. Method for creating control structure for versatile content control
US8621458B2 (en) 2004-12-21 2013-12-31 Microsoft Corporation Systems and methods for exposing processor topology for virtual machines
US8274518B2 (en) 2004-12-30 2012-09-25 Microsoft Corporation Systems and methods for virtualizing graphics subsystems
US8260753B2 (en) * 2004-12-31 2012-09-04 Emc Corporation Backup information management
US7702742B2 (en) * 2005-01-18 2010-04-20 Fortinet, Inc. Mechanism for enabling memory transactions to be conducted across a lossy network
US20060193316A1 (en) * 2005-02-25 2006-08-31 Allen Mark R Autonomous network topology and method of operating same
WO2006095875A1 (ja) 2005-03-10 2006-09-14 Nippon Telegraph And Telephone Corporation ネットワークシステム、ストレージ装置へのアクセス制御方法、管理サーバ、ストレージ装置、ログイン制御方法、ネットワークブートシステムおよび単位記憶ユニットのアクセス方法
US7383274B2 (en) 2005-03-21 2008-06-03 Microsoft Corporation Systems and methods for efficiently storing and accessing data storage system paths
US7640230B2 (en) 2005-04-05 2009-12-29 Microsoft Corporation Query plan selection control using run-time association mechanism
US7908242B1 (en) 2005-04-11 2011-03-15 Experian Information Solutions, Inc. Systems and methods for optimizing database queries
US7689609B2 (en) * 2005-04-25 2010-03-30 Netapp, Inc. Architecture for supporting sparse volumes
US8635612B2 (en) * 2005-04-29 2014-01-21 Microsoft Corporation Systems and methods for hypervisor discovery and utilization
US8332526B2 (en) * 2005-05-25 2012-12-11 Microsoft Corporation Data communication protocol including negotiation and command compounding
JP4611830B2 (ja) * 2005-07-22 2011-01-12 優 喜連川 データベース管理システム及び方法
US8788464B1 (en) 2005-07-25 2014-07-22 Lockheed Martin Corporation Fast ingest, archive and retrieval systems, method and computer programs
WO2007014296A2 (en) 2005-07-25 2007-02-01 Parascale, Inc. Scalable distributed file storage access and management
US7831582B1 (en) * 2005-08-23 2010-11-09 Amazon Technologies, Inc. Method and system for associating keywords with online content sources
US7383247B2 (en) * 2005-08-29 2008-06-03 International Business Machines Corporation Query routing of federated information systems for fast response time, load balance, availability, and reliability
US7823170B2 (en) * 2005-08-31 2010-10-26 Sap Ag Queued asynchronous remote function call dependency management
US20100036840A1 (en) 2005-09-09 2010-02-11 Pitts William M Presentation of Search Results
US8364638B2 (en) 2005-09-15 2013-01-29 Ca, Inc. Automated filer technique for use in virtualized appliances and applications
US7877379B2 (en) 2005-09-30 2011-01-25 Oracle International Corporation Delaying evaluation of expensive expressions in a query
US7788303B2 (en) 2005-10-21 2010-08-31 Isilon Systems, Inc. Systems and methods for distributed system scanning
US7685109B1 (en) * 2005-12-29 2010-03-23 Amazon Technologies, Inc. Method and apparatus for data partitioning and replication in a searchable data service
US7743051B1 (en) * 2006-01-23 2010-06-22 Clearwell Systems, Inc. Methods, systems, and user interface for e-mail search and retrieval
US8769127B2 (en) 2006-02-10 2014-07-01 Northrop Grumman Systems Corporation Cross-domain solution (CDS) collaborate-access-browse (CAB) and assured file transfer (AFT)
US7567956B2 (en) * 2006-02-15 2009-07-28 Panasonic Corporation Distributed meta data management middleware
US20070203893A1 (en) 2006-02-27 2007-08-30 Business Objects, S.A. Apparatus and method for federated querying of unstructured data
US7702625B2 (en) 2006-03-03 2010-04-20 International Business Machines Corporation Building a unified query that spans heterogeneous environments
CN101438256B (zh) * 2006-03-07 2011-12-21 索尼株式会社 信息处理设备、信息通信系统、信息处理方法
US9118697B1 (en) * 2006-03-20 2015-08-25 Netapp, Inc. System and method for integrating namespace management and storage management in a storage system environment
TWI444757B (zh) * 2006-04-21 2014-07-11 Asahi Glass Co Ltd 用於極紫外光(euv)微影術之反射性空白光罩
US8635247B1 (en) * 2006-04-28 2014-01-21 Netapp, Inc. Namespace and storage management application infrastructure for use in management of resources in a storage system environment
US7739296B2 (en) * 2006-07-12 2010-06-15 International Business Machines Corporation System and method for virtualization of relational stored procedures in non-native relational database systems
US7624118B2 (en) 2006-07-26 2009-11-24 Microsoft Corporation Data processing over very large databases
US7536383B2 (en) * 2006-08-04 2009-05-19 Apple Inc. Method and apparatus for searching metadata
US20080059489A1 (en) 2006-08-30 2008-03-06 International Business Machines Corporation Method for parallel query processing with non-dedicated, heterogeneous computers that is resilient to load bursts and node failures
US8028290B2 (en) 2006-08-30 2011-09-27 International Business Machines Corporation Multiple-core processor supporting multiple instruction set architectures
US20080082644A1 (en) 2006-09-29 2008-04-03 Microsoft Corporation Distributed parallel computing
US7720841B2 (en) 2006-10-04 2010-05-18 International Business Machines Corporation Model-based self-optimizing distributed information management
US8190610B2 (en) 2006-10-05 2012-05-29 Yahoo! Inc. MapReduce for distributed database processing
US8520673B2 (en) * 2006-10-23 2013-08-27 Telcordia Technologies, Inc. Method and communication device for routing unicast and multicast messages in an ad-hoc wireless network
GB0621433D0 (en) * 2006-10-27 2006-12-06 3G Scene Ltd Networking application
US20080109573A1 (en) * 2006-11-08 2008-05-08 Sicortex, Inc RDMA systems and methods for sending commands from a source node to a target node for local execution of commands at the target node
US7523123B2 (en) 2006-11-16 2009-04-21 Yahoo! Inc. Map-reduce with merge to process multiple relational datasets
US7613947B1 (en) 2006-11-30 2009-11-03 Netapp, Inc. System and method for storage takeover
US7849073B2 (en) * 2006-12-18 2010-12-07 Ianywhere Solutions, Inc. Load balancing for complex database query plans
US7599969B2 (en) 2006-12-20 2009-10-06 International Business Machines Corporation Method and system for scheduling workload in databases
US7593938B2 (en) 2006-12-22 2009-09-22 Isilon Systems, Inc. Systems and methods of directory entry encodings
US20080172281A1 (en) * 2007-01-12 2008-07-17 David Malthe Probst Scheduling Service Based on Usage Data
US7624131B2 (en) 2007-01-18 2009-11-24 Microsoft Corporation Type restriction and mapping for partial materialization
US20080195577A1 (en) 2007-02-09 2008-08-14 Wei Fan Automatically and adaptively determining execution plans for queries with parameter markers
JP5088668B2 (ja) 2007-03-08 2012-12-05 日本電気株式会社 計算機負荷見積システム、計算機負荷見積方法、計算機負荷見積プログラム
US9684554B2 (en) 2007-03-27 2017-06-20 Teradata Us, Inc. System and method for using failure casting to manage failures in a computed system
US8359495B2 (en) 2007-03-27 2013-01-22 Teradata Us, Inc. System and method for using failure casting to manage failures in computer systems
JP4352079B2 (ja) 2007-03-28 2009-10-28 株式会社東芝 分散データベースから情報を検索するシステム、装置、および方法
US7958303B2 (en) 2007-04-27 2011-06-07 Gary Stephen Shuster Flexible data storage system
US7827201B1 (en) 2007-04-27 2010-11-02 Network Appliance, Inc. Merging containers in a multi-container system
US8875266B2 (en) * 2007-05-16 2014-10-28 Vmware, Inc. System and methods for enforcing software license compliance with virtual machines
US7689535B2 (en) 2007-05-30 2010-03-30 Red Hat, Inc. Method for providing a unified view of a domain model to a user
US20110113052A1 (en) * 2007-06-08 2011-05-12 Hoernkvist John Query result iteration for multiple queries
US20080313183A1 (en) 2007-06-14 2008-12-18 Charles Edward Cunningham Apparatus and method for mapping feature catalogs
US8205194B2 (en) * 2007-06-29 2012-06-19 Microsoft Corporation Updating offline virtual machines or VM images
US7886301B2 (en) * 2007-06-29 2011-02-08 Microsoft Corporation Namespace merger
US8452821B2 (en) * 2007-06-29 2013-05-28 Microsoft Corporation Efficient updates for distributed file systems
JP5011006B2 (ja) 2007-07-03 2012-08-29 株式会社日立製作所 リソース割当方法、リソース割当プログラム、および、リソース割当装置
US20090043745A1 (en) 2007-08-07 2009-02-12 Eric L Barsness Query Execution and Optimization with Autonomic Error Recovery from Network Failures in a Parallel Computer System with Multiple Networks
US7949693B1 (en) 2007-08-23 2011-05-24 Osr Open Systems Resources, Inc. Log-structured host data storage
JP4587053B2 (ja) * 2007-08-28 2010-11-24 日本電気株式会社 通信装置、通信システム、パケット欠落検出方法、およびパケット欠落検出プログラム
WO2009032712A2 (en) 2007-08-29 2009-03-12 Nirvanix, Inc. Method and system for moving requested files from one storage location to another
US8244781B2 (en) * 2007-09-28 2012-08-14 Emc Corporation Network accessed storage files system query/set proxy service for a storage virtualization system
US7885953B2 (en) * 2007-10-03 2011-02-08 International Business Machines Corporation Off-loading star join operations to a storage server
CN101849212A (zh) * 2007-11-08 2010-09-29 Asml荷兰有限公司 辐射系统和方法以及光谱纯度滤光片
US8180747B2 (en) 2007-11-12 2012-05-15 F5 Networks, Inc. Load sharing cluster file systems
US9167034B2 (en) * 2007-11-12 2015-10-20 International Business Machines Corporation Optimized peer-to-peer file transfers on a multi-node computer system
US7844620B2 (en) 2007-11-16 2010-11-30 International Business Machines Corporation Real time data replication for query execution in a massively parallel computer
US8266122B1 (en) 2007-12-19 2012-09-11 Amazon Technologies, Inc. System and method for versioning data in a distributed data store
US7917502B2 (en) * 2008-02-27 2011-03-29 International Business Machines Corporation Optimized collection of just-in-time statistics for database query optimization
US20090222569A1 (en) 2008-02-29 2009-09-03 Atrato, Inc. Storage system front end
US20120095992A1 (en) 2008-03-04 2012-04-19 Timothy Cutting Unified media search
US8019737B2 (en) 2008-03-13 2011-09-13 Harris Corporation Synchronization of metadata
US8103628B2 (en) * 2008-04-09 2012-01-24 Harmonic Inc. Directed placement of data in a redundant data storage system
US8185488B2 (en) * 2008-04-17 2012-05-22 Emc Corporation System and method for correlating events in a pluggable correlation architecture
US8386508B2 (en) 2008-04-28 2013-02-26 Infosys Technologies Limited System and method for parallel query evaluation
US8682853B2 (en) 2008-05-16 2014-03-25 Paraccel Llc System and method for enhancing storage performance in analytical database applications
US8131711B2 (en) 2008-05-22 2012-03-06 Teradata Corporation System, method, and computer-readable medium for partial redistribution, partial duplication of rows of parallel join operation on skewed data
US9069599B2 (en) 2008-06-19 2015-06-30 Servicemesh, Inc. System and method for a cloud computing abstraction layer with security zone facilities
US8010738B1 (en) * 2008-06-27 2011-08-30 Emc Corporation Techniques for obtaining a specified lifetime for a data storage device
US8775413B2 (en) 2008-06-30 2014-07-08 Teradata Us, Inc. Parallel, in-line, query capture database for real-time logging, monitoring and optimizer feedback
US8239417B2 (en) 2008-08-07 2012-08-07 Armanta, Inc. System, method, and computer program product for accessing and manipulating remote datasets
US20100042655A1 (en) 2008-08-18 2010-02-18 Xerox Corporation Method for selective compression for planned degradation and obsolence of files
US8255430B2 (en) 2008-08-26 2012-08-28 Caringo, Inc. Shared namespace for storage clusters
US8312037B1 (en) 2008-08-28 2012-11-13 Amazon Technologies, Inc. Dynamic tree determination for data processing
US8195644B2 (en) 2008-10-06 2012-06-05 Teradata Us, Inc. System, method, and computer-readable medium for optimization of multiple parallel join operations on skewed data
US20100094716A1 (en) 2008-10-15 2010-04-15 Ganesan Chandramouli Method and computer-readable storage media to determine and access provisioning services
SE533007C2 (sv) 2008-10-24 2010-06-08 Ilt Productions Ab Distribuerad datalagring
US20100114970A1 (en) 2008-10-31 2010-05-06 Yahoo! Inc. Distributed index data structure
US8255550B1 (en) 2008-12-30 2012-08-28 Emc Corporation Multi-protocol global namespace mechanism for network attached storage
US8666966B2 (en) * 2009-01-30 2014-03-04 Hewlett-Packard Development Company, L.P. Providing parallel result streams for database queries
US20100198808A1 (en) * 2009-02-02 2010-08-05 Goetz Graefe Database system implementation prioritization using robustness maps
US8224811B2 (en) * 2009-02-02 2012-07-17 Hewlett-Packard Development Company, L.P. Workload management using robustness mapping
US10929399B2 (en) * 2009-02-02 2021-02-23 Micro Focus Llc Database system testing using robustness maps
US8572068B2 (en) * 2009-02-02 2013-10-29 Hewlett-Packard Development Company, L.P. Evaluation of set of representative query performance using robustness mapping
US9110706B2 (en) 2009-02-09 2015-08-18 Microsoft Technology Licensing, Llc General purpose distributed data parallel computing using a high level language
US8352517B2 (en) 2009-03-02 2013-01-08 Oracle International Corporation Infrastructure for spilling pages to a persistent store
DE112009004503T5 (de) 2009-03-10 2012-05-31 Hewlett-Packard Development Co., L.P. Optimierung der zugriffszeit von auf speichern gespeicherten dateien
US8209664B2 (en) 2009-03-18 2012-06-26 Microsoft Corporation High level programming extensions for distributed data parallel processing
US8239847B2 (en) 2009-03-18 2012-08-07 Microsoft Corporation General distributed reduction for data parallel computing
KR101335101B1 (ko) 2009-03-19 2013-12-03 가부시키가이샤 무라쿠모 데이터의 복제 관리 방법 및 시스템
EP2411918B1 (en) 2009-03-23 2018-07-11 Riverbed Technology, Inc. Virtualized data storage system architecture
US8713038B2 (en) * 2009-04-02 2014-04-29 Pivotal Software, Inc. Integrating map-reduce into a distributed relational database
US8200723B1 (en) 2009-04-21 2012-06-12 Network Appliance, Inc. Metadata file system backed by database
US20100274772A1 (en) 2009-04-23 2010-10-28 Allen Samuels Compressed data objects referenced via address references and compression references
US8126875B2 (en) * 2009-05-08 2012-02-28 Microsoft Corporation Instant answers and integrated results of a browser
US8060645B1 (en) * 2009-05-26 2011-11-15 Google Inc. Semi reliable transport of multimedia content
US8510280B2 (en) * 2009-06-30 2013-08-13 Teradata Us, Inc. System, method, and computer-readable medium for dynamic detection and management of data skew in parallel join operations
US8370394B2 (en) * 2009-07-17 2013-02-05 International Business Machines Corporation Parallel processing of data organized in a tree structure
GB2472620B (en) 2009-08-12 2016-05-18 Cloudtran Inc Distributed transaction processing
US9268815B2 (en) * 2009-08-20 2016-02-23 Hewlett Packard Enterprise Development Lp Map-reduce and parallel processing in databases
EP2290562A1 (en) * 2009-08-24 2011-03-02 Amadeus S.A.S. Segmented main-memory stored relational database table system with improved collaborative scan algorithm
US8352429B1 (en) * 2009-08-31 2013-01-08 Symantec Corporation Systems and methods for managing portions of files in multi-tier storage systems
US8051113B1 (en) 2009-09-17 2011-11-01 Netapp, Inc. Method and system for managing clustered and non-clustered storage systems
US8301822B2 (en) * 2009-09-23 2012-10-30 Sandisk Il Ltd. Multi-protocol storage device bridge
US9165034B2 (en) 2009-10-15 2015-10-20 Hewlett-Packard Development Company, L.P. Heterogeneous data source management
US8751533B1 (en) * 2009-11-25 2014-06-10 Netapp, Inc. Method and system for transparently migrating storage objects between nodes in a clustered storage system
US20110131198A1 (en) 2009-11-30 2011-06-02 Theodore Johnson Method and apparatus for providing a filter join on data streams
US8868510B2 (en) * 2009-12-03 2014-10-21 Sybase, Inc. Managing data storage as an in-memory database in a database management system
US8180813B1 (en) 2009-12-08 2012-05-15 Netapp, Inc. Content repository implemented in a network storage server system
US8484259B1 (en) 2009-12-08 2013-07-09 Netapp, Inc. Metadata subsystem for a distributed object store in a network storage system
US20110137966A1 (en) * 2009-12-08 2011-06-09 Netapp, Inc. Methods and systems for providing a unified namespace for multiple network protocols
US8832154B1 (en) 2009-12-08 2014-09-09 Netapp, Inc. Object location service for network-based content repository
JP5375972B2 (ja) 2009-12-10 2013-12-25 日本電気株式会社 分散ファイルシステム、そのデータ選択方法およびプログラム
US8543596B1 (en) * 2009-12-17 2013-09-24 Teradata Us, Inc. Assigning blocks of a file of a distributed file system to processing units of a parallel database management system
US9323758B1 (en) 2009-12-22 2016-04-26 Emc Corporation Efficient migration of replicated files from a file server having a file de-duplication facility
EP2517124A1 (en) 2009-12-23 2012-10-31 Ab Initio Technology LLC Managing queries
US8281105B2 (en) 2010-01-20 2012-10-02 Hitachi, Ltd. I/O conversion method and apparatus for storage system
US8595237B1 (en) 2010-02-17 2013-11-26 Netapp, Inc. Method and system for managing metadata in a storage environment
US9229980B2 (en) 2010-02-23 2016-01-05 Yahoo! Inc. Composition model for cloud-hosted serving applications
US9311184B2 (en) * 2010-02-27 2016-04-12 Cleversafe, Inc. Storing raid data as encoded data slices in a dispersed storage network
US8548986B2 (en) 2010-03-19 2013-10-01 Microsoft Corporation Adaptive row-batch processing of database data
US8874961B2 (en) 2010-03-22 2014-10-28 Infosys Limited Method and system for automatic failover of distributed query processing using distributed shared memory
US8577911B1 (en) 2010-03-23 2013-11-05 Google Inc. Presenting search term refinements
US9727588B1 (en) 2010-03-29 2017-08-08 EMC IP Holding Company LLC Applying XAM processes
US8954265B2 (en) * 2010-04-09 2015-02-10 Tomtom North America, Inc. Method of resolving a location from data representative thereof
US8484243B2 (en) 2010-05-05 2013-07-09 Cisco Technology, Inc. Order-independent stream query processing
US9037615B2 (en) 2010-05-14 2015-05-19 International Business Machines Corporation Querying and integrating structured and unstructured data
US8935232B2 (en) 2010-06-04 2015-01-13 Yale University Query execution systems and methods
US9495427B2 (en) 2010-06-04 2016-11-15 Yale University Processing of data using a database system in communication with a data processing framework
US9336263B2 (en) * 2010-06-04 2016-05-10 Yale University Data loading systems and methods
US9323775B2 (en) 2010-06-19 2016-04-26 Mapr Technologies, Inc. Map-reduce ready distributed file system
US9449007B1 (en) 2010-06-29 2016-09-20 Emc Corporation Controlling access to XAM metadata
EP2410440B1 (en) 2010-07-20 2012-10-03 Siemens Aktiengesellschaft Distributed system
US20120023145A1 (en) 2010-07-23 2012-01-26 International Business Machines Corporation Policy-based computer file management based on content-based analytics
US8640137B1 (en) * 2010-08-30 2014-01-28 Adobe Systems Incorporated Methods and apparatus for resource management in cluster computing
US9674294B1 (en) * 2010-09-01 2017-06-06 The Mathworks, Inc. Integrated collaboration environment
US8620974B2 (en) 2010-09-09 2013-12-31 International Business Machines Corporation Persistent file replacement mechanism
US8589553B2 (en) * 2010-09-17 2013-11-19 Microsoft Corporation Directory leasing
SG179314A1 (en) * 2010-09-23 2012-04-27 Eutech Cybernetic Pte Ltd Computer implemented method and system for integrating multiple building systems and business applications
US9215275B2 (en) * 2010-09-30 2015-12-15 A10 Networks, Inc. System and method to balance servers based on server load status
CN101963947B (zh) * 2010-09-30 2013-10-02 威盛电子股份有限公司 通用序列总线传输转译器及大量传输方法
US8510257B2 (en) 2010-10-19 2013-08-13 Xerox Corporation Collapsed gibbs sampler for sparse topic models and discrete matrix factorization
US20120036146A1 (en) 2010-10-26 2012-02-09 ParElastic Corporation Apparatus for elastic database processing with heterogeneous data
US8396894B2 (en) 2010-11-05 2013-03-12 Apple Inc. Integrated repository of structured and unstructured data
US8483095B2 (en) * 2010-11-11 2013-07-09 International Business Machines Corporation Configurable network socket retransmission timeout parameters
US9218278B2 (en) * 2010-12-13 2015-12-22 SanDisk Technologies, Inc. Auto-commit memory
US8478743B2 (en) 2010-12-23 2013-07-02 Microsoft Corporation Asynchronous transfer of state information between continuous query plans
US8572031B2 (en) * 2010-12-23 2013-10-29 Mongodb, Inc. Method and apparatus for maintaining replica sets
US9589029B2 (en) 2010-12-28 2017-03-07 Citrix Systems, Inc. Systems and methods for database proxy request switching
US8538954B2 (en) 2011-01-25 2013-09-17 Hewlett-Packard Development Company, L.P. Aggregate function partitions for distributed processing
US20120203765A1 (en) 2011-02-04 2012-08-09 Microsoft Corporation Online catalog with integrated content
US8578096B2 (en) * 2011-04-08 2013-11-05 Symantec Corporation Policy for storing data objects in a multi-tier storage system
US9396242B2 (en) 2011-04-11 2016-07-19 Salesforce.Com, Inc. Multi-master data replication in a distributed multi-tenant system
US20120278471A1 (en) * 2011-04-26 2012-11-01 Motorola Mobility, Inc. Devices and Methods for Two Step Searches for Servers by a Communication Device
US8935301B2 (en) 2011-05-24 2015-01-13 International Business Machines Corporation Data context selection in business analytics reports
US8875132B2 (en) 2011-05-31 2014-10-28 Neverfail Group Limited Method and apparatus for implementing virtual proxy to support heterogeneous systems management
US9116634B2 (en) * 2011-06-10 2015-08-25 International Business Machines Corporation Configure storage class memory command
US9146766B2 (en) * 2011-06-22 2015-09-29 Vmware, Inc. Consistent unmapping of application data in presence of concurrent, unquiesced writers and readers
US20130006996A1 (en) * 2011-06-22 2013-01-03 Google Inc. Clustering E-Mails Using Collaborative Information
US20130007091A1 (en) * 2011-07-01 2013-01-03 Yahoo! Inc. Methods and apparatuses for storing shared data files in distributed file systems
WO2013009503A2 (en) * 2011-07-08 2013-01-17 Yale University Query execution systems and methods
US8805870B2 (en) * 2011-07-27 2014-08-12 Hewlett-Packard Development Company, L.P. Multi-input, multi-output-per-input user-defined-function-based database operations
US20130036272A1 (en) 2011-08-02 2013-02-07 Microsoft Corporation Storage engine node for cloud-based storage
US8533231B2 (en) 2011-08-12 2013-09-10 Nexenta Systems, Inc. Cloud storage system with distributed metadata
US8601016B2 (en) * 2011-08-30 2013-12-03 International Business Machines Corporation Pre-generation of structured query language (SQL) from application programming interface (API) defined query systems
US8868546B2 (en) 2011-09-15 2014-10-21 Oracle International Corporation Query explain plan in a distributed data management system
US8700875B1 (en) 2011-09-20 2014-04-15 Netapp, Inc. Cluster view for storage devices
US9386127B2 (en) * 2011-09-28 2016-07-05 Open Text S.A. System and method for data transfer, including protocols for use in data transfer
WO2013049715A1 (en) 2011-09-29 2013-04-04 Cirro, Inc. Federated query engine for federation of data queries across structure and unstructured data
US8719271B2 (en) * 2011-10-06 2014-05-06 International Business Machines Corporation Accelerating data profiling process
US8359305B1 (en) 2011-10-18 2013-01-22 International Business Machines Corporation Query metadata engine
US9058371B2 (en) 2011-11-07 2015-06-16 Sap Se Distributed database log recovery
US9122535B2 (en) * 2011-11-22 2015-09-01 Netapp, Inc. Optimizing distributed data analytics for shared storage
US9031909B2 (en) 2011-11-29 2015-05-12 Microsoft Technology Licensing, Llc Provisioning and/or synchronizing using common metadata
US9286414B2 (en) 2011-12-02 2016-03-15 Microsoft Technology Licensing, Llc Data discovery and description service
US8971916B1 (en) 2011-12-09 2015-03-03 Emc Corporation Locating a data storage system
TWI461929B (zh) * 2011-12-09 2014-11-21 Promise Tecnnology Inc 雲端數據儲存系統
US9235396B2 (en) 2011-12-13 2016-01-12 Microsoft Technology Licensing, Llc Optimizing data partitioning for data-parallel computing
US20130166523A1 (en) 2011-12-21 2013-06-27 Sybase, Inc. Parallel Execution In A Transaction Using Independent Queries
US9002813B2 (en) * 2011-12-22 2015-04-07 Sap Se Execution plan preparation in application server
US20130166543A1 (en) 2011-12-22 2013-06-27 Microsoft Corporation Client-based search over local and remote data sources for intent analysis, ranking, and relevance
US8868594B2 (en) * 2011-12-23 2014-10-21 Sap Ag Split processing paths for a database calculation engine
US8762378B2 (en) * 2011-12-23 2014-06-24 Sap Ag Independent table nodes in parallelized database environments
US9160697B2 (en) * 2012-01-01 2015-10-13 Qualcomm Incorporated Data delivery optimization
US9020979B2 (en) * 2012-01-05 2015-04-28 International Business Machines Corporation Rich database metadata model that captures application relationships, mappings, constraints, and complex data structures
US8850450B2 (en) 2012-01-18 2014-09-30 International Business Machines Corporation Warning track interruption facility
US9170827B2 (en) 2012-01-31 2015-10-27 Hewlett-Packard Development Company, L.P. Configuration file compatibility
US20130246347A1 (en) 2012-03-15 2013-09-19 Ellen L. Sorenson Database file groups
US8682922B2 (en) 2012-03-20 2014-03-25 Schlumberger Technology Corporation Method and system for accessing a virtual seismic cube
US8645356B2 (en) * 2012-03-28 2014-02-04 International Business Machines Corporation Adaptive query execution plan enhancement
US9639575B2 (en) 2012-03-30 2017-05-02 Khalifa University Of Science, Technology And Research Method and system for processing data queries
US9628438B2 (en) 2012-04-06 2017-04-18 Exablox Consistent ring namespaces facilitating data storage and organization in network infrastructures
US11347443B2 (en) 2012-04-13 2022-05-31 Veritas Technologies Llc Multi-tier storage using multiple file sets
US9501550B2 (en) 2012-04-18 2016-11-22 Renmin University Of China OLAP query processing method oriented to database and HADOOP hybrid platform
US9378246B2 (en) * 2012-05-03 2016-06-28 Hiromichi Watari Systems and methods of accessing distributed data
US20130311447A1 (en) 2012-05-15 2013-11-21 Microsoft Corporation Scenario based insights into structure data
US8825752B1 (en) 2012-05-18 2014-09-02 Netapp, Inc. Systems and methods for providing intelligent automated support capable of self rejuvenation with respect to storage systems
US9613052B2 (en) 2012-06-05 2017-04-04 International Business Machines Corporation Establishing trust within a cloud computing system
US9002824B1 (en) 2012-06-21 2015-04-07 Pivotal Software, Inc. Query plan management in shared distributed data stores
US9235446B2 (en) 2012-06-22 2016-01-12 Microsoft Technology Licensing, Llc Parallel computing execution plan optimization
US9177008B1 (en) 2012-06-29 2015-11-03 Pivotal Software, Inc. Positioned updates in a distributed shared-nothing data store
US10242052B2 (en) 2012-07-24 2019-03-26 Unisys Corporation Relational database tree engine implementing map-reduce query handling
US8572051B1 (en) 2012-08-08 2013-10-29 Oracle International Corporation Making parallel execution of structured query language statements fault-tolerant
US9582221B2 (en) * 2012-08-24 2017-02-28 Vmware, Inc. Virtualization-aware data locality in distributed data processing
US10579634B2 (en) 2012-08-30 2020-03-03 Citus Data Bilgi Islemleri Ticaret A.S. Apparatus and method for operating a distributed database with foreign tables
US8762330B1 (en) 2012-09-13 2014-06-24 Kip Cr P1 Lp System, method and computer program product for partially synchronous and partially asynchronous mounts/unmounts in a media library
US9262479B2 (en) 2012-09-28 2016-02-16 Oracle International Corporation Join operations for continuous queries over archived views
US8874602B2 (en) * 2012-09-29 2014-10-28 Pivotal Software, Inc. Random number generator in a MPP database
US9355127B2 (en) 2012-10-12 2016-05-31 International Business Machines Corporation Functionality of decomposition data skew in asymmetric massively parallel processing databases
WO2014062637A2 (en) 2012-10-15 2014-04-24 Hadapt, Inc. Systems and methods for fault tolerant, adaptive execution of arbitrary queries at low latency
US20140114952A1 (en) * 2012-10-23 2014-04-24 Microsoft Corporation Optimizing queries of parallel databases
US8892599B2 (en) 2012-10-24 2014-11-18 Marklogic Corporation Apparatus and method for securing preliminary information about database fragments for utilization in mapreduce processing
US9165006B2 (en) * 2012-10-25 2015-10-20 Blackberry Limited Method and system for managing data storage and access on a client device
US9442954B2 (en) * 2012-11-12 2016-09-13 Datawise Systems Method and apparatus for achieving optimal resource allocation dynamically in a distributed computing environment
US9185156B2 (en) 2012-11-13 2015-11-10 Google Inc. Network-independent programming model for online processing in distributed systems
US9449039B2 (en) 2012-11-26 2016-09-20 Amazon Technologies, Inc. Automatic repair of corrupted blocks in a database
US20140149392A1 (en) 2012-11-28 2014-05-29 Microsoft Corporation Unified search result service and cache update
US9460154B2 (en) 2012-12-04 2016-10-04 Oracle International Corporation Dynamic parallel aggregation with hybrid batch flushing
US9229979B2 (en) 2012-12-11 2016-01-05 Microsoft Technology Licensing, Llc Optimizing parallel queries using interesting distributions
US9311354B2 (en) * 2012-12-29 2016-04-12 Futurewei Technologies, Inc. Method for two-stage query optimization in massively parallel processing database clusters
US10366057B2 (en) 2012-12-31 2019-07-30 Teradata Us, Inc. Designated computing groups or pools of resources for storing and processing data based on its characteristics
US9268808B2 (en) 2012-12-31 2016-02-23 Facebook, Inc. Placement policy
US9275121B2 (en) 2013-01-03 2016-03-01 Sap Se Interoperable shared query based on heterogeneous data sources
US9081826B2 (en) 2013-01-07 2015-07-14 Facebook, Inc. System and method for distributed database query engines
US9130920B2 (en) 2013-01-07 2015-09-08 Zettaset, Inc. Monitoring of authorization-exceeding activity in distributed networks
EP2755148A1 (en) 2013-01-15 2014-07-16 Fujitsu Limited Data storage system, and program and method for execution in a data storage system
US20140214886A1 (en) 2013-01-29 2014-07-31 ParElastic Corporation Adaptive multi-client saas database
US9984083B1 (en) 2013-02-25 2018-05-29 EMC IP Holding Company LLC Pluggable storage system for parallel query engines across non-native file systems
US9753980B1 (en) 2013-02-25 2017-09-05 EMC IP Holding Company LLC M X N dispatching in large scale distributed system
US9342557B2 (en) * 2013-03-13 2016-05-17 Cloudera, Inc. Low latency query engine for Apache Hadoop
US9684571B2 (en) 2013-05-01 2017-06-20 Netapp, Inc. Namespace mirroring in an expandable storage volume
US20140337323A1 (en) 2013-05-08 2014-11-13 New Consumer Solutions LLC Methods and computing systems for generating and operating a searchable consumer market research knowledge repository
US9477731B2 (en) * 2013-10-01 2016-10-25 Cloudera, Inc. Background format optimization for enhanced SQL-like queries in Hadoop
JP6221717B2 (ja) 2013-12-12 2017-11-01 富士通株式会社 ストレージ装置、ストレージシステム及びデータ管理プログラム
US9069095B1 (en) * 2013-12-16 2015-06-30 Schlumberger Technology Corporation Monitoring the output of a radiation generator
US10095800B1 (en) * 2013-12-16 2018-10-09 Amazon Technologies, Inc. Multi-tenant data store management
JP6634722B2 (ja) * 2015-07-14 2020-01-22 富士電機株式会社 絶縁ブスバーおよび製造方法

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8171018B2 (en) * 2003-07-07 2012-05-01 Ibm International Group B.V. SQL code generation for heterogeneous environment
US7653665B1 (en) * 2004-09-13 2010-01-26 Microsoft Corporation Systems and methods for avoiding database anomalies when maintaining constraints and indexes in presence of snapshot isolation
US7984043B1 (en) * 2007-07-24 2011-07-19 Amazon Technologies, Inc. System and method for distributed query processing using configuration-independent query plans
US20090182792A1 (en) * 2008-01-14 2009-07-16 Shashidhar Bomma Method and apparatus to perform incremental truncates in a file system
US20090254916A1 (en) * 2008-04-08 2009-10-08 Infosys Technologies Ltd. Allocating resources for parallel execution of query plans
CN102033889A (zh) * 2009-09-29 2011-04-27 熊凡凡 分布式数据库并行处理系统
US20110246511A1 (en) * 2010-04-06 2011-10-06 John Smith Method and system for defining and populating segments
WO2012050582A1 (en) * 2010-10-14 2012-04-19 Hewlett-Packard Development Company, L.P. Continuous querying of a data stream
WO2012124178A1 (ja) * 2011-03-16 2012-09-20 日本電気株式会社 分散記憶システムおよび分散記憶方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109313587A (zh) * 2016-04-25 2019-02-05 康维达无线有限责任公司 用于在服务层处使能数据分析服务的方法
CN110618979A (zh) * 2019-08-14 2019-12-27 平安科技(深圳)有限公司 嵌套循环的数据处理方法、装置及计算机设备
CN110618979B (zh) * 2019-08-14 2022-09-09 平安科技(深圳)有限公司 嵌套循环的数据处理方法、装置及计算机设备

Also Published As

Publication number Publication date
US20170177665A1 (en) 2017-06-22
US20200012646A1 (en) 2020-01-09
US9594803B2 (en) 2017-03-14
US11354314B2 (en) 2022-06-07
US9805053B1 (en) 2017-10-31
US20150379078A1 (en) 2015-12-31
US9858315B2 (en) 2018-01-02
US9898475B1 (en) 2018-02-20
EP2959384B1 (en) 2022-09-21
US10936588B2 (en) 2021-03-02
US10769146B1 (en) 2020-09-08
US20180129707A1 (en) 2018-05-10
US20140244701A1 (en) 2014-08-28
US9454573B1 (en) 2016-09-27
US20190005093A1 (en) 2019-01-03
US20160292181A1 (en) 2016-10-06
US20180025057A1 (en) 2018-01-25
US11514046B2 (en) 2022-11-29
US11436224B2 (en) 2022-09-06
US20200065295A1 (en) 2020-02-27
US11288267B2 (en) 2022-03-29
US20180373755A1 (en) 2018-12-27
EP2959384A4 (en) 2016-08-31
US20180025024A1 (en) 2018-01-25
US20170169074A1 (en) 2017-06-15
US9888048B1 (en) 2018-02-06
US10915528B2 (en) 2021-02-09
US10838960B2 (en) 2020-11-17
WO2014130371A1 (en) 2014-08-28
US9792327B2 (en) 2017-10-17
CN104937552B (zh) 2019-09-20
US11281669B2 (en) 2022-03-22
US10719510B2 (en) 2020-07-21
US20180075052A1 (en) 2018-03-15
US20200257690A1 (en) 2020-08-13
US20180276274A1 (en) 2018-09-27
US9582520B1 (en) 2017-02-28
US10013456B2 (en) 2018-07-03
US9805092B1 (en) 2017-10-31
US10572479B2 (en) 2020-02-25
US10459917B2 (en) 2019-10-29
US20200151179A1 (en) 2020-05-14
US11120022B2 (en) 2021-09-14
US20160342647A1 (en) 2016-11-24
EP2959384A1 (en) 2015-12-30
US9753980B1 (en) 2017-09-05
US9171042B1 (en) 2015-10-27
US10540330B1 (en) 2020-01-21
US10120900B1 (en) 2018-11-06
US9626411B1 (en) 2017-04-18
US10698891B2 (en) 2020-06-30
US20180011902A1 (en) 2018-01-11
US9454548B1 (en) 2016-09-27
US9563648B2 (en) 2017-02-07

Similar Documents

Publication Publication Date Title
CN104937552A (zh) 并行数据库和分布式文件系统上的数据分析平台
US20210342747A1 (en) Method and system for distributed deep machine learning
CN110168516B (zh) 用于大规模并行处理的动态计算节点分组方法及系统
US9720949B2 (en) Client-side partition-aware batching of records for insert operations
WO2019068002A1 (en) INFRASTRUCTURE OF INDEPENDENT AUTONOMOUS DATABASE BASED CLOUD SERVICES
Schelter et al. Distributed matrix factorization with mapreduce using a series of broadcast-joins
US10255347B2 (en) Smart tuple dynamic grouping of tuples
US20140189702A1 (en) System and method for automatic model identification and creation with high scalability
JP2014525640A (ja) 並列処理開発環境の拡張
Aljawarneh et al. Efficient spark-based framework for big geospatial data query processing and analysis
Chen et al. Cost-effective resource provisioning for spark workloads
CN106874067A (zh) 基于轻量级虚拟机的并行计算方法、装置及系统
CN109032614A (zh) 一种物联网应用程序开发与维护的系统及方法
Meister et al. Maggy: Scalable asynchronous parallel hyperparameter search
WO2019032123A1 (en) SYSTEMS AND METHODS FOR GENERATING DISTRIBUTED SOFTWARE USING AN UNREGRIBUTED SOURCE CODE
US10657135B2 (en) Smart tuple resource estimation
US10339037B1 (en) Recommendation engine for recommending prioritized performance test workloads based on release risk profiles
KR101621490B1 (ko) 쿼리 실행 장치 및 방법, 그리고 그를 이용한 데이터 처리 시스템
US10289447B1 (en) Parallel process scheduling for efficient data access
Li et al. Phronesis: Efficient performance modeling for high-dimensional configuration tuning
US11809981B1 (en) Performing hardware operator fusion
KR20190124512A (ko) 그래프 스트림에 대한 실시간 분산 저장을 위한 분할 방법 및 분할 장치
US11907195B2 (en) Relationship analysis using vector representations of database tables
US11809849B1 (en) Global modulo allocation in neural network compilation
CN107766442A (zh) 一种海量数据关联规则挖掘方法及系统

Legal Events

Date Code Title Description
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant