US20220239728A1 - System and method for global data sharing - Google Patents
System and method for global data sharing Download PDFInfo
- Publication number
- US20220239728A1 US20220239728A1 US17/709,689 US202217709689A US2022239728A1 US 20220239728 A1 US20220239728 A1 US 20220239728A1 US 202217709689 A US202217709689 A US 202217709689A US 2022239728 A1 US2022239728 A1 US 2022239728A1
- Authority
- US
- United States
- Prior art keywords
- data
- cloud computing
- listing
- exchange
- provider
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 107
- 238000012545 processing Methods 0.000 claims description 82
- 238000013498 data listing Methods 0.000 claims description 20
- 238000013500 data storage Methods 0.000 claims description 16
- 230000003362 replicative effect Effects 0.000 claims description 3
- 230000004044 response Effects 0.000 abstract description 14
- 238000010586 diagram Methods 0.000 description 45
- 230000008569 process Effects 0.000 description 32
- 230000006870 function Effects 0.000 description 23
- 238000003860 storage Methods 0.000 description 23
- 230000009471 action Effects 0.000 description 15
- 230000015654 memory Effects 0.000 description 14
- 238000005304 joining Methods 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 11
- 230000003993 interaction Effects 0.000 description 10
- 238000007726 management method Methods 0.000 description 10
- 230000010076 replication Effects 0.000 description 10
- 238000013459 approach Methods 0.000 description 8
- 230000002776 aggregation Effects 0.000 description 7
- 238000004220 aggregation Methods 0.000 description 7
- 238000010200 validation analysis Methods 0.000 description 7
- 230000008520 organization Effects 0.000 description 6
- 230000009466 transformation Effects 0.000 description 6
- 230000005641 tunneling Effects 0.000 description 6
- 230000008901 benefit Effects 0.000 description 5
- 238000004590 computer program Methods 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 230000003068 static effect Effects 0.000 description 4
- 241000533950 Leucojum Species 0.000 description 3
- 230000004931 aggregating effect Effects 0.000 description 3
- 230000005611 electricity Effects 0.000 description 3
- 230000036541 health Effects 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000002123 temporal effect Effects 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 238000012384 transportation and delivery Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000037406 food intake Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000013070 change management Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013479 data entry Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013503 de-identification Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2379—Updates performed during online database operations; commit processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/604—Tools and structures for managing or administering access control systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6272—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database by registering files or documents with a third party
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/50—Network service management, e.g. ensuring proper service fulfilment according to agreements
- H04L41/508—Network service management, e.g. ensuring proper service fulfilment according to agreements based on type of value added network service under agreement
- H04L41/5096—Network service management, e.g. ensuring proper service fulfilment according to agreements based on type of value added network service under agreement wherein the managed service relates to distributed or central networked applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0272—Virtual private networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/101—Access control lists [ACL]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/102—Entity profiles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/104—Grouping of entities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/107—Network architectures or network communication protocols for network security for controlling access to devices or network resources wherein the security policies are location-dependent, e.g. entities privileges depend on current location or allowing specific operations only from locally connected terminals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/20—Network architectures or network communication protocols for network security for managing network security; network security policies in general
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1095—Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H04L67/16—
-
- H04L67/20—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/51—Discovery or management thereof, e.g. service location protocol [SLP] or web services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/53—Network services using third party service providers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0896—Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
- H04L41/0897—Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities by horizontal or vertical scaling of resources, or by migrating entities, e.g. virtual resources or entities
Abstract
Sharing data in a data exchange across multiple cloud computing platforms and/or cloud computing platform regions is described. An example computer-implemented method can include receiving data sharing information from a data provider for sharing a data set in a data exchange from a first cloud computing entity to a set of second cloud computing entities. In response to receiving the data sharing information, the method may also include creating an account with each of the set of second cloud computing entities. The method may also further include sharing the data set from the first cloud computing entity with the set of second cloud computing entities using at least the corresponding account of that second cloud computing entity.
Description
- This application is a continuation of U.S. patent application Ser. No. 17/378,562, filed Jul. 16, 2021, which is a continuation of U.S. patent application Ser. No. 17/220,887, filed Apr. 1, 2021, now U.S. Pat. No. 11,082,483, issued Aug. 3, 2021, which is a continuation of U.S. patent application Ser. No. 16/814,875, filed Mar. 10, 2020, now U.S. Pat. No. 10,999,355, issued May 4, 2021, which claims the benefit of U.S. Provisional Application No. 62/966,977, entitled “Global Data Sharing,” filed Jan. 28, 2020, the disclosures of which are incorporated herein by reference in its entirety.
- The present disclosure relates to resource management systems and methods that manage data storage and computing resources.
- Databases are widely used for data storage and access in computing applications. Databases may include one or more tables that include or reference data that can be read, modified, or deleted using queries. Databases may be used for storing and/or accessing personal information or other sensitive information. Secure storage and access of database data may be provided by encrypting and/or storing data in an encrypted form to prevent unauthorized access. In some cases, data sharing may be desirable to let other parties perform queries against a set of data.
- The described embodiments and the advantages thereof may best be understood by reference to the following description taken in conjunction with the accompanying drawings. These drawings in no way limit any changes in form and detail that may be made to the described embodiments by one skilled in the art without departing from the spirit and scope of the described embodiments.
-
FIG. 1A is a block diagram depicting an example computing environment in which the methods disclosed herein may be implemented. -
FIG. 1B is a block diagram illustrating an example virtual warehouse. -
FIG. 2 is a schematic block diagram of data that may be used to implement a public or private data exchange in accordance with an embodiment of the present invention. -
FIG. 3 is a schematic block diagram of components for implementing a data exchange in accordance with an embodiment of the present invention. -
FIG. 4A is a process flow diagram of a method for controlled sharing of data among entities in a data exchange in accordance with an embodiment of the present invention. -
FIG. 4B is a diagram illustrating data used for implementing private sharing of data in accordance with an embodiment of the present invention. -
FIG. 4C is a diagram illustrating a secure view for implementing private sharing of data in accordance with an embodiment of the present invention. -
FIG. 5 is a process flow diagram of a method for public sharing of data among entities in a data exchange in accordance with an embodiment of the present invention. -
FIG. 6 is a process flow diagram of a method for performing bi-directional shares in a data exchange in accordance with an embodiment of the present invention. -
FIG. 7 is a process flow diagram of a method for providing enriched data in a data exchange in accordance with an embodiment of the present invention. -
FIG. 8 is a block diagram illustrating a network environment in which a data provider may share data via a cloud computing service. -
FIG. 9 is an example private data exchange in accordance with an embodiment of the present invention. -
FIG. 10 is a diagram illustrating an example secure view of shared data from a private data exchange. -
FIG. 11 is a diagram illustrating an example tunneling of a data listing between two private data exchanges. -
FIG. 12 is a diagram illustrating an example data query and delivery service according to some embodiments of the invention. -
FIG. 13A is a block diagram of an example system of multiple cloud computing services sharing data with a data exchange. -
FIG. 13B is a block diagram of an example system of a cloud computing service sharing data with a data exchange across multiple regions of the cloud computing service. -
FIG. 14 is a process flow diagram of a method for sharing data across multiple cloud computing services and/or across multiple regions with a cloud computing service. -
FIG. 15 is a process flow diagram of a method for creating a listing within a data exchange, where the listing is available in different cloud computing services and/or in multiple regions with a cloud computing service. -
FIG. 16 is a process flow diagram of a method for creating a listing for personalized shares within a data exchange, where the listing is available in different cloud computing services and/or in multiple regions with a cloud computing service. -
FIG. 17 is a process flow diagram of a method for sharing data with a Virtual Private Cloud (VPC). -
FIG. 18 is a block diagram of an example computing device that may perform one or more of the operations described herein, in accordance with some embodiments. - Data providers often have data assets that are cumbersome to share. A data asset may be data that is of interest to another entity. For example, a large online retail company may have a data set that includes the purchasing habits of millions of customers over the last ten years. This data set may be large. If the online retailer wishes to share all or a portion of this data with another entity, the online retailer may need to use old and slow methods to transfer the data, such as a file-transfer-protocol (FTP), or even copying the data onto physical media and mailing the physical media to the other entity. This has several disadvantages. First, it is slow. Copying terabytes or petabytes of data can take days. Second, once the data is delivered, the sharer cannot control what happens to the data. The recipient can alter the data, make copies, or share it with other parties. Third, the only entities that would be interested in accessing such a large data set in such a manner are large corporations that can afford the complex logistics of transferring and processing the data as well as the high price of such a cumbersome data transfer. Thus, smaller entities (e.g., “mom and pop” shops) or even smaller, nimbler cloud-focused startups are often priced out of accessing this data, even though the data may be valuable to their businesses. This may be because raw data assets are generally too unpolished and full of potentially sensitive data to just outright sell to other companies. Data cleaning, de-identification, aggregation, joining, and other forms of data enrichment need to be performed by the owner of data before it is shareable with another party. This is time-consuming and expensive. Finally, it is difficult to share data assets with many entities because traditional data sharing methods do not allow scalable sharing for the reasons mentioned above. Traditional sharing methods also introduce latency and delays in terms of all parties having access to the most recently-updated data.
- A private data exchange may allow data providers to more easily and securely share their data assets with other entities. A private data exchange can be under the data provider's brand, and the data provider may control who can gain access to it. The private data exchange may be for internal use only, or may also be opened to customers, partners, suppliers, or others. The data provider may control what data assets are listed as well as control who has access to which sets of data. This allows for a seamless way to discover and share data both within a data provider's organization and with its business partners.
- The private data exchange may be facilitated by a cloud computing service such as SNOWFLAKE, and allows data providers to offer data assets directly from their own online domain (e.g., website) in a private online marketplace with their own branding. The private data exchange may provide a centralized, managed hub for an entity to list internally or externally-shared data assets, inspire data collaboration, and also to maintain data governance and to audit access. With the private data exchange, data providers may be able to share data without copying it between companies. Data providers may invite other entities to view their data listings, control which data listings appear in their private online marketplace, control who can access data listings and how others can interact with the data assets connected to the listings. This may be thought of as a “walled garden” marketplace, in which visitors to the garden must be approved and access to certain listings may be limited.
- As an example, Company A may be a consumer data company that has collected and analyzed the consumption habits of millions of individuals in several different categories. Their data sets may include data in the following categories: online shopping, video streaming, electricity consumption, automobile usage, internet usage, clothing purchases, mobile application purchases, club memberships, and online subscription services. Company A may desire to offer these data sets (or subsets or derived products of these data sets) to other entities. For example, a new clothing brand may wish to access data sets related to consumer clothing purchases and online shopping habits. Company A may support a page on its website that is or functions substantially similar to a private data exchange, where a data consumer (e.g., the new clothing brand) may browse, explore, discover, access and potentially purchase data sets directly from Company A. Further, Company A may control: who can enter the private data exchange, the entities that may view a particular listing, the actions that an entity may take with respect to a listing (e.g., view only), and any other suitable action. In addition, a data provider may combine its own data with other data sets from, e.g., a public data exchange, and create new listings using the combined data.
- A private data exchange may be an appropriate place to discover, assemble, clean, and enrich data to make it more monetizable. A large company on a private data exchange may assemble data from across its divisions and departments, which could become valuable to another company. In addition, participants in a private ecosystem data exchange may work together to join their datasets together to jointly create a useful data product that any one of them alone would not be able to produce. Once these joined datasets are created, they may be listed on a public or private data exchange.
- The systems and methods described herein provide a flexible and scalable data warehouse using a new data processing platform. In some embodiments, the described systems and methods leverage a cloud infrastructure that supports cloud-based storage resources, computing resources, and the like. Example cloud-based storage resources offer significant storage capacity available on-demand at a low cost. Further, these cloud-based storage resources may be fault-tolerant and highly scalable, which can be costly to achieve in private data storage systems. Example cloud-based computing resources are available on-demand and may be priced based on actual usage levels of the resources. Typically, the cloud infrastructure is dynamically deployed, reconfigured, and decommissioned in a rapid manner.
- In the described systems and methods, a data storage system utilizes an SQL (Structured Query Language)-based relational database. However, these systems and methods are applicable to any type of database, and any type of data storage and retrieval platform, using any data storage architecture and using any language to store and retrieve data within the data storage and retrieval platform. The systems and methods described herein further provide a multi-tenant system that supports isolation of computing resources and data between different customers/clients and between different users within the same customer/client.
-
FIG. 1A is a block diagram of anexample computing environment 100 in which the systems and methods disclosed herein may be implemented. In particular, acloud computing platform 110 may be implemented, such as AMAZON WEB SERVICES™ (AWS), MICROSOFT AZURE™, GOOGLE CLOUD™, or the like. As known in the art, acloud computing platform 110 provides computing resources and storage resources that may be acquired (purchased) or leased and configured to execute applications and store data. - The
cloud computing platform 110 may host acloud computing service 112 that facilitates storage of data on the cloud computing platform 110 (e.g. data management and access) and analysis functions (e.g., SQL queries, analysis), as well as other computation capabilities (e.g., secure data sharing between users of the cloud computing platform 110). Thecloud computing platform 110 may include a three-tier architecture:data storage 140,query processing 130, andcloud services 120. -
Data storage 140 may facilitate the storing of data on thecloud computing platform 110 in one ormore cloud databases 141.Data storage 140 may use a storage service such as AMAZON S3 to store data and query results on thecloud computing platform 110. In particular embodiments, to load data into thecloud computing platform 110, data tables may be horizontally partitioned into large, immutable files which may be analogous to blocks or pages in a traditional database system. Within each file, the values of each attribute or column are grouped together and compressed using a scheme sometimes referred to as hybrid columnar. Each table has a header which, among other metadata, contains the offsets of each column within the file. - In addition to storing table data,
data storage 140 facilitates the storage of temp data generated by query operations (e.g., joins), as well as the data contained in large query results. This may allow the system to compute large queries without out-of-memory or out-of-disk errors. Storing query results this way may simplify query processing as it removes the need for server-side cursors found in traditional database systems. -
Query processing 130 may handle query execution within elastic clusters of virtual machines, referred to herein as virtual warehouses or data warehouses. Thus,query processing 130 may include one or morevirtual warehouses 131, which may also be referred to herein as data warehouses. Thevirtual warehouses 131 may be one or more virtual machines operating on thecloud computing platform 110. Thevirtual warehouses 131 may be compute resources that may be created, destroyed, or resized at any point, on demand. This functionality may create an “elastic” virtual warehouse that expands, contracts, or shuts down according to the user's needs. Expanding a virtual warehouse involves generating one ormore compute nodes 132 to avirtual warehouse 131. Contracting a virtual warehouse involves removing one ormore compute nodes 132 from avirtual warehouse 131. More computenodes 132 may lead to faster compute times. For example, a data load which takes fifteen hours on a system with four nodes might take only two hours with thirty-two nodes. - Cloud services 120 may be a collection of services that coordinate activities across the
cloud computing service 112. These services tie together all of the different components of thecloud computing service 112 in order to process user requests, from login to query dispatch. Cloud services 120 may operate on compute instances provisioned by thecloud computing service 112 from thecloud computing platform 110. Cloud services 120 may include a collection of services that manage virtual warehouses, queries, transactions, data exchanges, and the metadata associated with such services, such as database schemas, access control information, encryption keys, and usage statistics. Cloud services 120 may include, but not be limited to,authentication engine 121,infrastructure manager 122,optimizer 123,exchange manager 124,security 125 engine, andmetadata storage 126. -
FIG. 1B is a block diagram illustrating an examplevirtual warehouse 131. Theexchange manager 124 may facilitate the sharing of data between data providers and data consumers, using, for example, a private data exchange. For example,cloud computing service 112 may manage the storage and access of adatabase 108. Thedatabase 108 may include various instances ofuser data 150 for different users, e.g. different enterprises or individuals. The user data may include auser database 152 of data stored and accessed by that user. Theuser database 152 may be subject to access controls such that only the owner of the data is allowed to change and access theuser database 152 upon authenticating with thecloud computing service 112. For example, data may be encrypted such that it can only be decrypted using decryption information possessed by the owner of the data. Using theexchange manager 124, specific data from auser database 152 that is subject to these access controls may be shared with other users in a controlled manner according to the methods disclosed herein. In particular, a user may specifyshares 154 that may be shared in a public or private data exchange in an uncontrolled manner or shared with specific other users in a controlled manner as described above. A “share” encapsulates all of the information required to share data in a database. A share may include at least three pieces of information: (1) privileges that grant access to the database(s) and the schema containing the objects to share, (2) the privileges that grant access to the specific objects (e.g., tables, secure views, and secure UDFs), and (3) the consumer accounts with which the database and its objects are shared. When data is shared, no data is copied or transferred between users. Sharing is accomplished through thecloud services 120 ofcloud computing service 112. - Sharing data may be performed when a data provider creates a share of a database in the data provider's account and grants access to particular objects (e.g., tables, secure views, and secure user-defined functions (UDFs)). Then a read-only database may be created using the information provided in the share. Access to this database may be controlled by the data provider.
- Shared data may then be used to process SQL queries, possibly including joins, aggregations, or other analysis. In some instances, a data provider may define a share such that “secure joins” are permitted to be performed with respect to the shared data. A secure join may be performed such that analysis may be performed with respect to shared data but the actual shared data is not accessible by the data consumer (e.g., recipient of the share). A secure join may be performed as described in U.S. application Ser. No. 16/368,339, filed Mar. 18, 2019.
- User devices 101-104, such as laptop computers, desktop computers, mobile phones, tablet computers, cloud-hosted computers, cloud-hosted serverless processes, or other computing processes or devices may be used to access the
virtual warehouse 131 orcloud service 120 by way of anetwork 105, such as the Internet or a private network. - In the description below, actions are ascribed to users, particularly consumers and providers. Such actions shall be understood to be performed with respect to devices 101-104 operated by such users. For example, notification to a user may be understood to be a notification transmitted to devices 101-104, an input or instruction from a user may be understood to be received by way of the user's devices 101-104, and interaction with an interface by a user shall be understood to be interaction with the interface on the user's devices 101-104. In addition, database operations (joining, aggregating, analysis, etc.) ascribed to a user (consumer or provider) shall be understood to include performing such actions by the
cloud computing service 112 in response to an instruction from that user. -
FIG. 2 is a schematic block diagram of data that may be used to implement a public or private data exchange in accordance with an embodiment of the present invention. Theexchange manager 124 may operate with respect to some or all of the illustratedexchange data 200, which may be stored on the platform executing the exchange manager 124 (e.g., the cloud computing platform 110) or at some other location. Theexchange data 200 may include a plurality oflistings 202 describing data that is shared by a first user (“the provider”). Thelistings 202 may be listings in a private data exchange or in a public data exchange. The access controls, management, and governance of the listings may be similar for both a public data exchange and a private data exchange. Alisting 202 may includemetadata 204 describing the shared data. Alisting 202 may includemetadata 204 describing the shared data. Themetadata 204 may include some or all of the following information: an identifier of the sharer of the shared data, a URL associated with the sharer, a name of the share, a name of tables, a category to which the shared data belongs, an update frequency of the shared data, a catalog of the tables, a number of columns and a number of rows in each table, as well as name and descriptions of the columns. Themetadata 204 may also include examples to aid a user in using the data. Such examples may include sample tables or views that include a sample of rows and columns of an example table, example queries that may be run against the tables and/or possibly the results thereof, example views of an example table, example visualizations (e.g., graphs, dashboards) based on a table's data. Other information included in themetadata 204 may be metadata for use by business intelligence tools, text description of data contained in the table, keywords associated with the table to facilitate searching, bloom filters or other full text indices of the data in certain columns, a link (e.g., URL) to documentation related to the shared data, and a refresh interval indicating how frequently the shared data is updated (or an indication that the shared data is updated continuously) along with the date the data was last updated. - The
listing 202 may includeaccess controls 206, which may be configurable to any suitable access configuration. For example, access controls 206 may indicate that the shared data is available to any member of the private exchange without restriction (an “any share” as used elsewhere herein). The access controls 206 may specify a class of users (members of a particular group or organization) that are allowed to access the data and/or see the listing. The access controls 206 may specify that a “point-to-point” share (see discussion ofFIG. 4 ) in which users may request access but are only allowed access upon approval of the provider. The access controls 206 may specify a set of user identifiers of users that are excluded from being able to access the data referenced by thelisting 202. - Note that some
listings 202 may be discoverable by users without further authentication or access permissions whereas actual accesses are only permitted after a subsequent authentication step (see discussion ofFIGS. 4 and 6 ). The access controls 206 may specify that alisting 202 is only discoverable by specific users or classes of users. - Note also that a default function for
listings 202 is that the data referenced by the share is not exportable or copyable by the consumer. Alternatively, the access controls 206 may specify that this operation is not permitted. For example, access controls 206 may specify that secure operations (secure joins and secure functions as discussed below) may be performed with respect to the shared data such that viewing and exporting of the shared data is not permitted. - In some embodiments, once a user is authenticated with respect to a
listing 202, a reference to that user (e.g., user identifier of the user's account with the virtual warehouse 131) is added to the access controls 206 such that the user will subsequently be able to access the data referenced by the listing 202 without further authentication. - The
listing 202 may define one ormore filters 208. For example, thefilters 208 may definespecific identity data 214 of users that may view references to thelisting 202 when browsing thecatalog 220. Thefilters 208 may define a class of users (users of a certain profession, users associated with a particular company or organization, users within a particular geographical area or country) that may view references to thelisting 202 when browsing thecatalog 220. In this manner, a private exchange may be implemented by theexchange manager 124 using the same components. In some embodiments, an excluded user that is excluded from accessing alisting 202, i.e. adding thelisting 202 to the consumedshares 156 of the excluded user, may still be permitted to view a representation of the listing when browsing thecatalog 220 and may further be permitted to request access to thelisting 202 as discussed below. Requests to access a listing by such excluded users and other users may be listed in an interface presented to the provider of thelisting 202. The provider of thelisting 202 may then view demand for access to the listing and choose to expand thefilters 208 to permit access to excluded users or classes of excluded users (e.g., users in excluded geographic regions or countries). -
Filters 208 may further define what data may be viewed by a user. In particular, filters 208 may indicate that a user that selects alisting 202 to add to the consumedshares 156 of the user is permitted to access the data referenced by the listing but only a filtered version that only includes data associated with theidentity data 214 of that user, associated with that user's organization, or specific to some other classification of the user. In some embodiments, a private exchange is by invitation: users invited by a provider to viewlistings 202 of a private exchange are enabled to do by theexchange manager 124 upon communicating acceptance of an invitation received from the provider. - In some embodiments, a
listing 202 may be addressed to a single user. Accordingly, a reference to thelisting 202 may be added to a set of “pending shares” that is viewable by the user. Thelisting 202 may then be added to a group of shares of the user upon the user communicating approval to theexchange manager 124. - The
listing 202 may further includeusage data 210. For example, thecloud computing service 112 may implement a credit system in which credits are purchased by a user and are consumed each time a user runs a query, stores data, or uses other services implemented by thecloud computing service 112. Accordingly,usage data 210 may record an amount of credits consumed by accessing the shared data.Usage data 210 may include other data such as a number of queries, a number of aggregations of each type of a plurality of types performed against the shared data, or other usage statistics. In some embodiments, usage data for alisting 202 ormultiple listings 202 of a user is provided to the user in the form of a shared database, i.e. a reference to a database including the usage data is added by theexchange manager 124 to the consumed shares of the user. - The
listing 202 may also include aheat map 211, which may represent the geographical locations in which users have clicked on that particular listing. Thecloud computing service 112 may use the heat map to make replication decisions or other decisions with the listing. For example, a private data exchange may display a listing that contains weather data for Georgia, USA. Theheat map 211 may indicate that many users in California are selecting the listing to learn more about the weather in Georgia. In view of this information, thecloud computing service 112 may replicate the listing and make it available in a database whose servers are physically located in the western United States, so that consumers in California may have access to the data. In some embodiments, an entity may store its data on servers located in the western United States. A particular listing may be very popular to consumers. Thecloud computing service 112 may replicate that data and store it in servers located in the eastern United States, so that consumers in the Midwest and on the East Coast may also have access to that data. - The
listing 202 may also include one ormore tags 213. Thetags 213 may facilitate simpler sharing of data contained in one or more listings. As an example, a large company may have a human resources (HR) listing containing HR data for its internal employees on a private data exchange. The HR data may contain ten types of HR data (e.g., employee number, selected health insurance, current retirement plan, job title, etc.). The HR listing may be accessible to 100 people in the company (e.g., everyone in the HR department). Management of the HR department may wish to add an eleventh type of HR data (e.g., an employee stock option plan). Instead of manually adding this to the HR listing and granting each of the 100 people access to this new data, management may simply apply an HR tag to the new data set and that can be used to categorize the data as HR data, list it along with the HR listing, and grant access to the 100 people to view the new data set. - The
listing 202 may also includeversion metadata 215.Version metadata 215 may provide a way to track how the datasets are changed. This may assist in ensuring that the data that is being viewed by one entity is not changed prematurely. For example, if a company has an original data set and then releases an updated version of that data set, the updates could interfere with another user's processing of that data set, because the update could have different formatting, new columns, and other changes that may be incompatible with the current processing mechanism of the recipient user. To remedy this, thecloud computing service 112 may track version updates usingversion metadata 215. Thecloud computing service 112 may ensure that each data consumer accesses the same version of the data until they accept an updated version that will not interfere with current processing of the data set. - The
exchange data 200 may further include user records 212. Theuser record 212 may include data identifying the user associated with theuser record 212, e.g. an identifier (e.g., warehouse identifier) of a user havinguser data 130 inservice database 158 and managed by thevirtual warehouse 131. - The
user record 212 may list shares associated with the user, e.g.,reference listings 154 created by the user. Theuser record 212 may list shares consumed by the user, e.g.,reference listings 202 created by another user and that have been associated to the account of the user according to the methods described herein. For example, alisting 202 may have an identifier that will be used to reference it in the shares or consumed shares of auser record 212. - The
exchange data 200 may further include acatalog 220. Thecatalog 220 may include a listing of allavailable listings 202 and may include an index of data from themetadata 204 to facilitate browsing and searching according to the methods described herein. In some embodiments,listings 202 are stored in the catalog in the form of JavaScript Object Notation (JSON) objects. - Note that where there a multiple instances of the
virtual warehouse 131 on different cloud computing platforms, thecatalog 220 of one instance of thevirtual warehouse 131 may store listings or references to listings from other instances on one or more othercloud computing platforms 110. Accordingly, each listing 202 may be globally unique (e.g., be assigned a globally unique identifier across all of the instances of the virtual warehouse 131). For example, the instances of thevirtual warehouses 131 may synchronize their copies of thecatalog 220 such that each copy indicates thelistings 202 available from all instances of thevirtual warehouse 131. In some instances, a provider of alisting 202 may specify that it is to be available on only on specified one ormore computing platforms 110. - In some embodiments, the
catalog 220 is made available on the Internet such that it is searchable by a search engine such as BING or GOOGLE. The catalog may be subject to a search engine optimization (SEO) algorithm to promote its visibility. Potential consumers may therefore browse thecatalog 220 from any web browser. Theexchange manager 124 may expose uniform resource locators (URLs) linked to eachlisting 202. This web page underlying each URL may be searchable can be shared outside of any interface implemented by theexchange manager 124. For example, the provider of alisting 202 may publish the URLs for itslistings 202 in order to promote usage of itslisting 202 and its brand. -
FIG. 3 illustrates various components 300-310 that may be included in theexchange manager 124. Acreation module 300 may provide an interface for creatinglistings 202. For example, a web page interface enables a user on one or more devices 101-104 to select data, e.g. a specific table inuser data 150 of the user, for sharing and enter values defining some or all of themetadata 204, access controls 206, and filters 208. In some embodiments, creation may be performed by a user by way of SQL commands in an SQL interpreter executing on thecloud computing platform 110 and accessed by way of a webpage interface on a user device 101-104. - A
validation module 302 may validate information provided by a provider when attempting to create alisting 202. Note that in some embodiments the actions ascribed to thevalidation module 302 may be performed by a human reviewing the information provided by the provider. In other embodiments, these actions are performed automatically. Thevalidation module 302 may perform, or facilitate performing by a human operator of various functions. These functions may include verifying that themetadata 204 is consistent with the shared data to which it references, verifying that the shared data referenced bymetadata 204 is not pirated data, personal identification information (PII), personal health information (PHI) or other data for which sharing is undesirable or illegal. Thevalidation module 302 may also facilitate the verification that the data has been updated within a threshold period of time (e.g., within the last twenty-four hours). Thevalidation module 302 may also facilitate verifying that the data is not static or not available from other static public sources. Thevalidation module 302 may also facilitate verifying that the data is more than merely a sample (e.g., that the data is sufficiently complete to be useful). For example, geographically limited data may be undesirable whereas an aggregation of data that is not otherwise limited may still be of use. - The
exchange manager 124 may include asearch module 304. Thesearch module 304 may implement a webpage interface that is accessible by a user on one or more user devices 101-104 in order to invoke searches for search strings with respect to the metadata in thecatalog 220, receive responses to searches, and select references tolistings 202 in search results for adding to the consumedshares 156 of theuser record 212 of the user performing the search. In some embodiments, searches may be performed by a user by way of SQL commands in an SQL interpreter executing on thecloud computing platform 110 and accessed by way of a webpage interface on user devices 101-104. For example, searching for shares may be performed by way of SQL queries against thecatalog 220 within theSQL engine 310 discussed below. - The
search module 304 may further implement a recommendation algorithm. For example, the recommendation algorithm could recommendother listing 202 for a user based on other listings in the user's consumedshares 156 or formerly in the user's consumed shares. Recommendations could be based on logical similarity: one source of weather data leads to a recommendation for a second source of weather data. Recommendations could be based on dissimilarity: one listing is for data in one domain (geographic area, technical field, etc.) results in a listing for a different domain to facilitate complete coverage by the user's analysis (different geographic area, related technical field, etc.). - The
exchange manager 124 may include anaccess management module 306. As described above, a user may add alisting 202. This may require authentication with respect to the provider of thelisting 202. Once alisting 202 is added to the consumedshares 156 of theuser record 212 of a user, the user may be either (a) required to authenticate each time the data referenced by thelisting 202 is accessed or (b) be automatically authenticated and allowed to access the data once the listing 202 is added. Theaccess management module 306 may manage automatic authentication for subsequent access of data in the consumedshares 156 of a user in order to provide seamless access of the shared data as if it was part of theuser data 150 of that user. To that end, theaccess management module 306 may accessaccess controls 206 of thelisting 202, certificates, tokens, or other authentication material in order to authenticate the user when performing accesses to shared data. - The
exchange manager 124 may include a joiningmodule 308. The joiningmodule 308 manages the integration of shared data referenced by consumedshares 156 of a user with one another, i.e. shared data from different providers, and with auser database 152 of data owned by the user. In particular, the joiningmodule 308 may manage the execution of queries and other computation functions with respect to these various sources of data such that their access is transparent to the user. The joiningmodule 308 may further manage the access of data to enforce restrictions on shared data, e.g. such that analysis may be performed and the results of the analysis displayed without exposing the underlying data to the consumer of the data where this restriction is indicated by the access controls 206 of alisting 202. - The
exchange manager 124 may further include a standard query language (SQL)engine 310 that is programmed to receive queries from a user and execute the query with respect to data referenced by the query, which may include consumedshares 156 of the user and theuser database 152 owned by the user. TheSQL engine 310 may perform any query processing functionality known in the art. TheSQL engine 310 may additionally or alternatively include any other database management tool or data analysis tool known in the art. TheSQL engine 310 may define a webpage interface executing on thecloud computing platform 110 through which SQL queries are input and responses to SQL queries are presented - Referring to
FIG. 4A , the illustratedmethod 400 may be executed by theexchange manager 124 in order to implement a point-to-point share between a first user (“provider 402”) and a second user (“consumer 404”). - The
method 400 may include the provider entering 406 metadata. This may include a user on devices 101-104 of the provider entering the metadata into fields of a form in a web page provided by theexchange manager 124. In some embodiments, entering 406 of metadata may be made using SQL commands by way of theSQL engine 310. The items of metadata may include some or all of those discussed above with respect to themetadata 204 of alisting 202. Step 406 may include receiving other data for alisting 202, such as access controls 206 and parameters defining afilter 208. - The
provider 402 may then invoke, on the devices 101-104, submission of the form and the data entered. - The
exchange manager 124 may then verify 408 the metadata and validate 410 the data referenced by the metadata. This may include performing some or all of the actions ascribed to thevalidation module 302. - If the metadata and shared data are not successfully verified 408 and validated 410, the
exchange manager 124 may notify theprovider 402, such as by means of a notification through the web interface through which the metadata was submitted atstep 406. - If the metadata and shared data are not successfully verified 408 and validated 410, the
exchange manager 124 may notify theprovider 402, such as by way of the web interface through which the metadata was submitted atstep 406. - The
exchange manager 124 may further create 412 alisting 202 including the data submitted atstep 406 and may further create an entry in thecatalog 220. For example, keywords, descriptive text, and other items of information in the metadata may be indexed to facilitate searching. - Note that steps 406-412 may be performed by means of interface provided to the
provider 402. Such an interface may include any suitable features including elements for inputting data (e.g., elements 204-210), and elements for generating a data listing. In addition, the interface may include elements to publish or unpublish a data listing to make the listing un-viewable to at least some other users. The interface may also include an element to update versions of the data listing or to roll back to a prior version of the listing or of the metadata associated with the listing. The interface may also include a list of pending requests to add a data listing or to add members to the data exchange. The interface may also include an indication of the number and other non-identifying information related to the data consumers who have accessed a given listing, as well as a representation of usage patterns of the data referenced by a listing by the data consumers of that listing. - Another user acting as a
consumer 404 may then browse 414 the catalog. This may include accessing a webpage providing a search interface to the catalog. This webpage may be external to thevirtual warehouse 131, i.e. accessible by users that are not logged into thevirtual warehouse 131. In other embodiments, only users that are logged in to thevirtual warehouse 131 are able to access the search interface. As noted above, browsing of thecatalog 220 may be performed using queries to theSQL engine 310 that reference thecatalog 220. For example, user devices 101-104 may have a web-based interface to theSQL engine 310 through which queries against thecatalog 220 are input by theconsumer 404 and transmitted to theSQL engine 310. - In response to the consumer's browsing activities, the
exchange manager 124 may display the catalog and perform 416 searches with respect to the catalog to identifylistings 202 having metadata corresponding to queries or search strings submitted by theconsumer 404. The manner in which this search is performed may be according to any search algorithm known in the art. In the case of an SQL query, the query may be processed according to any approach for processing SQL queries known in the art. - The
exchange manager 124 may return results of a search string or SQL query to the consumer's 404 devices 101-104, such as in the form of a listing of references tolistings 202 identified according to the search algorithm or processing the SQL query. The listing may include items of metadata or links that theconsumer 404 may select to invoke display of metadata. In particular, any of the items ofmetadata 204 of alisting 202 may be displayed in the listing or linked to by an entry in the listing corresponding to thesearch record 202. - Note that the exchange referenced in
FIG. 4A may be a private exchange or a public exchange. In particular, thoselistings 202 that are displayed and searched 416 and viewable by theconsumer 404 duringbrowsing 414 may be limited to those havingfilters 208 that indicate that thelisting 202 is viewable by theconsumer 404, an organization of the consumer, or some other classification to which theconsumer 404 belongs. Where the exchange is public, then theconsumer 404 is not required to meet any filter criteria in some embodiments. - The
method 400 may include theconsumer 404 requesting 418 to access data corresponding to alisting 202. For example, by selecting an entry in the listing on the devices 101-104 of theconsumer 404, which invokes transmission of a request to theexchange manager 124 to add thelisting 202 corresponding to the entry to the consumedshares 156 in theuser record 212 of theconsumer 404. - In the illustrated example, the listing 202 of the selected entry has access controls 206. Accordingly, the
exchange manager 124 may forward 420 the request to theprovider 402 along with an identifier of theconsumer 404. Theconsumer 404 andprovider 402 may then interact to one or both of (a) authenticate (login) 424 theconsumer 404 with respect to theprovider 402 and (b)process 424 payment for access of the data referenced by thelisting 202. This interaction may be according to any approach to logging in or authenticating or known in the art. Likewise, any approach for processing payment between parties may be implemented. In some embodiments, the data warehouse module may provide a rebate to theprovider 402 due to credits consumed by theconsumer 404 when accessing the shared data of the provider. Credits may be units of usage purchased by a user that are then consumed in response to the services of thevirtual warehouse 131 used by theconsumer 404, e.g. queries and other analytics performed on data hosted by thevirtual warehouse 131. The interaction may be directly betweendevices 126 of theconsumer 404 andprovider 402 or may be performed by way of theexchange manager 124. In some embodiments, theexchange manager 124 authenticates theconsumer 404 using theaccess control information 206 such that interaction with theprovider 402 is not needed. Likewise, thelisting 202 may define payment terms such that theexchange manager 124 processes payment without requiring interaction with theprovider 402. Once theprovider 402 determines that theconsumer 404 is authenticated and authorized to access the data referenced by thelisting 202, theprovider 402 may notify 426 theexchange manager 124 that theconsumer 404 may access the data referenced by thelisting 202. In response, theexchange manager 124 adds 428 a reference to thelisting 202 to the consumedshares 156 in theuser record 212 of theconsumer 404. - Note that in some instances a
listing 202 does not list specific data, but rather references aparticular cloud service 120, e.g. the brand name or company name of a service. Accordingly, the request to access thelisting 202 is a request to accessuser data 150 of the consumer making the request. Accordingly, steps 422, 424, 426 including authenticating theconsumer 404 with respect to theauthentication engine 121 such that thecloud service 120 can verify the identity of theconsumer 404 and inform theexchange manager 124 of which data to share with theconsumer 404 and to indicate that theconsumer 404 is authorized to access that data. - In some embodiments, this may be implemented using a “single sign on” approach in which the
consumer 404 authenticates (logs in) once with respect to thecloud service 120 and thereafter is enabled to access theconsumers 404 data in theservice database 158. For example, theexchange manager 124 may present an interface to thecloud service 120 on the devices 101-104 of theconsumer 404. Theconsumer 404 inputs authentication information (username and password, certificate, token, etc.) into the interface and this information is forwarded to theauthentication engine 121 of thecloud service 120. The authentication information processes the authentication information and, if the information corresponds to a user account, notifies theexchange manager 124 that theconsumer 404 is authenticated with respect to that user account. Theexchange manager 124 may then identify theuser data 150 for that user account and create a database referencing it. A reference to that database is then added to the consumedshares 156 of theconsumer 404. - In some embodiments, the user's authentication with respect to the
virtual warehouse 131 is sufficient to authenticate the user with respect to thecloud service 120 such thatsteps consumer 404. For example, thevirtual warehouse 131 may be indicated by theconsumer 404 to thecloud service 120 to be authorized to verify the identity of theconsumer 404. - In some embodiments, the
exchange manager 124 authenticates theconsumer 404 using theaccess control information 206 such that interaction with theprovider 402 is not needed. Likewise, thelisting 202 may define payment terms such that theexchange manager 124 processes payment without requiring interaction with theprovider 402. Accordingly, in such embodiments,step 422 is performed by theexchange manager 124 and step 426 is omitted. Theexchange manager 124 then performs step 428 once theconsumer 404 is authenticated and/or provided required payment. - In some embodiments, adding a
listing 202 to the consumed shares of aconsumer 404 may further include receiving, from theconsumer 404, consent to the terms presented to theconsumer 404. In some embodiments, where the terms of the agreement are changed by aprovider 402 after aconsumer 404 has added thelisting 202 according to themethod 400 or other method described herein, theexchange manager 124 may require theconsumer 404 to agree to the changed terms before being allowed to continue to access the data referenced by thelisting 202. - Adding 428 the data reference by the
listing 202 may include creating a database referencing the data. A reference to this database may then be added to the consumedshares 156 and this database may then be used to process queries referencing the data referenced by the share record. Adding 428 the data may include adding data filtered according to filters 208. For example, data referenced by the listing 202 (e.g., a filtered view of the data) and that is associated with theconsumer 404, organization of theconsumer 404, or some other classification of theconsumer 404. - In some embodiments, adding the
listing 202 to theuser record 212 may include changing the access controls 206 of thelisting 202 to reference theidentity data 214 of theconsumer 404 such that attempts to access the data referenced by thelisting 202 will be permitted and executed by theexchange manager 124. - The
consumer 404 may then input 430 queries to theSQL engine 310 by way of the consumer's devices 101-104. The queries may reference the data referenced in thelisting 202 added atstep 428 as well as other data referenced in theuser database 152 and consumedshares 156. TheSQL engine 310 then processes 432 the queries using the database created atstep 428 and returns the result to theconsumer 404 or creates views, materialized views, or other data that may be accessed or analyzed by the user. As noted above, the data of consumed shares operated upon by the queries may have been previously filtered to include only data relating to theconsumer 404. Accordingly,different consumers 404 adding thesame listing 202 to their consumedshares 156 will see different versions of the database referenced by thelisting 202. - Referring to
FIG. 4B , in some embodiments, the private sharing of data and filtering of data according to identify of theconsumer 404 may be implemented using the illustrated data structures. For example, theservice database 158 of theprovider 402 may include acustomer map 434 that includes entries forcustomer identifiers 436 of users of the service provided by theprovider 402, e.g. a service implemented by thecloud service 120 of the server and thecustomer identifier 436 being an identifier for authenticating with theauthentication interface 120. Thecustomer map 434 may map eachcustomer identifier 436 to awarehouse identifier 438, i.e. a user identifier used by a user to authenticate with thevirtual warehouse 131 such that the same user corresponds to bothidentifiers identifiers - The
customer map 434 may further include areference 440 to an entitlement table 442, which may be one of a plurality of entitlement tables 442. Each entitlement table 442 defines which of one or more tables 444 of theprovider 402 may be accessed with thecustomer ID 436 to which it is mapped. The entitlement table 442 may further define columns of a table 444 that can be accessed with thecustomer ID 436. The entitlement table 442 may further define rows or types of rows based on one or more filtration criteria of a table 444 that can be accessed with thecustomer ID 436. The entitlement table 442 may further define a schema for a table 444 that can be accessed with thecustomer ID 436. - A listing 202 for a table 444 may therefore specify that access to a data table 444 is to be performed as defined by the
customer map 434. For example, referring toFIG. 4C , when aconsumer 404 requests to add alisting 202 for a database for which access is defined according to the customer map, theexchange manager 124 may create asecure view 446 according to thecustomer identifier 436 and entitlement table 442 mapped to thewarehouse identifier 438 of theconsumer 404. The secure view may be generated by performing an inner join of the data tables 444 of the database specified in the entitlement table 442 (or portions thereof as specified in the entitlement table 442) that is filtered according to thecustomer identifier 436 such that a result of the join includes only data for thespecific customer identifier 436 and includes only those portions of the database (tables 444 and/or portions of tables 444) specified in the entitlement table 442. The manner in which the secure view is generated may be as described in U.S. application Ser. No. 16/055,824 filed Aug. 6, 2018, and entitled SECURE DATA SHARING IN A MULTI-TENANT DATABASE and U.S. application Ser. No. 16/241,463 filed Jan. 7, 2019 and entitled SECURE DATA SHARING IN A MULTI-TENANT DATABASE. -
FIG. 5 illustrates analternative method 500 for sharing data that may be performed when theconsumer requests 418 to add alisting 202 that is available to the public or to all users of a private exchange. In that case, theexchange manager 124 adds 428 the reference to thelisting 202 to the consumedshares 156 of theconsumer 404 and authentication or payment steps are omitted. Step 428 may be performed as described above except that no change to accesscontrols 206 is performed. Likewise, steps 430 and 432 may be performed with respect to the shared data as described above. The exchange ofFIG. 5 could be a public exchange or a private exchange as described above with respect toFIG. 4 .FIG. 5 illustrates the case where if alisting 202 is viewable (i.e. filter criteria permit viewing by theconsumer 404 as described above), theconsumer 404 is able to add thelisting 202 to the consumedshares 156 of theconsumer 404 without further authentication or payment. - Note that when a
listing 202 is added to the consumedshares 156 of a user according to any of the methods disclosed herein, theexchange manager 124 may notify consumers of thelisting 202 when the data referenced by thelisting 202 is updated. - Referring to
FIG. 6 , in some embodiments, amethod 600 may include aconsumer 404 browsing a catalog and selecting alisting 202 as described for the other methods described herein (see, e.g.,FIGS. 4A and 5 ), from theexchange manager 124, a bidirectional share with respect to the data referenced by the listing (“the shared data”) and additional data in the user database 152 (“the user's data”). Note that in some embodiments the listing 202 of theprovider 402 does not reference any specific data (e.g., a specific table or database) and instead offers to perform a service with respect to data provided by theconsumer 404. Accordingly, in such instances “the shared data” as discussed below may be understood to be replaced with “the offered service.” - In response to this request, the
exchange manager 124 implements 604 a point-to-point share of the shared data with respect to theconsumer 404 and theprovider 402. This may be performed as described above with respect toFIG. 4A , e.g. include authentication of theconsumer 404 and possibly filtering of the shared data to only include data associated with theconsumer 404 as described above. Theexchange manager 124 may further implement a point-to-point share of the user's data with respect to theprovider 402 as described with respect toFIG. 4A except: (a) theconsumer 404 acts as the provider and theprovider 402 acts as the consumer for the user's data and the user's data is added to the consumedshares 156 of theprovider 402 and (b) theconsumer 404 need not create alisting 202 for the user's data and the user's data need not be listed in thecatalog 220. - Following
step 606, both theconsumer 404 or theprovider 402 have access to the shared data and the user's data. Either may then run queries against both of these, join them, perform aggregations on the joined data, or perform any other actions or enrichments known in the art with respect to multiple databases. - In some embodiments, a bi-directional share may include, or be requested by the
consumer 404 to include, theprovider 402 also joining 608 the shared data and the user data to obtain joined data and returning 610 a reference to the joined data to theexchange manager 124 with a request to add 612 a reference to the joined data to the consumedshares 156 of theconsumer 404, which theexchange manager 124 does. - Accordingly, the
consumer 404 will now have access to the joined data. Step 608 may further include performing other actions (aggregations, analysis) on the user data and shared data either before or after joining. Step 608 may be performed by thevirtual warehouse 131 in response to the request form theconsumer 404 to do so. - Note that the result of the join may be either (a) a new database that is a result of the join or (b) a joined database view that defines a join of the shared data and the user data.
- The result from step 608 (joining, aggregating, analyzing, etc.) may alternatively be added to the original share performed at
step step 608. - Steps 608-612 may also be performed by the
virtual warehouse 131 in response to a request from theconsumer 404 orprovider 402 to do so independently from the request made atstep 602. - Note that in many instances there are
many consumers 404 that attempt to perform bi-directional shares with respect to theprovider 402 and theseconsumers 404 may seek bi-directional shares with respect to their user data that may be in many different formats (schemas) that may be different from a schema used by the shared data of theprovider 402. Accordingly, step 608 may include a transformation step. The transformation step maps a source schema of the user's data to a target schema of the shared data. The transformation may be a static transformation provided by a human operator. The transformation may be according to an algorithm that maps column labels of the source schema to corresponding column labels of the target schema. The algorithm may include a machine learning or artificial intelligence model that is trained to perform the transformation. For example, a plurality of training data entries may be specified by human annotators that each include as an input a source schema and as an output include a mapping between the source schema and the target schema. These entries may then be used to train a machine learning or artificial intelligence algorithm to output a mapping to a target schema for a given input source schema. - Data added to the shares consumed by the
consumer 404 andprovider 402 may then be operated on by theconsumer 404 andprovider 402, respectively, such as by executing queries against the data, aggregating the data, analyzing the data, or performing any other actions described herein as being performed with respect to shares added to the consumedshares 156 of a user. - In particular embodiments, a data provider may improve its relationship with business partners by enabling the secure interchange of data in a bi-directional manner, as discussed above. Traditional methods of bi-directional data sharing have been challenging to accomplish, and only very limited sets of data are shared via APIs, FTP, or file transfer between companies. And this often comes at great cost, expense, data latency, and even some security risk.
- A data provider may instead host a private data exchange, and invite their customers and partners to participate in the exchange. Customers and partners may access data in secure views, for example, and they may also push data in the other direction as well. This could be to share data back to the host, but also to potentially list data so that other participants of the ecosystem can securely share it as well. Data from a public data exchange, other private exchanges, or from other external sources may also be included.
- Every large company depends on other companies, and on its customers. Bidirectionally sharing data not only from the company to and from these parties, but also between these external parties themselves, can allow rich, collaborative data ecosystems to develop where groups of companies can work together around data. They can securely discover, combine, and enrich data assets to help service a common customer, or to form new partnerships amongst themselves. Some of these relationships may even lead to opportunities to sell data, secure views of or functions across data to other participants of a walled garden ecosystem.
- Referring to
FIG. 7 , the approach to sharing and consuming data as described herein enables enrichment of data and return of that enriched data to the exchange. For example, provider A may request 702 sharing of data (share 1) with the exchange in the same manner as for other methods described herein. Theexchange manager 124 verifies, validates, and adds 704share 1 to thecatalog 220. - A second provider B may then browse the
catalog 220 and add 706share 1 to its consumedshares 156. Provider B may perform 708 operations on the shared data such as joining it with other data, performing aggregations, and/or performing other analysis with respect toshare 1, resulting in modified data (share 2). Provider B may then request 710 sharing ofshare 2 with the exchange as described herein. Note that the joining ofstep 708 may include joining any number of databases, such as any number of shares based on any number of listings by any number of other users. Accordingly, iterations of steps 702-710 by many users may be viewed as a hierarchy in which a large number oflistings 202 of multiple users are narrowed down to a smaller number oflistings 202 based on the data from the larger number oflistings 202. - The
exchange manager 124 verifies, validates, and adds 712share 2 to thecatalog 220. This process may be repeated 714 with respect toshare 2, as provider A, provider B, or a different provider addsshare 2, generates modified data based on it, and adds the result back to the catalog in the same manner. In this manner, a rich ecosystem of data and analysis may be made available to users. The shares according to themethod 700 may be any shares, point-to-point shares, private exchange shares, or bi-directional exchange shares according to the methods disclosed herein. - Note that there is a possibility that provider may perform
steps listing 202 that is based on alisting 202. For example, listing L1 of provider A is used by provider B to create listing L2, which is used by provider C to create listing L3, which is used by provider A to define listing L1. Such a flow could include any number of steps. This may be undesirable in some cases such that modification of listing L1 to reference L3 is not permitted in view L3 being derived from L1. In other instances, such a loop is permitted provided there is a time delay in when the data referenced by each listing is refreshed. For example, L1 may reference L3 provided L3 will not be refreshed until some time after L1 is refreshed and therefore the circular reference will not result in continuous updating of L1 and L3 ad infinitum. Non-looping flows are also contemplated by this disclosure, such that listing L1 is not influenced by other providers' use of listing L1 - The listing created at step 712 (Share 2) may either (a) include copies of the data from
Share 1 remaining afterstep 708 and as modified according to step 708 or (b) include a view referencing Share 1 (e.g., a database created based on thelisting 202 forShare 1 according to the methods disclosed herein) and defining the operations performed atstep 708 without including actual data fromShare 1 or derived fromShare 1. Accordingly, a hierarchy as described above may be a hierarchy of views that either reference one or both of listing 202 that are views created according to themethod 700 or listing 202 of data from one or more providers according to any of the methods disclosed herein. - In the methods disclosed herein approaches are disclosed for creating shares (listings 202) and for adding shares. In a like manner, a
consumer 404 may instruct theexchange manager 124 to remove added shares. Aprovider 402 may instruct theexchange manager 124 to cease sharingcertain listings 202. In some embodiments, this may be accompanied by actions to avoid disruptingconsumers 404 of thoselistings 202. Such as by notifying theseconsumers 404 and ceasing to share thelistings 202 only after a specified time period after the notification or after allconsumers 404 have removed references to thelistings 202 from their consumedshares 156. - In a first use case a company implements a private exchange according to the methods described above. In particular, listing 202 of the company are viewable only by
consumers 404 that are associated (employees, management, investors, etc.) with the company. Likewise, adding of listing 202 is permitted only for those associated with the company. When adding alisting 202 to the consumedshares 156, it may be filtered based on the identity of the consumer that adds it, i.e. data that is relevant to the consumer's role within the company. - In a second use case, a
provider 402 creates a reader or reader/writer account for aconsumer 404 that is not yet a user of thevirtual warehouse 131. The account may be associated with the account data of the consumer (see consumer map ofFIG. 4B discussed above). Theconsumer 404 may then log on to that account and then access the provider's listings to access the consumer'sdata 404 that is managed by the provider 402 (see, e.g. discussion ofFIG. 4A ). - In a fifth use case a
consumer 404 adds shares that are private (e.g., accessible due to the identity of theconsumer 404 according to the methods described above) and shares that are public. These may then be joined by theconsumer 404 and used to process queries. - In a sixth use case, a
listing 202 may be shared based on a subscription (e.g., monthly) or be accessed based on per-query pricing, or a credit uplift multiplier. Accordingly, theexchange manager 124 may manage processing of payment and access such that theconsumer 404 is allowed to access the data subject to the pricing model (subscription, per query, etc.). - In a seventh use case, the
exchange manager 124 implements secure functions and secure machine learning models (both training and scoring) that may be used to process private data such that theconsumer 404 is allowed to use the result of the function or machine learning model but does not have access to the raw data processed by the function or machine learning model itself. Likewise, the consumer of the shared data is not allowed to export the shared data. The consumer is nonetheless allowed to perform analytical functions with respect to the shared data. For example, the following secure function may be implemented to enable viewing of customer shopping data in a secure manner: - create or replace secure function
UDF_DEMO.PUBLIC.get_market_basket(input_item_sk numbers(38))
returns table (input_item NUMBER(38,0), basket_item_sk NUMBER(38,0),
num_baskets NUMBER(38,0))
as
‘select input_item_sk, ss_item_sk basket_Item, count(distinct
ss_ticket_number) baskets
from udf_demo.public.sales
where ss_ticket_number in (select ss_ticket_number from udf_demo.public sales where
ss_item_sk=input_item_sk
group by ss_item_sk
order by 3 desc, 2’; - In an eighth use case, the
exchange manager 124 may provide usage statistics of alisting 202 by one ormore consumers 404 to theprovider 402 of the listing, e.g. queries, credits used, tables scanned, tables hit, etc. - In a ninth use case, the systems and methods disclosed herein are used for industry-specific applications. For example:
- 1. Cybersecurity
- a. Allows for sharing of risk vectors, bad actors, IP white/black lists, realtime attacks in progress, known good/bad emailers, etc.
- 2. Healthcare
- a. Secure sharing of patient information, including cost information and outcome information, among other types of information
- b. Secure multi-hospital databases so patients can share their information to multiple providers. (e.g., if patient A lives in California and travels to Florida on vacation, is injured, and is treated in an emergency room, the hospital in Florida may be able to access patient A's records from disparate hospitals and providers.)
- Other industries may also benefit from private or public sharing of data according to the systems and methods disclosed herein. Such as the financial services industry, telecommunications industry, media and advertising industry, government agencies, militaries, and intelligence agencies.
- In a tenth use case, a first user provides marketing services for a second user and, therefore, the second user shares a customer list with the first user. The first user shares data regarding a marketing campaign to the second user, such as campaign metadata, current user events (session start/end for specific users, purchases for specific users, etc.). This may be accomplished using the bi-directional sharing of
FIG. 6 . This data may be joined (customer list+customer events from first user) in order to obtain a better understanding about events for a specific user or groups of users. As noted above, this exchange of data may be performed without creating copies or transferring data—each user accesses the same copy of the shared data. Since no data is transferred, the data may be accessed in near real time as customer events occur. -
FIG. 8 is a block diagram illustrating a network environment in which a data provider may share data via a cloud computing service. Adata provider 810 may upload one ormore data sets 820 in cloud storage using acloud computing service 112. These data sets may then become viewable by one or more data consumers 101-104. Thedata provider 810 may be able to control, monitor, and increase the security of its data using thecloud computing service 112 using the methods and systems discussed herein. In particular embodiments, thedata provider 810 may implement a private data exchange on its own online domain using the functionality, methods, and systems provided bycloud computing service 112.Data providers 810 may be any provider of data, such as retail companies, government agencies, polling agencies, non-profit organizations, etc. The data consumers 101-104 may be internal to thedata provider 810 or external to thedata provider 810. A data consumer that is internal to the data provider may be an employee of the data provider. The data provider may be a bike-share company, which provides bicycles for a daily, monthly, annual, or trip-based fee. The bike share company may gather data about its users, such as basic demographic information as well as ride information, including date of ride, time of ride, and duration of ride. This information may be available to employees of the bike share company via thecloud computing service 112. - The interaction between a
data provider 810, private data exchange 812 (as implemented by cloud computing service 112), and a data consumer may be as follows. The data provider may create one ormore listings 811 using data sets 820. The listings may be for any suitable data. For example, a consumer data company may create a listing called “video streaming” that contains data related to the video streaming habits of a large number of users. The data provider may setlisting policies 821 related to who may view thelisting 811, who may access the data in thelisting 811, or any other suitable policy. Such listing policies are discussed above with reference toFIG. 2 . Thedata provider 810 may then submit to theprivate exchange 812 atstep 813. Theprivate data exchange 812 may be embedded inside a web domain of thedata provider 810. For example, if the web domain of the consumer data company is www.entityA.com, the private data exchange may be found at www.entityA.com/privatedataexchange. Theprivate data exchange 812 may receive the listing and approve it atstep 814 if the listing complies with one or more rules as determined by thecloud computing service 112. Theprivate data exchange 812 may then set up access controls at 815 at least in part according to the listing policies what were set instep 821. Theprivate data exchange 812 may then invite members atstep 816. The members may bedata consumers 801. Thedata consumers 801 may accept the invitation atstep 817 and then may begin consuming the data at 818. The type of data consumption may depend on the access controls that were established at 815. For example, the data consumer may be able to read the data only or share the data. As another example, a data consumer may be able to do any combination of the above read, or share operations on the data, subject to the access controls. In general, data sharing does not involve altering shared data. - In some embodiments, a
data consumer 801 may independently access theprivate data exchange 812, either by directly navigating to theprivate data exchange 812 in a browser, or by clicking on an advertisement for theprivate data exchange 812, or by any other suitable mechanism. A private data exchange may also be rendered via custom or other code by accessing listing and other information via an API. If thedata consumer 801 wishes to access the data within a listing and the listing is not already universally available or thedata consumer 801 does not already have access, thedata consumer 801 may need to request access atstep 820. The data provider may approve or deny the request at 822. If approved, the private data exchange may grant access to the listing at 823. The user may then begin consuming the data as discussed above. - In particular embodiments, one or more data exchange administrator accounts may be designated by the
cloud computing service 112. The data exchange administrator may manage members of the private data exchange by designating members asdata providers 810 ordata consumers 801. The data exchange administrator may be able to control listing visibility by selecting which members can see a given listing. The data exchange administrator may also have other functions such as approving listings before they are published on the private data exchange, track usage of each of the listings, or any other suitable administrative function. In some embodiments, the data provider and the data exchange administrator are part of the same entity; in some embodiments, they are separate entities. The provider may create listings, may test sample queries on the data underlying a listing, may set listing access, grant access to listing requests, and track usage of each of the listings and the data underlying the listings. Adata consumer 801 may visit a private data exchange and browse visible listings which may appear as tiles. To consume the data underlying a listing, the consumer may either immediately access the data, or may request access to the data. -
FIG. 9 is an exampleprivate data exchange 900 in accordance with an embodiment of the present invention.Private data exchange 900 may be what a data consumer sees when she navigates to the private data exchange on the web. For example, the data consumer may enter www.entityA.com/privatedataexchange in her browser. As discussed herein, “Entity A Data Exchange” may be a private data exchange that is facilitated by thecloud computing service 112 and is embedded into Entity A's own web domain or into an application, or may be accessed via an API.Private data exchange 900 may include several listings for different data sets, for example listings A-L. The listing A-L may also be referred to herein as a data catalog, which may allow visitors to the private data exchange to view all the available listings in the private data exchange. These listings may be placed by an administrator internal to Entity A. Providing a data catalog in this manner may serve to combine the benefits of crowdsourced content, data quality, and the right level of centralized control and coordination that can overcome the challenges that have slowed the adoption of other approaches to enterprise data cataloging (e.g., indexing and crawling systems). It allows users across an enterprise to contribute data, use data from other groups, and join data together to create enriched data products, for both internal use as well as potentially for external monetization. - As an example and not by way of limitation, Entity A may be a consumer data company that has collected and analyzed the consuming habits of millions of individuals in several different categories. Their data sets may include data in the following categories: online shopping, video streaming, electricity consumption, automobile usage, internet usage, clothing purchases, mobile application purchases, club memberships, and online subscription services. Each of these data sets may correspond to different listings. For example, Listing A may be for online shopping data, Listing B may be for video streaming data, Listing C may be for electricity consumption data, and so on. Note that the data may be anonymized so that individual identities are not revealed. The listings located below
line 915 may correspond to third-party listings that entity A may allow on its private data exchange. Such listings may be generated by other data providers and may be subject to approval by Entity A before being added to theprivate data exchange 900. A data consumer may click on and view any of the listings subject to various access controls and policies as discussed above with reference toFIGS. 2, 4, and 8 . - In particular embodiments, a data provider may invite members to access its private data exchange, as discussed with reference to
FIG. 8 . One class of members may be the physical and digital supply chain suppliers of the data provider. For example, a data provider may share data with suppliers on its inventory levels or consumption of things provided by the suppliers, so they can better meet the needs of the data provider. In addition, digital data providers may provide data directly into its private data exchange, to make it immediately usable and joinable to the internal enterprise data, saving costs for both parties on transmitting, storing, and loading the data. - Some companies such as hedge funds and marketing agencies bring in data from many external sources. Some hedge funds evaluate hundreds of potential data sets per year. A private data exchange may be used to not only connect with data that has already been purchased, but can also be used to evaluate new data assets. For example, a hedge fund could have potential data suppliers list their data on their private exchange, and the fund could explore and “shop” for data in a private data store where they are the only customer. Such an internal data store could also “tunnel” in data assets from a public Data Exchange (e.g., the SNOWFLAKE public Data Exchange), as discussed with reference to
FIG. 11 . - As another example, an existing provider of marketing data to a company could list some additional datasets that their customer could use via their private exchange on a trial basis, and if the customer finds them useful, the supplier can immediately provide full access through the same exchange. These arrangements can bring much greater depth of data, bi-directional and much fresher data, and greater trust and transparency to relationships between suppliers of data and physical goods and their customers.
-
FIG. 10 is a diagram illustrating an example secure view of shared data from a private data exchange. When adata consumer 1020 wishes to access data in a listing (e.g., Listing H), thecloud computing service 112 may facilitate access via a secure view of shareddata 1010. The secure view of shareddata 1010 may includemetadata 1015 that includes the metadata and access controls discussed herein with reference toFIG. 2 . This may allow data providers to share data without exposing the underlying tables or internal details. This makes the data more private and secure. With a secure view of shareddata 1010, the view definition and details are only visible to authorized users. - In a private data exchange, data may be shared both within the same entity and between different entities. Additionally, the data sharing may be one-way, two-way, or multi-way. In one embodiment, his can lead to up to five main use-cases for sharing data: two-way inter-entity, two-way intra-entity, one-way inter-entity, one-way intra-entity, and multi-way multi-entity. An example of two-way inter-entity data sharing may be data sharing from portfolio companies to a parent company and between portfolio companies. An example of two-way intra entity data sharing may be data sharing from the headquarters of a large company to the different business units within that company, and also data sharing from the business units to headquarters. An example of one-way inter-entity data sharing may be a large data provider (e.g., a national weather service) that shares data with lots of different entities, but does not receive data from those entities. An example of one-way intra-entity may be a large company that provides data to its respective business units but does not receive data from those business units. In particular embodiments, data may be shared as “point-to-point shares” of specific data, or as “any-shares.” A point-to-point share of specific data may include a private data exchange share between a parent company and specific portfolio companies. An any-share may include a private data exchange share from a parent company to a broad group of data consumers on a public or within a private exchange.
- In particular embodiments, the
cloud computing service 112 may generate a private data exchange for an entity who is the owner of the data to be shared on the private data exchange. Thecloud computing service 112 may designate one or more administrators of the private data exchange. These administrators may have control over the access rights of the private data exchange with regard to other users. For example, an administrator may be able to add another user account to the private data exchange and designate that account as a data provider, data consumer, exchange administrator, or a combination of these. - In particular embodiments, the exchange administrator may control viewing and access rights to the private data exchange. Viewing rights may include a list of entities that may view the listing in the private data exchange. Access rights may include a list of entities that may access the data after selecting a particular listing. For example, a company may publish
private data exchange 900 and may include several listings, Listing A through Listing L. Each of these listings may include their own individual viewing and access rights. For example, Listing A may include a first list of entities that have rights to view the listing on theprivate data exchange 900 and a second list of entities that have rights to access the listing. Viewing a listing may simply be to see that the listing exists on the private data exchange. Accessing a listing may be to select the listing and access the underlying data for that listing. Access may include both viewing the underlying data, manipulating that data, or both. Controlling viewing rights may be useful for data providers who do not want some users to even know that a certain listing exists on the private data exchange. Thus, when a user who does not have viewing rights to a particular listing visits the private data exchange, that user will not even see the listing on the exchange. In particular embodiments, the above discussed viewing and access rights may be provisioned via an application program interface (API). The exchange catalog may be queries and updated via the API. This may allow a data provider to show listings on its own application or website to anyone who visits. When a user wants to access or request access to data, the user may then create an account with thecloud computing service 112 and obtain access. In some embodiments, a URL may be called with a user requests access to data within a listing. This may allow for integration with external request approval workflows. For example, if a user makes an access request, an external request approval workflow of the data provider may be accessed and activated. The external request approval workflow may then operate normally to perform an external request approval process. In some embodiments, a listing may be unlisted, which means that the listing exists but is not visible on the data exchange. To access an unlisted listing, a consumer may input a global URL into the browser. This may require a unique URL for each listing. - In particular embodiments, when a new private data exchange is created for a data provider, the
cloud computing service 112 may designate an exchange admin (e.g., the data exchange administrator, as discussed above), and may also generate the following metadata about the private data exchange: the name of the private data exchange, which needs to be unique, a display name, a logo, a short description of the private data exchange, and an indication of whether approval from the exchange admin is necessary for publishing (e.g., Admin_Approval_for_Publishing). This may be a true/false statement. It may be set to true if the exchange admin needs to approve listings submitted to the private data exchange before they are published. It may be set to false if the exchange admin does not need to provide such approval. In this case, providers can publish data directly. If the exchange admin sets the Admin_Approval_for_Publishing to True, the exchange admin may be able to see a list of Listings, and select a listing to preview and approve/reject. The owner of the private data exchange may be the account that is paying for the private data exchange. This metadata information may be stored as part of an Exchange object. Also stored in association with the private data exchange may be the users and accounts who provide data to the exchange, the consumers of the exchange, and the exchange admin(s). - In particular embodiments, the exchange admin may add members (e.g., data providers and data consumers) to the private data exchange by inviting the members in any suitable manner. For example, by inviting the users' accounts on the
cloud computing service 112, or by sending an email to the users' email account addresses. When the exchange admin adds a member to the private data exchange, the exchange admin may also specify one or more member-types: exchange admin, provider, or consumer. An exchange admin may be able to add and remove members from the private data exchange and to edit metadata associated with the private data exchange. For each user, the exchange admin may designate whether the user is an exchange admin, a data provider, and a data consumer, or multiple of these roles. The following table summarizes the rights associated with each type of account. -
TABLE 1 Rights Associated with Each Type of Private Data Exchange Account is_Exchange_Admin is_Data_Provider is_Data_Consumer Description False False True Can Discover & Consume listings (subject to Listing visibility and access rules), but cannot publish listings False True True Can Discover & Consume listings (subject to Listing visibility and access rules), plus can publish listings False True False Can publish listings, but when they go to the consumer view they only see their own listings. They will not be able to ‘get’ their own listing as they will get a self-sharing prohibited error. True False False Can add members, remove members, change member roles, access list of member accounts, and edit metadata. True True True Can do everything a data provider and a data consumer can do, as well as add members, remove members, change member roles, access list of member accounts, and edit metadata. - In some embodiments, if the exchange admin removes a member or changes a member's type from provider to a consumer only, then existing listings published by that member may become unpublished from the Exchange. Additionally, existing shares added to the Exchange by the member are no longer considered part of the private data exchange. The listings published by that member may be archived, and are no longer visible in the UI to anyone, including the member. The
cloud computing service 112 can un-archive this if the same member (same account on the cloud computing service 112) who has been removed is made a Provider again. - In some embodiments, the exchange admin may be able to specify a list of categories as well as edit an existing list. Categories may have icons associated with them, and the exchange admin may be able to specify the icon along with the category name.
- When a member becomes a data provider, a provider profile may be generated that includes a logo, a description of the provider, and a URL to the provider's website. When submitting listings, a provider may do the following: select which private data exchange to publish the data in (e.g., many private exchanges may exist and the provider may need to select a subset of these exchanges, which may be one or more), and set metadata about the new listing. The metadata may include a listing title, a listing type (e.g., Standard or Personalized), a listing description, one or more usage examples (e.g., title and sample queries), a listing category, which may be input as free form text, an update frequency for the listing, a support email/URL, and a documentation link. The provider may also set access for the listing. The provider may allow the exchange admin to control the visibility of the listing, or the provider may retain that control for itself. The provider may also associate a share with a listing. For a standard share, a listing may be associated with zero or more shares. The provider may associate shares to a listing through the UI or SQL. For personalized shares, when the provider provisions a share in response to a request, the provider may associate that share with the listing. When the provider wishes to publish the listing, the listing may first need approval from the exchange admin, depending on the publishing rules of the private data exchange.
-
FIG. 11 is a diagram illustrating an example tunneling of a data listing between a public data exchange and a private data exchange. Alternatively, data may be tunneled between two public data exchanges or between two private data exchanges, or from one public exchange to multiple private exchanges, or any other suitable combination. In some embodiments, an entity may wish to offer a publicly listed data listing on its private data exchange. For example, Entity B may wish to include Listing F ofpublic data exchange 1100 on its ownprivate data exchange 1000. The data underlying Listing F may be tunneled frompublic data exchange 1100 toprivate data exchange 1000. In particular embodiments, data may be tunneled between two private data exchanges. At times, a first data provider may wish to allow a second data provider to list data belonging to the first data provider on a private data exchange of the second data provider. Tunneling of data listings may allow the two data providers to offer the same listing. As an example, Entity A and Entity B may have a business agreement to share listing F on each of their private data exchanges. Listing F may be the property of Entity A, but Entity B may have a license to offer it on its private data exchange as well. In this case, both of the listings titled “Listing F” will point to the same data set stored incloud computing platform 110. Thetunnel 1015 is a representation to illustrate that Listing F may be shared securely and easily between two ormore data exchanges - In particular embodiments, tunnel linking may be accomplished between a private data exchange and a public data exchange, or vice versa. For instance,
data exchange 1100 may be a public data exchange. Entity B may use a listing listed on thepublic data exchange 1100 on its ownprivate data exchange 1000 viatunnel 1015. In some embodiments, a data listing may be tunneled from one data exchange to another data exchange and then the underlying data may be joined with another data set, and then a new listing may be generated from the combined data set. As an example and not by way of limitation, a first data set may be listed on a private data exchange that includes NBA player shooting statistics over the last five years. A second data set may be listed on a different data exchange that includes weather data over the same time span. These two data sets may be joined and listed as a new listing in either a private or a public data exchange. Data consumers may then access this data set, subject to the viewing and access controls discussed herein, to gain insight into how the weather might affect player shooting percentages. Additionally, if data is listed on a public data exchange (e.g., a data exchange hosted by the cloud computing service 112), this data may be tunneled to a private data exchange. - In some embodiments, tunneling of datasets may be used to create an “industry-wide” data exchange that is either public or private. Many different entities may tunnel datasets to a “mega ecosystem data exchange.” If a private ecosystem data exchange really takes off, it could become so big and influential that it could become the standard place for a whole industry to interchange, collaborate around, and monetize data. There is probably room for one or two “mega ecosystem data exchanges” in each industry. Once any one gains significant traction, it could become the “go to” place for that industry. If more than one viable exchange emerges in an industry, the respective hosts of these could decide to partner and “cross-tunnel” some (but maybe not all) assets between their exchanges to get critical mass.
- While it is possible that industry coalitions could host such exchanges via tunneling, it may be more likely that one or two large players in each industry will bootstrap ecosystem private data exchanges fast and broadly enough to become the defacto data exchange for their industry. This provides a significant incentive for companies that want to become major players in the data side of their industries to start as soon as they can to build their internal data exchanges and then open them up quickly to their suppliers, customers, and partners.
-
FIG. 12 is a diagram illustrating an example data query anddelivery service 1200 according to some embodiments of the invention. Data query anddelivery service 1200 illustrates four ways a data provider may be able to share data. The first way is through adata exchange 900. Thedata exchange 900 may be a public data exchange or a private data exchange. Thedata provider 1210 may list 1211 the data on the data exchange according to the methods and systems described herein with reference toFIGS. 2, 4, and 8 . Thedata consumer 1220 may access the data in the listings by either accepting an invitation from the data provider as discussed herein or by requesting 1212 access to the listing as discussed herein with reference toFIG. 8 . The second way data may be shared is by directly sharing the data at 1213. This may be a point-to-point share as discussed with reference toFIG. 4 , or may be any other suitable type of share, accomplished using the secure data share methods discussed herein. Note that thedata provider 1210 and thedata consumer 1220 are both users of thecloud computing service 112. If thedata provider 1210 wishes to share data with a non-user 1230, this is possible as a third way to share the data, with areader account 1215 a or with a reader/writer account 1215 b. Here the non-user may need to have a reader account but may not need to be an actual user of thecloud computing service 112. A reader account may allow the non-user 1230 to view the data but do nothing else to the data. Finally, a fourth way to share data is via a file drop to cloud storage 1214. Here thedata provider 1210 may make a copy of adata set 1216 and may allow for another non-user 1230 to have thedata set 1216. This way of sharing data may not allow thedata provider 1210 to retain control of the data set. Thus, using the fourth way, the non-user 1230 may be able to view, manipulate, and re-share the data. - As described above, the private data exchange is used within one data region of a cloud computing service or within a single cloud computing service. A customer may like to be able to have one or more listings across multiple regions of a cloud computing platform and/or across multiple cloud computing platforms. In order for a data provider to share data sets across multiple cloud computing platform regions and/or multiple cloud computing platforms, the data providers would need to set up accounts in different regions, login to each account to setup replication, and refresh using tasks. The data provider would need to share with the consumer in the target region as well as needing to replicate the entire database. This whole process adds significant overhead for a data provider. In addition, when a virtual private cloud (VPC) customer wants to consume shared data through the data exchange, this customer currently does not have a way to do this within their VPC account. A workaround, such as asking the customer to open a multi-tenant account and persisting the shared data there, puts a burden on the consumer. In one embodiment, a multi-tenant account is an account in a system that supports isolation of computing resources and data between different customers/clients and between different users within the same customer/client. Thus it is difficult. It would be useful to a data provider (and to a customer who consume this data) to achieve cross-region and cross-cloud data sharing.
- In one embodiment, for global data sharing, there are two types of data: standard and personalized. Standard data represent data that is the same for every consumer. For example, there is no dynamic filtering of rows based on the consumer's account. In contrast, personalized data represents data that is unique to each consumer (or a group of consumers). In one embodiment, personalized data can have a secure view to dynamically filter rows of the data based on the consumer's account, so each consumer sees their own slice of the data. In a further embodiment, for a data provider, this can mean they can create a view for some, or all their consumers instead of a view for each consumer.
- In one embodiment, a listing for shared data can include metadata associated with zero or more shares, or collection of database (DB) objects. In addition, the listing can be for free data or paid data. Furthermore, a data provider can grant access to a consumer for shared data depending on whether the shared data is standard, where the customers each get access to the same shared data, or whether the shared data is personalized, where data is shared on an individual or group basis. For personalized data, in one embodiment, a data provider adds each customer of the data to an entitlements table. Alternatively, a non-personalized data may still require some sort of approval process for the customer to access the data. For example and in one embodiment, in a private exchange, getting access to a share might need an approval workflow even though the data itself is not personalized. Thus, in one embodiment, for a standard data share, a data provider does not need to be in the request fulfillment loop. Any consumer account that discovers the listing can access the data and use this data to create a database from it. Alternatively, for data shared by request, a data provider will need to explicitly add the consumer to the share.
- In another embodiment, standard shares may exist in the context of the data exchange (whether, public or private) as the data exchange represents the membership base that it is made available to and how listings are discovered. In one embodiment, data shared by a data provider is also called a data share. In one embodiment, a consumer should not really care about the underlying share type (e.g., Standard or By Request). In addition, a consumer can always interact with the standard listing. In this embodiment, a data provider will have to be aware of shares to create them. However, the data exchange will handle the creation of these shares.
- In one embodiment, data shares outside of an exchange are, by definition, By Request shares. In addition, the data exchange can include unlisted or undiscoverable Standard Listings which do not have an entry in a data exchange. The effective membership base for this is any Snowflake customer. The listing, however, is not discoverable by consumers as the data provider would send around a listing URL. In this embodiment, anyone who views that URL and logs into the data exchange can view the data and, for example, create a DB from the share.
- In one embodiment, both Standard Listings and By Request listings could be free or paid. If the data share is free, a consumer can create a DB from the shared data, once the consumer accepts the provider terms. If data share is paid, the consumer can accept terms and arrange for payment. Then the consumer can create a DB from share. In one embodiment, for a standard listing, the data can be available in a region or not: If the data is not yet available in a region, consumer still clicks get and can create a DB from the share. At this point, the data exchange can let them know that the data will be available in a certain amount of time (e.g., could have visual countdown if it's in seconds or progress bar indication in the UI). In one embodiment, a personalized data share can be a type of By Request share or listing.
-
FIG. 13A is a block diagram of anexample system 1300 of multiple cloud computing platform sharing data with a data exchange. InFIG. 13A , thesystem 1300 includes two differentcloud computing platforms 1302A-B, where each of thecloud computing platforms 1302A-B can be one of the cloud computing platforms described above inFIG. 1 . Each of thecloud computing platforms 1302A-B are coupled to adata exchange 1306. In one embodiment, thedata exchange 1306 is similar to the data exchanges described above, with the exception thatdata exchange 1306 can support data sharing across multiple cloud computing platforms, such ascloud computing platforms 1302A-B. - In one embodiment,
cloud computing platform 1302A includesdata provider share 1304. In this embodiment, the data provider shares some or all of itsdata 1304 via thedata exchange 1306 that is visible to data consumers in bothcloud computing platforms 1302A-B. For example and in one embodiment,data consumer 1308 that is in a different cloud computing platform (e.g., cloud computing platform 1302B) can view the listing for thedata provider share 1304 via the data exchange. If thedata consumer 1308 requests the listing, in response, thedata exchange 1306 can create a provider account in cloud computing platform 1302B for thedata provider share 1304, and replicate the data share to the cloud computing platform 1302B. With the data share replicated 1310 in cloud computing platform 1302B, thedata consumer 1308 can access the data. In one embodiment, thedata exchange 1306 can replicate the entiredata provider share 1304, or can replicate some of thedata provider share 1304. In one embodiment, the data provider indicates which parts of thedata provider share 1304 to be replicated. In another embodiment, thedata exchange 1304 infers what parts of thedata provider share 1304 to replicate. In this embodiment, the data exchange can infer the objects of the data provider share that need to be replicated, the frequency as which these objects need to be replicated, and/or region of the consumer account and the corresponding provider secondary account. -
FIG. 13B is a block diagram of anexample system 1320 of a cloud computing platform sharing data with a data exchange across multiple regions of the cloud computing platform. InFIG. 13B , thesystem 1320 includes a cloud computing platform with twodifferent regions 1322A-B, where each of the cloudcomputing platforms region 1322A-B can be a different geographic region for that cloud computing platform (e.g., US-West, US-East, Europe, Asia, and/or another type of geographic region). In one embodiment, each of the cloudcomputing platform regions 1322A-B are coupled to adata exchange 1306. In one embodiment, thedata exchange 1306 is similar to the data exchanges describe, with the exception thatdata exchange 1306 can support data sharing across multiple cloud computing platforms, such as cloudcomputing platform regions 1322A-B. - In one embodiment, the cloud
computing platform region 1322A includesdata provider share 1304. In this embodiment, the data provider shares some or all of itsdata 1304 via thedata exchange 1306 that is visible to data consumers in bothcloud computing platforms 1302A-B. For example and in one embodiment,data consumer 1308 that is in a different cloud computing platform region (e.g., cloud computing platform region 1302B) can view the listing for thedata provider share 1304 via the data exchange. If thedata consumer 1308 requests the listing, in response, thedata exchange 1306 can create a provider account in cloud computing platform region 1322B for thedata provider share 1304, and replicate the data share to the cloud computing platform region 1322B. With the data share replicated 1310 in cloud computing platform region 1322B, thedata consumer 1308 can access the data. In one embodiment, thedata exchange 1306 can replicate the entiredata provider share 1304, or can replicate some of thedata provider share 1304. In one embodiment, the data provider indicates which parts of thedata provider share 1304 to be replicated. In another embodiment, thedata exchange 1304 infers what parts of thedata provider share 1304 to replicate. In this embodiment, the data exchange can infer the objects of the data provider share that need to be replicated, the frequency as which these objects need to be replicated, and/or region of the consumer account and the corresponding provider secondary account. - In one embodiment, there are different types of use cases for data sharing across deployments, such as across regions, across clouds, and/or as well as into/out from a VPC. For example and in one embodiment, a data provider can provide non-personalized By Request data shares either in a data exchange or outside of the data exchange, where consumers can be in different deployments. In one embodiment, this is the basic building block that some of the other type can be built on. Another use case type is a data provider of a standard listing that used for many different consumers (e.g. weather data on a public or private exchange). A third use type case can be a data provider of personalized shares/listings. This use case type could be on the public data exchange, in a private data exchange, or outside the Exchange (just data sharing). Lastly, a use case type could be a consumer on VPC wanting access to a data share in a cloud computing platform. In one embodiment, an additional nuance is that a private data exchange adds that providers are added by the data exchange administrator. The data exchange administrator would ideally like to not have to train the providers in managing the complexities of setting up replication to share data from that provider to other private exchange members who may be on different regions/clouds.
- In one embodiment, a reservations data provider wants to share select tables and/or views whose data is stored on is the US-West region of a cloud computing platform and with a marketing data firm (whose main account is on the same cloud computing platform, but in a different region, say US-East). In this embodiment, the reservations data provider is a customer of marketing data provider. A primary goal is for marketing data provider to process and incorporate this info into a product of the marketing data provider. In one embodiment, costs should impact marketing data provider data exchange account, but not the account of the reservations data provider. In this embodiment, a secondary goal is for the marketing data provider to share certain views from the marketing data provider to reservations data provider. In this case, the reservations data provider is a customer of the marketing data provider, and so the marketing data provider does not want to make reservations data provider do the work. In addition, the marketing data provider have a view that references objects in another database, so both databases must be replicated. In one embodiment, the reservations data provider determines which tables and/or views to list in the data exchange. In response to the marketing data provider requesting the data share via the data exchange, the data exchange creates an account in the US-East region of the cloud computing provider and replicates the relevant data shares of the reservations data providers (e.g., the main data share indicated by the reservations data provider as well as the dependent data). Sharing data across cloud computing platforms and/or regions is described further in
FIG. 14 below. - In a further embodiment, in a private exchange, a non-profit research institution wants listings that members have to request access to. In this embodiment, these requests trigger a workflow with cloud computing platform, and each consumer is added once the approval is done. This is a common scenario in private exchange, where the data is not personalized but the consumer needs to go through an approval workflow. Sharing data across cloud computing platforms and/or regions using an approval workflow is described further in
FIG. 14 below. - In another embodiment, is that a data provider wishes to replicate a customized set of data shares for a consumer across different cloud computing platforms and/or regions. In this embodiment, the data provider wishes only to replicate certain tables of their database. A transportation analytics company its data in a cloud computing service, where each table contains data for a specific dataset. This company shares specific datasets with customers based on what datasets that customer is subscribed to. The transportation analytics company wants to model their data in cloud computing service independent of how shares will be created. Then, when the transportation analytics company creates shares for specific customers, they want only those tables to be replicated. Sharing data across cloud computing platforms and/or regions using an approval workflow is described further in
FIG. 14 below. - In one embodiment, a standard listing on a public exchange can represent increased opportunities for the data exchange and/or the cloud computing service. Paid standard shares present a new revenue stream with monetization. In one embodiment, an experience for the consumer to be able to get the free or paid listing as immediately as possible. Since the provider is not explicitly adding the consumer to the share, they are not in the user flow. In addition, a data provider has also said it would also be ideal to incur replication cost based on demand. For example and in one embodiment, a weather data provider has made available a free subset of their data as a free standard listings and wish to have a paid standard listing, where a customer can immediately pay and get the pre-defined package. Third, for more custom packages they want the consumer to contact them and they will set up a direct share with the consumer. The most likely scenario in this case is that free listing is made available in (almost) all regions. For the paid listing, data providers would want it to be available on any deployment as soon as there is at least one paying consumer in that deployment or if the cloud computing service covers replication cost. Note that while free vs paid shares could be set up as a filter on rows (secure view within one share) or set up as separate shares altogether, one scenario is that these are two separate shares. For example, a retail analytics provider would make a different share with a different set of objects available for free vs paid. Within the paid share, this data provider would have the ability for the consumer to select which rows they want, to create. dynamically created packages. For instance, a consumer will select through the data exchange user interface that they want only location data for “Type=McDonald's” and “State=CA”, and therefore pay only for the rows they get.
- In one embodiment, various scenarios can be handled using the data replication described above. For example and in one embodiment, a brand data provider is building its data-driven marketing solution on a cloud computing service. They will be a provider of both standard and personalized shares, on the public exchange. They have hundreds of TBs of data and can have hundreds of clients to share with, where most of whom will be new to the cloud computing service. Based on the data set a client has purchased on their platform, they automatically insert the clients ID into an entitlements table in the cloud computing service and add their consumer account to the share. They want this automatic sharing pipeline to work across regions & clouds with little or no manual effort. In addition, the brand data provider may also want to share data into a VPC.
- A customer relationship company builds mobile marketing campaigns and shares the results with customers via the data exchange. In one example, the campaign data arrives into their 50 TB event table at least every 15 minutes and consumer-specific data gets shared with consumers. In addition, this company wants a 15-minutes latency (or less) to their consumers.
- An equipment manufacturer wants to share machine health data with 200 dealers so they can take corrective actions. Data is not large in volume but is coming throughout the day and they want latency and/or freshness guarantees for their dealers. “I expected remote data sharing to be a continuous data stream instead of a batch model.” The company will set up secure views based on an entitlements table that maps dealers to equipment, so that each dealer only sees rows that are relevant for them.
- In another example, and embodiment, VPC customers want to consume shares from data providers who are in a multi-tenant deployment. In one embodiment, a VPC is an on-demand configurable pool of shared computing resources allocated within a public cloud environment, providing a level of isolation between the different organizations using the resources. For example, a financial company wants to consume marketing data from companies like marketing companies. However, when directly consuming data from these data sources, considerable effort is necessary to ingest the data, and manage and maintain the ingestion process, and change management of the underlying schemas. Solving problems, which usually are called “first mile problem of data ingestion,” can provide tremendous benefits. The financial company can have the data processed by a third party before ingesting the data from the exchange. In addition, the financial company has multiple business units (Bus), which have different needs and visibility requirements for shared data. For that reason, the financial company needs the ability to apply fine grained security controls when sharing the data provided by the third party company to its internal customers (BUs). Furthermore, any solution to be considered must be easy to use (no additional coding and/or engineering time), robust (no fragile ongoing or maintenance processes.) Handling a workflow of this type is further described in
FIG. 16 below. -
FIG. 14 is a process flow diagram of amethod 1400 for sharing data across multiple cloud computing services and/or across multiple regions with a cloud computing service. In general, themethod 1400 may be performed by processing logic that may include hardware (e.g., processing device, circuitry, dedicated logic, programmable logic, microcode, hardware of a device, integrated circuit, etc.), software (e.g., instructions run or executed on a processing device), or a combination thereof. For example, the processing logic may be implemented asexchange manager 124.Method 1400 may begin atstep 1402, where the processing logic receives an indication for data sharing across different cloud computing platforms or a cloud computing platform that has multiple regions. In one embodiment, a data provider can create a listing that is a standard one available to all customers, a data share that is available by request, and/or other types of data share as described above. Creating a list is further described inFIG. 15 below. Atblock 1404, processing logic receives request for listing of data sharing. In one embodiment, the request is associated with a customer account for an account that is part of a cloud computing platform or cloud computing platform region. For example and in one embodiment, the request could be associated with a customer account that is a different cloud computing platform and/or cloud computing platform region that is different with the cloud computing platform and/or cloud computing platform region associated with the listing. Processing logic determines if a provider customer account is allowed in the cloud computing platforms and/or cloud computing platform regions associated with the listing request. For example and in one embodiment, the provider may have information that is not allowed out a certain region (e.g., security data). If not, executing proceeds to block 1406, where an error is returned. If the provider customer account is allowed, processing logic creates the customer account the listing request at block 1410. In one embodiment, if the request is for a consumer that is in different cloud computing platforms and/or cloud computing platform regions, processing logic creates a provider account in that cloud computing platforms and/or cloud computing platform regions, so the data provider can share the data with the requesting customer. - At
block 1412, processing logic shares the data. In one embodiment, processing logic shares the data by replicating the data to the cloud computing platforms and/or cloud computing platform regions associated with the consumer who made the original listing request. In one embodiment, processing logic can replicate the entire data share, or parts of the data share. In this embodiment, processing logic can infer which parts of the data is to be shared based on characteristics of the requesting consumer (geographical, temporal, etc.). In a further embodiment, processing logic can customize the data based on the consumer (e.g., paid data represents one view as opposed to unpaid data, data shared is based on the region of the consumer, a consumer's affiliations, and/or other types of characteristics). Processing logic sets up tasks for frequency replication at block 1414. In one embodiment, by setting up these tasks, the data that is shared with the different cloud computing platforms and/or cloud computing platform regions can be periodically refreshed, so that the data provider or consumer does not need to manually refresh the data. - In
FIG. 14 , a consumer requests shared data across different cloud computing platforms and/or cloud computing platform regions. In one embodiment, the data provider can create a listing that allows for data to replicated across different cloud computing platforms and/or cloud computing platform regions.FIG. 15 is a process flow diagram of amethod 1500 for creating a listing within a data exchange, where the listing is available in different cloud computing services and/or in multiple regions with a cloud computing service. For example, the processing logic may be implemented asexchange manager 124.Method 1500 may begin atstep 1502, where the processing logic receives a request to create a data listing from a data provider. In one embodiment, the data listing can include a listing types (e.g., Standard or By Request, whether a free or a paid listing, what data is to be shared (e.g., which tables, rows, etc. of a database to share), any sharing restrictions, data provider information, and/or other information used for the listing. Atblock 1504, processing logic creates the listing in the data exchange. Processing logic can pre-emptively replicate the shared data, based on characteristics of the data to be shared and the cloud computing entities (e.g., cloud computing platforms and/or cloud computing platform regions) for the listing. For example and in one embodiment, processing logic may share standard data to different cloud computing platforms where existing consumers are present, or may preemptively share data based on geographic regions (e.g., sharing weather data based on cloud computing platform regions), and/or other characteristics. Processing logic sets up tasks for frequency replication at block 1508. In one embodiment, by setting up these tasks, the data that is shared with the different cloud computing platforms and/or cloud computing platform regions can be periodically refreshed, so that the data provider or consumer does not need to manually refresh the data. - In one embodiment, another type of sharing model is one where a data provider provides the shared data to one or more third party entities before sharing the data with the consumer. This can be used to personalize the shared data for the consumer.
FIG. 16 is a process flow diagram of amethod 1600 for creating a listing for personalized shares within a data exchange, where the listing is available in different cloud computing services and/or in multiple regions with a cloud computing service. For example, the processing logic may be implemented asexchange manager 124.Method 1600 may begin at step 1602, where the processing logic receives a request to create a listing from a data provider at block 1602. In one embodiment, a data provider can create a listing that is a standard one available to all customers, a data share that is available by request, and/or other types of data share as described above. In a further embodiment, the listing can include a set of cloud computing platforms and/or cloud computing platform regions where the list can be visible. Listing could be visible in all possible cloud computing regions and/or cloud computing platforms clouds or can be a subset of all possible cloud computing regions and/or cloud computing platforms clouds. In addition, the listing can include other information (e.g., allowed consumers for the data). Processing logic creates the listing in the data exchange atblock 1604. In one embodiment, processing logic creates the listing by creating an entitlements map, which maps a consumer identifier to the data provider. In addition, the listing can include a secure view of the data, which is added to the shared data. - At
block 1606, processing logic determines which third party account will receive the data. In one embodiment, the shared data can be replicated to another party for processing before the data is shared with a consumer. Processing logic determines which objects are to be replicated to potential third parties atblock 1608. At block 1610, processing logic replicates the data to the third party accounts. Processing logic determines the secure view for the shared data for potential consumers atblock 1612. In one embodiment, a secure view is used to create a secure way for the potential consumers to access the shared data. In this embodiment, there can be different secure view for different potential consumers, group of consumers, or the same secure view for all potential consumers. - At block 1614, processing logic can pre-emptively replicate the shared data and the entitlements table, based on characteristics of the data to be shared and the cloud computing entities (e.g., cloud computing platforms and/or cloud computing platform regions) for the listing. For example and in one embodiment, processing logic may share standard data to different cloud computing platforms where existing consumers are present, or may preemptively share data based on geographic regions (e.g., sharing weather data based on cloud computing platform regions), and/or other characteristics.
- Processing logic receives a listing request at
block 1616. In one embodiment, the request is associated with a consumer account for an account that is part of cloud computing platform and/or cloud computing region. For example, and in one embodiment, the request could be associated with a consumer account that is on a different cloud computing platforms and/or cloud computing platform regions that is different with the cloud computing platforms and/or cloud computing platform regions than associated with the listing. Processing logic determines if a provider customer account is allowed in the cloud computing platform, cloud computing platform region, and/or VPC associated with the listing request. For example and in one embodiment, the provider may have information that is not allowed out a certain region (e.g., security of data, personally identifiable information, government restrictions, and/or other types of restrictions). If not, processing logic returns an error. If the provider customer account is allowed, processing logic creates the customer account the listing request at block 1618. In one embodiment, if the request is for a consumer that is in different cloud computing platform, cloud computing platform region, and/or VPC, processing logic creates a provider account in that cloud computing platforms and/or cloud computing platform regions, so the data provider can share the data with the requesting customer. - At
block 1620, processing logic shares the data using the secure view associated with the consumer. In one embodiment, processing logic shares the data by replicating the data using the secure view to the cloud computing platforms and/or cloud computing platform regions associated with the consumer who made the original listing request. In one embodiment, processing logic can replicate the entire data share, or parts of the data share. In this embodiment, processing logic can infer which parts of the data is to be shared based on characteristics of the requesting consumer (geographical, temporal, etc.). Processing logic sets up tasks for frequency replication atblock 1622. In one embodiment, by setting up these tasks, the data that is shared with the cloud computing platforms and/or cloud computing platform regions can be periodically refreshed, so that the data provider or consumer does not need to manually refresh the data. -
FIG. 17 is a process flow diagram of a method 1700 for sharing data with a VPC. For example, the processing logic may be implemented asexchange manager 124. Method 1700 may begin atstep 1702, where the processing logic receives a listing request from a consumer using a VPC. In one embodiment, processing logic uses an account of the VPC that has been created beforehand to share the data. At block 1704, processing logic replicates the data to the consumer's VPC using the account. In one embodiment, processing logic can replicate the entire data share, or parts of the data share. In this embodiment, processing logic can infer which parts of the data is to be shared based on characteristics of the requesting consumer (geographical, temporal, etc.). In this embodiment, the consumer sees the data share though a user interface associated with the VPC account. The consumer can create a database from the shared data. Processing logic sets up tasks for frequency replication atblock 1706. In one embodiment, by setting up these tasks, the data that is shared with the VPC can be periodically refreshed, so that the data provider or consumer does not need to manually refresh the data. -
FIG. 18 is a block diagram of anexample computing device 1800 that may perform one or more of the operations described herein, in accordance with some embodiments.Computing device 1800 may be connected to other computing devices in a LAN, an intranet, an extranet, and/or the Internet. The computing device may operate in the capacity of a server machine in client-server network environment or in the capacity of a client in a peer-to-peer network environment. The computing device may be provided by a personal computer (PC), a set-top box (STB), a server, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single computing device is illustrated, the term “computing device” shall also be taken to include any collection of computing devices that individually or jointly execute a set (or multiple sets) of instructions to perform the methods discussed herein. - The
example computing device 1800 may include a processing device (e.g., a general purpose processor, a PLD, etc.) 1802, a main memory 1804 (e.g., synchronous dynamic random access memory (DRAM), read-only memory (ROM)), a static memory 1806 (e.g., flash memory and a data storage device 1818), which may communicate with each other via abus 1830. -
Processing device 1802 may be provided by one or more general-purpose processing devices such as a microprocessor, central processing unit, or the like. In an illustrative example,processing device 1802 may comprise a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or a processor implementing other instruction sets or processors implementing a combination of instruction sets.Processing device 1802 may also comprise one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. Theprocessing device 1802 may be configured to execute the operations described herein, in accordance with one or more aspects of the present disclosure, for performing the operations and steps discussed herein. In one embodiment,processing device 1802 representscloud computing platform 110 ofFIG. 1 . In another embodiment,processing device 1802 represents a processing device of a client device (e.g., client devices 101-104). -
Computing device 1800 may further include anetwork interface device 1808 which may communicate with anetwork 1820. Thecomputing device 1800 also may include a video display unit 1810 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)), an alphanumeric input device 1812 (e.g., a keyboard), a cursor control device 1814 (e.g., a mouse) and an acoustic signal generation device 1816 (e.g., a speaker). In one embodiment,video display unit 1810, alphanumeric input device 1812, andcursor control device 1814 may be combined into a single component or device (e.g., an LCD touch screen). -
Data storage device 1818 may include a computer-readable storage medium 1828 on which may be stored one or more sets of instructions, e.g., instructions for carrying out the operations described herein, in accordance with one or more aspects of the present disclosure. Private data exchangeinstructions 1826 may also reside, completely or at least partially, withinmain memory 1804 and/or withinprocessing device 1802 during execution thereof bycomputing device 1800,main memory 1804 andprocessing device 1802 also constituting computer-readable media. The instructions may further be transmitted or received over anetwork 1820 vianetwork interface device 1808. - While computer-
readable storage medium 1828 is shown in an illustrative example to be a single medium, the term “computer-readable storage medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database and/or associated caches and servers) that store the one or more sets of instructions. The term “computer-readable storage medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform the methods described herein. The term “computer-readable storage medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical media and magnetic media. - Unless specifically stated otherwise, terms such as “receiving,” “receiving,” “creating,” “determining,” “sharing,” “providing,” “designating,” or the like, refer to actions and processes performed or implemented by computing devices that manipulates and transforms data represented as physical (electronic) quantities within the computing device's registers and memories into other data similarly represented as physical quantities within the computing device memories or registers or other such information storage, transmission or display devices. Also, the terms “first,” “second,” “third,” “fourth,” etc., as used herein are meant as labels to distinguish among different elements and may not necessarily have an ordinal meaning according to their numerical designation.
- Examples described herein also relate to an apparatus for performing the operations described herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computing device selectively programmed by a computer program stored in the computing device. Such a computer program may be stored in a computer-readable non-transitory storage medium.
- The methods and illustrative examples described herein are not inherently related to any particular computer or other apparatus. Various general purpose systems may be used in accordance with the teachings described herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear as set forth in the description above.
- The above description is intended to be illustrative, and not restrictive. Although the present disclosure has been described with references to specific illustrative examples, it will be recognized that the present disclosure is not limited to the examples described. The scope of the disclosure should be determined with reference to the following claims, along with the full scope of equivalents to which the claims are entitled.
- As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises”, “comprising”, “includes”, and/or “including”, when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Therefore, the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting.
- It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may in fact be executed substantially concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
- Although the method operations were described in a specific order, it should be understood that other operations may be performed in between described operations, described operations may be adjusted so that they occur at slightly different times or the described operations may be distributed in a system which allows the occurrence of the processing operations at various intervals associated with the processing.
- Various units, circuits, or other components may be described or claimed as “configured to” or “configurable to” perform a task or tasks. In such contexts, the phrase “configured to” or “configurable to” is used to connote structure by indicating that the units/circuits/components include structure (e.g., circuitry) that performs the task or tasks during operation. As such, the unit/circuit/component can be said to be configured to perform the task, or configurable to perform the task, even when the specified unit/circuit/component is not currently operational (e.g., is not on). The units/circuits/components used with the “configured to” or “configurable to” language include hardware—for example, circuits, memory storing program instructions executable to implement the operation, etc. Reciting that a unit/circuit/component is “configured to” perform one or more tasks, or is “configurable to” perform one or more tasks, is expressly intended not to invoke 35 U.S.C. 112, sixth paragraph, for that unit/circuit/component. Additionally, “configured to” or “configurable to” can include generic structure (e.g., generic circuitry) that is manipulated by software and/or firmware (e.g., an FPGA or a general-purpose processor executing software) to operate in manner that is capable of performing the task(s) at issue. “Configured to” may also include adapting a manufacturing process (e.g., a semiconductor fabrication facility) to fabricate devices (e.g., integrated circuits) that are adapted to implement or perform one or more tasks. “Configurable to” is expressly intended not to apply to blank media, an unprogrammed processor or unprogrammed generic computer, or an unprogrammed programmable logic device, programmable gate array, or other unprogrammed device, unless accompanied by programmed media that confers the ability to the unprogrammed device to be configured to perform the disclosed function(s).
- Any combination of one or more computer-usable or computer-readable media may be utilized. For example, a computer-readable medium may include one or more of a portable computer diskette, a hard disk, a random access memory (RAM) device, a read-only memory (ROM) device, an erasable programmable read-only memory (EPROM or Flash memory) device, a portable compact disc read-only memory (CDROM), an optical storage device, and a magnetic storage device. Computer program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages. Such code may be compiled from source code to computer-readable assembly language or machine code suitable for the device or computer on which the code will be executed.
- Embodiments may also be implemented in cloud computing environments. In this description and the following claims, “cloud computing” may be defined as a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned (including via virtualization) and released with minimal management effort or service provider interaction and then scaled accordingly. A cloud model can be composed of various characteristics (e.g., on-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service), service models (e.g., Software as a Service (“SaaS”), Platform as a Service (“PaaS”), and Infrastructure as a Service (“IaaS”)), and deployment models (e.g., private cloud, community cloud, public cloud, and hybrid cloud). The flow diagrams and block diagrams in the attached figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flow diagrams or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It will also be noted that each block of the block diagrams or flow diagrams, and combinations of blocks in the block diagrams or flow diagrams, may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flow diagram and/or block diagram block or blocks.
- The foregoing description, for the purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the embodiments and its practical applications, to thereby enable others skilled in the art to best utilize the embodiments and various modifications as may be suited to the particular use contemplated. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.
Claims (21)
1. A method comprising:
creating, by a processing device, a corresponding second data provider account for each of a set of second cloud computing entities, wherein the user has a set of user accounts in the set of second cloud computing entities and a data set is associated with a first data provider account from a first cloud computing entity; and
copying the data set from the first cloud computing entity to each of the set of second cloud computing entities using the corresponding second data provider accounts of the set of second cloud computing entities, wherein the first cloud computing entity is a first region for a cloud computing platform and at least one of the set of second cloud computing entities is a second region for the cloud computing platform that is different from the first region.
2. The method of claim 1 , further comprising:
receiving a request for a listing of the data set from a user associated with the set of second cloud computing entities.
3. The method of claim 1 , wherein a data exchange comprises a plurality of data listings provided by a plurality of data providers, the plurality of data listings referencing a plurality of data sets stored in a data storage platform, the data set is one of the plurality of data sets, and the data provider is one of the plurality of data providers.
4. The method of claim 1 , further comprising:
determining a frequency of updating the copied data set with the set of second cloud computing entities.
5. The method of claim 4 , further comprising:
updating the copied data set using the determined frequency.
6. The method of claim 1 , wherein the copying of the data set comprises:
determining a set of one or more objects of the copied data set is to be replicated with the set of second cloud computing entities; and
replicating the set of one or more objects with the set of second cloud computing entities.
7. The method of claim 1 , wherein the copied data set is customized for the user.
8. The method of claim 1 , wherein the data set is a subset of a database managed by the data provider.
9. The method of claim 1 , wherein at least copied data set references at least one object in a second database that is different than a first data database used to store the data set.
10. A non-transitory machine-readable medium storing instructions which, when executed by one or more processors of a computing device, cause the one or more processors to:
create, by a processing device, a corresponding second data provider account for each of a set of second cloud computing entities, wherein the user has a set of user accounts in the set of second cloud computing entities and a data set is associated with a first data provider account from a first cloud computing entity; and
copy the data set from the first cloud computing entity to each of the set of second cloud computing entities using the corresponding second data provider accounts of the set of second cloud computing entities, wherein the first cloud computing entity is a first region for a cloud computing platform and at least one of the set of second cloud computing entities is a second region for the cloud computing platform that is different from the first region.
11. The machine-readable medium of claim 10 , wherein the instructions further cause the computing device to:
receive a request for a listing of the data set from a user associated with the set of second cloud computing entities.
12. The machine-readable medium of claim 10 , wherein a data exchange comprises a plurality of data listings provided by a plurality of data providers, the plurality of data listings referencing a plurality of data sets stored in a data storage platform, the data set is one of the plurality of data sets, and the data provider is one of the plurality of data providers.
13. The machine-readable medium of claim 10 , wherein the instructions further cause the computing device to:
determine a frequency of updating the copied data set with the set of second cloud computing entities.
14. The machine-readable medium of claim 13 , wherein the instructions further cause the computing device to:
update the copied data set using the determined frequency.
15. The machine-readable medium of claim 10 , wherein the instructions further cause the computing device to copy of the data set by:
determine a set of one or more objects of the copied data set is to be replicated with the set of second cloud computing entities; and
replicate the set of one or more objects with the set of second cloud computing entities.
16. The machine-readable medium of claim 10 , wherein the copied data set is customized for the user.
17. The machine-readable medium of claim 10 , wherein the data set is a subset of a database managed by the data provider.
18. The machine-readable medium of claim 10 , wherein the copied data set references at least one object in a second database that is different than a first data database used to store the data set.
19. A system comprising:
a first cloud computing entity; and
a set of second cloud computing entities;
a data exchange to,
create, by a processing device, a corresponding second data provider account for each of a set of second cloud computing entities, wherein the user has a set of user accounts in the set of second cloud computing entities and a data set is associated with a first data provider account from a first cloud computing entity; and
copy the data set from the first cloud computing entity to each of the set of second cloud computing entities using the corresponding second data provider accounts of the set of second cloud computing entities, wherein the first cloud computing entity is a first region for a cloud computing platform and at least one of the set of second cloud computing entities is a second region for the cloud computing platform that is different from the first region.
20. The system of claim 19 , wherein the data exchange further to:
receive a request for a listing of the data set from a user associated with the set of second cloud computing entities.
21. The system of claim 19 , wherein the data exchange comprises a plurality of data listings provided by a plurality of data providers, the plurality of data listings referencing a plurality of data sets stored in a data storage platform, the data set is one of the plurality of data sets, and the data provider is one of the plurality of data providers.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/709,689 US11418577B1 (en) | 2020-01-28 | 2022-03-31 | System and method for global data sharing |
US17/858,645 US11463508B1 (en) | 2020-01-28 | 2022-07-06 | System and method for global data sharing |
US17/940,436 US11743324B2 (en) | 2020-01-28 | 2022-09-08 | System and method for global data sharing |
US18/222,770 US20230362235A1 (en) | 2020-01-28 | 2023-07-17 | System and method for global data sharing |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202062966977P | 2020-01-28 | 2020-01-28 | |
US16/814,875 US10999355B1 (en) | 2020-01-28 | 2020-03-10 | System and method for global data sharing |
US17/220,887 US11082483B1 (en) | 2020-01-28 | 2021-04-01 | System and method for global data sharing |
US17/378,562 US11323506B2 (en) | 2020-01-28 | 2021-07-16 | System and method for global data sharing |
US17/709,689 US11418577B1 (en) | 2020-01-28 | 2022-03-31 | System and method for global data sharing |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/378,562 Continuation US11323506B2 (en) | 2020-01-28 | 2021-07-16 | System and method for global data sharing |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/858,645 Continuation US11463508B1 (en) | 2020-01-28 | 2022-07-06 | System and method for global data sharing |
Publications (2)
Publication Number | Publication Date |
---|---|
US20220239728A1 true US20220239728A1 (en) | 2022-07-28 |
US11418577B1 US11418577B1 (en) | 2022-08-16 |
Family
ID=75689306
Family Applications (10)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/814,883 Active US11030343B1 (en) | 2020-01-28 | 2020-03-10 | System and method for creating a global data sharing listing |
US16/814,875 Active US10999355B1 (en) | 2020-01-28 | 2020-03-10 | System and method for global data sharing |
US17/220,887 Active US11082483B1 (en) | 2020-01-28 | 2021-04-01 | System and method for global data sharing |
US17/244,616 Active US11805167B2 (en) | 2020-01-28 | 2021-04-29 | Creating a global data sharing listing |
US17/378,562 Active US11323506B2 (en) | 2020-01-28 | 2021-07-16 | System and method for global data sharing |
US17/709,689 Active US11418577B1 (en) | 2020-01-28 | 2022-03-31 | System and method for global data sharing |
US17/858,645 Active US11463508B1 (en) | 2020-01-28 | 2022-07-06 | System and method for global data sharing |
US17/940,436 Active US11743324B2 (en) | 2020-01-28 | 2022-09-08 | System and method for global data sharing |
US18/222,770 Pending US20230362235A1 (en) | 2020-01-28 | 2023-07-17 | System and method for global data sharing |
US18/493,606 Pending US20240129360A1 (en) | 2020-01-28 | 2023-10-24 | Creating a global data sharing listing |
Family Applications Before (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/814,883 Active US11030343B1 (en) | 2020-01-28 | 2020-03-10 | System and method for creating a global data sharing listing |
US16/814,875 Active US10999355B1 (en) | 2020-01-28 | 2020-03-10 | System and method for global data sharing |
US17/220,887 Active US11082483B1 (en) | 2020-01-28 | 2021-04-01 | System and method for global data sharing |
US17/244,616 Active US11805167B2 (en) | 2020-01-28 | 2021-04-29 | Creating a global data sharing listing |
US17/378,562 Active US11323506B2 (en) | 2020-01-28 | 2021-07-16 | System and method for global data sharing |
Family Applications After (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/858,645 Active US11463508B1 (en) | 2020-01-28 | 2022-07-06 | System and method for global data sharing |
US17/940,436 Active US11743324B2 (en) | 2020-01-28 | 2022-09-08 | System and method for global data sharing |
US18/222,770 Pending US20230362235A1 (en) | 2020-01-28 | 2023-07-17 | System and method for global data sharing |
US18/493,606 Pending US20240129360A1 (en) | 2020-01-28 | 2023-10-24 | Creating a global data sharing listing |
Country Status (5)
Country | Link |
---|---|
US (10) | US11030343B1 (en) |
EP (1) | EP4097955A4 (en) |
KR (1) | KR20220130728A (en) |
CN (1) | CN115023921B (en) |
WO (1) | WO2021154567A1 (en) |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11810089B2 (en) * | 2020-01-14 | 2023-11-07 | Snowflake Inc. | Data exchange-based platform |
US11488162B2 (en) * | 2020-02-26 | 2022-11-01 | Salesforce.Com, Inc. | Automatically storing metrics relating to payments in a blockchain |
US11269744B2 (en) | 2020-04-22 | 2022-03-08 | Netapp, Inc. | Network storage failover systems and associated methods |
US11416356B2 (en) | 2020-04-22 | 2022-08-16 | Netapp, Inc. | Network storage failover systems and associated methods |
US11216350B2 (en) * | 2020-04-22 | 2022-01-04 | Netapp, Inc. | Network storage failover systems and associated methods |
EP4173232A1 (en) * | 2020-06-29 | 2023-05-03 | Illumina, Inc. | Temporary cloud provider credentials via secure discovery framework |
US11522880B2 (en) * | 2020-07-09 | 2022-12-06 | International Business Machines Corporation | Analytics engine for data exploration and analytics |
US11507693B2 (en) * | 2020-11-20 | 2022-11-22 | TripleBlind, Inc. | Systems and methods for providing a blind de-identification of privacy data |
US11416450B1 (en) * | 2021-03-16 | 2022-08-16 | EMC IP Holding Company LLC | Clustering data management entities distributed across a plurality of processing nodes |
CN113642036B (en) * | 2021-07-07 | 2023-07-28 | 阿里巴巴华北技术有限公司 | Data processing method, device and system |
US11544011B1 (en) | 2021-07-28 | 2023-01-03 | Netapp, Inc. | Write invalidation of a remote location cache entry in a networked storage system |
US11481326B1 (en) | 2021-07-28 | 2022-10-25 | Netapp, Inc. | Networked storage system with a remote storage location cache and associated methods thereof |
US11500591B1 (en) | 2021-07-28 | 2022-11-15 | Netapp, Inc. | Methods and systems for enabling and disabling remote storage location cache usage in a networked storage system |
US11768775B2 (en) | 2021-07-28 | 2023-09-26 | Netapp, Inc. | Methods and systems for managing race conditions during usage of a remote storage location cache in a networked storage system |
US11373000B1 (en) | 2021-10-22 | 2022-06-28 | Akoya LLC | Systems and methods for managing tokens and filtering data to control data access |
US11496483B1 (en) * | 2021-10-22 | 2022-11-08 | Akoya LLC | Systems and methods for managing tokens and filtering data to control data access |
US11641357B1 (en) * | 2021-10-22 | 2023-05-02 | Akoya LLC | Systems and methods for managing tokens and filtering data to control data access |
US11379614B1 (en) | 2021-10-22 | 2022-07-05 | Akoya LLC | Systems and methods for managing tokens and filtering data to control data access |
US11379617B1 (en) | 2021-10-22 | 2022-07-05 | Akoya LLC | Systems and methods for managing tokens and filtering data to control data access |
US11704338B1 (en) * | 2022-02-28 | 2023-07-18 | Snowflake Inc. | Replication of share across deployments in database system |
US20230401181A1 (en) * | 2022-06-10 | 2023-12-14 | Capital One Services, Llc | Data Management Ecosystem for Databases |
US20230401224A1 (en) * | 2022-06-10 | 2023-12-14 | Capital One Services, Llc | Methods of orchestrated data sharing across cloud regions and cloud platforms of cloud-based data warehousing systems |
CN116781608B (en) * | 2023-08-17 | 2023-11-21 | 中移信息系统集成有限公司 | Data transmission system, method, electronic device and readable storage medium |
Family Cites Families (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7921299B1 (en) * | 2003-12-05 | 2011-04-05 | Microsoft Corporation | Partner sandboxing in a shared multi-tenant billing system |
US8560456B2 (en) * | 2005-12-02 | 2013-10-15 | Credigy Technologies, Inc. | System and method for an anonymous exchange of private data |
US8452726B2 (en) * | 2010-06-04 | 2013-05-28 | Salesforce.Com, Inc. | Sharing information between tenants of a multi-tenant database |
AU2011296008B2 (en) * | 2010-09-01 | 2015-08-06 | Google Llc | Access control for user-related data |
CN103384237B (en) * | 2012-05-04 | 2017-02-22 | 华为技术有限公司 | Method for sharing IaaS cloud account, shared platform and network device |
US20130311598A1 (en) | 2012-05-16 | 2013-11-21 | Apple Inc. | Cloud-based data item sharing and collaboration among groups of users |
US20140006038A1 (en) * | 2012-06-27 | 2014-01-02 | Prime West Health | Account Tracking System for Health Resource Encounters |
AU2013295603A1 (en) * | 2012-07-26 | 2015-02-05 | Experian Marketing Solutions, Inc. | Systems and methods of aggregating consumer information |
WO2015069234A1 (en) | 2013-11-06 | 2015-05-14 | Intel Corporation | Unifying interface for cloud content sharing services |
US10546149B2 (en) * | 2013-12-10 | 2020-01-28 | Early Warning Services, Llc | System and method of filtering consumer data |
EP3080742A4 (en) * | 2013-12-11 | 2017-08-30 | Intralinks, Inc. | Customizable secure data exchange environment |
AU2015207842B2 (en) * | 2014-07-29 | 2020-07-02 | Samsung Electronics Co., Ltd. | Method and apparatus for sharing data |
US10033702B2 (en) | 2015-08-05 | 2018-07-24 | Intralinks, Inc. | Systems and methods of secure data exchange |
US10671641B1 (en) * | 2016-04-25 | 2020-06-02 | Gravic, Inc. | Method and computer program product for efficiently loading and synchronizing column-oriented databases |
US10592681B2 (en) * | 2017-01-10 | 2020-03-17 | Snowflake Inc. | Data sharing in a multi-tenant database system |
US10437786B2 (en) * | 2017-10-21 | 2019-10-08 | Dropbox, Inc. | Interoperability between content management system and collaborative content system |
US10733168B2 (en) * | 2017-10-26 | 2020-08-04 | Sap Se | Deploying changes to key patterns in multi-tenancy database systems |
KR102441299B1 (en) * | 2017-11-27 | 2022-09-08 | 스노우플레이크 인코포레이티드 | Batch data collection into database system |
US10798165B2 (en) * | 2018-04-02 | 2020-10-06 | Oracle International Corporation | Tenant data comparison for a multi-tenant identity cloud service |
US10673694B2 (en) | 2018-05-29 | 2020-06-02 | Amazon Technologies, Inc. | Private network mirroring |
US10810166B2 (en) * | 2018-09-20 | 2020-10-20 | Paypal, Inc. | Reconciliation of data in a distributed system |
US10754827B2 (en) * | 2018-11-06 | 2020-08-25 | Dropbox, Inc. | Technologies for integrating cloud content items across platforms |
WO2020190931A1 (en) | 2019-03-19 | 2020-09-24 | Sigma Computing, Inc. | Cross-organization worksheet sharing |
US10635642B1 (en) * | 2019-05-09 | 2020-04-28 | Capital One Services, Llc | Multi-cloud bi-directional storage replication system and techniques |
US11461184B2 (en) * | 2019-06-17 | 2022-10-04 | Commvault Systems, Inc. | Data storage management system for protecting cloud-based data including on-demand protection, recovery, and migration of databases-as-a-service and/or serverless database management systems |
-
2020
- 2020-03-10 US US16/814,883 patent/US11030343B1/en active Active
- 2020-03-10 US US16/814,875 patent/US10999355B1/en active Active
-
2021
- 2021-01-20 EP EP21748410.4A patent/EP4097955A4/en active Pending
- 2021-01-20 WO PCT/US2021/014213 patent/WO2021154567A1/en unknown
- 2021-01-20 KR KR1020227028068A patent/KR20220130728A/en unknown
- 2021-01-20 CN CN202180011492.2A patent/CN115023921B/en active Active
- 2021-04-01 US US17/220,887 patent/US11082483B1/en active Active
- 2021-04-29 US US17/244,616 patent/US11805167B2/en active Active
- 2021-07-16 US US17/378,562 patent/US11323506B2/en active Active
-
2022
- 2022-03-31 US US17/709,689 patent/US11418577B1/en active Active
- 2022-07-06 US US17/858,645 patent/US11463508B1/en active Active
- 2022-09-08 US US17/940,436 patent/US11743324B2/en active Active
-
2023
- 2023-07-17 US US18/222,770 patent/US20230362235A1/en active Pending
- 2023-10-24 US US18/493,606 patent/US20240129360A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
US11418577B1 (en) | 2022-08-16 |
US11323506B2 (en) | 2022-05-03 |
US20230362235A1 (en) | 2023-11-09 |
CN115023921A (en) | 2022-09-06 |
US11030343B1 (en) | 2021-06-08 |
KR20220130728A (en) | 2022-09-27 |
EP4097955A4 (en) | 2024-02-21 |
WO2021154567A1 (en) | 2021-08-05 |
US20210250400A1 (en) | 2021-08-12 |
US20240129360A1 (en) | 2024-04-18 |
US11805167B2 (en) | 2023-10-31 |
CN115023921B (en) | 2023-09-01 |
EP4097955A1 (en) | 2022-12-07 |
US11743324B2 (en) | 2023-08-29 |
US20230007074A1 (en) | 2023-01-05 |
US11463508B1 (en) | 2022-10-04 |
US10999355B1 (en) | 2021-05-04 |
US20210344747A1 (en) | 2021-11-04 |
US11082483B1 (en) | 2021-08-03 |
US20210320968A1 (en) | 2021-10-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11418577B1 (en) | System and method for global data sharing | |
US11810089B2 (en) | Data exchange-based platform | |
US11843608B2 (en) | Managing version sharing in a data exchange | |
US10803082B1 (en) | Data exchange | |
US11334604B2 (en) | Private data exchange | |
US20230252179A1 (en) | Organizing, discovering and evaluating marketplace datasets and services by industry business needs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |