US20190377727A1 - Automatic dynamic reusable data recipes - Google Patents

Automatic dynamic reusable data recipes Download PDF

Info

Publication number
US20190377727A1
US20190377727A1 US16/384,474 US201916384474A US2019377727A1 US 20190377727 A1 US20190377727 A1 US 20190377727A1 US 201916384474 A US201916384474 A US 201916384474A US 2019377727 A1 US2019377727 A1 US 2019377727A1
Authority
US
United States
Prior art keywords
data
recipe
components
live
recipes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/384,474
Inventor
Jeff Burtenshaw
Daren Thayne
Joshua G. James
Paul Baker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Domo Inc
Original Assignee
Domo Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Domo Inc filed Critical Domo Inc
Priority to US16/384,474 priority Critical patent/US20190377727A1/en
Assigned to DOMO, INC. reassignment DOMO, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BAKER, PAUL, JAMES, JOSHUA G., THAYNE, DAREN, BURTENSHAW, JEFF
Publication of US20190377727A1 publication Critical patent/US20190377727A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24526Internal representations for queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/289Object oriented databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing

Definitions

  • ADS Application Data Sheet
  • the present application claims priority to: U.S. application Ser. No. 14/257,669 filed on Apr. 21, 2014 and issued as U.S. Pat. No. 10,262,030 on Apr. 16, 2019, and U.S. Provisional Application Ser. No. 61/814,586 for “Automatic Dynamic Reusable Data Recipes,” filed Apr. 22, 2013, each of which is incorporated by reference herein in its entirety.
  • the present invention relates to systems and methods for generating user-requested information, and more particularly, automated creation of data recipes.
  • KPI's key performance indicators
  • Many businesses use KPI's to make strategic decisions.
  • KPI's to make strategic decisions.
  • a new KPI is requested, considerable work must be done in order to obtain it. For example, a user may have to (1) determine what the component data of the KPI are, (2) determine where this data resides among one or more files, databases, and the like, (3) locate the data, (4) determine how the data should be combined in order to obtain the KPI, and (5) combine the data in the manner determined.
  • data modeling is used to organize and structure data for efficient query and retrieval in various contexts.
  • data modeling is typically a highly manual and expert-based task.
  • Conventional data modeling methods may be expensive and labor-intensive, and may be unavailable to many organizations.
  • many known data modeling techniques require significant computational time to resolve.
  • the systems and methods of the present invention may address such difficulty by providing mechanisms for automatically creating a recipe to provide requested information. This may be done without the need for the user to review the associated data source or locate the constituent data.
  • Various embodiments of the present invention may implement dynamic reusable data recipes that allow a data structure to evolve automatically and dynamically over time as new content is added. Such content can be added for later access from a business semantic layer of a software application.
  • the system may operate in contexts where high volumes of new content may be added and accessed/queried on a continual basis, so that the ability for a human data architect or team of data modelers would be overwhelmed by the volume and variety of structures required to be developed to support ever-changing needs.
  • the system of the present invention may automatically create and evolve reusable data structures in real-time with little human intervention.
  • One application of such a system is to enable and implement a community-based service that allows members to create new content and share that content with others. This can include, for example, extending existing content to cover new attributes and/or to answer new facets of questions.
  • an interrogation engine may be provided. These engines may, co-operatively, have the capability to evolve data recipes so as to optimize the reuse of existing data recipes and extend them as needed. In this manner, the systems and methods of the present invention may automate the process of generating data structures for efficient data storage and retrieval of query tools. Further details and variations are described herein.
  • FIG. 1A is a block diagram depicting a hardware architecture for practicing the present invention according to one embodiment of the present invention.
  • FIG. 1B is a block diagram depicting a hardware architecture for practicing the present invention in a client/server environment, according to one embodiment of the present invention.
  • FIG. 2 is a block diagram depicting the structure of a data recipe according to one embodiment of the present invention.
  • FIG. 3 is a block diagram depicting the structure of a data map according to one embodiment of the present invention.
  • FIG. 4 is a block diagram depicting the structure of a semantic layer according to one embodiment of the invention.
  • FIG. 5 is a block diagram depicting a system for carrying out automatic information provision, according to one embodiment of the present invention.
  • FIG. 6 is a flowchart depicting a method of carrying out automatic data recipe generation, according to one embodiment of the present invention.
  • FIG. 7 is a block diagram depicting a first recipe used to obtain a first KPI according to one embodiment of the invention.
  • FIG. 8 is a block diagram depicting a second recipe used to obtain a second KPI according to one embodiment of the invention.
  • the systems and methods described and depicted herein may refer to automated generation of data recipes that provide information requested by a user.
  • the data recipes may, in some embodiments, relate to the operation of an enterprise.
  • the techniques of the present invention can be applied to many different types of information, and may apply to many different situations apart from the exemplary enterprise operation context mentioned previously.
  • the present invention can be implemented on any electronic device(s) equipped to receive, store, and present information.
  • an electronic device(s) may include, for example, one or more a desktop computers, laptop computers, smartphones, tablet computers, or the like.
  • FIG. 1A there is shown a block diagram depicting a hardware architecture for practicing the present invention, according to one embodiment.
  • Such an architecture can be used, for example, for implementing the techniques of the present invention in a computer or other device 101 .
  • Device 101 may be any electronic device equipped to receive, store, and/or present information, and to receive user input in connect with such information.
  • device 101 has a number of hardware components well known to those skilled in the art.
  • Input device 102 can be any element that receives input from user 100 , including, for example, a keyboard, mouse, stylus, touch-sensitive screen (touchscreen), touchpad, trackball, accelerometer, five-way switch, microphone, or the like.
  • Input can be provided via any suitable mode, including for example, one or more of: pointing, tapping, typing, dragging, and/or speech.
  • Data store 106 can be any magnetic, optical, or electronic storage device for data in digital form; examples include flash memory, magnetic hard drive, CD-ROM, DVD-ROM, or the like.
  • data store 106 stores information which may include documents 107 and/or libraries 111 that can be utilized and/or displayed according to the techniques of the present invention, as described below.
  • documents 107 and/or libraries 111 can be stored elsewhere, and retrieved by device 101 when needed for presentation to user 100 .
  • Libraries 111 may include one or more data sets, including a first data set 109 , and optionally, a plurality of additional data sets up to an nth data set 119 .
  • Display screen 103 can be any element that graphically displays documents 107 , libraries 111 , and/or the results of steps performed on documents 107 and/or libraries 111 to provide data output incident to automated provision of data recipes.
  • data output may include, for example, one or more prompts that request information from the user 100 , data, data visualizations, prompts requesting input to confirm and/or modify data recipes, and the like.
  • a dynamic control such as a scrolling mechanism, may be available via input device 102 to change which information is currently displayed, and/or to alter the manner in which the information is displayed.
  • Processor 104 can be a conventional microprocessor for performing operations on data under the direction of software, according to well-known techniques.
  • Memory 105 can be random-access memory, having a structure and architecture as are known in the art, for use by processor 104 in the course of running software.
  • Data store 106 can be local or remote with respect to the other components of device 101 .
  • device 101 is configured to retrieve data from a remote data storage device when needed.
  • Such communication between device 101 and other components can take place wirelessly, by Ethernet connection, via a computing network such as the Internet, or by any other appropriate means. This communication with other electronic devices is provided as an example and is not necessary to practice the invention.
  • data store 106 is detachable in the form of a CD-ROM, DVD, flash drive, USB hard drive, or the like.
  • Documents 107 and/or libraries 111 can be entered from a source outside of device 101 into a data store 106 that is detachable, and later displayed after the data store 106 is connected to device 101 .
  • data store 106 is fixed within device 101 .
  • FIG. 1B there is shown a block diagram depicting a hardware architecture for practicing the present invention in a client/server environment, according to one embodiment of the present invention.
  • client/server environment may use a “black box” approach, whereby data storage and processing are done completely independently from user input/output.
  • client/server environment is a web-based implementation, wherein client device 108 runs a browser that provides a user interface for interacting with web pages and/or other web-based resources from server 110 .
  • Documents 107 and/or libraries 111 can be presented as part of such web pages and/or other web-based resources, using known protocols and languages such as Hypertext Markup Language (HTML), Java, JavaScript, and the like.
  • HTML Hypertext Markup Language
  • Java Java
  • JavaScript JavaScript
  • Client device 108 can be any electronic device incorporating the input device 102 and/or display screen 103 , such as a desktop computer, laptop computer, personal digital assistant (PDA), cellular telephone, smartphone, music player, handheld computer, tablet computer, kiosk, game system, or the like.
  • PDA personal digital assistant
  • Any suitable type of communications network 113 such as the Internet, can be used as the mechanism for transmitting data between client device 108 and server 110 , according to any suitable protocols and techniques.
  • client device 108 transmits requests for data via communications network 113 , and receives responses from server 110 containing the requested data.
  • server 110 is responsible for data storage and processing, and incorporates data store 106 for storing documents 107 and/or libraries 111 .
  • Server 110 may include additional components as needed for retrieving data and/or libraries 111 from data store 106 in response to requests from client device 108 .
  • documents 107 are organized into one or more well-ordered data sets, with one or more data entries in each set.
  • Data store 106 can have any suitable structure. Accordingly, the particular organization of documents 107 within data store 106 need not resemble the form in which documents 107 are displayed to user 100 .
  • an identifying label is also stored along with each data entry, to be displayed along with each data entry.
  • the libraries 111 may include one or more data sources, which may be stored at one or more locations and in one or more formats. In at least one embodiment, libraries 111 are organized in a file system within data store 106 . Appropriate indexing can be provided to associate particular documents with particular quantitative data elements, reports, other documents, and/or the like. Libraries 111 may include any of a wide variety of data structures known in the database arts. As in FIG. 1A , libraries 111 may include one or more data sets, including a first data set 109 , and optionally, a plurality of additional data sets up to an nth data set 119 .
  • Documents 107 and/or libraries 111 can be retrieved from client-based or server-based data store 106 , and/or from any other source.
  • input device 102 is configured to receive data entries from user 100 , to be added to documents 107 and/or libraries 111 held in data store 106 .
  • User 100 may provide such data entries via the hardware and software components described above according to means that are well known to those skilled in the art.
  • Display screen 103 can be any element that graphically displays documents 107 , libraries 111 , and/or the results of steps performed on documents 107 and/or libraries 111 to provide data output incident to automated provision of data recipes.
  • data output may include, for example, one or more prompts that request information from the user 100 , data, data visualizations, prompts requesting input to confirm and/or modify data recipes, and the like.
  • a dynamic control such as a scrolling mechanism, may be available via input device 102 to change which information is currently displayed, and/or to alter the manner in which the information is displayed.
  • the information displayed on display screen 103 may include data in text and/or graphical form. Such data may comprise visual cues, such as height, distance, and/or area, to convey the value of each data entry.
  • labels accompany data entries on display screen 103 , or can be displayed when user 100 taps on or clicks on a data entry, or causes an onscreen cursor to hover over a data entry.
  • display screen 103 can selectively present a wide variety of data related to automated data recipe generation.
  • user 100 can provide input, such as a selection from a menu containing a variety of options, to determine the various characteristics of the information presented such as the type, scope, and/or format of the information to be displayed on display screen 103 .
  • system can be implemented as software written in any suitable computer programming language, whether in a standalone or client/server architecture. Alternatively, it may be implemented and/or embedded in hardware.
  • a “data recipe” may include any instruction set that enables requested information to be obtained from other data in one or more data sources.
  • a wide variety of data types and processes may be included in a data recipe.
  • Each piece of data used by the data recipe i.e., each “data recipe ingredient”
  • each piece of data may be of any desired length and format.
  • each piece of data may be a character string, integer, floating point number, or any other type of data, and may thus represent any information such as names, times, dates, currency amounts, percentages, fractions, physical dimensions, or any other data that may desirably be stored in a computer.
  • the data recipe 200 may include one or more data recipe ingredients 210 and one or more data processes 220 that describe how the data recipe ingredients 210 are to be combined and/or manipulated to obtain the requested information.
  • the data recipe ingredients 210 may include data of any of the types listed in the preceding paragraph, and the data processes 220 may include one or more formulas or other combination instructions indicating how the data recipe ingredients 210 may be used to obtain the requested information. If the data recipe ingredients 210 include numbers, the data processes 220 may include mathematical formulas or the like.
  • the data recipe ingredients 210 may include one or more dimensions 230 and/or one or more measures 240 .
  • a measure 240 is a property on which calculations can be made
  • a dimension 230 is a data set that can be used for structured labeling of measures.
  • dimensions 230 may represent data that would likely be used as the scale on a data visualization, such as the X-axis on a conventional bar chart or line chart.
  • the measures 240 may represent data that would likely be used for the measurements on a data visualization, such as the vertical displacements of the bars or points of a conventional bar chart or line chart.
  • a set of rules may be used to define which of the data recipe ingredients 210 are dimensions 230 and which are measures 240 .
  • One example of such a set of rules is as follows:
  • Such a rule set may grow in sophistication over time.
  • machine learning techniques and/or other techniques may be used to automatically grow, refine, and/or otherwise develop the rule set used to classify the data recipe ingredients 210 as dimensions 230 or measures 240 .
  • the dimensions 230 and the measures 240 may be defined according to a wide variety of alternative definitions or rules, or alternatively, other classifications (or no classifications) may be applied to the data recipe ingredients 210 .
  • FIG. 3 a block diagram depicts the structure of a data map 300 according to one embodiment of the present invention.
  • the data map 300 may be designed to identify the data recipe ingredients 210 within one or more data stores 106 .
  • the data map 300 may include metadata 302 , which may include records for one or more reference data recipes that are to be mapped to data sources such as the data store 106 . More specifically, the metadata 302 may include a first record 310 pertaining to a first reference data recipe and optionally, one or more additional records pertaining to one or more additional reference data recipes up to an nth record 320 pertaining to an nth reference data recipe.
  • the first record 310 may include first reference data recipe information 330 pertaining to the first reference data recipe.
  • the first reference data recipe information 330 may include a name or other indicator of the first reference data recipe, the information the first reference data recipe is designed to provide, data processes 220 associated with the first reference data recipe, or the like.
  • the first record 310 may include the data recipe ingredients 210 of the first reference data recipe, which may include a first data recipe ingredient 340 and optionally, one or more additional data recipe ingredients up to an nth data recipe ingredient 342 .
  • the metadata 302 may contain a mapping indicating where the data of each data recipe ingredient 210 can be found in one or more data stores 106 .
  • the metadata 302 may also contain, for each of the data recipe ingredients 210 of the first reference data recipe, a data mapping including a first data mapping 350 for the first data recipe ingredient 340 and optionally, one or more additional data mappings up to an nth data mapping 352 for the nth data recipe ingredient 342 .
  • the first data mapping 350 may indicate the location of the first data recipe ingredient 340 in one or more data stores 106 .
  • the nth data mapping 352 may indicate the location of the nth data recipe ingredient 342 in one or more data stores 106 .
  • the nth record 320 may include nth reference data recipe information 360 pertaining to the nth reference data recipe.
  • the nth reference data recipe information 360 may include a name or other indicator of the nth reference data recipe, the information the nth reference data recipe is designed to provide, data processes 220 associated with the nth reference data recipe, or the like.
  • the nth record 320 may include the data recipe ingredients 210 of the nth reference data recipe, which may include a first data recipe ingredient 370 and optionally, one or more additional data recipe ingredients up to an nth data recipe ingredient 372 .
  • the metadata 302 may contain a mapping indicating where the data of each data recipe ingredient 210 can be found in one or more data stores 106 .
  • the metadata 302 may also contain, for each of the data recipe ingredients 210 of the nth data recipe, a data mapping including a first data mapping 380 for the first data recipe ingredient 370 and optionally, one or more additional data mappings up to an nth data mapping 382 for the nth data recipe ingredient 372 .
  • the first data mapping 380 may indicate the location of the first data recipe ingredient 370 in one or more data stores 106 .
  • the nth data mapping 382 may indicate the location of the nth data recipe ingredient 372 in one or more data stores 106 .
  • a block diagram depicts the structure of a semantic layer 400 according to one embodiment of the invention.
  • the semantic layer 400 may provide pairings between terminology and data to facilitate location of data with a semantic identification, such as a name, description, other semantic metadata, or the like, within one or more data stores 106 .
  • the semantic layer 400 may link a “phrase,” which may include one or more words or other semantic elements, with the location within one or more data stores 106 , where data corresponding to that phrase may be found.
  • the semantic layer 400 may have one or more pairings, which may include a first pairing 410 and optionally, one or more additional pairings up to an nth pairing 420 .
  • the first pairing 410 may include a first phrase 430 and a first phrase mapping 440 , which may indicate one or more locations within one or more data stores 106 where the first phrase 430 , or data corresponding to the first phrase 430 , may be found.
  • the first pairing 410 may further include a first confidence factor 450 indicating a level of confidence in the link between the first phrase 430 and the first phrase mapping 440 .
  • the first confidence factor 450 may, for example, be a number such as a percentage, where 0% indicates no confidence in the link, and 100% indicates absolute certainty that the first phrase mapping 440 is the location of the first phrase 430 , or data related to the first phrase 430 , within one or more data stores 106 . This structure may facilitate machine learning to allow for improved performance in generating and developing data recipes over time.
  • the first confidence factor 450 may be revised over time based on user feedback, further comparisons with the data store 106 , or the like.
  • the nth pairing 420 may include an nth phrase 460 and an nth phrase mapping 470 , which may indicate one or more locations within one or more data stores 106 where the nth phrase 460 , or data corresponding to the nth phrase 460 , may be found.
  • the nth pairing 420 may further include an nth confidence factor 480 indicating a level of confidence in the link between the nth phrase 460 and the nth phrase mapping 470 .
  • the nth confidence factor 480 may be a number or other indicator of a confidence level that the first phrase mapping 440 is the location of the first phrase 430 , or data related to the first phrase 430 , within one or more data stores 106 .
  • the nth confidence factor 480 may also be revised over time based on user feedback, further comparisons with the data store 106 , or the like.
  • the semantic layer 400 may be incorporated into the data map 300 .
  • the semantic layer 400 may be independent of the data map 300 .
  • the system of the present invention enables automated provision of data recipes for generating, collecting, and/or presenting information requested by users.
  • a data recipe may be formulated by interrogating one or more data sources, such as the data store 106 , to obtain data types categorizing the data types into data recipe ingredients, and comparing the data recipe ingredients with one or more reference data recipes.
  • the new data recipe may then be created independently, or by modifying one or more reference data recipes to obtain the new data recipe.
  • the new data recipe may then be used to generate, collect, and/or present the requested information.
  • FIG. 5 a block diagram depicts a system 500 for carrying out automatic data recipe generation, according to one embodiment of the present invention.
  • the system 500 may have an interrogation engine 510 , a categorization engine 520 , and a comparison engine 530 that may cooperate to generate a data recipe 200 .
  • User inputs are shown on the left-hand side of FIG. 5
  • outputs to the user are shown on the right-hand side of FIG. 5 .
  • the system 500 may receive an information request 540 from the user 100 .
  • the information request 540 may indicate one or more pieces of information desired by the user 100 .
  • the information request 540 may be for any type of information.
  • the information request 540 may be provided via the input device 102 , and may include one or more numbers, phrases, natural language questions, menu selections, and/or a variety of other user input elements.
  • the system 500 may be incorporated into a business intelligence system.
  • the information request 540 may relate to organizational performance, and may more specifically be a key performance indicator (KPI).
  • KPI key performance indicator
  • KPIs Key performance indicators are performance measurement indicators that can be used to evaluate success of an enterprise, e.g., an entity, activity, organization, or group.
  • KPI reports which summarize key performance indicators, can be very useful in management of an enterprise so that effective decisions can be made regarding business strategy and resource allocation.
  • KPIs can be compiled to create a “dashboard,” which is a compiled snapshot of the most important aspects of the operation of the enterprise.
  • KPIs are numerically measurable aspects of the operation of an enterprise. Some KPIs are well-known and apply to a wide range of businesses. However, the most important KPIs for a business are often highly industry-specific, enterprise-specific, or even department-specific. In management, the challenge is to know which KPIs to focus on. Often, the process of finding the best KPIs to use is an iterative one in which one set of KPI's is utilized, and then refreshed to add and/or remove KPIs to the set under review. The speed at which this process occurs is often limited by the ability of an organization to locate and/or process the data required to calculate the KPIs of interest.
  • the system 500 may beneficially provide the user with a data recipe 200 that can be used to locate and properly process the data necessary to obtain the KPI. This may be performed in several stages.
  • the system 500 of the present invention may apply a sophisticated set of heuristics within a software application.
  • the interrogation engine 510 , the categorization engine 520 , and/or the comparison engine 530 may utilize a heuristic algorithm 570 to perform any of a number of tasks including, but not limited to creation and/or implementation of the data map 300 and/or the semantic layer 400 .
  • FIG. 5 illustrates a connection between the heuristic algorithm 570 and the categorization engine 520 , but the interrogation engine 510 and/or the comparison engine 530 may also utilize the heuristic algorithm 570 , if desired.
  • the information request 540 may be received by the interrogation engine 510 .
  • the interrogation engine 510 may interrogate one or more data sources, which are exemplified by the data store 106 in FIG. 5 .
  • the interrogation engine 510 may identify data types 550 present within the data store 106 . This may optionally be done with the aid of the semantic layer 400 , which may help to map semantic elements of the information request 540 and/or the data types 550 to corresponding data within the data store 106 .
  • Data interrogation may be done based on semantics so that the data types 550 will conform to semantic arche-types.
  • the interrogation engine 510 may function independently of the semantic layer 400 .
  • the interrogation engine 510 may receive data from the data store 106 and parse the data into a NoSQL tree structure for processing. If desired or necessary, the schema of the NoSQL tree structure can be approximated based on the structure of the data store 106 .
  • the data types 550 may be provided to the categorization engine 520 , which may categorize the data types 550 to determine which of the data types 550 are data recipe ingredients 210 of the data recipe 200 to be created. Like the interrogation engine 510 , the categorization engine 520 may operate with the aid of the semantic layer 400 , or independently of the semantic layer 400 .
  • the categorization engine 520 may function by, for example, using the heuristic algorithm 570 to identify the dimensions 230 and/or measure 240 of the data. Relationships between data elements within the schema may be inferred from the structure of the data store 106 , if possible. This may be accomplished, for example, by interrogating the data store 106 to find ingredients in common among different entities.
  • the data recipe ingredients 210 may be provided to the comparison engine 530 , which may compare the data recipe ingredients 210 with one or more reference data recipes to determine whether one or more of the reference data recipes can be modified to yield the data recipe 200 that provides the requested information.
  • the reference data recipes may be stored in the data map 300 , for example, in the first reference data recipe information 330 through the nth reference data recipe information 360 .
  • the comparison engine 530 may compare the data recipe ingredients 210 with the data map 300 , or more precisely, with the data recipe ingredients stored within the data map 300 . This may entail comparing the data recipe ingredients 210 with the first data recipe ingredient 340 through the nth data recipe ingredient 342 and the data recipe ingredients of the other data recipes, up to the first data recipe ingredient 370 through the nth data recipe ingredient 372 .
  • the determination of whether to modify one of the reference data recipes stored in the data map 300 may be made, for example, based on the degree of similarity between the data recipe ingredients 210 for the data recipe 200 desired, and the data recipe ingredients 210 of the corresponding reference data recipe.
  • the data recipe ingredients 210 received by the comparison engine 530 are very similar to the first data recipe ingredient 340 through the nth data recipe ingredient 342 of the first record 310 of the data map 300 , it may be easier to modify the corresponding first reference data recipe to obtain the data recipe 200 that satisfies the information request 540 .
  • the data recipe 200 may be created independently of any of the reference data recipes.
  • the categorization engine 520 may operate with the aid of the semantic layer 400 , or independently of the semantic layer 400 .
  • the comparison engine 530 may use the heuristic algorithm 570 to attempt to match each element in the schema to the semantic layer 400 based, for example, on name, data type, sample data set, and/or any other suitable data elements.
  • the comparison engine 530 may use the data map 300 .
  • the data map 300 may describe one or more reference data recipes, such as the first through nth reference data recipes of FIG. 3 , and may describe how each reference data recipe correlates to the data store 106 .
  • the comparison engine 530 may determine how a data recipe can be applied to a different data source, used to power a data visualization, or otherwise used in a manner different from that of its reference data recipe.
  • comparison of the data recipe ingredients 210 received from the categorization engine 520 with those of the reference data recipes of the data map 300 may be made by comparing attributes of elements of the data recipe ingredients 210 with corresponding attributes of the reference data recipes.
  • attributes that can be associated with one another in accordance with the techniques of the present invention:
  • the comparison engine 530 may utilize any of a wide variety of data comparison techniques known in the art.
  • the system 500 may gather feedback from the user 100 at one or more points in the process via direct interaction (prompt and response) and/or by monitoring manual adjustments to generated content (changes in the state of the model).
  • the system 500 may provide the data recipe 200 to the user for approval or rejection. If the user 100 rejects the data recipe 200 , the data recipe 200 may be revised, for example, by further iterations with the interrogation engine 510 , the categorization engine 520 , and/or the comparison engine 530 .
  • the data recipe 200 may be used to adjust the heuristic algorithm 570 through the use of machine learning or other techniques.
  • the data recipe 200 may also be added to the data map 300 as one of the reference data recipes that may be modified in the process of generating future data recipes.
  • the semantic layer 400 may also be adjusted as the system 500 operates. These adjustments to the semantic layer 400 may be made based upon user feedback, or through the use of machine learning techniques or the like. Such changes to the semantic layer 400 may then be fed back through the system 500 to re-evaluate previously applied schema and thereby improve the performance of the system 500 for future data recipe generation.
  • the data recipe 200 may be provided to the user 100 .
  • the user 100 may then use the data recipe 200 to fulfill the information request 540 .
  • the data recipe 200 may be used repeatedly to obtain the requested information 580 as circumstances change. For example, if the information request 540 represents a KPI, the data recipe 200 may be used to obtain the KPI and then update and/or review it according to an interval desired by the user 100 .
  • the system 500 may also be designed to apply the data recipe 200 to fulfill the information request 540 by providing the requested information 580 automatically for the user 100 .
  • the system 500 may do this only once, or at any interval desired by the user 100 .
  • the system 500 may provide the user 100 with the requested information 580 and then receive feedback based on the requested information 580 . For example, returning to the example of a KPI, if the user deems that the requested information 580 is not the KPI that was requested, the user 100 may provide feedback that causes the system 500 to revise the data recipe 200 to obtain the requested information 580 again from the new data recipe.
  • the new data recipe 200 may be further refined as needed. This may be done, for example, by additional iterations through the interrogation engine 510 , the categorization engine 520 , the comparison engine 530 , and/or the heuristic algorithm 570 .
  • FIG. 6 a flowchart depicts a method 600 of carrying out automatic data recipe generation, according to one embodiment of the present invention.
  • the method 600 may be carried out, at least in part, by the system 500 as in FIG. 5 , or with a differently-configured data recipe provision system.
  • the method 600 may be performed in connection with input from the user 100 ; such a user 100 may be a developer, customer, enterprise leader, sales representative for business intelligence services, or any other individual.
  • FIG. 6 illustrates a series of steps in a certain order, but those of skill in the art will recognize that these steps may be re-ordered, omitted, replaced with other steps, or supplemented with additional steps, consistent with the spirit of the invention.
  • the method 600 may utilize any suitable source of data, such as for example a spreadsheet, database, website, blog, whitepaper, report, key performance indicator (KPI), dashboard, and/or the like, which may provide the data store 106 illustrated in FIG. 5 . It may then apply a rules-based algorithm to the data store 106 to interrogate the data store 106 , discover the data types contained, categorize the data types into logical data recipe ingredients, compare the data recipe ingredients to existing reference data recipes, and either extend the existing reference data recipes as needed or create a new data recipe.
  • KPI key performance indicator
  • the method 600 may start 610 with a step 620 in which the information request 540 is received from the user 100 . As mentioned in connection with FIG. 5 , this may be done in many ways and with any of a wide variety of input devices 102 .
  • the interrogation engine 510 may interrogate the data store 106 , which may represent one or more data sources and may include any of a variety of data storage devices and/or schema.
  • the step 630 may include interrogation of the data store 106 based on the information request 540 to provide the data types 550 stored within the data store 106 .
  • the interrogation engine 510 may receive data from the data store 106 , and may use the semantic layer 400 to assist with interrogation.
  • the data map 300 and/or the heuristic algorithm 570 may additionally or alternatively be referenced by the interrogation engine 510 in the performance of the step 630 .
  • the categorization engine 520 may categorize the data types 550 from the data store 106 , as provided by the interrogation engine 510 , to define the data recipe ingredients 210 that may be components of the data recipe 200 that is to be generated to satisfy the information request 540 .
  • performance of the step 640 may entail usage of the semantic layer 400 , the heuristic algorithm 570 , and/or the data map 300 .
  • the data recipe ingredients 210 may be the actual and only data recipe ingredients 210 of the data recipe 200 that satisfies the information request 540 , or they may be over-inclusive (i.e., including more data recipe ingredients 210 than the data recipe 200 will need), or may even be under-inclusive in the event that one or more steps of the method 600 , including the step 640 , are to be performed recursively to supply additional data recipe ingredients 210 .
  • the method 600 may then proceed to a step 650 , in which the only data recipe ingredients 210 received from the step 640 are compared by the comparison engine 530 with reference data recipes. As set forth in the discussion of FIG. 5 , this may entail comparison of the only data recipe ingredients 210 received from the categorization engine 520 with those stored within the data map 300 . As indicated in the description of FIG. 5 , performance of the step 650 may entail usage of the semantic layer 400 , the heuristic algorithm 570 , and/or the data map 300 .
  • the system 500 may determine whether one of the reference data recipes can be modified to create the data recipe 200 that will satisfy the information request 540 . Notably, it may be possible to create any data recipe 200 as a modification of one or more data recipes if enough modification is done. Thus, the query 660 may compare the likelihood of success and/or computational time required to modify one or more of the reference data recipes, with the likelihood of success and/or computational time required if the data recipe 200 is to be created independently of the reference data recipes.
  • the method 600 may progress to a step 662 in which the data recipe 200 that satisfies the information request 540 is created by modifying the one or more reference data recipes. Conversely, if the data recipe 200 that satisfies the information request 540 is to be created “from scratch,” the method 600 may progress to a step 664 in which the data recipe 200 is created independently of the reference data recipes.
  • the result is the creation of a new data recipe 200 .
  • This data recipe 200 may optionally be presented to the user 100 for approval and/or modification.
  • a query 670 may determine whether the user 100 approves the new data recipe 200 without modification. If the user 100 does not approve the data recipe 200 , or provides modifications, either via explicit or implicit user feedback, the data recipe 200 may return to the step 630 and once again query the data store 106 for data types.
  • the step 630 , the step 640 , and/or the step 650 may again be performed, but if desired, may incorporate the feedback provided by the user 100 . If no user feedback has been obtained, settings applicable to the step 630 , the step 640 , and/or the step 650 may be modified so that the resulting data recipe 200 is different from that obtained previously. For example, the data map 300 , the semantic layer 400 , and/or the heuristic algorithm 570 may operate on different settings from those used to obtain the data recipe 200 rejected by the user 100 .
  • the method 600 may proceed to a step 680 in which the data map 300 is updated to include the data recipe 200 .
  • the data recipe 200 may be recorded in the data map 300 as one of the reference data recipes that can be the basis of comparison for future iterations of the step 650 , and may be modified to obtain a new data recipe 200 .
  • the semantic layer 400 and/or the heuristic algorithm 570 may also be updated to reflect the data recipe 200 and/or any adjustments needed. Such adjustments may be made pursuant to known artificial intelligence and/or machine learning techniques based on the results of previous steps and/or queries of the method 600 .
  • the data recipe 200 may then be provided to the user 100 .
  • the user 100 may use the data recipe 200 , one time or repeatedly, to obtain the requested information 580 .
  • the method 600 may proceed to a step 690 in which the system 500 follows the data recipe 200 to obtain the requested information 580 and provide the requested information 580 to the user 100 . This may also be done one time or repeatedly as desired by the user 100 .
  • the method 600 may then end 699 .
  • a wide variety of methods may be used to generate a wide range of data recipes according to the invention.
  • the following example is presented by way of illustration and not limitation to indicate some of the ways in which a system, such as the system 500 of FIG. 5 , may be used to automatically generate a data recipe that provides information requested by a user through the use of a method such as the method 600 of FIG. 6 .
  • a block diagram depicts a first KPI recipe 700 used to obtain a first KPI according to one embodiment of the invention.
  • the first KPI may include products sold by a company broken down into product type and catalog price.
  • the first KPI recipe 700 may include first KPI ingredients 710 , which may include product, product type, and catalog price ingredients. More precisely, as illustrated in the breakout of FIG. 7 , the first KPI ingredients 710 may include one or more dimensions 230 and/or one or more measures 240 , such as a product dimension 720 , a product type dimension 730 , and a catalog price measure 740 . The determination of whether each of the first KPI ingredients 710 is a dimension or a measure may be made, for example, using the criteria set forth in the description of FIG. 2 .
  • the product dimension 720 may have a product key element 750 .
  • the product type dimension 730 may have a product type key element 760 . Since it is a measure 240 , the catalog price measure 740 may include the product key element 750 , the product type key element 760 , and a catalog price element 770 .
  • the first KPI recipe 700 may be generated by following the method 600 of FIG. 6 , and/or utilizing the system 500 of FIG. 5 .
  • the first KPI recipe 700 may be obtained by receiving and interrogating one or more data sources such as the data store 106 .
  • the interrogation engine 510 , the categorization engine 520 , and/or the comparison engine 530 may operate with the aid of the data map 300 , the semantic layer 400 , and/or the heuristic algorithm 570 to provide a proposed model, or a proposed data recipe, to the user 100 .
  • the proposed data recipe may have any number of dimensions 230 .
  • the first KPI ingredients 710 include the product dimension 720 and the product type dimension 730 .
  • the proposed data recipe may be presented to the user 100 , and the user 100 may be prompted to accept or reject the proposed data recipe. If accepted, the proposed data recipe provided to the user 100 , used to satisfy the information request 540 by providing the requested information 580 , and/or further processed, for example, by the heuristic algorithm 570 for further refinement as additional data is received.
  • a block diagram depicts a second KPI recipe 800 used to obtain a second KPI according to one embodiment of the invention.
  • the second KPI may include products sold by a company broken down into product group, order price, and order date.
  • the second KPI may relate to the same product as the first KPI.
  • the second KPI recipe 800 may include second KPI ingredients 810 , which may include product, order price, product group, and order date ingredients.
  • the breakout of FIG. 8 illustrates the pool of data recipe ingredients 210 from which the second KPI ingredients 810 may be selected.
  • the first KPI ingredients 710 may be included in the pool and may be used to provide one or more of the second KPI ingredients 810 .
  • the first KPI recipe 700 may be modified to facilitate the creation of the second KPI recipe 800 .
  • the second KPI ingredients 810 may include one or more dimensions 230 and/or one or more measures 240 , one or more of which may be obtained from or derived from the first KPI ingredients 710 of the first KPI recipe 700 . More specifically, the second KPI ingredients 810 may include the product dimension 720 from the first KPI ingredients 710 . Additionally, the second KPI ingredients 810 may include an order date dimension 820 , a product group dimension 830 , and an order price measure 840 .
  • the order date dimension 820 may have an order date key element 850 .
  • the product group dimension 830 may have a product group key element 860 . Since it is a measure 240 , the order price measure 840 may have the product key element 750 , the order date key element 850 , the product group key element 860 , and an order price element 870 .
  • the second KPI recipe 800 may also be obtained through the use of the system 500 of FIG. 5 and/or the method 600 of FIG. 6 . This may be facilitated via comparison with the second KPI recipe 800 .
  • the data definition for the second KPI recipe 800 may be compared with the first KPI recipe 700 .
  • One or more dimensions, such as the order date dimension 820 and the product group dimension 830 may be added.
  • One or more new measures, such the order price measure 840 may be created by combining data from the first KPI ingredients 710 , such as the product key element 750 , with data from the other second KPI ingredients 810 , such as the order date key element 850 and the product group key element 860 , and adding the order price element 870 .
  • Such a data recipe modification technique may beneficially allow a data recipe to be modified at runtime, without changing structures that relied on previous definitions.
  • the second KPI recipe 800 may be obtained by receiving and interrogating one or more data sources such as the data store 106 (or alternatively, one or more data sources different from those used to generate the first KPI recipe 700 ).
  • the interrogation engine 510 , the categorization engine 520 , and/or the comparison engine 530 may operate with the aid of the data map 300 , the semantic layer 400 , and/or the heuristic algorithm 570 to provide a proposed model, or a proposed data recipe to obtain the second KPI, to the user 100 .
  • the comparison engine 530 may operate by comparing data recipe ingredients 210 obtained from the categorization engine 520 with the first KPI ingredients 710 to obtain the pool of potential data recipe ingredients shown in FIG. 8 . Then, the comparison engine 530 may modify the first KPI ingredients 710 applicable to the second KPI, and add new data recipe ingredients from the data recipe ingredients 210 obtained from the categorization engine 520 , to obtain the second KPI ingredients 810 .
  • the proposed data recipe for the second KPI may be presented to the user 100 , and the user 100 may be prompted to accept or reject the proposed data recipe. If accepted, the proposed data recipe provided to the user 100 , used to satisfy the information request 540 for the second KPI by providing the requested information 580 , and/or further processed, for example, by the heuristic algorithm 570 for further refinement as additional data is received.
  • the use of the heuristic algorithm 570 may beneficially avoid the need for the user 100 to manually map new data elements such as those of the first KPI ingredients 710 and the second KPI ingredients 810 . Rather, the system 500 may iteratively present options to user 100 until the user 100 approves of a proposed data recipe.
  • the present invention can be implemented as a system or a method for performing the above-described techniques, either singly or in any combination.
  • the present invention can be implemented as a computer program product comprising a non-transitory computer-readable storage medium and computer program code, encoded on the medium, for causing a processor in a computing device or other electronic device to perform the above-described techniques.
  • Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention can be embodied in software, firmware and/or hardware, and when embodied in software, can be downloaded to reside on and be operated from different platforms used by a variety of operating systems.
  • the present invention also relates to an apparatus for performing the operations herein.
  • This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computing device selectively activated or reconfigured by a computer program stored in the computing device.
  • a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, DVD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, flash memory, solid state drives, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
  • the computing devices referred to herein may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
  • the present invention can be implemented as software, hardware, and/or other elements for controlling a computer system, computing device, or other electronic device, or any combination or plurality thereof.
  • an electronic device can include, for example, a processor, an input device (such as a keyboard, mouse, touchpad, track pad, joystick, trackball, microphone, and/or any combination thereof), an output device (such as a screen, speaker, and/or the like), memory, long-term storage (such as magnetic storage, optical storage, and/or the like), and/or network connectivity, according to techniques that are well known in the art.
  • Such an electronic device may be portable or non-portable.
  • Examples of electronic devices that may be used for implementing the invention include: a mobile phone, personal digital assistant, smartphone, kiosk, server computer, enterprise computing device, desktop computer, laptop computer, tablet computer, consumer electronic device, or the like.
  • An electronic device for implementing the present invention may use any operating system such as, for example and without limitation: Linux; Microsoft Windows, available from Microsoft Corporation of Redmond, Wash.; Mac OS X, available from Apple Inc. of Cupertino, Calif.; iOS, available from Apple Inc. of Cupertino, Calif.; Android, available from Google, Inc. of Mountain View, Calif.; and/or any other operating system that is adapted for use on the device.

Abstract

A data recipe may be automatically generated to provide requested information to a user. After the information is requested, one or more data sources may be interrogated to discover a plurality of data types of data stored in the data sources. The data types may be categorized to define a plurality of data recipe ingredients that are likely to be needed to provide the requested information. The data recipe ingredients may be compared with a reference data recipe. Based on the results of the comparison, a new data recipe that provides the requested information may be made by either modifying the reference data recipe or by proceeding independently of the reference data recipe. The new data recipe may, for example, calculate a key performance indicator used to measure organizational performance.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • The Application Data Sheet (“ADS”) filed in the present application is incorporated by reference. Any applications claimed on the ADS for priority under 35 U.S.C. §§ 119, 120, 121, or 365(c), and any and all parent, grandparent, great-grandparent, etc., applications of such applications, are also incorporated by reference, including any priority claims made in those applications and any material incorporated by reference, to the extent such subject matter is not inconsistent herewith. The present application claims priority to: U.S. application Ser. No. 14/257,669 filed on Apr. 21, 2014 and issued as U.S. Pat. No. 10,262,030 on Apr. 16, 2019, and U.S. Provisional Application Ser. No. 61/814,586 for “Automatic Dynamic Reusable Data Recipes,” filed Apr. 22, 2013, each of which is incorporated by reference herein in its entirety.
  • FIELD OF THE INVENTION
  • The present invention relates to systems and methods for generating user-requested information, and more particularly, automated creation of data recipes.
  • DESCRIPTION OF THE RELATED ART
  • Many organizations possess vast troves of data, which may be stored in a variety of locations. Many aspects of organizational activity are often tracked and recorded. Nevertheless, despite the immense quantities of data available, many organizations find that they are unable to locate the information they currently need, particularly when multiple data sets must be combined to provide the information.
  • One example of this is the generation of key performance indicators, or KPI's, that provide metrics for assessing organizational performance. Many businesses use KPI's to make strategic decisions. Unfortunately, in many instances, when a new KPI is requested, considerable work must be done in order to obtain it. For example, a user may have to (1) determine what the component data of the KPI are, (2) determine where this data resides among one or more files, databases, and the like, (3) locate the data, (4) determine how the data should be combined in order to obtain the KPI, and (5) combine the data in the manner determined.
  • This can be exhaustive, particularly for a large organization in which the component data for the KPI are managed by multiple individuals. Hence, what may seem to be a simple request for information can be surprisingly difficult to fulfill. This problem may be compounded when, as is often the case, there is no standard format, nomenclature, or other metadata that can be used to automate the search for relevant data. The user must then engage in some analysis to determine where the desired data are likely to reside, how they are likely to be identified in the associated file and/or database. Hence, it is often prohibitively time consuming for an organization to find and use the data needed. Thus, tools that could be used to enhance the performance and strategic decision-making of the organization are simply not available.
  • Conventionally, data modeling is used to organize and structure data for efficient query and retrieval in various contexts. However, data modeling is typically a highly manual and expert-based task. Conventional data modeling methods may be expensive and labor-intensive, and may be unavailable to many organizations. Furthermore, many known data modeling techniques require significant computational time to resolve.
  • SUMMARY
  • As set forth above, locating, compiling, and processing data needed to provide a requested piece of information can be very difficult and time-consuming. The systems and methods of the present invention may address such difficulty by providing mechanisms for automatically creating a recipe to provide requested information. This may be done without the need for the user to review the associated data source or locate the constituent data.
  • Various embodiments of the present invention may implement dynamic reusable data recipes that allow a data structure to evolve automatically and dynamically over time as new content is added. Such content can be added for later access from a business semantic layer of a software application.
  • In at least one embodiment, the system may operate in contexts where high volumes of new content may be added and accessed/queried on a continual basis, so that the ability for a human data architect or team of data modelers would be overwhelmed by the volume and variety of structures required to be developed to support ever-changing needs.
  • In at least one embodiment, the system of the present invention may automatically create and evolve reusable data structures in real-time with little human intervention. One application of such a system is to enable and implement a community-based service that allows members to create new content and share that content with others. This can include, for example, extending existing content to cover new attributes and/or to answer new facets of questions.
  • In at least one embodiment, an interrogation engine, a categorization engine, and a comparison engine may be provided. These engines may, co-operatively, have the capability to evolve data recipes so as to optimize the reuse of existing data recipes and extend them as needed. In this manner, the systems and methods of the present invention may automate the process of generating data structures for efficient data storage and retrieval of query tools. Further details and variations are described herein.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The accompanying drawings illustrate several embodiments of the invention. Together with the description, they serve to explain the principles of the invention according to the embodiments. One skilled in the art will recognize that the particular embodiments illustrated in the drawings are merely exemplary, and are not intended to limit the scope of the present invention.
  • FIG. 1A is a block diagram depicting a hardware architecture for practicing the present invention according to one embodiment of the present invention.
  • FIG. 1B is a block diagram depicting a hardware architecture for practicing the present invention in a client/server environment, according to one embodiment of the present invention.
  • FIG. 2 is a block diagram depicting the structure of a data recipe according to one embodiment of the present invention.
  • FIG. 3 is a block diagram depicting the structure of a data map according to one embodiment of the present invention.
  • FIG. 4 is a block diagram depicting the structure of a semantic layer according to one embodiment of the invention.
  • FIG. 5 is a block diagram depicting a system for carrying out automatic information provision, according to one embodiment of the present invention.
  • FIG. 6 is a flowchart depicting a method of carrying out automatic data recipe generation, according to one embodiment of the present invention.
  • FIG. 7 is a block diagram depicting a first recipe used to obtain a first KPI according to one embodiment of the invention.
  • FIG. 8 is a block diagram depicting a second recipe used to obtain a second KPI according to one embodiment of the invention.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS
  • For illustrative purposes, the systems and methods described and depicted herein may refer to automated generation of data recipes that provide information requested by a user. The data recipes may, in some embodiments, relate to the operation of an enterprise. However, one skilled in the art will recognize that the techniques of the present invention can be applied to many different types of information, and may apply to many different situations apart from the exemplary enterprise operation context mentioned previously.
  • System Architecture
  • According to various embodiments, the present invention can be implemented on any electronic device(s) equipped to receive, store, and present information. Such an electronic device(s) may include, for example, one or more a desktop computers, laptop computers, smartphones, tablet computers, or the like.
  • Although the invention is described herein in connection with an implementation in a computer, one skilled in the art will recognize that the techniques of the present invention can be implemented in other contexts, and indeed in any suitable device capable of receiving and/or processing user input. Accordingly, the following description is intended to illustrate various embodiments of the invention by way of example, rather than to limit the scope of the claimed invention.
  • Referring now to FIG. 1A, there is shown a block diagram depicting a hardware architecture for practicing the present invention, according to one embodiment. Such an architecture can be used, for example, for implementing the techniques of the present invention in a computer or other device 101. Device 101 may be any electronic device equipped to receive, store, and/or present information, and to receive user input in connect with such information.
  • In at least one embodiment, device 101 has a number of hardware components well known to those skilled in the art. Input device 102 can be any element that receives input from user 100, including, for example, a keyboard, mouse, stylus, touch-sensitive screen (touchscreen), touchpad, trackball, accelerometer, five-way switch, microphone, or the like. Input can be provided via any suitable mode, including for example, one or more of: pointing, tapping, typing, dragging, and/or speech.
  • Data store 106 can be any magnetic, optical, or electronic storage device for data in digital form; examples include flash memory, magnetic hard drive, CD-ROM, DVD-ROM, or the like. In at least one embodiment, data store 106 stores information which may include documents 107 and/or libraries 111 that can be utilized and/or displayed according to the techniques of the present invention, as described below. In another embodiment, documents 107 and/or libraries 111 can be stored elsewhere, and retrieved by device 101 when needed for presentation to user 100. Libraries 111 may include one or more data sets, including a first data set 109, and optionally, a plurality of additional data sets up to an nth data set 119.
  • Display screen 103 can be any element that graphically displays documents 107, libraries 111, and/or the results of steps performed on documents 107 and/or libraries 111 to provide data output incident to automated provision of data recipes. Such data output may include, for example, one or more prompts that request information from the user 100, data, data visualizations, prompts requesting input to confirm and/or modify data recipes, and the like. In at least one embodiment where only some of the desired output is presented at a time, a dynamic control, such as a scrolling mechanism, may be available via input device 102 to change which information is currently displayed, and/or to alter the manner in which the information is displayed.
  • Processor 104 can be a conventional microprocessor for performing operations on data under the direction of software, according to well-known techniques. Memory 105 can be random-access memory, having a structure and architecture as are known in the art, for use by processor 104 in the course of running software.
  • Data store 106 can be local or remote with respect to the other components of device 101. In at least one embodiment, device 101 is configured to retrieve data from a remote data storage device when needed. Such communication between device 101 and other components can take place wirelessly, by Ethernet connection, via a computing network such as the Internet, or by any other appropriate means. This communication with other electronic devices is provided as an example and is not necessary to practice the invention.
  • In at least one embodiment, data store 106 is detachable in the form of a CD-ROM, DVD, flash drive, USB hard drive, or the like. Documents 107 and/or libraries 111 can be entered from a source outside of device 101 into a data store 106 that is detachable, and later displayed after the data store 106 is connected to device 101. In another embodiment, data store 106 is fixed within device 101.
  • Referring now to FIG. 1B, there is shown a block diagram depicting a hardware architecture for practicing the present invention in a client/server environment, according to one embodiment of the present invention. Such an implementation may use a “black box” approach, whereby data storage and processing are done completely independently from user input/output. An example of such a client/server environment is a web-based implementation, wherein client device 108 runs a browser that provides a user interface for interacting with web pages and/or other web-based resources from server 110. Documents 107 and/or libraries 111 can be presented as part of such web pages and/or other web-based resources, using known protocols and languages such as Hypertext Markup Language (HTML), Java, JavaScript, and the like.
  • Client device 108 can be any electronic device incorporating the input device 102 and/or display screen 103, such as a desktop computer, laptop computer, personal digital assistant (PDA), cellular telephone, smartphone, music player, handheld computer, tablet computer, kiosk, game system, or the like. Any suitable type of communications network 113, such as the Internet, can be used as the mechanism for transmitting data between client device 108 and server 110, according to any suitable protocols and techniques. In addition to the Internet, other examples include cellular telephone networks, EDGE, 3G, 4G, long term evolution (LTE), Session Initiation Protocol (SIP), Short Message Peer-to-Peer protocol (SMPP), SS7, Wi-Fi, Bluetooth, ZigBee, Hypertext Transfer Protocol (HTTP), Secure Hypertext Transfer Protocol (SHTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), and/or the like, and/or any combination thereof. In at least one embodiment, client device 108 transmits requests for data via communications network 113, and receives responses from server 110 containing the requested data.
  • In this implementation, server 110 is responsible for data storage and processing, and incorporates data store 106 for storing documents 107 and/or libraries 111. Server 110 may include additional components as needed for retrieving data and/or libraries 111 from data store 106 in response to requests from client device 108.
  • In at least one embodiment, documents 107 are organized into one or more well-ordered data sets, with one or more data entries in each set. Data store 106, however, can have any suitable structure. Accordingly, the particular organization of documents 107 within data store 106 need not resemble the form in which documents 107 are displayed to user 100. In at least one embodiment, an identifying label is also stored along with each data entry, to be displayed along with each data entry.
  • The libraries 111 may include one or more data sources, which may be stored at one or more locations and in one or more formats. In at least one embodiment, libraries 111 are organized in a file system within data store 106. Appropriate indexing can be provided to associate particular documents with particular quantitative data elements, reports, other documents, and/or the like. Libraries 111 may include any of a wide variety of data structures known in the database arts. As in FIG. 1A, libraries 111 may include one or more data sets, including a first data set 109, and optionally, a plurality of additional data sets up to an nth data set 119.
  • Documents 107 and/or libraries 111 can be retrieved from client-based or server-based data store 106, and/or from any other source. In at least one embodiment, input device 102 is configured to receive data entries from user 100, to be added to documents 107 and/or libraries 111 held in data store 106. User 100 may provide such data entries via the hardware and software components described above according to means that are well known to those skilled in the art.
  • Display screen 103 can be any element that graphically displays documents 107, libraries 111, and/or the results of steps performed on documents 107 and/or libraries 111 to provide data output incident to automated provision of data recipes. Such data output may include, for example, one or more prompts that request information from the user 100, data, data visualizations, prompts requesting input to confirm and/or modify data recipes, and the like. In at least one embodiment where only some of the desired output is presented at a time, a dynamic control, such as a scrolling mechanism, may be available via input device 102 to change which information is currently displayed, and/or to alter the manner in which the information is displayed.
  • In at least one embodiment, the information displayed on display screen 103 may include data in text and/or graphical form. Such data may comprise visual cues, such as height, distance, and/or area, to convey the value of each data entry. In at least one embodiment, labels accompany data entries on display screen 103, or can be displayed when user 100 taps on or clicks on a data entry, or causes an onscreen cursor to hover over a data entry.
  • Furthermore, as described in more detail below, display screen 103 can selectively present a wide variety of data related to automated data recipe generation. In particular, as described herein, user 100 can provide input, such as a selection from a menu containing a variety of options, to determine the various characteristics of the information presented such as the type, scope, and/or format of the information to be displayed on display screen 103.
  • In one embodiment, the system can be implemented as software written in any suitable computer programming language, whether in a standalone or client/server architecture. Alternatively, it may be implemented and/or embedded in hardware.
  • Data Recipe, Data Map, and Semantic Layer Structure
  • In general, a “data recipe” may include any instruction set that enables requested information to be obtained from other data in one or more data sources. A wide variety of data types and processes may be included in a data recipe. Each piece of data used by the data recipe (i.e., each “data recipe ingredient”) may be of any desired length and format. Thus, each piece of data may be a character string, integer, floating point number, or any other type of data, and may thus represent any information such as names, times, dates, currency amounts, percentages, fractions, physical dimensions, or any other data that may desirably be stored in a computer.
  • Referring to FIG. 2, a block diagram depicts the structure of a data recipe 200 according to one embodiment of the present invention. The data recipe 200 may include one or more data recipe ingredients 210 and one or more data processes 220 that describe how the data recipe ingredients 210 are to be combined and/or manipulated to obtain the requested information. Thus, the data recipe ingredients 210 may include data of any of the types listed in the preceding paragraph, and the data processes 220 may include one or more formulas or other combination instructions indicating how the data recipe ingredients 210 may be used to obtain the requested information. If the data recipe ingredients 210 include numbers, the data processes 220 may include mathematical formulas or the like.
  • The data recipe ingredients 210 may include one or more dimensions 230 and/or one or more measures 240. In general, a measure 240 is a property on which calculations can be made, while a dimension 230 is a data set that can be used for structured labeling of measures. For example, dimensions 230 may represent data that would likely be used as the scale on a data visualization, such as the X-axis on a conventional bar chart or line chart. The measures 240 may represent data that would likely be used for the measurements on a data visualization, such as the vertical displacements of the bars or points of a conventional bar chart or line chart.
  • According to one example, a set of rules may be used to define which of the data recipe ingredients 210 are dimensions 230 and which are measures 240. One example of such a set of rules is as follows:
      • Any of the data recipe ingredients 210 that represent dates may be classified as dimensions 230.
      • Any of the data recipe ingredients 210 that do not represent dates but are alpha-numeric may also be classified as dimensions 230.
      • Any of the data recipe ingredients 210 that are numeric and do not have a name that includes “ID,” “type,” “group,” “category,” or a similar descriptor may be measures 240.
      • Any of the data recipe ingredients 210 that are numeric and do not have less than a number of distinct values, such as twenty distinct values may be measures 240.
  • Such a rule set may grow in sophistication over time. In at least one embodiment, machine learning techniques and/or other techniques may be used to automatically grow, refine, and/or otherwise develop the rule set used to classify the data recipe ingredients 210 as dimensions 230 or measures 240. In alternative embodiments, the dimensions 230 and the measures 240 may be defined according to a wide variety of alternative definitions or rules, or alternatively, other classifications (or no classifications) may be applied to the data recipe ingredients 210.
  • Referring to FIG. 3, a block diagram depicts the structure of a data map 300 according to one embodiment of the present invention. The data map 300 may be designed to identify the data recipe ingredients 210 within one or more data stores 106.
  • As shown, the data map 300 may include metadata 302, which may include records for one or more reference data recipes that are to be mapped to data sources such as the data store 106. More specifically, the metadata 302 may include a first record 310 pertaining to a first reference data recipe and optionally, one or more additional records pertaining to one or more additional reference data recipes up to an nth record 320 pertaining to an nth reference data recipe.
  • The first record 310 may include first reference data recipe information 330 pertaining to the first reference data recipe. The first reference data recipe information 330 may include a name or other indicator of the first reference data recipe, the information the first reference data recipe is designed to provide, data processes 220 associated with the first reference data recipe, or the like.
  • Additionally, the first record 310 may include the data recipe ingredients 210 of the first reference data recipe, which may include a first data recipe ingredient 340 and optionally, one or more additional data recipe ingredients up to an nth data recipe ingredient 342. For each of the data recipe ingredients 210 of the first reference data recipe, the metadata 302 may contain a mapping indicating where the data of each data recipe ingredient 210 can be found in one or more data stores 106.
  • More specifically, the metadata 302 may also contain, for each of the data recipe ingredients 210 of the first reference data recipe, a data mapping including a first data mapping 350 for the first data recipe ingredient 340 and optionally, one or more additional data mappings up to an nth data mapping 352 for the nth data recipe ingredient 342. The first data mapping 350 may indicate the location of the first data recipe ingredient 340 in one or more data stores 106. Similarly, the nth data mapping 352 may indicate the location of the nth data recipe ingredient 342 in one or more data stores 106.
  • Similarly, the nth record 320 may include nth reference data recipe information 360 pertaining to the nth reference data recipe. The nth reference data recipe information 360 may include a name or other indicator of the nth reference data recipe, the information the nth reference data recipe is designed to provide, data processes 220 associated with the nth reference data recipe, or the like.
  • Additionally, the nth record 320 may include the data recipe ingredients 210 of the nth reference data recipe, which may include a first data recipe ingredient 370 and optionally, one or more additional data recipe ingredients up to an nth data recipe ingredient 372. For each of the data recipe ingredients 210 of the nth reference data recipe, the metadata 302 may contain a mapping indicating where the data of each data recipe ingredient 210 can be found in one or more data stores 106.
  • More specifically, the metadata 302 may also contain, for each of the data recipe ingredients 210 of the nth data recipe, a data mapping including a first data mapping 380 for the first data recipe ingredient 370 and optionally, one or more additional data mappings up to an nth data mapping 382 for the nth data recipe ingredient 372. The first data mapping 380 may indicate the location of the first data recipe ingredient 370 in one or more data stores 106. Similarly, the nth data mapping 382 may indicate the location of the nth data recipe ingredient 372 in one or more data stores 106.
  • Referring to FIG. 4, a block diagram depicts the structure of a semantic layer 400 according to one embodiment of the invention. The semantic layer 400 may provide pairings between terminology and data to facilitate location of data with a semantic identification, such as a name, description, other semantic metadata, or the like, within one or more data stores 106. The semantic layer 400 may link a “phrase,” which may include one or more words or other semantic elements, with the location within one or more data stores 106, where data corresponding to that phrase may be found.
  • As shown, the semantic layer 400 may have one or more pairings, which may include a first pairing 410 and optionally, one or more additional pairings up to an nth pairing 420. The first pairing 410 may include a first phrase 430 and a first phrase mapping 440, which may indicate one or more locations within one or more data stores 106 where the first phrase 430, or data corresponding to the first phrase 430, may be found.
  • The first pairing 410 may further include a first confidence factor 450 indicating a level of confidence in the link between the first phrase 430 and the first phrase mapping 440. The first confidence factor 450 may, for example, be a number such as a percentage, where 0% indicates no confidence in the link, and 100% indicates absolute certainty that the first phrase mapping 440 is the location of the first phrase 430, or data related to the first phrase 430, within one or more data stores 106. This structure may facilitate machine learning to allow for improved performance in generating and developing data recipes over time. The first confidence factor 450 may be revised over time based on user feedback, further comparisons with the data store 106, or the like.
  • Similarly, the nth pairing 420 may include an nth phrase 460 and an nth phrase mapping 470, which may indicate one or more locations within one or more data stores 106 where the nth phrase 460, or data corresponding to the nth phrase 460, may be found.
  • The nth pairing 420 may further include an nth confidence factor 480 indicating a level of confidence in the link between the nth phrase 460 and the nth phrase mapping 470. Like the first confidence factor 450, the nth confidence factor 480 may be a number or other indicator of a confidence level that the first phrase mapping 440 is the location of the first phrase 430, or data related to the first phrase 430, within one or more data stores 106. The nth confidence factor 480 may also be revised over time based on user feedback, further comparisons with the data store 106, or the like.
  • If desired, the semantic layer 400 may be incorporated into the data map 300. Alternatively, the semantic layer 400 may be independent of the data map 300.
  • Conceptual Architecture
  • In at least one embodiment, the system of the present invention enables automated provision of data recipes for generating, collecting, and/or presenting information requested by users. A data recipe may be formulated by interrogating one or more data sources, such as the data store 106, to obtain data types categorizing the data types into data recipe ingredients, and comparing the data recipe ingredients with one or more reference data recipes. The new data recipe may then be created independently, or by modifying one or more reference data recipes to obtain the new data recipe. The new data recipe may then be used to generate, collect, and/or present the requested information.
  • Referring to FIG. 5, a block diagram depicts a system 500 for carrying out automatic data recipe generation, according to one embodiment of the present invention. As shown, the system 500 may have an interrogation engine 510, a categorization engine 520, and a comparison engine 530 that may cooperate to generate a data recipe 200. User inputs are shown on the left-hand side of FIG. 5, and outputs to the user are shown on the right-hand side of FIG. 5.
  • As shown, the system 500 may receive an information request 540 from the user 100. The information request 540 may indicate one or more pieces of information desired by the user 100. The information request 540 may be for any type of information. The information request 540 may be provided via the input device 102, and may include one or more numbers, phrases, natural language questions, menu selections, and/or a variety of other user input elements.
  • In some embodiments, the system 500 may be incorporated into a business intelligence system. The information request 540 may relate to organizational performance, and may more specifically be a key performance indicator (KPI).
  • Key performance indicators are performance measurement indicators that can be used to evaluate success of an enterprise, e.g., an entity, activity, organization, or group. KPI reports, which summarize key performance indicators, can be very useful in management of an enterprise so that effective decisions can be made regarding business strategy and resource allocation. KPIs can be compiled to create a “dashboard,” which is a compiled snapshot of the most important aspects of the operation of the enterprise.
  • Generally, KPIs are numerically measurable aspects of the operation of an enterprise. Some KPIs are well-known and apply to a wide range of businesses. However, the most important KPIs for a business are often highly industry-specific, enterprise-specific, or even department-specific. In management, the challenge is to know which KPIs to focus on. Often, the process of finding the best KPIs to use is an iterative one in which one set of KPI's is utilized, and then refreshed to add and/or remove KPIs to the set under review. The speed at which this process occurs is often limited by the ability of an organization to locate and/or process the data required to calculate the KPIs of interest.
  • Thus, if the information request 540 is for a KPI, the system 500 may beneficially provide the user with a data recipe 200 that can be used to locate and properly process the data necessary to obtain the KPI. This may be performed in several stages.
  • In at least one embodiment, the system 500 of the present invention may apply a sophisticated set of heuristics within a software application. The interrogation engine 510, the categorization engine 520, and/or the comparison engine 530 may utilize a heuristic algorithm 570 to perform any of a number of tasks including, but not limited to creation and/or implementation of the data map 300 and/or the semantic layer 400. FIG. 5 illustrates a connection between the heuristic algorithm 570 and the categorization engine 520, but the interrogation engine 510 and/or the comparison engine 530 may also utilize the heuristic algorithm 570, if desired.
  • The information request 540 may be received by the interrogation engine 510. The interrogation engine 510 may interrogate one or more data sources, which are exemplified by the data store 106 in FIG. 5. The interrogation engine 510 may identify data types 550 present within the data store 106. This may optionally be done with the aid of the semantic layer 400, which may help to map semantic elements of the information request 540 and/or the data types 550 to corresponding data within the data store 106. Data interrogation may be done based on semantics so that the data types 550 will conform to semantic arche-types. Alternatively, the interrogation engine 510 may function independently of the semantic layer 400.
  • In at least one embodiment, the interrogation engine 510 may receive data from the data store 106 and parse the data into a NoSQL tree structure for processing. If desired or necessary, the schema of the NoSQL tree structure can be approximated based on the structure of the data store 106.
  • The data types 550 may be provided to the categorization engine 520, which may categorize the data types 550 to determine which of the data types 550 are data recipe ingredients 210 of the data recipe 200 to be created. Like the interrogation engine 510, the categorization engine 520 may operate with the aid of the semantic layer 400, or independently of the semantic layer 400.
  • The categorization engine 520 may function by, for example, using the heuristic algorithm 570 to identify the dimensions 230 and/or measure 240 of the data. Relationships between data elements within the schema may be inferred from the structure of the data store 106, if possible. This may be accomplished, for example, by interrogating the data store 106 to find ingredients in common among different entities.
  • The data recipe ingredients 210 may be provided to the comparison engine 530, which may compare the data recipe ingredients 210 with one or more reference data recipes to determine whether one or more of the reference data recipes can be modified to yield the data recipe 200 that provides the requested information. The reference data recipes may be stored in the data map 300, for example, in the first reference data recipe information 330 through the nth reference data recipe information 360. Thus, the comparison engine 530 may compare the data recipe ingredients 210 with the data map 300, or more precisely, with the data recipe ingredients stored within the data map 300. This may entail comparing the data recipe ingredients 210 with the first data recipe ingredient 340 through the nth data recipe ingredient 342 and the data recipe ingredients of the other data recipes, up to the first data recipe ingredient 370 through the nth data recipe ingredient 372.
  • The determination of whether to modify one of the reference data recipes stored in the data map 300 may be made, for example, based on the degree of similarity between the data recipe ingredients 210 for the data recipe 200 desired, and the data recipe ingredients 210 of the corresponding reference data recipe. Thus, for example, if the data recipe ingredients 210 received by the comparison engine 530 are very similar to the first data recipe ingredient 340 through the nth data recipe ingredient 342 of the first record 310 of the data map 300, it may be easier to modify the corresponding first reference data recipe to obtain the data recipe 200 that satisfies the information request 540. Conversely, if none of the reference data recipes stored in the data map 300 have data recipe ingredients 210 that are similar to the data recipe ingredients 210 received from the categorization engine 520, the data recipe 200 may be created independently of any of the reference data recipes.
  • Like the interrogation engine 510 and the categorization engine 520, the categorization engine 520 may operate with the aid of the semantic layer 400, or independently of the semantic layer 400. In one example, the comparison engine 530 may use the heuristic algorithm 570 to attempt to match each element in the schema to the semantic layer 400 based, for example, on name, data type, sample data set, and/or any other suitable data elements.
  • In at least one embodiment, the comparison engine 530 may use the data map 300. As set forth previously, the data map 300 may describe one or more reference data recipes, such as the first through nth reference data recipes of FIG. 3, and may describe how each reference data recipe correlates to the data store 106. In this manner, the comparison engine 530 may determine how a data recipe can be applied to a different data source, used to power a data visualization, or otherwise used in a manner different from that of its reference data recipe.
  • If desired, comparison of the data recipe ingredients 210 received from the categorization engine 520 with those of the reference data recipes of the data map 300 may be made by comparing attributes of elements of the data recipe ingredients 210 with corresponding attributes of the reference data recipes. The following is an exemplary list of attributes that can be associated with one another in accordance with the techniques of the present invention:
      • A name of the element;
      • A data type of the element;
      • A display name of the element;
      • An alias or tag name of the element, which may yield more commonalities for automated data recipe creation than elements such as the source column name;
      • A metadata tag associated with the element;
      • An average size of data of the element;
      • A level of uniformity of the element, which may, for example, be calculated as the number of distinct data points within the data divided by the total number of data points within the data;
      • A cleanliness level of the element, which may be determined, for example, by:
        • A presence and/or prevalence of NULLs in relationship keys;
        • A presence and/or prevalence of malformed data based on assumed data types; and/or
        • A presence and/or prevalence of multiple variations of a standard data point;
      • An anticipated relationship between the element and other data, which may be determined by, for example:
        • Groupings via many-to-many tag structures (related measures may share related tags)
        • Relationships to possible dimensions (which may include percentages of probable matches or the like)
  • The foregoing is merely exemplary; the comparison engine 530 may utilize any of a wide variety of data comparison techniques known in the art. In at least one embodiment, the system 500 may gather feedback from the user 100 at one or more points in the process via direct interaction (prompt and response) and/or by monitoring manual adjustments to generated content (changes in the state of the model). In one embodiment, the system 500 may provide the data recipe 200 to the user for approval or rejection. If the user 100 rejects the data recipe 200, the data recipe 200 may be revised, for example, by further iterations with the interrogation engine 510, the categorization engine 520, and/or the comparison engine 530.
  • Once the data recipe 200 has been provided to the satisfaction of the user 100, it may be used to adjust the heuristic algorithm 570 through the use of machine learning or other techniques. The data recipe 200 may also be added to the data map 300 as one of the reference data recipes that may be modified in the process of generating future data recipes.
  • Additionally or alternatively, the semantic layer 400 may also be adjusted as the system 500 operates. These adjustments to the semantic layer 400 may be made based upon user feedback, or through the use of machine learning techniques or the like. Such changes to the semantic layer 400 may then be fed back through the system 500 to re-evaluate previously applied schema and thereby improve the performance of the system 500 for future data recipe generation.
  • If desired, the data recipe 200 may be provided to the user 100. The user 100 may then use the data recipe 200 to fulfill the information request 540. The data recipe 200 may be used repeatedly to obtain the requested information 580 as circumstances change. For example, if the information request 540 represents a KPI, the data recipe 200 may be used to obtain the KPI and then update and/or review it according to an interval desired by the user 100.
  • Additionally or alternatively, the system 500 may also be designed to apply the data recipe 200 to fulfill the information request 540 by providing the requested information 580 automatically for the user 100. The system 500 may do this only once, or at any interval desired by the user 100. If desired, the system 500 may provide the user 100 with the requested information 580 and then receive feedback based on the requested information 580. For example, returning to the example of a KPI, if the user deems that the requested information 580 is not the KPI that was requested, the user 100 may provide feedback that causes the system 500 to revise the data recipe 200 to obtain the requested information 580 again from the new data recipe.
  • Additionally or alternatively, after approval of the new data recipe 200, the new data recipe 200 may be further refined as needed. This may be done, for example, by additional iterations through the interrogation engine 510, the categorization engine 520, the comparison engine 530, and/or the heuristic algorithm 570.
  • Automatic Data Recipe Generation
  • Referring to FIG. 6, a flowchart depicts a method 600 of carrying out automatic data recipe generation, according to one embodiment of the present invention. The method 600 may be carried out, at least in part, by the system 500 as in FIG. 5, or with a differently-configured data recipe provision system. The method 600 may be performed in connection with input from the user 100; such a user 100 may be a developer, customer, enterprise leader, sales representative for business intelligence services, or any other individual. FIG. 6 illustrates a series of steps in a certain order, but those of skill in the art will recognize that these steps may be re-ordered, omitted, replaced with other steps, or supplemented with additional steps, consistent with the spirit of the invention.
  • The method 600 may utilize any suitable source of data, such as for example a spreadsheet, database, website, blog, whitepaper, report, key performance indicator (KPI), dashboard, and/or the like, which may provide the data store 106 illustrated in FIG. 5. It may then apply a rules-based algorithm to the data store 106 to interrogate the data store 106, discover the data types contained, categorize the data types into logical data recipe ingredients, compare the data recipe ingredients to existing reference data recipes, and either extend the existing reference data recipes as needed or create a new data recipe.
  • The method 600 may start 610 with a step 620 in which the information request 540 is received from the user 100. As mentioned in connection with FIG. 5, this may be done in many ways and with any of a wide variety of input devices 102.
  • Then in a step 630, the interrogation engine 510 may interrogate the data store 106, which may represent one or more data sources and may include any of a variety of data storage devices and/or schema. The step 630 may include interrogation of the data store 106 based on the information request 540 to provide the data types 550 stored within the data store 106. As set forth in the discussion of FIG. 5, the interrogation engine 510 may receive data from the data store 106, and may use the semantic layer 400 to assist with interrogation. The data map 300 and/or the heuristic algorithm 570 may additionally or alternatively be referenced by the interrogation engine 510 in the performance of the step 630.
  • Then, in a step 640, the categorization engine 520 may categorize the data types 550 from the data store 106, as provided by the interrogation engine 510, to define the data recipe ingredients 210 that may be components of the data recipe 200 that is to be generated to satisfy the information request 540. As mentioned in the description of FIG. 5, performance of the step 640 may entail usage of the semantic layer 400, the heuristic algorithm 570, and/or the data map 300. The data recipe ingredients 210 may be the actual and only data recipe ingredients 210 of the data recipe 200 that satisfies the information request 540, or they may be over-inclusive (i.e., including more data recipe ingredients 210 than the data recipe 200 will need), or may even be under-inclusive in the event that one or more steps of the method 600, including the step 640, are to be performed recursively to supply additional data recipe ingredients 210.
  • The method 600 may then proceed to a step 650, in which the only data recipe ingredients 210 received from the step 640 are compared by the comparison engine 530 with reference data recipes. As set forth in the discussion of FIG. 5, this may entail comparison of the only data recipe ingredients 210 received from the categorization engine 520 with those stored within the data map 300. As indicated in the description of FIG. 5, performance of the step 650 may entail usage of the semantic layer 400, the heuristic algorithm 570, and/or the data map 300.
  • Then, in a query 660, the system 500 may determine whether one of the reference data recipes can be modified to create the data recipe 200 that will satisfy the information request 540. Notably, it may be possible to create any data recipe 200 as a modification of one or more data recipes if enough modification is done. Thus, the query 660 may compare the likelihood of success and/or computational time required to modify one or more of the reference data recipes, with the likelihood of success and/or computational time required if the data recipe 200 is to be created independently of the reference data recipes.
  • If one or more reference data recipes are to be modified, the method 600 may progress to a step 662 in which the data recipe 200 that satisfies the information request 540 is created by modifying the one or more reference data recipes. Conversely, if the data recipe 200 that satisfies the information request 540 is to be created “from scratch,” the method 600 may progress to a step 664 in which the data recipe 200 is created independently of the reference data recipes.
  • In either case, the result is the creation of a new data recipe 200. This data recipe 200 may optionally be presented to the user 100 for approval and/or modification. A query 670 may determine whether the user 100 approves the new data recipe 200 without modification. If the user 100 does not approve the data recipe 200, or provides modifications, either via explicit or implicit user feedback, the data recipe 200 may return to the step 630 and once again query the data store 106 for data types.
  • The step 630, the step 640, and/or the step 650 may again be performed, but if desired, may incorporate the feedback provided by the user 100. If no user feedback has been obtained, settings applicable to the step 630, the step 640, and/or the step 650 may be modified so that the resulting data recipe 200 is different from that obtained previously. For example, the data map 300, the semantic layer 400, and/or the heuristic algorithm 570 may operate on different settings from those used to obtain the data recipe 200 rejected by the user 100.
  • If the user 100 approves the data recipe 200 generated by the step 650 without modification, the method 600 may proceed to a step 680 in which the data map 300 is updated to include the data recipe 200. The data recipe 200 may be recorded in the data map 300 as one of the reference data recipes that can be the basis of comparison for future iterations of the step 650, and may be modified to obtain a new data recipe 200. If desired, the semantic layer 400 and/or the heuristic algorithm 570 may also be updated to reflect the data recipe 200 and/or any adjustments needed. Such adjustments may be made pursuant to known artificial intelligence and/or machine learning techniques based on the results of previous steps and/or queries of the method 600.
  • The data recipe 200 may then be provided to the user 100. As mentioned previously, the user 100 may use the data recipe 200, one time or repeatedly, to obtain the requested information 580. Additionally or alternatively, the method 600 may proceed to a step 690 in which the system 500 follows the data recipe 200 to obtain the requested information 580 and provide the requested information 580 to the user 100. This may also be done one time or repeatedly as desired by the user 100. The method 600 may then end 699.
  • Example
  • A wide variety of methods may be used to generate a wide range of data recipes according to the invention. The following example is presented by way of illustration and not limitation to indicate some of the ways in which a system, such as the system 500 of FIG. 5, may be used to automatically generate a data recipe that provides information requested by a user through the use of a method such as the method 600 of FIG. 6.
  • Referring to FIG. 7, a block diagram depicts a first KPI recipe 700 used to obtain a first KPI according to one embodiment of the invention. The first KPI may include products sold by a company broken down into product type and catalog price.
  • The first KPI recipe 700 may include first KPI ingredients 710, which may include product, product type, and catalog price ingredients. More precisely, as illustrated in the breakout of FIG. 7, the first KPI ingredients 710 may include one or more dimensions 230 and/or one or more measures 240, such as a product dimension 720, a product type dimension 730, and a catalog price measure 740. The determination of whether each of the first KPI ingredients 710 is a dimension or a measure may be made, for example, using the criteria set forth in the description of FIG. 2.
  • As shown, the product dimension 720 may have a product key element 750. The product type dimension 730 may have a product type key element 760. Since it is a measure 240, the catalog price measure 740 may include the product key element 750, the product type key element 760, and a catalog price element 770.
  • The first KPI recipe 700 may be generated by following the method 600 of FIG. 6, and/or utilizing the system 500 of FIG. 5. Thus, after the user 100 submits the information request 540 for the first KPI, the first KPI recipe 700 may be obtained by receiving and interrogating one or more data sources such as the data store 106. The interrogation engine 510, the categorization engine 520, and/or the comparison engine 530 may operate with the aid of the data map 300, the semantic layer 400, and/or the heuristic algorithm 570 to provide a proposed model, or a proposed data recipe, to the user 100.
  • The proposed data recipe may have any number of dimensions 230. In the example of FIG. 7, the first KPI ingredients 710 include the product dimension 720 and the product type dimension 730. If desired, the proposed data recipe may be presented to the user 100, and the user 100 may be prompted to accept or reject the proposed data recipe. If accepted, the proposed data recipe provided to the user 100, used to satisfy the information request 540 by providing the requested information 580, and/or further processed, for example, by the heuristic algorithm 570 for further refinement as additional data is received.
  • Referring to FIG. 8, a block diagram depicts a second KPI recipe 800 used to obtain a second KPI according to one embodiment of the invention. As shown, the second KPI may include products sold by a company broken down into product group, order price, and order date. The second KPI may relate to the same product as the first KPI.
  • The second KPI recipe 800 may include second KPI ingredients 810, which may include product, order price, product group, and order date ingredients. The breakout of FIG. 8 illustrates the pool of data recipe ingredients 210 from which the second KPI ingredients 810 may be selected. The first KPI ingredients 710 may be included in the pool and may be used to provide one or more of the second KPI ingredients 810. Thus, the first KPI recipe 700 may be modified to facilitate the creation of the second KPI recipe 800.
  • Like the first KPI ingredients 710, the second KPI ingredients 810 may include one or more dimensions 230 and/or one or more measures 240, one or more of which may be obtained from or derived from the first KPI ingredients 710 of the first KPI recipe 700. More specifically, the second KPI ingredients 810 may include the product dimension 720 from the first KPI ingredients 710. Additionally, the second KPI ingredients 810 may include an order date dimension 820, a product group dimension 830, and an order price measure 840.
  • As shown, the order date dimension 820 may have an order date key element 850. The product group dimension 830 may have a product group key element 860. Since it is a measure 240, the order price measure 840 may have the product key element 750, the order date key element 850, the product group key element 860, and an order price element 870.
  • The second KPI recipe 800 may also be obtained through the use of the system 500 of FIG. 5 and/or the method 600 of FIG. 6. This may be facilitated via comparison with the second KPI recipe 800. For example, the data definition for the second KPI recipe 800 may be compared with the first KPI recipe 700. One or more dimensions, such as the order date dimension 820 and the product group dimension 830, may be added. One or more new measures, such the order price measure 840, may be created by combining data from the first KPI ingredients 710, such as the product key element 750, with data from the other second KPI ingredients 810, such as the order date key element 850 and the product group key element 860, and adding the order price element 870. Such a data recipe modification technique may beneficially allow a data recipe to be modified at runtime, without changing structures that relied on previous definitions.
  • After the user 100 submits the information request 540 for the second KPI, the second KPI recipe 800 may be obtained by receiving and interrogating one or more data sources such as the data store 106 (or alternatively, one or more data sources different from those used to generate the first KPI recipe 700). The interrogation engine 510, the categorization engine 520, and/or the comparison engine 530 may operate with the aid of the data map 300, the semantic layer 400, and/or the heuristic algorithm 570 to provide a proposed model, or a proposed data recipe to obtain the second KPI, to the user 100.
  • The comparison engine 530 may operate by comparing data recipe ingredients 210 obtained from the categorization engine 520 with the first KPI ingredients 710 to obtain the pool of potential data recipe ingredients shown in FIG. 8. Then, the comparison engine 530 may modify the first KPI ingredients 710 applicable to the second KPI, and add new data recipe ingredients from the data recipe ingredients 210 obtained from the categorization engine 520, to obtain the second KPI ingredients 810.
  • As in the creation of the first KPI recipe 700, the proposed data recipe for the second KPI may be presented to the user 100, and the user 100 may be prompted to accept or reject the proposed data recipe. If accepted, the proposed data recipe provided to the user 100, used to satisfy the information request 540 for the second KPI by providing the requested information 580, and/or further processed, for example, by the heuristic algorithm 570 for further refinement as additional data is received.
  • The use of the heuristic algorithm 570 may beneficially avoid the need for the user 100 to manually map new data elements such as those of the first KPI ingredients 710 and the second KPI ingredients 810. Rather, the system 500 may iteratively present options to user 100 until the user 100 approves of a proposed data recipe.
  • One skilled in the art will recognize that the examples depicted and described herein are merely illustrative, and that other arrangements of user interface elements can be used. In addition, some of the depicted elements can be omitted or changed, and additional elements depicted, without departing from the essential characteristics of the invention.
  • The present invention has been described in particular detail with respect to possible embodiments. Those of skill in the art will appreciate that the invention may be practiced in other embodiments. First, the particular naming of the components, capitalization of terms, the attributes, data structures, or any other programming or structural aspect is not mandatory or significant, and the mechanisms that implement the invention or its features may have different names, formats, or protocols. Further, the system may be implemented via a combination of hardware and software, or entirely in hardware elements, or entirely in software elements. Also, the particular division of functionality between the various system components described herein is merely exemplary, and not mandatory; functions performed by a single system component may instead be performed by multiple components, and functions performed by multiple components may instead be performed by a single component.
  • Reference in the specification to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one embodiment of the invention. The appearances of the phrases “in one embodiment” or “in at least one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
  • In various embodiments, the present invention can be implemented as a system or a method for performing the above-described techniques, either singly or in any combination. In another embodiment, the present invention can be implemented as a computer program product comprising a non-transitory computer-readable storage medium and computer program code, encoded on the medium, for causing a processor in a computing device or other electronic device to perform the above-described techniques.
  • Some portions of the above are presented in terms of algorithms and symbolic representations of operations on data bits within a memory of a computing device. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps (instructions) leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical, magnetic or optical signals capable of being stored, transferred, combined, compared and otherwise manipulated. It is convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. Furthermore, it is also convenient at times, to refer to certain arrangements of steps requiring physical manipulations of physical quantities as modules or code devices, without loss of generality.
  • It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “displaying” or “determining” or the like, refer to the action and processes of a computer system, or similar electronic computing module and/or device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission or display devices.
  • Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention can be embodied in software, firmware and/or hardware, and when embodied in software, can be downloaded to reside on and be operated from different platforms used by a variety of operating systems.
  • The present invention also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computing device selectively activated or reconfigured by a computer program stored in the computing device. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, DVD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, flash memory, solid state drives, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. Further, the computing devices referred to herein may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
  • The algorithms and displays presented herein are not inherently related to any particular computing device, virtualized system, or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will be apparent from the description provided herein. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any references above to specific languages are provided for disclosure of enablement and best mode of the present invention.
  • Accordingly, in various embodiments, the present invention can be implemented as software, hardware, and/or other elements for controlling a computer system, computing device, or other electronic device, or any combination or plurality thereof. Such an electronic device can include, for example, a processor, an input device (such as a keyboard, mouse, touchpad, track pad, joystick, trackball, microphone, and/or any combination thereof), an output device (such as a screen, speaker, and/or the like), memory, long-term storage (such as magnetic storage, optical storage, and/or the like), and/or network connectivity, according to techniques that are well known in the art. Such an electronic device may be portable or non-portable. Examples of electronic devices that may be used for implementing the invention include: a mobile phone, personal digital assistant, smartphone, kiosk, server computer, enterprise computing device, desktop computer, laptop computer, tablet computer, consumer electronic device, or the like. An electronic device for implementing the present invention may use any operating system such as, for example and without limitation: Linux; Microsoft Windows, available from Microsoft Corporation of Redmond, Wash.; Mac OS X, available from Apple Inc. of Cupertino, Calif.; iOS, available from Apple Inc. of Cupertino, Calif.; Android, available from Google, Inc. of Mountain View, Calif.; and/or any other operating system that is adapted for use on the device.
  • While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments may be devised which do not depart from the scope of the present invention as described herein. In addition, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the claims.

Claims (14)

What is claimed is:
1. A computer-implemented method for processing structured data, the method comprising:
providing a library on a non-transitory storage medium, the library defining a plurality of reference data recipes, each reference data recipe comprising a data process configured to produce quantitative output data by use of one or more reference data components;
deriving live data components from data elements managed by a data source, the deriving comprising:
determining attributes of respective data elements of a plurality of data elements managed by the data source, and
applying pre-determined categorization rules to the attributes determined for the respective data elements, wherein applying the pre-determined categorization rules to a live data component corresponding to a specified data element comprises categorizing the live data component as one of a measurement component and a dimension component;
selecting a reference data recipe from the library in response to a request, the selecting comprising:
comparing reference data components of respective reference data recipes to live data components derived from the data elements managed by the one or more data sources, and
comparing data processes of the respective reference data recipes to the request;
producing a live data recipe, the producing comprising substituting reference data components of the selected reference data recipe with designated live data components; and
generating quantitative output data in response to the request, wherein generating the quantitative output data comprises applying a first data process to a measurement component of the live data recipe and a dimension component of the live data recipe, the measurement component comprising a first live data component corresponding to a first data element managed by the data source, and the dimension component comprising a second live data component corresponding to a second data element managed by the data source.
2. The method of claim 1, wherein comparing the data processes of the respective reference data recipes to the request comprises comparing semantic processing metadata of the request to the respective reference data recipes.
3. The method of claim 2, wherein the semantic processing metadata of the request comprises a key performance indicator.
4. The method of claim 1, wherein producing the live data recipe comprises modifying a data process of the selected reference recipe to operate on the designated live data components.
5. The method of claim 4, wherein generating the quantitative output data comprises applying the data process of the live data recipe to data elements corresponding to the designated live data components.
6. The method of claim 1, wherein determining the attributes of the respective data elements comprises parsing a schema of the data source.
7. The method of claim 6, wherein parsing the schema of the data source comprises parsing data elements stored within the data source into a NoSQL tree structure.
8. The method of claim 7, wherein parsing the data elements into the NoSQL tree structure comprises selecting a schema for the NoSQL tree structure based on a structure of the data source.
9. The method of claim 1, wherein the pre-determined categorization rules comprise:
a first rule to categorize date data types as dimension components,
a second rule to categorize alpha-numeric data types as dimension components, and
a third rule to categorize numeric data types as measure components.
10. The method of claim 1, further comprising categorizing the live data components by use of the pre-determined categorization rules, the categorizing further comprising inferring relationships among measure components and dimension components based on a structure of the one or more data sources.
11. The method of claim 1, wherein comparing the reference data components of respective reference data recipes to the live data components comprises matching each of the live data components to a semantic layer.
12. The method of claim 11, wherein matching each of the live data components to the semantic layer comprises:
locating an identifying characteristic of the data element corresponding to each live data component;
locating, within the semantic layer, a phrase matching each identifying characteristic, wherein the identifying characteristic is selected from the group consisting of:
a name of the data element;
a data type of the data element;
a data size of the data element;
a data structure of the data element;
metadata of the data element;
a sample data set related to the data element.
13. The method of claim 11, wherein the semantic layer comprises a plurality of pairings, wherein each pairing comprises a phrase, a phrase mapping, and a confidence factor that indicates a likelihood that a data element is related to the phrase, wherein matching each of the live data components to the semantic layer comprises using the confidence factor of the pairing with a phrase that matches the data element of the live data component.
14. The method of claim 1, wherein comparing reference data components of a reference data recipes to the live data components comprises using a data map to quantify relationships between the reference data components and data elements of the data source.
US16/384,474 2013-04-22 2019-04-15 Automatic dynamic reusable data recipes Abandoned US20190377727A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/384,474 US20190377727A1 (en) 2013-04-22 2019-04-15 Automatic dynamic reusable data recipes

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201361814586P 2013-04-22 2013-04-22
US14/257,669 US10262030B1 (en) 2013-04-22 2014-04-21 Automatic dynamic reusable data recipes
US16/384,474 US20190377727A1 (en) 2013-04-22 2019-04-15 Automatic dynamic reusable data recipes

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/257,669 Continuation US10262030B1 (en) 2013-04-22 2014-04-21 Automatic dynamic reusable data recipes

Publications (1)

Publication Number Publication Date
US20190377727A1 true US20190377727A1 (en) 2019-12-12

Family

ID=66098723

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/257,669 Expired - Fee Related US10262030B1 (en) 2013-04-22 2014-04-21 Automatic dynamic reusable data recipes
US16/384,474 Abandoned US20190377727A1 (en) 2013-04-22 2019-04-15 Automatic dynamic reusable data recipes

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/257,669 Expired - Fee Related US10262030B1 (en) 2013-04-22 2014-04-21 Automatic dynamic reusable data recipes

Country Status (1)

Country Link
US (2) US10262030B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20230069428A (en) * 2021-11-12 2023-05-19 주식회사 스타캣 Method for generating metadata for automatically determining type of data and apparatus for determining type of data using a machine learning/deep learning model for the same

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9779134B2 (en) * 2014-12-26 2017-10-03 Business Objects Software Ltd. System and method of data wrangling
WO2018081633A1 (en) * 2016-10-28 2018-05-03 Roam Analytics, Inc. Semantic parsing engine
US11847170B2 (en) 2020-01-17 2023-12-19 Target Brands, Inc. Data visualization tool with guided visualization creation and secure publication features, and graphical user interface thereof
USD941836S1 (en) 2020-01-17 2022-01-25 Target Brands, Inc. Display panel or portion thereof with a computer-generated graphical user interface
US11921991B2 (en) 2020-01-17 2024-03-05 Target Brands, Inc. Data visualization tool with guided visualization creation and secure publication features, and graphical user interface thereof
US11880393B1 (en) * 2022-10-28 2024-01-23 Kpn Innovations, Llc. Apparatus and method for generating an ingredient chain

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6975910B1 (en) * 2000-04-28 2005-12-13 International Business Machines Corporation Managing an electronic cookbook
US7149746B2 (en) * 2002-05-10 2006-12-12 International Business Machines Corporation Method for schema mapping and data transformation
US7127469B2 (en) * 2002-06-13 2006-10-24 Mark Logic Corporation XML database mixed structural-textual classification system
EP1552427A4 (en) * 2002-06-13 2009-12-16 Mark Logic Corp Parent-child query indexing for xml databases
US7702636B1 (en) * 2002-07-31 2010-04-20 Cadence Design Systems, Inc. Federated system and methods and mechanisms of implementing and using such a system
US8335779B2 (en) * 2002-08-16 2012-12-18 Gamroe Applications, Llc Method and apparatus for gathering, categorizing and parameterizing data
US7076477B2 (en) * 2002-12-19 2006-07-11 International Business Machines Corporation Fast and robust optimization of complex database queries
US7664795B2 (en) * 2003-09-26 2010-02-16 Microsoft Corporation Apparatus and method for database migration
US8117143B2 (en) * 2004-05-28 2012-02-14 Intel Corporation Using affinity measures with supervised classifiers
US7596546B2 (en) * 2004-06-14 2009-09-29 Matchett Douglas K Method and apparatus for organizing, visualizing and using measured or modeled system statistics
US7979468B2 (en) * 2005-06-14 2011-07-12 Enterprise Elements, Inc. Database data dictionary
US8036997B2 (en) * 2005-06-16 2011-10-11 Board Of Trustees Of Michigan State University Methods for data classification
US7849049B2 (en) * 2005-07-05 2010-12-07 Clarabridge, Inc. Schema and ETL tools for structured and unstructured data
US20070269557A1 (en) * 2006-05-19 2007-11-22 Hannaford Licensing Corp. Method and system for assessing, scoring, grouping and presenting nutritional value information of food products
US7962476B2 (en) * 2006-07-26 2011-06-14 Applied Minds, Inc. Method and apparatus for performing a depth-first join in a database
US20080140696A1 (en) * 2006-12-07 2008-06-12 Pantheon Systems, Inc. System and method for analyzing data sources to generate metadata
US8022952B2 (en) * 2007-07-31 2011-09-20 Hewlett-Packard Development Company, L.P. Generating a visualization to show mining results produced from selected data items and attribute(s) in a selected focus area and other portions of a data set
US8555206B2 (en) * 2007-12-21 2013-10-08 Fisher-Rosemount Systems, Inc. Methods and apparatus to present recipe progress status information
US8261186B2 (en) * 2009-01-02 2012-09-04 Apple Inc. Methods for efficient cluster analysis
US8200642B2 (en) * 2009-06-23 2012-06-12 Maze Gary R System and method for managing electronic documents in a litigation context
US8200548B2 (en) * 2009-08-31 2012-06-12 Peter Wiedl Recipe engine system and method
US8204848B2 (en) * 2009-11-17 2012-06-19 Business Objects Software Limited Detecting and applying database schema changes to reports
US20120253828A1 (en) * 2011-04-01 2012-10-04 Bellacicco Jr John A System and method for sensitivity or nutritional factor exposure monitoring
US20120322032A1 (en) * 2011-06-17 2012-12-20 Spinning Plates, Llc Methods and systems for electronic meal planning
EP2701087A4 (en) * 2012-06-27 2014-07-09 Rakuten Inc Information processing device, information processing method, and information processing program
US9536237B2 (en) * 2012-11-28 2017-01-03 Wal-Mart Stores, Inc. Recipe suggestion apparatus and method
US9495360B2 (en) * 2014-01-31 2016-11-15 International Business Machines Corporation Recipe creation using text analytics
US9886670B2 (en) * 2014-06-30 2018-02-06 Amazon Technologies, Inc. Feature processing recipes for machine learning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20230069428A (en) * 2021-11-12 2023-05-19 주식회사 스타캣 Method for generating metadata for automatically determining type of data and apparatus for determining type of data using a machine learning/deep learning model for the same
KR20230154157A (en) * 2021-11-12 2023-11-07 주식회사 스타캣 Method for generating metadata for automatically determining type of data and apparatus for determining type of data using a machine learning/deep learning model for the same
KR20230155390A (en) * 2021-11-12 2023-11-10 주식회사 스타캣 Method for generating metadata for automatically determining type of data and apparatus for determining type of data using a machine learning/deep learning model for the same
KR102622434B1 (en) 2021-11-12 2024-01-09 주식회사 스타캣 Method for generating metadata for automatically determining type of data and apparatus for determining type of data using a machine learning/deep learning model for the same
KR102623561B1 (en) 2021-11-12 2024-01-09 주식회사 스타캣 Method for generating metadata for automatically determining type of data and apparatus for determining type of data using a machine learning/deep learning model for the same
KR102622433B1 (en) 2021-11-12 2024-01-09 주식회사 스타캣 Method for generating metadata for automatically determining type of data and apparatus for determining type of data using a machine learning/deep learning model for the same

Also Published As

Publication number Publication date
US10262030B1 (en) 2019-04-16

Similar Documents

Publication Publication Date Title
US20190377727A1 (en) Automatic dynamic reusable data recipes
US20210224818A1 (en) User Interface and Process Flow for Providing an Intent Suggestion to a User in a Text-Based Conversational Experience with User Feedback
US10192425B2 (en) Systems and methods for automated alerts
US20180196579A1 (en) Master View of Tasks
US9098314B2 (en) Systems and methods for web based application modeling and generation
US11074250B2 (en) Technologies for implementing ontological models for natural language queries
US10102246B2 (en) Natural language consumer segmentation
US11341449B2 (en) Data distillery for signal detection
US10803390B1 (en) Method for the management of artifacts in knowledge ecosystems
US10860656B2 (en) Modular data insight handling for user application data
US20170344643A1 (en) Providing travel or promotion based recommendation associated with social graph
US11269894B2 (en) Topic-specific reputation scoring and topic-specific endorsement notifications in a collaboration tool
US20160216946A1 (en) Access operation with dynamic linking and access of data within plural data sources
US20210390258A1 (en) Systems and methods for identification of repetitive language in document using linguistic analysis and correction thereof
KR20240020166A (en) Method for learning machine-learning model with structured ESG data using ESG auxiliary tool and service server for generating automatically completed ESG documents with the machine-learning model
US11789962B1 (en) Systems and methods for interaction between multiple computing devices to process data records
US20150235281A1 (en) Categorizing data based on cross-category relevance
US20220108359A1 (en) System and method for continuous automated universal rating aggregation and generation
US11113081B2 (en) Generating a video for an interactive session on a user interface
US20180060440A1 (en) Systems and methods to cognitively update static bi models
Pope Big data analytics with SAS: Get actionable insights from your big data using the power of SAS
JP2016518646A (en) System, apparatus, and method for generating contextual objects mapped to data measurements by dimensional data
US11823286B2 (en) Dependent dimensions
US20170255972A1 (en) Enhancement to customer feedback systems
US11531675B1 (en) Techniques for linking data to provide improved searching capabilities

Legal Events

Date Code Title Description
AS Assignment

Owner name: DOMO, INC., UTAH

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BURTENSHAW, JEFF;THAYNE, DAREN;JAMES, JOSHUA G.;AND OTHERS;SIGNING DATES FROM 20140402 TO 20181206;REEL/FRAME:049061/0547

STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE