MXPA00012346A - Data retrieval method and apparatus with multiple source capability - Google Patents

Data retrieval method and apparatus with multiple source capability

Info

Publication number
MXPA00012346A
MXPA00012346A MXPA/A/2000/012346A MXPA00012346A MXPA00012346A MX PA00012346 A MXPA00012346 A MX PA00012346A MX PA00012346 A MXPA00012346 A MX PA00012346A MX PA00012346 A MXPA00012346 A MX PA00012346A
Authority
MX
Mexico
Prior art keywords
data
information
database
data source
source
Prior art date
Application number
MXPA/A/2000/012346A
Other languages
Spanish (es)
Inventor
David B Kouchi
David Yarnall
Donald K Babcock
Original Assignee
Timeline Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Timeline Inc filed Critical Timeline Inc
Publication of MXPA00012346A publication Critical patent/MXPA00012346A/en

Links

Abstract

Generation of output or reports contained in a data source which may be any of two or more types of source data, in a standardized or uniform manner is provided. A plurality of drivers (804) are provided specific to different types of source data which include programming for identifying structural or other characteristics of the various data sources, e.g. for use in defining a new database. Preferably the new database is configured to permit highly flexible and/or rapid output or reporting or is otherwise optimized for reporting purposes. In one embodiment, the present invention includes conversion of one or more data sources into one or more uniform databases (812), preferably generating one or more key categories for organizing the data, optionally generating category groupings or rollups and additional data or optional references.

Description

METHOD OF RECOVERY OF DATA AND DEVICE WITH CAPACITY OF MULTIPLE SOURCES.
The present application is a continuation in part of the application with serial number 08/582 062 that was filed on January 2, 1996 and serial number 08 / 593,118 that was filed on February 1, 1996, both entitled METHOD OF RECOVERY OF DATA AND APPARATUS WITH CAPACITY OF MULTIPLE SOURCES, both incorporated herein by reference. The present invention relates to a computer implemented system that is capable of retrieving information stored in one or more of many different sources and which may be in any of different formats and / or provides reports and analyzes based on the information, and in particular to a computer and apparatus method that can retrieve information from databases stored in any of a plurality of formats, including structural and / or rational information, without the need to rely on a human analysis of the source data.
BACKGROUND INFORMATION Many ways of organizing computer-accessible information were developed, such as hierarchical or relational database management systems, flat file data systems, spreadsheet systems, and the like. These systems are used to store, manipulate and visually display a myriad of information types, which include financial or accounting information, scientific or technical data, corporate or business data, telephone data, addresses and names and statistical data. Different formats and data structures have been developed, and this situation has both desirable and undesirable ramifications. On the positive side, through the multiplicity of different types of systems that are available, it is possible to provide different systems that are optimized for different purposes (for example, optimized for entry or storage of data against the flexibility or speed of analysis and reporting of data, optimized for accounting data against company data, and the like), or that provide user interfaces or other features that may be of interest to personal or company preferences. This multiplication of information systems, however, provides a substantial barrier in situations where it would be useful to have access to information in two or more systems, for example to coordinate or combine that information. Examples of those situations include: (1) an accountant who wants to produce standardized reports but who has multiple clients, each of whom keep their countable data in a different data source; (2) a corporation with different divisions that wants to produce uniform reports, but in which the different divisions use different corporate or financial programs; (3) a corporation that wants to produce uniform reports, but maintains its accounting information in a first type or brand of databases (or other data source), and its corporation information in a second database and of a different type; (4) a group of scientists who investigate a common problem, each one of those who keep or have access - data that is kept in a database of a different type or brand or other source of data. Other examples will occur to the reader after understanding this description. Additionally in some situations, when all the desired information is in a single type of data source or even all stored in a single data file, it may be desirable to provide a way to access the data, for example to provide reports. and analysis of uniform and / or augmented data. These situations present difficulties for many reasons, including differences in ways of organizing information and differences between types of data sources. In some situations, similar categories of information may be organized in different ways, even if you are using the same database program. For example, in the first instance, using a first package of database programs, a user could organize the records of the company's personnel so that all the names of the company's personnel are stored in a first table or list, all addresses in one in a second table or list and all telephone numbers are stored in a third table or list, and pointers or links are stored to indicate which names are associated with which addresses and with which telephone numbers. However, another instance may occur using the same program in which a different person organizing the personnel information could provide a single table in which each line or "record" of information includes a name, an address and a telephone number, so Without any links or pointers from a record in a table to a record to another table, in addition different types of data sources may have different structures and / or different data storage formats or schemes. data are organized in a hierarchical way (for example, in a tree way), while others can be organized as relational databases (modeled in two-dimensional tables of rows and columns) .In addition, the information can be stored in forms that are not, strictly speaking, forms of databases as it is to store data in the form of "flat files", such as electronic sheet nica, and the like. Additionally, different types of data sources can store the data in different formats. For example, some database products store each table, each report format and each query as a separate file on a storage device such as the hard drive, while other programs can store all the tables in a single file, relationships, queries, report formats, etc. Some products could store each record and / or field as fixed length data and / or in a fixed position in a file, while others could use delimiters to distinguish between one record and the next or between one field and the next within a registry. Even if two products from different programs store a particular type of information at a predetermined location, that location may be different for different program products. In addition, the data may be encoded differently in different program products, such as using ASCII code in one product and multilanguage characters (multi byte) in another product. In some cases the data may be compressed or encrypted. In view of the wide variety of data types, in the past, when it was desired to access stored information (for example to standardize reports and analyzes and / or to combine or coordinate information from two or more databases), a consultant or Another expert analyzed individually or "manually" each data file "source" or database to understand its structure, relational data storage format, the organization of data within the database, and the like. The expert will then build some way of importing or coding the data into the source data file or database in order to achieve the desired access, coordination or combination. Although this approach is operable, it is labor intensive since it requires human analysis and is also time consuming since it typically takes a relatively long period of time for the expert or consultant to complete the task of analyzing what often requires days or weeks to achieve the access, combination or coordination. In accordance with the above, it would be useful to provide a system in which information that is in different formats or forms or organized in different ways can be accessed, combined or coordinated while reducing or eliminating the need for human analysis, thus providing a system that is at least partially automated and preferably less labor intensive and that consumes less time than certain previous methods.
COMPENDIUM OF THE INVENTION The present invention relates to a system that manages to access stored information, for example to access information or to achieve the coordination and / or combination of information in two different storage systems. Preferably some or all of the analyzes involved are performed automatically (for example without the need for human analysis), in one modality, using a properly programmed computer. In one embodiment, the information, preferably including at least some information that is automatically obtained from the data source, is used to define and / or populate a new database. In some modalities more than one database can be provided. For example, a first new database may be used as a source to distribute information to a plurality of information consumers and the distributed information may itself be in the form of a plurality of databases, which may be different from one of the other. Preferably, the system is flexible in the sense that it is not inherently limited in the data formats it can access but can be configured to obtain data from virtually any information source that can be read by computer. Preferably the system is extensible (more preferably modularly extensible) in the sense that the components can be added to allow access to additional types, formats or data organizations. In one embodiment, the access, coordination or combination of data is accompanied by an increase in data analysis, for example by providing types of data analysis and / or reporting used or not found in the original data source. Preferably the system can be used to provide a standardization of data analysis or report through different types of data sources. In one modality, the system uses the content of the source data files or databases as well as information about the structure, in order to achieve the desired results (as it is through text recognition, artificial intelligence and / or systems). experts). In one modality, the system uses that information to at least partially control the way in which the data is made available for analysis or reporting. A generation of outputs or information reports is provided that contains a data source that can be any of two or more types of source data in a standardized or uniform manner. A plurality of specific controllers are provided for different types of source data including programming to identify structural characteristics or other characteristics of the different data sources, for example to be used in the definition of a new database. Preferably, the new database is configured to allow fast and / or highly flexible outputs or reports or is otherwise optimized for reporting purposes. In one embodiment, the present invention includes the conversion of one or more data sources into one or more uniform databases, preferably generating one or more key categories to organize and / or validate the data, optionally generating data groupings or arrays. and additional data or optional references. In one embodiment, the present invention creates or populates a database, based on accounting data or other data converted from other existing data files such as data files created by accounting programs or other previous programs.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a schematic description of an example of data storage of flat files. Figure 2 describes a directory structure of the type that is used in connection with the data storage described in Figure 1. Figures 3A-3C describe examples of data storage formats that can be used in connection with data storage described in Figure 1.
Figures 4A-4F are schematic descriptions of an example of data stored in relational database tables. Figure 5 describes a directory structure of the type that is used in connection with the data storage described in Figures 4A-4F. Figure 6 is a schematic description of an example of flat file storage. Figures 7A-7D are schematic descriptions of data stored in tables of a relational database.
Figure 8 is a block diagram of a system for data retrieval in accordance with an embodiment of the present invention. Figure 9 is a schematic description of the contents of a function module in accordance with an embodiment of the present invention. Figure 10 is a flowchart of a process for data recovery, in accordance with one embodiment of the present invention. Figures HA and 11B describe pseudocode methods for selecting or searching directories, in accordance with one embodiment of the present invention. Figure 12 is a schematic description of data stored in tables of a database 808 that is provided in accordance with an embodiment of the present invention. Figure 13 is a block diagram illustrating a multi-database structure in accordance with one embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED MODALITIES Before describing certain aspects of the present invention, it will be useful to promote an understanding of the present invention to provide examples of different ways of storing information. This will be done by providing different examples, including examples of accounting information and examples of technical or scientific information. Table I provides a comparison of data types that can be stored by two different corporations. It is planned that table I indicate the conceptual organization of accounting information and other information for two corporations, and is not necessarily information that will be stored in a database (although it could be, if desired).
TABLE I. EXAMPLES OF ACCOUNTING INFORMATION Corporation # 1 Corporation # 2 Cash Account Account Cash In the first example of table I, the number 1 corporation stores account information, employee information, project information and product information and thus has four account parts. The accounting information of the number 1 corporation includes only three components: cash, accounts payable and accounts receivable. The corporation maintains a list of its vendors, route information through two types of projects (research and sales) and maintains a list of its products. The second example in Table I is a somewhat long (but still simplified) example. In this example, the account has multiple components in a hierarchy. Although the number 2 corporation account has categories of cash, accounts payable and accounts receivable, each of these categories has subcategories and some subcategories have even finer divisions. Similarly, information on employees, projects and products is broken into different categories and subcategories and the number 2 corporation again has additional items that are tracked, as are subsidiary companies. Table I illustrates that, even before considering the differences between data sources and / or data storage formats, a structure of the corporation and / or the manner in which it chooses to organize its information will cause differences in a system to another. For example, if an accountant had the number one corporation and the number 2 corporation as client and would like to use a standard or uniform system to report and analyze these two corporations, there would be significant difficulties in doing so, even if the number 1 corporation and the number 2 corporation they will use the same database programs and even the account information within that program in a similar way. A) Yes, using previous methods, would typically have required human analysis and understanding of the information in Table I in order to provide a uniform or standardized report and analysis for the two corporations, based on the corporations' databases. Even in the context of considering account information, different types of information storage can be used to store the account information of the number 1 corporation and the number 2 corporation. For example, the information can be stored as one or more files blueprints. It is noted that, at least in accordance with some uses, the storage of "flat file" information is not a real database system. However, the present invention, at least in some embodiments, is capable of accommodating flat file data as well as other methods of database storage and not databases. Figure 1 is a schematic description of how a plurality of flat files can be used to store information for corporation number 2 of table I. Although Figure 1 shows the information as it could appear in written form on a plurality of sheets of paper , the data will, in fact, be stored in a computer readable medium such as the hard drive, for example, as described below. It is proposed that the format of Figure 1 describe the logical structure of the data organized in a plurality of files from 101a to lOlf, each file having file identification information 104, described in Figure 1 as header information or title. 104a, 104b and a plurality of registers, described in Figure 1 as information lines 106a, 106b, 106c, each register having a plurality of fields (organized in columns 108a, 108b, 108c, 108d in the description of Figure 1). The methods and apparatus for storing and accessing the data to have or reflect the logical fi le or column structure described in Figure 1 are well known to those skilled in the art of programming. The present invention can be used in connection with numerous flat file storage programs. Examples of those storage programs include those sold under the social reasons of Simply Accounting ™ and MAS-90 ™. The data organized in the logical structure described in Figure 1 can be stored in many different formats. For example, in one embodiment the data in each flat file 101a through lOlf, are stored in a separate file on a hard disk of a personal computer. Figure 2 describes a directory / file structure that can be used to store those files, in which all the files described in Figure 1 are stored in a single subdirectory. As is well known to those skilled in the art, the different files, even though organized in a hierarchy of directories as described in Figure 2, can be stored, physically on a hard disk in a plurality of different addresses. Numerous formats can be used to _ ^ store the data in a file. The examples are described in Figures 3A to 3C. In the example of Figure 3A, the file includes header information, followed by storage of the first record 106a, and so on. In the example of Figure 3A, a fixed-length data format is used in which each register 106a, 106b has an identical length 304 (for example, it occupies a fixed number of bits). In the embodiment of Figure 3A each field in each register also has a fixed length 308a to 308d. Figure 3B describes another method of storing fixed-length data where the data is stored in order of columns rather than in order of rows (all dates sequentially, then all descriptions sequentially, etc.). In the embodiment of Figure 3B, in order to assist in locating the desired data it may be useful to store an indication 322 of the number of records, for example, as part of the header. In the fixed-length system, for example as that described in Figure 3A or 3B, a particular piece of data will be addressed at a given distance (for example, a given number of bits) from the start of the data. For example, in Figure 3A, if it is known that the header 302 has a length of four bytes and it is known that the length of the record 304 is 8 bytes, it will necessarily be found that the data information for the first record 106b begins at the byte number 5, it will necessarily be found that the data information with the second register 106b starts in the byte number 13 and so on. Figure 3C describes the storage of data in delimited fields rather than in fixed length fields. In the delimited format, a special symbol, for example a bit pattern that is different from any pattern that is used to store data, is used to mark the end or start of a record and / or field. In the embodiment described in Figure 3C, two different special symbols are used, one to mark the beginning of a record and another to mark the beginning of a field. These symbols are indicated in Figure 3 as commas and semicolons respectively, which can be used for any bit pattern or symbols. In the data format of Figure 3C, it is possible for the information to be identified by indicating date information for the first record 106a as the information following the first new registration symbol 324a, and prior to the first new registration symbol 324b . The information of the date of the second record 106b would be that information which follows the second or new registration symbol 326a and which precedes the next new field symbol 326b., etc. Many other formats are possible to store information. It should be apparent, from the illustrations of Figures 3A through 3C, that the multiplicity of data storage formats presents yet another problem for accessing, coordinating and combining data in different types of information storage systems. Previously, those who wished to access information directly (for example without using the database administrator system or other software designed to read the stored information) required knowledge of data storage formats such as those that could, in some cases, be acquired by analyzing examples of stored information. Thus accommodating the needs of an accountant who wishes to access the stored information, for both the corporation 1 and the corporation 2, of the table I will require not only information about the logical organization of the data (Figure 1), its logical structure of the directory (Figure 2), but also information about the data storage format (Figures 3A through 3C). In addition to the storage of information such as flat data there are many other possibilities. Figures 4A through 4F describe a possible organization of information in the context of a relational database. In the examples of Figures 4A through 4F, a first transaction table 402 is stored, having a plurality of registers 406a through 406d. It will be noted that the registers described in Figure 4A are similar in some respects to those described in Figure 1, for example it includes a data field, a description field, a quantity field and a totals field. "In the example of Figure 4A an additional Index field 408 is provided for each record The transaction table in the example of Figure 4A does not include the header information 104a of the type described in Figure 1, and only one table of information is provided. Figure 4A (rather than a multiplicity of tables 101a to lOlf in the embodiment of Figure 1) In the relational database described in Figures 4A through 4F, additional tables are provided that may reflect the organization described in the table I. For example, a table of accounts 412 includes a list of all the categories defined in Table I with an Index 414 that is being associated with each account, similarly, a table of employees 416 inc. Look at the names of the employees in Table I, each with an Index 418 associated with it. Also in Figure 4C there is an indication for each name, >; whether the person is associated with the sales force or the research force (reflecting the hierarchy shown in • table I). In addition, a field is included to indicate the employee's address. Additional tables (not described) can be provided to list the different projects, products and subsidiaries of corporation number 2, which reflects the organization of table I. Figure 4D describes a table of links 422, which indicates, for each record of a transaction table 40-2 any desired league to other tables. For example, if the first transaction 406a is a transaction that relates to the cash component of bank number 1, a record 428 will be provided, indicating that for the transaction record to have a value of Index number 1, the reference to the The appropriate account is the one that has the Index number 424. Similarly, the links can be created for the employee table 416, or other tables (not shown). Thus, while in Figure 1 it was necessary to provide a separate file for each possible combination of account, subsidiary, product, project, etc. (potentially leading to a large number of files for a relatively complicated accounting structure), 'oa single transaction table 402 is required in the embodiment of Figures 4A through 4F, with the table of links in Figure 422 providing the information which, in the example of Figure 1, is obtained by knowing in which flat file a transaction is stored. In a typical relational database, it is possible to visually identify and display oinformation that meets certain criteria, for example, othose transactions for a particular account and a particular employee. In some database programs it is possible to store such criteria or "queries" to be used, for example, when selective information is commoneeded. Figure 4E describes a table that stores a plurality of those queries using, for example, the structured query language (SQL). The structures that are used in a particular database system may reflect the way in which the data is analyzed or organized by a company. And so, an accountant who might have an interest in standardizing reports and analyzes based on information in that database might wish to know about and / or be able to reproduce the data analyzes of the type represented by several stored queries (Figure 4E ). Additionally, many types of database allow a user to design a report (either to display visually or print) and, in some cases, to store information that defines that report, for example, for repetitive use. Thus, another table or set of tables (not shown) can be stored as part of or in connection with a relational database to preserve information regarding those reports. _ # * The information and structure represented in Figures 4A through 4F can be stored in many different ways. Figure 5 describes a directory / file hierarchy that can be used to store a plurality of data tables, league tables, query tables and / or report formats. These data can be stored in many different data formats, as is any of those described in Figures 3A through 3C, or others, as is known to those skilled in the art. Another example of information that can be stored in different formats is technical or scientific information. In Figure 6, a flat file system is provided to store surface temperature information for, for example, meteorological research. In the example of Figure 6, each file 602a, 602b, 602c stores information for a particular address and unit of measure (for example Fahrenheit or Celsius), indicated in the header of that 604. For each record 606a, 606b, 606c the date and the reading for each hour of the day is stored in separate fields. A relational database system for storing this type of data is described in Figures 7A through 7D. A data table 702 contains all the observations and, for each observation, the unit of measurement 704a, together with an Index 704a. Tables 706 (Figure 7B) and 708 (Figure 7C) store information regarding the address and times of the day. Each line 722, 724 of the league table (Figure 7D) can be used to indicate, for any data point stored in Figure 7A (as indicated by its Index value 704b), the Index value for the address associated with that data point and the time of the day associated with that data point. Although the examples in Figures 6 and 7A through 7D are simplified, it will be apparent to those skilled in the art how to construct information systems for storing more complicated data sets such as weather data, including wind speed and direction. of the wind, radio sound data and the like. Thus, if a researcher wishes to correlate information from two meteorological data sources, one of which stored information in the system described in Figure 6 and another that stores information in the system described in Figure 7A through 7D, in previous methods it would typically be necessary. have a consultant or other expert to analyze the information storage structure and the organization described in Figures 6 and 7 through 7D, and manually develop a system to access the information in a way that allows the combination or coordination of the data there inside. Figure 8 is a block diagram of a system, in accordance with an embodiment of the present invention, for use in overcoming the difficulties described above in relation to access, coordination and combination of data in different information systems. In the embodiment of Figure 8, a main process 802 selectively activates different controllers 804b, 804c, 804d. For discussion purposes, the source data 806a through 806d, described in Figure 8, may be data that is mastered or created by any of many programs or systems for organizing or storing data such as flat file systems, databases. , electronic sheets, etc. as discussed above. The processes and data in Figure 8 can reside on one or many computers. In one embodiment, the process is implemented in the context of a local area network (LAN), which has a network server computer, and disks or other associated storage devices, and one or more client computers. In one embodiment, the main process 802 runs on a client computer while the information sources 806 and the data files for the new database 808 are stored on a disk (or other data storage) associated with the network server . The invention can be implemented in many other architectures, as it is in a single computer, in multiple computers not in a network, using computers that are linked through a communication link such as a global area network, modem communications, via internet, etc. Many types of computers can be used to implement the invention, including the central unit and personal computers such as those produced by International Business Machines (eg, computers 386, 486 or Pentium® or Pentium Plus), Apple Corp. (as what are Macintosh computers) and clones of those computers. In one modality the processes are implemented using a DOS operating system and / or a Microsoft Windows® or Windows 95 interface. The items referred to here as 804a through 804d controllers should not be confused with a data filter of a type provided in certain database programs. A data filter is typically a type of query or logical test to select certain records and / or fields, in accordance with a criterion determined by the user. The drivers 804a through 804d, on the other hand, as described more fully below, are processes that have multiple functions for analyzing and accessing different types of source data. In one embodiment, the 804 modules function is provided as dynamic link libraries (DLL's) in a manner that will be understood by those skilled in the art after understanding the present disclosure. Controllers 804a through 804d are configured to operate with one or more types of data sources, such as data files ** produced using a particular database program. Depending on the characteristics of the database program, it might be necessary to have, for example, two separate controllers for data files produced by two separate versions of database program packages. In some circumstances it may be possible to provide a single controller that can be used in connection to data files produced using two (or more) types or different program marks (or different versions of a given database or other program mark). _ The source data 806a through 806d described in Figure 8 can, in general, be any source of computer readable information. Examples include flat file source data, hierarchical databases, relational databases, spreadsheets, and the like. Although Figure 8 describes a modality in which four data sources are shown, the present invention can be used in the context in which there was only a single data source, or in which five or more data sources speak. Although the present invention can be used in situations where each data source 806a through 806d is produced using a different brand or type of program, it is also possible to use the present invention in situations where two or more of the data sources are they produce using the same type or brand of database or another program. As an example, the first controller 804a may be configured to be used in retrieving information from data files that were produced using dBase II®, the second controller may be configured to be used in retrieving information from data files that were produced using dBase III® , the third controller may be configured to be used to retrieve information from data files that were produced using a flat file system such as Simple Accounting ™ and the fourth controller may be configured to be used in retrieving information from data files that were produced using Microsoft Access®. Once the source or data sources have been analyzed (as described more fully below), the results of those analyzes can be used in many ways, including providing the user with access to the information in the data sources to visualize it or edit it, copy all or some data and preferably improvements of that (as described below) to create a new database, create data reports (to visualize, print, store, transmit, etc.), queries and the like. In the embodiment of Figure 8, after the main process 802, using the drivers 804, executes an analysis of the source data 806a through 806d, it can create one or more new databases 808, which contain data of one or more of the different data sources 806a to 806d. In a modality, a new database is created for each data source. It might be desirable to join two or more of these databases, for example, using standard database techniques as they are when those databases have a similar structure. In another mode, a database 808 may contain information from two or more data sources (for example, if one company uses a database or other data source to store sales information and another database or other source). of data to store employee information). At least in some embodiments, preferably some amount of preparation, reformatting or other processing is done to the source data in connection with the creation of a new 808 database. Although in one embodiment, all that processing is manipulated by the controllers appropriate, in other modalities part of that processing could be manipulated within the data source 806 and / or in the new database 808. In a modality, after the data has been prepared or processed in this way, it can be used with relationship to populate the new database 808 by employing a universal routine, for example, a routine that can operate in relation to any of a plurality of different data sources. The new database or databases 808 can, if desired, be used to generate reports, for example, using a report writer 810, and can, if desired, be used to enter, visualize or analyze data, for example , using an 812 database administrator system or another 814 program., the 808 database is a Microsoft® Access database that includes base code that has one or more wizards, models, filters and or tool-kit programs (such as those terms comprised by those skilled in programming with Microsoft® Access) for example to provide report and analysis of databases such as outgoing standard financial reports. In one modality, administrative and financial reporting programs are provided as an extension or modification of those available under the METAVIEW ™ trade name available from Timeline Inc. of Bellevue, WA. In one embodiment the information in the new database or databases 808 is enhanced, for example, by the main process 802 in the sense that it is configured to generate, visually display or output analysis or data relationships that were not displayed. using or using source data 806a through 806d. As described in Figure 13, in one embodiment, the new database808 is just one of many databases that are formed. In the described embodiment, a plurality of databases 1302a-1302c are provided by using the new database 808 as a source of information for distributing databases 1302a-1302c for the end users of the information. Each of the databases 1302a-1302c is populated with at least some information from the database 808. In one embodiment, the databases 1302a-1302c may differ from one another, for example, by having different data. and / or different structures of databases, reports, queries and the like. As an example, when database 80-8 contains information pertaining to a plurality of different companies (eg, different subsidiaries of a central company), each of the 1302a-1302b databases could contain information to be used by one of the companies and then typically would contain only information related to that company. Since the different subsidiary companies may be organized differently, each one may have its own reports or other predefined structures. It is also possible that one or more of the 1302a-1302c databases could contain information related to two or more of the subsidiary companies, such as the account information, summarizing the activity for all (or select group) subsidiaries, for example to be used by the central company. The databases 1302a-1302c can originate from a so-called push procedure (from a series of commands that originate from the computer system where the new database 808 resides) and / or from a so-called traction procedure (from a series of commands that originate from the computational system where the databases 1302a-1302c reside). Each database 1302a-1302c can be constructed as a result of consulting database 808 *, selecting the structure and data to be included in database 1302a-1302c, as will be clear to those skilled in the art after understanding the present description. In one embodiment, the databases 1302a-1302c can be used in a normal manner by the end user (s) of each database that typically would involve adding or updating information in the databases 1302a-1302c. Preferably that new or updated information is also written again, preferably in a partial or total automatic way, to the 808 database. In this way, others who access the 808 database, or who access another database 1302a - c that could contain information that relates to that which has been added or updated (for example, the central company) will receive data up to date. The reply to the database 808 may occur periodically, on request, or it may be performed each time any of the databases 1302a-1302c are updated. In some cases the changes made to the 1302a-1302c databases are better handled by changing some of the structure of the 808 database (as would be adding new tables, new fields, new indices, new reports and similar). Preferably, the convenience of changing something of the structure of the database 808 is detected at least mechanically in part, as it is by comparing the present database structure 808 with the structure definition information that is automatically detected. in databases 1302a-1302c, for example by using techniques similar to those described above in relation to detecting structure definition information from data sources 806. In one embodiment, some or all of the databases 1302a-1302c are configured with queries, reports and similar structures but at least some or all of the data remains permanently stored in the base, of data 808. For example, in a modality, a 1302a database may have certain queries defined, but when executing a query, a communication link 1304 is used so that the query is executed on the data residing in the 808 database. In the modality, the information resulting from the execution of the query is reserved in a storage device of data or memory, either in the site of the database 808 or the site of the database 1302 that requests it. If desired, that reservation can be executed only when at least a limited number of requests have been received for the same query. Regardless of whether or not a limit was implemented, the memory or storage available for reservation can be recycled, for example, on a less recently used basis, as understood by those skilled in the art after understanding the present description. Preferably, before reservation data are supplied in response to a query or other request, a review is made to determine whether there have been relevant changes to the data (for example, if the reserved data will be identical to an answer resulting from the execution of the query in the most available data available). That revision should be made in relation to each potential source of changes / updates of data. When the system is configured so that the 808 database can be directly accessed, the 808 database should be reviewed to determine if there have been changes made directly that would affect the reserved data. One way to perform this review is to store, in addition to the same reserved data, an identification of certain categories of data (for example, an identification of tables, fields and the like) that were used in providing the reservation data. Each time a change is made in the database 808, it is determined whether any of the tables, fields, etc. used by the reserved data are being changed. If so, the reserved data associated with those tables or fields is marked (for example, by placing a data flag) to indicate that the reserved data may no longer be up to date, preferably the memory or storage used for those data. Reserved data is released for recycling (for example, for storing a response to another query or request). When the system is configured to answer data from the databases 1302a-c to the database 808, then, before reserve data is supplied, it must be determined whether changes have been made to the 1302a-c databases (for example, changes for stored queries), not yet reflected in the 808 database, which would make the reserved past data of time. Similarly, when the system is configured so that changes can be made to the 806 data sources, which may not be reflected in updates to the 808 database, a revision should be made for changes that could affect the current nature of the data. the reserved data, before that reserved data is supplied in response to a request. In some cases, multiple revisions should be made (revisions of the 808 databases, 1302a-c and data sources 806) before the reserved data is supplied in response to a request. Figure 9 is a schematic description of different controllers 804a through 804d. Each controller includes a plurality of defined processes or functions 901 through 910. Each function may include computer program instructions 912, for example to implement and carry out one or more of the steps described below and described in. Figure 10. In one embodiment, each function 901-910 is a subroutine or procedure that can be called. Functions 901 through 909 defined in a given controller 804b include functions that must be performed or performed differently depending on the source data type 806a, 806b. Thus, for example in relation to function one 901, which is a function designed to select certain directories on a hard disk or other information storage device where the desired information is stored, the procedure for selecting directories will differ depending on the type of information stored. source data 806, as can be seen, for example, from a comparison of the example of Figure 2 with the example of Figure 5. Accordingly, the schedule 912 that implements function one in the first controller 804a may be different from the programming code that implements a corresponding function in the second controller 804b. In this way, each controller defines one or more processes to perform a function with that procedure that is configured to accommodate the different characteristics of two or more different types of data sources. For example, Figure HA describes a portion of a method, expressed in pseudo code, of a type that can be used in connection with selection and / or search directories in connection with the directory structure described in Figure 2, while Figure 11B shows corresponding portions of pseudo code for a procedure that can be used in connection with selection or search directories for the directory structure described in Figure 5. Those skilled in the art will understand, from the examples of the Figures HA and 11B, how to configure controllers to perform the same function in two different types of source data. Although Figure 9 describes a function module having nine functions, the present invention can be used in connection with a function module having more or fewer functions. It is possible to configure a system in which different function modules define different number of functions and / or in which one or more functions are configured to provide or return a null value or a constant value or information. Many methods can be employed to initiate 1002 the method that is described in Figure 10. In one embodiment, the method of Figure 10 is implemented using a computer program stored in a medium such as the hard disk, CD-ROM or another non-volatile medium, and the method is initiated by issuing instructions to the computer (for example via keyboard, mouse, etc.) to launch the program, for example, to load the program into memory and execute the program. Alternatively, the program can be launched by another program. For example, in one embodiment, the new database database 808 is a Microsoft® Access database that can include a routine, such as the so-called "magician", to launch the program, with the program (Figure 10) in turn access data in the information source 806 to populate or update the database 808. In this mode, it may be useful to use the wizard to visually display pointers or 'dialog boxes' for the purpose of requesting tickets from user 'as needed (for example in step 1020), so that the user interface will have an appearance that is consistent with the user interface for the 808 database., you can see that even though a process like the one described here is partially or fully automatic; and / or even though a process may be performed without the need for user input, however, it may be desired to provide for user inputs for different purposes, for example, to provide options to reduce processing times, to eliminate or select features by default or optional and similar. In the procedure described in Figure 10, the first step after the procedure initiates 1002 is to identify and initialize the dynamic controllers 1004. In this context, the 804 controllers are considered to be dynamic in the sense that the controllers can be added or deleted. modularly, for example, to accommodate a new or different type of data source. For example, a user may initially be provided with a system such as the one described in Figure 8 that has four controllers, but may at a future time add additional controllers by purchasing through a program vendor, loading it from a service of information, network, internet connection and the like, or by writing a custom-made driver. Due to the dynamic and modular nature of the controllers, it will not be known, in advance, what drivers are available, and so, when the program starts 1002, the program identifies the drivers that are available to it. In one modality this is done by searching a disk or a directory of that for files that have file extensions or file names previously determined (partial). In one embodiment, the program may additionally analyze selected portions of each file, for example, header information, to verify that the files identified by that file name and / or extension are desired drivers. Initializing the controllers usually includes identifying and linking the controller functions and initializing the data within each controller. It is then determined whether the procedure will make an amount or an update 1006. In an amount, first a procedure is done in which all or most of the data and structure in the data source is accessed and saved to a new one. database. In an update, a procedure is done in which only a sectioned portion of the data and / or structure is accessed, for example, to ensure that the information in the new 808 databases reflects recent changes or additions that may have been made to data source 806. In a typical situation, an amount will be made the first time that the system in Figure 8 accesses or uses information from a given data source, or if relatively larger changes or additions have been made to the source of data. Typically, an update is performed on a regular basis (for example daily, weekly, etc.) in order to synchronize the data in the source data 806 with the data in the new database 808. In a modality the selection to import or update 1006 is done automatically, for example, by performing an update unless this is the first time that the procedure has accessed the particular data source. In another embodiment, a user is allowed to choose between importing and updating by providing an entry, for example, via a keyboard selection, using a pointing device and the like. If an import 1008 has been selected, the main process 802 will initiate the execution of a function of one or more of the controllers to select the directories to search for 1010. The 804 controller that is loaded or called by the main process 802 will depend on what type of source data are being accessed. In particular, for a given data source 806a, the main process 802 will employ the controller 804b, which is configured to accommodate that source data type 806a. If more than one data source 806 is to be accessed, the main process 802 will use whatever the 804 controller configured for each type of data source 806. Preferably, the type of the data sources is determined automatically based on characteristics such as the names (or 'extensions') of the files and / or directories, the number, size and structure of the files, header and other information in the files In another modality, the user is allowed or requested to indicate the type of data source (for example, by identifying the name of the brand and the version number of the program that was used to create the data source files or by indicating if the user wants the procedure to search only files on the local disk 'or perform a search that includes network files.) In one mode, a controller (or subroutines or controller parameters) is selected depending on the language used in the source data 806 For example in order to properly determine the type of the data source when that determination depends, partially, on a file name, or header information of the file, that name or information can take a different form (even for the same brand of source data program) depending on whether the source data is installed to be used by a ker of Spanish, English, Japanese, etc. In accordance with the above, in one modality, different controllers are used for the same type of data source that is installed for kers of different languages. Alternatively it is possible to use substantially similar controllers for different language data sources but to configure the controllers to obtain requested or similar file names in appropriate language (s), for example, from a table, from past or similar parameters. In this way the controllers can analyze databases in any of a plurality of languages. Although a modality provides that language capability substantially automatically, it is also possible to configure the controllers to allow (or require) user entries that ify one or more languages of the source data. In addition, when a new database 808 is constructed, it can be constructed using a language (for example for field names, legend titles and the like in the output of the database 808) which is the same as, or is different from, of, the language (s) in which the data source 806 was installed, for example using a table that provides the corresponding field names, legends, titles and the like in a plurality of different languages. That configuration can be used, for example, to use a database or other data source that was created or installed to be used in a first language, to at least partially automatically create or include 808 databases that are addressed to different idioms. This gives the end user the ability to use the native language of the end user to consult a foreign language database. For example, an end user can force a local language database 808 to update itself (if needed), based on information in a 806 data source in a foreign language, by sending a query or request for a report in native language to the 808 local language database. Similarly, databases 1302a-c (Figure 13) can be configured with field names, identifiers, titles, legends and the like in different languages, but each having access to the same raw data by accessing or updating from the database 808, to visually display the requested data in a report using the domestic current and / or with domestic language identifiers. At the end of step 1010, the main process 802 will have access to a stored list of the directories ß that will be searched for all data sources 806 as identified by the drivers 804 that were identified in step 1004. After step 1010, the main process 802 loads, or activates or calls another function 902 of the function modules 804a through 804d, in order to search the directories that were selected in step 1010 for the data to be imported 1012. The directory search 1012 is it performs in a manner that will depend on how the information is stored in different 806 data sources. For example, for some types of source data it may be sufficient to identify the files having a certain file name and / or a certain file extension. For other types of 806 source data it may be necessary to scan the data in different files to identify files that have a certain structure or content, for example, in a header portion of a file, or in another site. Thus, different modules 804 will be configured to provide the function of "search directories" 902 in different ways, to accommodate different data sources 806. If an update 1014 is to be performed instead of an import 1008, it is not necessary to select, and search directories, since, preferably, when an amount is initially made, the results of functions 901, 902 for selecting and searching directories are stored in a way that can be accessed by the main process 802 at a later time. Thus, using this stored information, the main process 802 is able to identify data that was previously imported or updated In one modality, this is useful to prevent the loading of redundant data, for example, the data that is already present in the new databases. data 808. In general, it is desired in step 1016 to identify data that is new or changed since the last import or update. n such that at least some data already in the databases 808 will not be recharged. In one modality, in order to prevent heavy data loads, the system will attempt to identify data that has not changed since the last import or update. In general, if this process is followed, at the end of the procedure the data in the new databases 808 will be synchronized with the information in the source data 806, for example, it will contain information that precisely indicates the structure and data of the data sources 806 in its current state. In the modality of Figure 10, an identification of the data that will be imported or updated is displayed visually, in order to give the user an opportunity to select the data to be imported or updated or to choose to refrain from importing or updating certain data. . The way in which the indication of the data to be imported or updated for its visual display is organized will depend on what type of data source is being accessed, and will be provided in response to a call ¿>; activation of a function in one of the controllers 804 (902a). For example, a function 902a of a controller that is configured to be used in connection with the data source described in Figures 4a through 4F may show a list of the subsidiary companies formulated in Table 430 (Figure 4F) in order of give the user an opportunity to import or update data for some companies but not for others. Preferably, the user can select one or more companies from a list shown. The steps of visual display and selection 1018, 1020 may be repeated for other types of organizations or data in the source of _ ^ data, for example, to display visually and allow the selection of data specific to a certain employee 416, for certain accounts 412, that depend on how the visual display function 902a of the controller is written or configured. In some cases, it can be determined in advance that one always wants to import or update all available information from the data source, and so the 902a function for the applied controller can merely return a program flow to the main process 802. For example, in relation to the data source described in Figures 7A to 7D, it can be determined that all available data of the surface temperature of all the sites will always be included in each amount or update. In one embodiment, the user can be provided with a visual display indicating the address of the data corresponding to the user's selection, such as the visual display of the directories, sub-directories and files containing the information, and can optional way give you an opportunity to select which directories, files, etc. will be accessed.
Once it has been determined, for example, by means of steps 1018 and 1020, the data that will be imported or updated, general information 1022 is loaded. If it is desired to access information from two or more data sources, this can be done either in series (for example, performing steps 1022 through 1046 in a first data source using a first module, followed by performing steps 1022 through 1046 in a second data source using the appropriate controller and so on) or in parallel (by example, performing the steps in each desired data source using the appropriate drivers before performing further steps in each information module). The general information includes information about the structure of the data in the data source. The type of general information that is loaded in this step 1022 will vary for different types of source data. For example, for a function 903 that is written or configured to be used in connection with a database such as that described in Figures 4A through 4F, general information may include, for example, an account identifier or other categories used in the data source 806. On the other hand, if the function 903 of a controller 804 is configured or written to be used in connection with the data source described in Figure 1, it may still be desired to determine how many components are used in the data source. , but, in this case, this information will be determined by the number of flat files 101a to lOlf that are in the data source 806. The general information may also include information such as how many projects 112, how many products 114 and / or lines of 116 products, or how many subsidiaries 118 are defined in data source 806. General information may also include the name of the company, the first s of the fiscal year and, in general, any other information that may be charged once (as opposed to, for example, the information loaded in steps _ ** 1024, 1033 and 1036, typically loaded in a loop). If a 'general information loading' function 903 is provided in a controller configured to be used in connection with the data source described in Figure 6, the general information such as the number of addresses 612 in the database could be loaded in step 1022. The main process 802 also calls or activates a function 904 of the appropriate controller or modules 804 to load data definitions 1024. The data definitions may include information such as the text name stored as an identifier for a class or particular category of data in the data source 806, the field size, the data type (string, integer or decimal, number of decimal places) and similar characteristics for different data categories. Data definitions includes the interrogation of the data to obtain information necessary to store an indicator of the architecture or structure of the data. information in the data source and the data elements in the data source as required to generate one or more new databases 808 that will contain all the structure and data necessary for the type of report or analysis that will be performed in the new database. The interrogation of the data in the step of 'loading data definitions' is an intelligent interrogation in the sense that it can be adjusted to virtually any data source and identify what is required to store a standard form of the data source, For example, to report and analyze In the example of Figures 4A through 4F the information needed to indicate the architecture of the source data will include, for example, the names of four account parts (Account, Company, Employee and Address). ) as well as the type of data (for example, numeric or string) and the length required to store any part of the string account In the example of Figure 6, the information that is needed to indicate the architecture of that data source will include storing the names of the parts of the account (Address and Date) as well as the names of the references used for this data (Unit). The interrogation can include identifying other optional data that can be loaded, for example, billing numbers. The particular type of interrogation relied on will depend on the characteristics of the particular data source that is being analyzed and is thus different for each controller 804. In general, the steps of loading data or information 1022, 1024, 1033, 1036 are performed by functions in controllers, 8-04, while the steps of saving information 1026, 1028, 1030, 1032, 1034, 1038 are performed by the main process 802. The main process 802 then identifies or creates a repository database 1026, by example, identify or create a file or other storage structure in the new databases 808 that will serve as the address where the information loaded from the source or 806 data sources will be saved. Typically an update does not need to create a new database or database table, since the updates are usually simply added to existing tables within an existing database. In one modality, the new data base (s) has a predefined structure and, in accordance with the above, in this modality, it is not necessary to create or define a structure for the new database, or to obtain information from the source (s) of data for the purpose of defining that structure. However, it is anticipated that, commonly, not all possible tables or other data repositories of the predefined database will be filled with data from a data source. For example, the predefined database can have a table to maintain information about the divisions of the company, while the data source can be related to a company that does not have company divisions. Thus in this mode, the controller (s) is preferably configured to determine which data of the data source (s) should be loaded appropriately in which tables (or other data structures) of the predefined database, depending, for example , in what data are available in the data source (s). When tables in database 808 have not yet _ ^ >; Once created, the main procedure 802 entoinces calls the appropriate function 905 of one or more controllers to create database tables 1028 that will be used to store data saved from the source data to the new database 808. The manner in which the database tables are created preferably taking into account both the structure of the data and the source or sources of data 806 and the way in which the new databases 808 will be used, for example, for analysis, generating reports , etc. because the particular tables that are created vary depending on the characteristics of the information in the data source 806 (as determined for example, by steps 1022 and 1024), the 1028 database tables created are functions provided by 804 controllers configured for the particular database that is being used to access. For example, when function 905 'create database table' is written or configured to be used in connection with a data source such as the one described in Figures 4A through 4F, the database tables that are created they can include, for example, a table of accounts, a table of employees, a table of subsidiaries (which would be an account roll) and a table of details (as described in more detail below), while wanting to create a table database "written or provided in a controller configured to be used in connection with the database described in Figures 7A through 7D may include an address table, a data table, a time table, a unit table and a detail table. Although the structure and data for the new database 808 may depend, at least in part, on what information is available in the data source (s) 806, it is also possible to configure the system so that the structure and / or the data to be loaded in the new database 808 can be, at least partially specified or selected (manually or automatically) from among a plurality of options. For example, the system can be configured so that the user can specify or select (or the system can, by default, automatically configure itself) a larger general database structure for the new database 808. In a variant, the system is configured to recognize certain terms commonly used (for example, "Network Input") and to use controllers that, for each type of data source, automatically map those terms for commonly used definitions, command of subroutines and the like that obtain the data (and, perform calculations as needed) required, in the context of the type of data source for which the controller is configured, to provide 'network input' information in the new 808 database. In an environment from multiple data sources, this allows the user to use the same requests to obtain corresponding numbers from each source, without, for example, having to r knowledge of the underlying 806 data sources or their structures, commands, etc. In these let's say one or more requests for queries or specifications per user, preferably established in 'natural' or semantic language, it can cause the appropriate controller to build the new database 808 that is desired.
Preferably, the tables created in the new databases 808 have a structure or architecture that is dynamic in the sense that it can adapt virtually any type of data definitions or structures that could be found in different 806 data sources. modality, it is intended that the new databases 808 be used primarily for output information as it is to generate reports and analysis and thus are configured preferably, as described below, to provide superior output performance as is the high flexibility in the types of outputs and analysis of available data and the relatively rapid execution of those analyzes and / or outputs. In this context, a database is optimized for speed and / or output flexibility if it provides output speed or flexibility that is greater than the speed or flexibility of any other possible configuration. Thus in this context, Optimized "does not necessarily require mathematically accurate optimization." In one embodiment, three general types of tables are provided in step 1028: a plurality of category tables (including enrole tables where appropriate), less a detail table and an input table Preferably, a category table is provided for each way in which a particular data point or record can be categorized, for example, if function 905 of 'create a base table of data "is provided in a controller 804 that is configured to be used with the database of data as described in Figures 4A through 4F, the new database, as described in Figure 12, will contain a plurality of category tables 1202 that include, for example, a table of Accounts that lists all the possible account categories of account 1203, a table of subsidiaries 1230 that lists all possible subsidiaries which are in the data source 806, a table of employees 1216 that lists all the registered employees in the data source 806, and b a table of addresses 1234 that lists different addresses, sales regions, etc., annotated in the source of data 806. In the modality described, each record or item in each of the category 1202 tables is associated with an Index for use in a detail table as described below. In the embodiment described in Figure 12, the details table 1240 will be provided which, once populated, will have a record for each account entry or transaction in the source or data source 806. In the embodiment of the Figure 12 for fields 1244 is provided, namely a date field 1242b, a description field 1242c, a quantity field 1242d and a total field 1242e in relation to a transaction in the data source (Figure 4A). an Index field 1242a is provided to store an identifier number or Index number for each record. Additionally, a separate capo is provided for each record to store an indication of any information appropriate to each of the categories _ * • defined in step 1022 of general information loading, in the example of Figure 12, which includes the category of Account 1242f, subsidiary category 1242g, product category 1242h and employee category Í242i. In general, it is desired to provide as many possible fields, for example, categories, as are present in the data source 806 as needed to analyze or extract data. Thus, since it may be desirable to obtain a report that groups the transactions in relation to which Account the transaction is related to, it is useful to have an Account category. However, it may be that in the purpose of a desirable point, you want to print a separate report for each subsidiary or print a report in which the transactions are grouped by subsidiary, so it is useful to have a category of 1242g subsidiaries. In general, for each desirable way to select, group, report, print or analyze the data, a separate field can be provided in the detail table 1240. The structure of the database described in Figure 12 can be contrasted with the structure of the data source described in Figures 4A through 4F and the structure of the data source described in Figure 1. For example, in the structure described in Figures 4A through 4F, the manner in which a particular transaction (Figure 4A) was associated with a particular account (Figure 4B) ) was indicated in a separate table of links (Figure 4D), while in the embodiment of Figure 12 the Index for the appropriate account 1242f will be stored in its own field in the same register that contains the transaction information 1244. Thus, although the database 808 having tables such as those described in Figure 12 can store the information found in either a data source such as that described in Figure 1 or a data source such as that described in Figs. 4A through 4F (or source data structured in another way), the structure or architecture of the database in the example of Figure 12 is different from that of either the data source described in Figure 1 or the source of data. data described in Figures 4A through 4F. similarly, the relational database structure of Figure 12 is different from the flat file structure described in Figure 1 even though the type of information stored in the two organizations is similar. The main procedure 802 can be configured to save the general information 1030 (loaded in step 1022) and save the data definitions 1032 (loaded in step 1024), for example, in additional tables provided in the databases 808, for example , to be used in later passages of Figure 10 and / or in subsequent updates. The main stream 802 utilizes appropriate 908 functions from one or more 804 controllers, to load in the new 808 databases, data definition codes (e.g., field width, data type, etc.) for the different tables created in step 1028 (1033). In any mode, the enrollment information is also loaded at this time. In general, the enrole information refers to the information used to define subcategories of data, for example, groups of items within one of the 4e category tables. As an example, as described in Figure 12, the employee category table may be associated with an address or region code, for example, to identify the address or site or region where each employee of the company is located or has responsibility. For example, there may be numerous sales employees for a company, each of which is associated with a sales region. As another example, the different products of a company can be enrolled or grouped into product lines. By defining a field 1238 for an address roll code, the structure in Figure 12 makes it possible to output a report grouped by sales region. The envelopes can also be used to provide statistical analysis of grouped data such as averages, means, standard deviations, etc. Although in the embodiment of Figure 12, the field for the address enlis code 1238 is shown as being a field of the employee category table 1216, an address field 1238 may be provided in the detail table also if it is you want The manner in which a function operates to load a roll code 908 varies depending on the source data type 806 with which it is configured to act, and thus is provided as a function of different 804 controllers so that different instructions can be provided of programming to be used with different types of source data. As an example, a "define enlis code" function 908 can be provided in a controller 804 configured to be used in connection with the database described in Figures 4A through 4F., an address field has already been defined in the employee table 416 that can be used directly for address code purposes. In contrast, in the embodiment described in Figure 1, there is no direction indication for an employee 120 associated with a particular flat file 101a. Thus, in one modality, an address pool for the data obtained from the data source in Figure 1 may not be possible. If, however, there is, for example, another file that provides the home address for each employee in the company, it would be possible to use, for example, the residence status of each vendor to infer the sales region for which that person is responsible and thus inferentially define the address enlistment code. For example, in connection with the data source described in Figure 6, the function of 'define data link "908 may contain, within that function, a table indicating, for each temperature station 612, whether that station is a station in the northern hemisphere or the southern hemisphere and could thus create a hemisphere enlisting code based on that.In some cases it might be desirable to provide the keyword database for word recognition and / or search in order of define enroles and / or additional structures In some situations, the enrole code will be related to information that was not used in the data source as a basis for analyzing or grouping data (for example, the hemisphere enrole code for the data source of Figure 6) Thus, in these situations that provide an enlistment code implies providing an increase of the data by automatically providing additional elements that e were not available (or at least not used) to extract or analyze information in the 806 data source. Preferably, the data is classified into categories and then grouped by an in-depth analysis of the source data. In addition to the defined loops, the process can also store optional reference fields. In general ', the optional reference fields refer to fields that typically would not be used to group data such as free text fields (comments, memory fields, billed numbers, etc.) but may be desired to be included in the contributions , etc, Following with the data definition codes enroles these codes of definitions of data and enroles are saved 1034 in the new database 808, for example, by listing the categories in the different category tables 1202. Ef1 load and save data data definition and code 1033, 1034 are made as a loop 1035 to load and save the specific categories (Accounts, Companies, Employees and specific addresses, in the example of Figures 4A to 4F) In summary, in relation to the modality described, step 1024 delfines the categories (for example Accounts, Company, Employees and Direction), as well as their data types (for example, a string or numeric). Step 1028 creates the category tables defined in step 1024 (and any other tables defined in step 1022). Step 1032 saves the data definition, typically in a standard table. Steps 1 033 and 1034 load and save data definitions enrole codes. At this point in the process, although the information regarding the structure of the data has been placed in the new databases 808, the data that is the subject pnncip to that of the source data 806 (for example, the concablles entries or transactions in the case of accounting source data, temperature data in the case of meteorological temperature source data) has not been loaded into the new databases 808. In accordance with the above, the main process 802 calls or activates a 909 function in the appropriate drivers 804 to load the data 1036, save the data 1038, and repeat the process 1039 until all the desired data has been loaded and saved 1040. Thus at the end of this procedure 1036, 1038, 1040, the new base of data 808 will be populated with data from one or more data sources 806. In one embodiment the information is verified as part of the system described herein. For example, rules can be defined in a controller or as part of a new database 808 to control the data that is extracted from the data source 806. This can be in the form of data validation as they are brought to the new base of data 808 and, preferably, when reporting any item that does not meet the DS criteria of those rules, to provide errors to report and catch. For example, in the context of accounting packages, the present invention can be configured to verify whether the "books" are balanced.When two or more data sources having different structures and / or are produced using different brands or types of programs are Combined, by the use of the procedure in Figure 10, those data from different types of sources can be populated in a common database structure, for example, as described in Figure 12. This facilitates the analysis and common report or standardized for data ", preferably optimized to provide flexibility and exit speed. In the embodiment of Figure 10, the main process 802 can now build and, if desired, execute, query data as summarization queries 1042. In general, there are at least three types of queries that can be constructed. The first type of query that would be common to all the new databases 808 that are created using the procedure of Figure 10 can be provided, as is a query that provides the number of entries in the detail table or the number of entries for a range of dates given (for example, per room). Other queries may be constructed at least partially depending on the general information and data definitions obtained in relation to one or more of the 806 data sources, which include enroles that may have been provided and in accordance with the above may, if desired , provided as part of a controlled 804, specific to a particular data source. A third type of query can be provided to replicate or include queries or reports that were used in the original data source, (for example, as described in Figure 4E). Once a new database 808 is populated and the appropriate queries are constructed, the main procedure 802 can close the tables and databases 1044 and the dynamic controllers 1046, for example, to free memory. Although a greater use contemplated by the present invention is in relation to providing standardized and / or improved reports and analyzes in one or more data sources, it is also possible to use the present invention in relation to data entry and data storage through using a database administrator system (for example, Microsoft Access®, Excel®, FoxPro®, Btrieve®, etc.) in relation to the new databases 808. Although it is contemplated that further utilization of the present invention will involve continue using the original source data 806 to enter and store while maintaining a copy of the same information in the new databases 808 for reporting and analysis purposes, it is also possible to use the present invention to transfer data from a source data type 806a, b, c to another type, for example 806c, by storing them first in the new database 808, as described above and, Then, by downloading or exporting the information from the new 808 databases to a different type of 806c source data. In light of the above description, many advantages of the present invention can be observed. The present invention facilitates the standardization of reports and analyzes despite a variety in the brands or types of data sources used. The present invention is provided for a system that can optimize or otherwise improve its performance in extracting or reporting data. The present invention is provided for the ability to report and analyze data that can be improved compared to the reports and data analysis of the data sources. Via sophisticated interrogations of the source data, in the context of an accounting system the present invention is able to reflect the graphs of the numbers arranged in the data source. In one modality, the process extracts some or all of the defined loops, optional reference fields, account period information. By performing the tasks automatically and avoiding (or reducing) the need for human analysis, the present invention requires less labor intensive and less time consumed than previous methods, in some cases making it possible to populate new databases 808 in a matter of a few minutes or hours in situations that would require several days or weeks under the previous methods. In one embodiment, the controller 804 can be configured to detect, analyze and maintain in the new 808 databases any security, passwords, permissions, etc. which are used in data sources 806. Thus, it is not necessary for a system administrator who needs to maintain a new and separate set of accounts, passwords, permissions, etc., for the new databases 808 in addition to that maintained with the original data source 806. Preferably, the system can be configured to perform updates substantially at previously determined intervals such as daily, weekly, etc. Many variations and modifications of the present invention may also be used. It is possible to use some aspects of the invention without using others. For example, it is possible that a new database 808 is supplied to populate it without defining new or additional links. Although in the above description, the different 804 controllers can be provided as separate DLLs and are dynamic in the sense that as many as can be added as desired simply by storing additional DLLs in the appropriate directory, it would also be possible to do an operable version of the invention in which the function performed by the function modules are provided as portions or subroutines called by the main process 802 rather than being separately stored modules. While the invention has been described by means of a preferred embodiment in certain variations and modifications, other variations and modifications may also be used, the invention being defined by the following claims.

Claims (18)

  1. CLAIMS 1. A method implemented by computer that comprises; providing a first controller that issues instructions for accessing data stored in a first data source, said first controller contains program instructions configured to be used in relation to said first data source, wherein said first data source includes data identifiers in a first language; using said first controllers to automatically obtain first information about the data structure of said first data source without the need for human analysis of the first data source; create a database to store at least some data of said first data source, said database having a structure based on at least some of said first information; wherein said database includes data identifiers in a second language, different from said first language.
  2. 2. A method, as claimed in claim 1, wherein one of the aforementioned first and second languages is English.
  3. 3. A method, as claimed in claim 1, wherein said data identifiers are selected from the group consisting of field names, legends, labels and titles.
  4. A method, as claimed in claim 1, further comprising: consulting said database using at least a first term in said second language; and that it automatically obtains information from said first data source in response to said query, wherein said first term in said second language is absent from said first data source.
  5. 5. A method implemented by computer that comprises; providing a first controller issuing instructions for accessing data stored in a first data source, said first controller containing program instructions configured to be used in relation to said first data source; using said first controllers to automatically obtain first information about the data structure of said first data source without the need for human analysis of the first data source; creating at least one first and second databases to store at least some data of said first data source; at least one of said first and second databases having a structure based on at least some of said first information; change the data in said second database to provide the changed data; and automatically writing back said changed data from said second database to said first database.
  6. 6. A method, as claimed in claim 5, wherein said step of writing back automatically is performed periodically.
  7. A method, as claimed in claim 5, wherein said step of writing back automatically is performed in response to said step of changing the data in said second database.
  8. A method, as claimed in claim 5, wherein said step of writing back automatically is performed when a request is made, on said first base _ > of data, for information that corresponds to information that has been changed in said second database.
  9. 9. A method implemented by computer that comprises; receive instructions from a user that include at least a first natural language term; providing a first controller issuing instructions for accessing data stored in a first data source, said first controller containing program instructions configured to be used in relation to said first data source; using said first controller to automatically obtain first information from said first data source without the need for human analysis of the first data source, wherein said first information is information that is needed to perform said instructions.
  10. 10. One method, as claimed in claim 9, further comprising: providing a first controller that issues instructions for accessing data stored in a second data source, different from said first data source, and using said second controller to automatically obtain second information from said second data source without the need for human analysis of the second data source, wherein said second information is information that is needed to perform said instructions, wherein the corresponding information of said first and second data sources is provided in response to said user's instructions.
  11. 11. A method implemented by computer that comprises; providing a first controller issuing instructions for accessing data stored in a first data source, said first controller, containing program instructions configured to be used in relation to said first data source; using said first controllers to automatically obtain first information about the data structure of said first data source without the need for human analysis of the first data source; creating at least one first and second databases to store at least some data of said first data source; at least one of said first and second databases having a structure based on at least some of said first information; introducing, at a first time, at least one first query for execution by said first database to produce a first query result; store said first query result; enter for the second time, said first query; and outputting said first query result, in response to said step of entering said first query to said second time, and without executing said first query after said second time.
  12. 12. A method, as claimed in claim 11, further comprising reviewing for changes in data prior to said step of outputting said first query result.
  13. A method, as claimed in claim 12, wherein said step of storing said first query is performed only after said first query has executed at least a previously determined number of times.
  14. 14. An apparatus that can be used in relation to accessing data stored in first data sources configured to generate at least one first output, the apparatus comprises a computer paired to said first source and programmed to: provide a first controller that issuing instructions for accessing data stored in a first data source, said first controller contains program instructions configured to be used in relation to said first data source, wherein said first data source includes data identifiers in a first language; using said first controllers to automatically obtain first information about the data structure of said first data source without the need for human analysis of the first data source; create a database to store at least some data of said first data source, said database having a structure based on at least some of said first information; tf »74 wherein said database includes data identifiers in a second language, different from said first language.
  15. 15. A computer-readable medium, which has 5 stored within a computer program that can be used in relation to accessing the data that can be stored in any of the first or second different data sources, at least one of said first and second data sources configured to generate the minus a first output, the computer program comprising instructions for: providing a first controller that issues instructions for accessing data stored in a first data source, said first controller contains program instructions configured to be used in relation to said first source of data, wherein said first data source includes data identifiers in a first language ^ use said first controllers to automatically obtain first information about the data structure 20 of said first data source without the need for human analysis of the data. first source of data; creating a database for storing at least some data of said first data source, said database having a structure based on at least some of said first information; «'V- 75 wherein said database includes data identifiers in a second language, different from said first language.
  16. 16. A method implemented by computer that comprises 5; provide a first controller which issues instructions for accessing data stored in a first data source, said first controller contains program instructions configured for use with 10 relation to said first data source; storing at least some data of said first data source in a database different from said database.
  17. 17. A method, as claimed in claim 15, wherein said step of using said first controller to automatically obtain first information from said first data source that includes selecting information that depends on the information available in said data source.
  18. 18. A method, as claimed in claim 16, further comprising obtaining user entries and wherein said step of using said first controller to automatically obtain first information comprising using said user input to obtain automatically 25 said first information. A - »* 76 SUMMARY A generation of outputs or reports contained in a data source is provided, which can be any of two or more types of data sources, in a standardized or uniform manner. A plurality of specific controllers are provided for different types of data sources which include programming to identify structural characteristics or other characteristics of the different 10 data sources, for example, to be used in defining a new database. Preferably the new database is configured to allow highly flexible and / or quick outputs or reports or is otherwise optimized for reporting purposes. In one modality, the, present The invention includes the conversion of one or more data sources into one or more uniform databases (812), preferably generating one or more key categories for organizing the data, optionally generating category groupings or enroles and additional data or data. optional references
MXPA/A/2000/012346A 1998-06-29 2000-12-13 Data retrieval method and apparatus with multiple source capability MXPA00012346A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09106538 1998-06-29

Publications (1)

Publication Number Publication Date
MXPA00012346A true MXPA00012346A (en) 2002-05-09

Family

ID=

Similar Documents

Publication Publication Date Title
US6023694A (en) Data retrieval method and apparatus with multiple source capability
US6026392A (en) Data retrieval method and apparatus with multiple source capability
US6631382B1 (en) Data retrieval method and apparatus with multiple source capability
US6625617B2 (en) Modularized data retrieval method and apparatus with multiple source capability
US20180060410A1 (en) System and method of applying globally unique identifiers to relate distributed data sources
US7013307B2 (en) System for organizing an annotation structure and for querying data and annotations
US5913214A (en) Data extraction from world wide web pages
US8782096B2 (en) Virtual repository management
US7925658B2 (en) Methods and apparatus for mapping a hierarchical data structure to a flat data structure for use in generating a report
US7251653B2 (en) Method and system for mapping between logical data and physical data
US6615202B1 (en) Method for specifying a database import/export operation through a graphical user interface
KR100538547B1 (en) Data retrieval method and apparatus with multiple source capability
US6915303B2 (en) Code generator system for digital libraries
Trujillo et al. Applying UML and XML for designing and interchanging information for data warehouses and OLAP applications
US7433882B2 (en) Data management system and computer program
EP1304630A2 (en) Report generating system
MXPA00012346A (en) Data retrieval method and apparatus with multiple source capability
AU772658B2 (en) Data retrieval method and apparatus with multiple source capability
AU2004200749A1 (en) Data retrieval method and apparatus with multiple source capability
Zhang Analysis and Research on the Construction of Book Retrieval Platform Based on Disconnected Apriori Algorithm
Thaller The Archive on Top of Your Desk: An Introduction to Self-Documenting Image Files
Hernandez-Orallo Data Warehousing and OLAP
Willis beginning vb. net databases
Harvey et al. Database Concepts and Terminology
Allen et al. Introducing Metadata Modeling