WO2020091627A1 - Method of forming and structuring an electronic database - Google Patents
Method of forming and structuring an electronic database Download PDFInfo
- Publication number
- WO2020091627A1 WO2020091627A1 PCT/RU2019/000747 RU2019000747W WO2020091627A1 WO 2020091627 A1 WO2020091627 A1 WO 2020091627A1 RU 2019000747 W RU2019000747 W RU 2019000747W WO 2020091627 A1 WO2020091627 A1 WO 2020091627A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- identifier
- column
- database
- final
- type
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
Definitions
- the invention relates to the field of data processing.
- the patent document RU 2386166 C2 discloses a method of structuring and functioning a database of regulatory documents, the method comprising the following steps: storing objects of a regulatory documentation in at least one data warehouse; interacting with the data warehouse in the following manner: the entire volume of regulatory documentation and the information contained in the documentation for any subject area of activity is placed in a separate area of the database and represented in the form of a three- dimensional information space; choosing three main comparand words and assigning their names to the X, Y, Z coordinate axes; determining components of the comparand words and plotting the components along the coordinate axes formed by unit vectors (unit segments), which determine the components of the comparand words; assigning codes based on designations used in practice; forming clusters, which are regions of the three-dimensional space, from the unit vectors; the regions are bounded by unit surfaces of the unit segments; forming a code of each cluster from codes
- the closest analogue of the claimed solution is the patent document RU 2650032 C1 (Aleksey Petrovich Semenov, April 6, 2018).
- the document discloses a method for creating an electronic database. The method is executed using a database management system through an electronic device by the following steps: structuring data into a single table with the use of a database management system; wherein the table contains at least four columns and at least five lines; forming a first line that displays a root element; storing a first identification number in a first column of the first line; forming a second line that displays the data type; storing a unique identification number of the data type in a first column of the second line; wherein the unique identification number of the data type is different from the first identification number, storing a name of the used data type in a third column; storing a code of the data type in a fourth column; forming a third line that displays a term, storing a unique identification number of the term in a first column of the third line; storing a name of
- the disadvantages of the above mentioned analogues are the complexity and awkwardness of the methods, a large number of operations, which leads to a decrease in the software performance and a decrease in the data processing rate, the difficulty in perceiving identifiers and their recognition during calculations and other operations with numeric values, an increase in a number of human errors, the difficulty during the operation with the database.
- the problem and the technical result of the present invention are providing a more convenient visual perception of identifiers and their recognition during calculations and other actions with numeric values in databases, simplifying the operation with a database, reducing the significance of a human factor, and, therefore, decreasing human errors, increasing the software performance, as well as providing a significant increase in the data processing rate during the operation with databases. 5 Disclosure of the invention
- a method of forming and structuring an electronic database the method implemented using a database management system and / or programming languages by means of an lo electronic device according to the following steps: a) placing data into a table or tables with the necessary structure; wherein the structure contains at least one column for a final identifier of numeric values and / or index that uniquely indicates the final identifier of numeric values; b) creating a table containing at least one column, and forming at least two basic is elements of the identifier in lines, wherein one of the basic elements determines a type of the numeric value of the identifier; the column of the basic element of the identifier has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported by electronic databases; c) using the table created 20 during the step b) with the addition of at least one column or creating another table containing at least two columns; and forming a preliminary identifier in
- the invention is a method of constructing/optimizing an organizational structure of data for enterprises/companies that need sequential information processing and require careful analysis of numeric data.
- the purpose of creating a table of basic elements of the identifier is to store and form basic elements of the identifier in the database.
- the table of basic elements of the identifier is conventionally called the Alphabet in this description of the invention and contains five columns sufficient to structure a semantic identifier of numeric data.
- a column of the basic element of the identifier has a text type of the data that allows storing the data in the form of a set of natural language characters and / or other characters supported by electronic databases
- Columns of a type, group, and priority may have a text, digital, or user- defined data type depending on the implementation of the invention; they can’t have the data type in the form of a date or logical types.
- a first column contains a text expression of the basic element of the identifier
- a second column contains a text and / or digital expression of the type of the basic element of the identifier
- a third column contains a text and / or digital expression of a group membership of the basic element of the identifier
- a fourth column contains a text and / or digital expression of an order number of the basic element of the identifier within a type; and it is a priority relationship between types
- the fifth column contains a text and / or digital expression of a separator character of the basic element of the identifier. Columns of the type, group, priority, and separator may be absent in various implementations of the invention, wherein the column of the basic element of the identifier is always present.
- the expression of the basic element of the identifier is a combination of natural language characters and / or other characters supported by electronic databases.
- the basic element of the identifier is conventionally called the Letter in this description of the invention.
- the Letter can provide a function of a group membership, a descriptive function, a function of the final identification of the type of numeric value assigned to the identifier (Table 2) and / or other functions provided by the implementation of the invention.
- the type of the Letter can be formed on the basis of functional membership or on the basis of other needs / preferences of the developer / user of the database, wherein the type performing the function of determining the type of numeric value is always created (examples are table 2, lines 10-14); this type is crucial for the formation of the final identifier and the binding of numeric data.
- the type of the Letter can be expressed in a digital and / or other form supported by fields of the column of the table of the database, and defines the primary grouping and sorting of the Letters for the purpose of further building the semantic identifier.
- the group of the Letter is formed on the basis of the needs / preferences of the developer / user of the database; the group can be expressed in a digital and / or other form supported by fields of the column of the table of the database, and defines the secondary grouping of the Letter for the purpose of further building the semantic identifier.
- the column of the group may have other functions or be absent in various implementations of the invention.
- the priority of the Letter is formed on the basis of the needs / preferences of the developer / user of the database; the priority can be expressed in a digital and / or other form supported by fields of the column of the table of the database, and performs the function of sorting within the type in the final identifier.
- the column of the priority may have other functions or be absent in various implementations of the invention.
- the separator of the Letter is formed on the basis of the needs / preferences of the developer / user of the database; the separator can be expressed in a digital and / or other form supported by fields of the column of the table of the database, and performs the function of visual separation of the Letters in the identifier.
- the separator can be expressed by one or several characters for all Letters or a character / characters defined in advance in the application code or be absent in various implementations of the invention. Table of forming preliminary identifiers
- the purpose of creating a table of constructing preliminary identifiers is the formation and storage of preliminary identifiers in the database.
- the table for constructing preliminary identifiers is conventionally called the Dictionary in this description of the invention and contains six columns sufficient to build a semantic identifier for digital data and its further storage in the database.
- the first column of the table of the preliminary identifiers has a text data type that allows storing data in the form of a set of natural language characters and other characters supported by electronic databases.
- the remaining five columns which are columns of the basic element of the identifier, type, group, priority, and separator, are similar to columns of the table of basic elements of the identifier, and these remaining columns obey the same rules of forming.
- Columns of the type, group, priority, and separator may be absent in various implementations of the invention, wherein the column of the basic element of the identifier and the column of the preliminary identifier are always present.
- the expression of the preliminary identifier is a complex of elements (Letters) formed in the Alphabet (table of basic elements of the identifier) and separators combined by the concatenation operation in the application code or the database code, or connected manually with the use of sorting firstly by Type, then by Priority (or sorting may be absent) by selecting one / several elements (Letters) within the Group.
- the Letters determining the type of numeric value are not attached to the preliminary identifier.
- the preliminary identifier is conventionally called the Word in this description of the invention.
- the result of the formation of the Word is a text expression that allows providing the accurate identification of the object of measurement, its group membership, properties and type of numeric values in further operations with lines / cells of the database (Fig. 1).
- the remaining columns of the Dictionary perform the function of assigning the Letters (basic elements of the identifier) to the Word (preliminary identifier) and further storage of the generated structure in the database.
- Columns of the type, group, priority, and separator may be absent in various implementations of the invention, wherein the columns of the preliminary identifier and the Letters are always present.
- the preliminary identifier called the Word allows providing the visual determination of objects of measurement in databases; this identifier has undeniable advantages with respect to traditional identification in the form of numeric product items / indexes and / or identifiers, since it is semantically readable, requiring no further explanation of its meaning, group memberships, and properties of the object identified by it.
- this identifier allows performing the same functions as traditional product items / indexes / identifiers expressed by digital encoding. These functions are the unique identification of database lines, queries in SQL and NoSQL languages.
- the Word identifies the object of measurement, but not the type of measurement, therefore, in order to form the final identifier and bind numeric data to it, the Word is concatenated with the separator of the Letter expressing the type that determines the type of the numeric value or with the separator defined in the application / database code, if the presence of the separator is determined by the user / developer, and then with the Letter defining the type of numeric value (table 2, lines 10-14). The concatenation is performed in the application code or the database code, or manually.
- a number of the final identifiers for one Word is equal to a number of types of numeric value defined in the Word structure (Fig. 2).
- the final identifier is conventionally called the Line Forming structure in this description of the invention.
- the Line Forming structure has the function of uniquely identifying an array of numeric values for filtering, grouping, sorting, and performing mathematical operations on array elements.
- the array of numeric data can be expressed in any form that allows storing numeric data in digital form via data storage devices.
- a separate table / tables can be created to store the Line Forming structures or it is possible to generate them on the fly in the application / database code.
- the binding of numeric data means a ratio of the final identifier (Line Forming structure) with a set of numeric data; the ratio allows determining their direct relationship with an object of identification.
- the object of identification means any tangible / intangible object that can be expressed numerically.
- the binding of numeric data is carried out by providing a clear relationship of numeric data with the Line Forming structure in tables / database code or application code.
- the relationship of numeric data with the Line Forming structure can be provided by storing the numeric data in one line of the table with the Line Forming structure by constructing a multi- stage correlation from identifiers / product items / indexes and / or their combinations stored in lines of database tables and reliably determining the correlation of numeric data with the Line Forming structure, by storing the Line Forming structure, its Letters and / or their combination in the application / database code, and creating tables and their lines / columns based on them.
- Different implementations may use other methods of providing the relationship of numeric data with the Line Forming structure, but this relationship is determined reliably in all cases.
- the method is implemented according to the following steps: taking the decision on the goals of a calculation system of the future database / application, for example, maintaining warehouse records, financial planning, a comprehensive analysis of the enterprise operations; based on the decision made, determining arrays of numeric data that require structuring, for example: a sales plan, a sales report, and comparing a plan with the fact of sales; defining the required parameters of filtration, grouping, sorting, for example: isometric proportions of goods (width, length, thickness, etc.), direction of flow (income, expenditure), type of data about the object (plan, fact); determining the required types of measurement, for example, rubles, pieces, kilograms.
- Tables (for example, the following table) necessary to achieve the goals of the decision are created using the DBMS and its means and / or external means of working with the DBMS (for example, ODBC driver) to create / encode tables:
- the word identifies numeric data as the product receipts plan in ruble, piece, and packaging (roll) terms.
- the example shows that during using the semantic identifier there is no need to create separate tables for the plan, fact, income, expenditure, and explanation of a product item, all these functions are performed by the identifier itself; the identifier allows accurately identifying the data.
- the FCT (fact) Word can be used instead of the PLN (plan) Word.
- Lines identify factual data in this case. Being in the same table, such Line Forming structures make it easy to get reporting results on the matching of the plan and the fact by performing simple SQL queries, wherein it is enough to look at the Line Forming structure one time to determine data that the Line Forming structure identifies.
- the only necessary condition for the implementation of identification is the presence of a column containing the Line Forming structures or some index relating to the Line Forming structure in target tables, if the developer / user decides to use additional indexing, which is also acceptable.
- the table of basic elements of the identifier (Alphabet) and the table of preliminary identifiers (Dictionary) are formed after selecting a composition and structure of a target table / tables and creating this table / tables. After that basic elements are formed in the Alphabet table. Types, group memberships, priorities, and separators are specified based on the formation rules described in the“Table of basic elements of the identifier” subsection of the description. Then the preliminary identifiers (Words) are formed in the Dictionary table based on values in the filled Alphabet table and according to the formation rules described in the“Table of forming preliminary identifiers” subsection of the description. The generation of the final identifier (Line Forming structure) is provided after forming the both tables. The generation is ensured according to the formation rules described in the“Forming the final identifier and binding numeric values” subsection of the description.
- a small number of operations, tables, the absence of multiple value duplicates, and the use of final identifiers of the Line Forming structures provide simplifying the operation with the database, reducing the significance of the human factor, and, therefore, decreasing human errors.
- the use of the text data type providing the storage of the data in the form of a set of natural language characters and / or other characters supported by electronic databases for the column of the basic element of the identifier and the column of the preliminary identifiers, as well as the use of the final identifiers (Line Forming structure) allow visually defining objects of measurement, types of numeric values, therefore, providing a more convenient visual perception of identifiers and their recognition during calculations and other actions with numeric values.
- the features that discloses b) creating a table containing at least one column, and forming at least two basic elements of the identifier in lines, wherein one of the basic elements determines a type of the numeric value of the identifier; the column of the basic element of the identifier has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported by electronic databases; c) using the table created during the step b) with the addition of at least one column or creating another table containing at least two columns; and forming a preliminary identifier in the lines of one of these tables; wherein the preliminary identifier is based on the basic elements of the identifier formed during the step b) so that the lines reflect a composition of the preliminary identifier in the form of a ratio of one preliminary identifier to many basic elements; wherein the column of preliminary identifiers has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported
- the achievement of the required technical result is ensured, namely: providing a more convenient visual perception of identifiers and their recognition during calculations and other actions with numeric values in databases, simplifying the operation with the database, reducing the significance of the human factor, and, therefore, decreasing human errors, increasing the software performance, as well as providing a significant increase in the data processing rate during the operation with databases.
- Implementations of the claimed invention have at least one of the above mentioned objects and / or aspects, but do not necessarily contain all of them. It should be understood that some aspects of the present invention arisen as a result of attempts to reach the aforementioned implementation object may satisfy not this object and / or may satisfy other objects not specifically mentioned here.
- the specified order of filling the lines and columns is given as an example and used to describe the essence of the invention. In each case, the order of lines and columns and the order of their filling may be different depending on decisions of the developer / user.
- the invention can be implemented using any electronic device that allows installing and running the software, for example, personal computers, smartphones, tablets, laptops.
- the invention can be implemented with the use of a client-server technology for organizing user interfaces, as well as in local applications, wherein the server and client can be a hardware system, as well as a software system, for example, the separation of physical electronic devices into a client and a server, or the separation of software products into the client and the server.
- the invention can be implemented using any relational and nonrelational DBMSs that support storing tables and text and/or numeric data in these tables, for example, MySQL, Microsoft Access, Oracle Database, Hbase, etc.
- the invention can be implemented using any programming language, both low and high level languages, for example, Assembler, C++, Visual basic, Java, PHP, etc.
- tables which are created as a part of the method of structuring data representing the distribution of data by lines and columns, can be implemented using DBMS means or programming languages.
- the concatenation is the operation of gluing objects of a linear structure.
- the concatenation of the words“micro” and“world” will give the word“microworld”.
- SQL structured query language
- DBMS database management system
- NoSQL is a term denoting a number of approaches aimed at implementing database storages that differ significantly from models used in traditional relational DBMSs with the access to data using SQL means.
- an interface is a set of means, with the use of which a user interacts with an application running in hardware of an electronic device.
- a database is a complex of independent data presented in an objective form.
- DBMS database management system
- an electronic device is any computer equipment that can run software for a given task.
- the electronic device term means that the device can function as a server for other electronic devices; however, this is not necessary in relation to the present invention.
- some (non-limiting) examples of electronic devices are personal computers (desktop computers, laptops, netbooks, etc.), smartphones and tablets, as well as network equipment such as routers, switches, and gateways.
- the device functions as an electronic device does not mean that it cannot function as a server for other electronic devices
- the use of the electronic device expression does not exclude the use of several electronic devices for receiving / sending, executing, or invoking the execution of a task or request or consequences of any task or request or steps of any method described herein.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of data processing. Invention provide a more convenient visual perception of identifiers and their recognition during calculations and other actions with numeric values in databases, simplifying the operation with a database, reducing the significance of a human factor, and, therefore, decreasing human errors, increasing the software performance, as well as providing a significant increase in the data processing rate during the operation with databases. A method of forming and structuring an electronic database is implemented using a database management system and / or programming languages by means of an electronic device.
Description
Method of forming and structuring an electronic database Field of the invention
The invention relates to the field of data processing.
Background of the invention
Currently, the main emphasis in traditional systems for automation of planning and calculation activities, as well as analytical activities is on a document volume of enterprises / companies, i.e. all initial data for such systems are usually obtained from documents. This approach allows providing the automation, but the effectiveness of this approach remains wanting. The operating activities of most of enterprises relate to the goods turnover, especially for production and trading companies; and the main entity requiring the digital processing is goods and other material assets, but not documents. Of course, the documents reflect most of information about goods and material assets, but they do not represent data for the automation and carry more legal functions than information ones.
Many existing identification systems are tied to documents. For example, an employee fills in a sales plan as a separate document; this document is stored in a database in a separate table or in several tables; the same relates to a sales report; linking the data for analysis is provided after filling both documents, wherein numeric data objects in these documents are the same; these documents are separated for an individual analysis of the objects in the case of a detailed analytics (for example, the profitability of a specific item of goods or services). This approach to data processing makes it necessary to regularly collect / disassemble numeric data arrays (increasing an algorithmic chain) about the same objects. It leads to the complication of the operation with the database.
At the same time, the existing solutions for forming a structure of databases and their filling are not accurate enough; these solutions have
multiple value duplicates and cause difficulties in the visual perception of identifiers and their recognition during calculations and other actions with numeric values; this fact leads to an increase in the significance of a human factor and, therefore, leads to errors.
The patent document RU 2386166 C2 (Open Joint-Stock Company Taganrog Aviation Scientific and Technical Complex named after G. M. Beriev, April 10, 2010) discloses a method of structuring and functioning a database of regulatory documents, the method comprising the following steps: storing objects of a regulatory documentation in at least one data warehouse; interacting with the data warehouse in the following manner: the entire volume of regulatory documentation and the information contained in the documentation for any subject area of activity is placed in a separate area of the database and represented in the form of a three- dimensional information space; choosing three main comparand words and assigning their names to the X, Y, Z coordinate axes; determining components of the comparand words and plotting the components along the coordinate axes formed by unit vectors (unit segments), which determine the components of the comparand words; assigning codes based on designations used in practice; forming clusters, which are regions of the three-dimensional space, from the unit vectors; the regions are bounded by unit surfaces of the unit segments; forming a code of each cluster from codes of the corresponding unit vectors of the comparand words; obtaining a three-dimensional information space consisting of clusters with codes assigned to them; placing the formed three-dimensional information space together with other databases in a common information warehouse in the form of an object-oriented or relational database; transforming each document into the XML format; defining keys and attributes; locating links for search and analysis criteria and in a formatted form in the corresponding area of the database; analyzing a document, a document part or information included in the document with respect to the
membership to clusters of the three-dimensional information space; forming a full identification number of the document or its part along three axes; wherein the full identification number consists of codes of the unit vectors depending on the membership to one or several clusters and the identification number of the document, wherein if the document along any axis belongs to several clusters, then the code of the unit vector along this axis is replaced by zero value; then placing the full identification number of the document in the cluster or clusters, to which the document or a part of information contained in it belong; forming keys, indexes and attributes through the cluster; adding them to the document in the XML format; in the course of analyzing documents, filling tables of membership of the documents to the comparand words; wherein the tables are stored in the database and allow determining components of the comparand words, to which the document belongs, through the full identification number; then checking the availability of documents in the cluster and analyzing the sufficiency of information in these documents according to the criteria of completeness of the documentation using database add-ons; placing the analysis results in an analyzed cluster and storing together with the full identification numbers of the documents; in order to analyze a wider area that defines one component or the entire comparand word, providing a cross section of the three-dimensional information space in the selected plane using the database add-ons; analyzing the status of providing any direction of the chosen area of activity with the normative documentation and taking decision on the need to create a new document or supplement an existing document based on the fact that each cluster of the three- dimensional information space should ideally contain only one document fully describing the area of activity limited by this cluster; using a pattern base and the criteria of completeness of the documentation, providing the development or adjusting regulatory documents during the development process, wherein for the final version, the new document is iteratively
subjected to analysis to determine the quality and completeness of the information and to place the document in the three-dimensional information space; selecting and finding the regulatory documents through one, two or three comparand words with the use of tools of the database; providing access to information in relation to a matter of interest; entering (deleting) information; controlling the completeness and quality of the received information; wherein during developing the three-dimensional information space of the areas of activity, if a part of the volume of regulatory documentation and components of the comparand words are the same, similar clusters of the new database are obtained by replacing them with existing ones.
The closest analogue of the claimed solution is the patent document RU 2650032 C1 (Aleksey Petrovich Semenov, April 6, 2018). The document discloses a method for creating an electronic database. The method is executed using a database management system through an electronic device by the following steps: structuring data into a single table with the use of a database management system; wherein the table contains at least four columns and at least five lines; forming a first line that displays a root element; storing a first identification number in a first column of the first line; forming a second line that displays the data type; storing a unique identification number of the data type in a first column of the second line; wherein the unique identification number of the data type is different from the first identification number, storing a name of the used data type in a third column; storing a code of the data type in a fourth column; forming a third line that displays a term, storing a unique identification number of the term in a first column of the third line; storing a name of the term in a third column; storing a code of the used data type in a fourth column; forming a fourth line that displays an attribute of the term; storing a unique identification number of the attribute of the term in a first column of the fourth line; storing an identification number of a parent element of the
attribute of the term in a second column; wherein the attribute of the term depends on the parent element; forming a fifth line that displays data; storing a unique identification number of the data in a first column of the fifth line; storing an identification number of a parent element of the data in a second column; wherein the data depend on the parent element; storing the data in a third column, and storing a unique code of the data type in a fourth column.
Moreover, if it is necessary to process impressive amounts of data from many different sources, these methods are ineffective because of their complexity and awkwardness; a large number of operations will lead to a decrease in the software performance and a decrease in the data processing rate. Furthermore, a large number of tables and data in them complicate the perception of identifiers and their recognition during calculations and other actions with numeric values, inhibit the operation with the database, as well as increase a number of human errors.
Therefore, the disadvantages of the above mentioned analogues (including the prototype) are the complexity and awkwardness of the methods, a large number of operations, which leads to a decrease in the software performance and a decrease in the data processing rate, the difficulty in perceiving identifiers and their recognition during calculations and other operations with numeric values, an increase in a number of human errors, the difficulty during the operation with the database.
Problem and technical result
The problem and the technical result of the present invention are providing a more convenient visual perception of identifiers and their recognition during calculations and other actions with numeric values in databases, simplifying the operation with a database, reducing the significance of a human factor, and, therefore, decreasing human errors, increasing the software performance, as well as providing a significant increase in the data processing rate during the operation with databases.
5 Disclosure of the invention
The problem is solved, and the required technical result is achieved during using the invention by a method of forming and structuring an electronic database, the method implemented using a database management system and / or programming languages by means of an lo electronic device according to the following steps: a) placing data into a table or tables with the necessary structure; wherein the structure contains at least one column for a final identifier of numeric values and / or index that uniquely indicates the final identifier of numeric values; b) creating a table containing at least one column, and forming at least two basic is elements of the identifier in lines, wherein one of the basic elements determines a type of the numeric value of the identifier; the column of the basic element of the identifier has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported by electronic databases; c) using the table created 20 during the step b) with the addition of at least one column or creating another table containing at least two columns; and forming a preliminary identifier in the lines of one of these tables; wherein the preliminary identifier is based on the basic elements of the identifier formed during the step b) so that the lines reflect a composition of the preliminary identifier in 25 the form of a ratio of one preliminary identifier to many basic elements; wherein the column of preliminary identifiers has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported by electronic databases; d) generating a final identifier by concatenating the preliminary identifier with a BO separator and / or the basic element of the identifier expressing the type that determines the type of numeric value; wherein the final identifier is based on values of the basic elements of the identifier formed during the steps b), c) so that at least one of them determines the type of numeric value; e) forming the lines of the table or tables created during the step a)
based on the final identifiers generated during the step d), and attaching the numeric values to the final identifiers in the form of a ratio of one final identifier to many numeric values, wherein the ratio allows determining their direct relationship with the identification object.
Essence of the invention
The invention is a method of constructing/optimizing an organizational structure of data for enterprises/companies that need sequential information processing and require careful analysis of numeric data.
Table of basic elements of the identifier
The purpose of creating a table of basic elements of the identifier is to store and form basic elements of the identifier in the database. The table of basic elements of the identifier is conventionally called the Alphabet in this description of the invention and contains five columns sufficient to structure a semantic identifier of numeric data.
Table 1 - Example of fil ing the table of basic elements of the identifier
A column of the basic element of the identifier has a text type of the data that allows storing the data in the form of a set of natural language characters and / or other characters supported by electronic databases Columns of a type, group, and priority may have a text, digital, or user- defined data type depending on the implementation of the invention; they can’t have the data type in the form of a date or logical types. A first column contains a text expression of the basic element of the identifier, a second column contains a text and / or digital expression of the type of the basic element of the identifier, a third column contains a text and / or digital expression of a group membership of the basic element of the identifier, a fourth column contains a text and / or digital expression of an order number of the basic element of the identifier within a type; and it is a priority relationship between types, the fifth column contains a text and / or digital expression of a separator character of the basic element of the identifier. Columns of the type, group, priority, and separator may be absent in various implementations of the invention, wherein the column of the basic element of the identifier is always present. Basic element of the identifier
The expression of the basic element of the identifier is a combination of natural language characters and / or other characters supported by electronic databases. The basic element of the identifier is conventionally called the Letter in this description of the invention. The Letter can provide a function of a group membership, a descriptive function, a function of the final identification of the type of numeric value assigned to the identifier
(Table 2) and / or other functions provided by the implementation of the invention.
Table 2 - Examples of basic elements of the identifier - the Letters The examples in Table 2 are given for understanding within the framework of this description and for providing analogies in the case of similar applications, the Letters can be formed in any composition and quantity, but always perform the functions specified in this description in various implementations of the invention
Type of the Letter
The type of the Letter can be formed on the basis of functional membership or on the basis of other needs / preferences of the developer / user of the database, wherein the type performing the function of
determining the type of numeric value is always created (examples are table 2, lines 10-14); this type is crucial for the formation of the final identifier and the binding of numeric data. The type of the Letter can be expressed in a digital and / or other form supported by fields of the column of the table of the database, and defines the primary grouping and sorting of the Letters for the purpose of further building the semantic identifier.
Group of the Letter
The group of the Letter is formed on the basis of the needs / preferences of the developer / user of the database; the group can be expressed in a digital and / or other form supported by fields of the column of the table of the database, and defines the secondary grouping of the Letter for the purpose of further building the semantic identifier. The column of the group may have other functions or be absent in various implementations of the invention.
Priority of the Letter
The priority of the Letter is formed on the basis of the needs / preferences of the developer / user of the database; the priority can be expressed in a digital and / or other form supported by fields of the column of the table of the database, and performs the function of sorting within the type in the final identifier. The column of the priority may have other functions or be absent in various implementations of the invention.
Separator of the Letter
The separator of the Letter is formed on the basis of the needs / preferences of the developer / user of the database; the separator can be expressed in a digital and / or other form supported by fields of the column of the table of the database, and performs the function of visual separation of the Letters in the identifier. The separator can be expressed by one or several characters for all Letters or a character / characters defined in advance in the application code or be absent in various implementations of the invention.
Table of forming preliminary identifiers
The purpose of creating a table of constructing preliminary identifiers is the formation and storage of preliminary identifiers in the database. The table for constructing preliminary identifiers is conventionally called the Dictionary in this description of the invention and contains six columns sufficient to build a semantic identifier for digital data and its further storage in the database.
The first column of the table of the preliminary identifiers has a text data type that allows storing data in the form of a set of natural language characters and other characters supported by electronic databases. The remaining five columns, which are columns of the basic element of the identifier, type, group, priority, and separator, are similar to columns of the table of basic elements of the identifier, and these remaining columns obey the same rules of forming. Columns of the type, group, priority, and separator may be absent in various implementations of the invention,
wherein the column of the basic element of the identifier and the column of the preliminary identifier are always present.
Preliminary identifier
The expression of the preliminary identifier is a complex of elements (Letters) formed in the Alphabet (table of basic elements of the identifier) and separators combined by the concatenation operation in the application code or the database code, or connected manually with the use of sorting firstly by Type, then by Priority (or sorting may be absent) by selecting one / several elements (Letters) within the Group. The Letters determining the type of numeric value are not attached to the preliminary identifier. The preliminary identifier is conventionally called the Word in this description of the invention. The result of the formation of the Word is a text expression that allows providing the accurate identification of the object of measurement, its group membership, properties and type of numeric values in further operations with lines / cells of the database (Fig. 1).
Other columns of the Dictionary
The remaining columns of the Dictionary perform the function of assigning the Letters (basic elements of the identifier) to the Word (preliminary identifier) and further storage of the generated structure in the database. Columns of the type, group, priority, and separator may be absent in various implementations of the invention, wherein the columns of the preliminary identifier and the Letters are always present.
Forming the final identifier and binding numeric values
The preliminary identifier called the Word allows providing the visual determination of objects of measurement in databases; this identifier has undeniable advantages with respect to traditional identification in the form of numeric product items / indexes and / or identifiers, since it is semantically readable, requiring no further explanation of its meaning, group memberships, and properties of the object identified by it. At the
same time, this identifier allows performing the same functions as traditional product items / indexes / identifiers expressed by digital encoding. These functions are the unique identification of database lines, queries in SQL and NoSQL languages.
Final identifier
The Word identifies the object of measurement, but not the type of measurement, therefore, in order to form the final identifier and bind numeric data to it, the Word is concatenated with the separator of the Letter expressing the type that determines the type of the numeric value or with the separator defined in the application / database code, if the presence of the separator is determined by the user / developer, and then with the Letter defining the type of numeric value (table 2, lines 10-14). The concatenation is performed in the application code or the database code, or manually. Thus, a number of the final identifiers for one Word is equal to a number of types of numeric value defined in the Word structure (Fig. 2).
The final identifier is conventionally called the Line Forming structure in this description of the invention. The Line Forming structure has the function of uniquely identifying an array of numeric values for filtering, grouping, sorting, and performing mathematical operations on array elements. The array of numeric data can be expressed in any form that allows storing numeric data in digital form via data storage devices.
A separate table / tables can be created to store the Line Forming structures or it is possible to generate them on the fly in the application / database code.
Binding numeric data
The binding of numeric data means a ratio of the final identifier (Line Forming structure) with a set of numeric data; the ratio allows determining their direct relationship with an object of identification. The object of identification means any tangible / intangible object that can be expressed numerically.
The binding of numeric data is carried out by providing a clear relationship of numeric data with the Line Forming structure in tables / database code or application code. The relationship of numeric data with the Line Forming structure can be provided by storing the numeric data in one line of the table with the Line Forming structure by constructing a multi- stage correlation from identifiers / product items / indexes and / or their combinations stored in lines of database tables and reliably determining the correlation of numeric data with the Line Forming structure, by storing the Line Forming structure, its Letters and / or their combination in the application / database code, and creating tables and their lines / columns based on them. Different implementations may use other methods of providing the relationship of numeric data with the Line Forming structure, but this relationship is determined reliably in all cases.
Implementation of the invention
The method is implemented according to the following steps: taking the decision on the goals of a calculation system of the future database / application, for example, maintaining warehouse records, financial planning, a comprehensive analysis of the enterprise operations; based on the decision made, determining arrays of numeric data that require structuring, for example: a sales plan, a sales report, and comparing a plan with the fact of sales; defining the required parameters of filtration, grouping, sorting, for example: isometric proportions of goods (width, length, thickness, etc.), direction of flow (income, expenditure), type of data about the object (plan, fact); determining the required types of measurement, for example, rubles, pieces, kilograms.
Tables (for example, the following table) necessary to achieve the goals of the decision are created using the DBMS and its means and / or external means of working with the DBMS (for example, ODBC driver) to create / encode tables:
Table 4 - Example of the target table "Availability of goods in the storage taking into account the balance in the workshop"
In this example, the word identifies numeric data as the product receipts plan in ruble, piece, and packaging (roll) terms. The example shows that during using the semantic identifier there is no need to create separate tables for the plan, fact, income, expenditure, and explanation of a product item, all these functions are performed by the identifier itself; the identifier allows accurately identifying the data. For example, the FCT (fact) Word can be used instead of the PLN (plan) Word. Lines identify factual data in this case. Being in the same table, such Line Forming structures make it easy to get reporting results on the matching of the plan and the fact by performing simple SQL queries, wherein it is enough to look at the Line Forming structure one time to determine data that the Line Forming structure identifies. The only necessary condition for the implementation of identification is the presence of a column containing the Line Forming structures or some index relating to the Line Forming structure in target tables, if the developer / user decides to use additional indexing, which is also acceptable.
The table of basic elements of the identifier (Alphabet) and the table of preliminary identifiers (Dictionary) are formed after selecting a composition and structure of a target table / tables and creating this table / tables. After that basic elements are formed in the Alphabet table. Types, group memberships, priorities, and separators are specified based on the
formation rules described in the“Table of basic elements of the identifier” subsection of the description. Then the preliminary identifiers (Words) are formed in the Dictionary table based on values in the filled Alphabet table and according to the formation rules described in the“Table of forming preliminary identifiers” subsection of the description. The generation of the final identifier (Line Forming structure) is provided after forming the both tables. The generation is ensured according to the formation rules described in the“Forming the final identifier and binding numeric values” subsection of the description.
Therefore, using the principle of constructing the Line Forming structures, all operations with data related to objects of identification receive a single identifier that allows providing the accurate determination of data membership and their attributes to the identification object, without unnecessarily increasing the volume of the database and algorithmic chains, as well as by optimizing sampling processes that leads to increasing the software performance, as well as providing a significant increase in the processing rate of the data associated with it.
A small number of operations, tables, the absence of multiple value duplicates, and the use of final identifiers of the Line Forming structures provide simplifying the operation with the database, reducing the significance of the human factor, and, therefore, decreasing human errors.
The use of the text data type providing the storage of the data in the form of a set of natural language characters and / or other characters supported by electronic databases for the column of the basic element of the identifier and the column of the preliminary identifiers, as well as the use of the final identifiers (Line Forming structure) allow visually defining objects of measurement, types of numeric values, therefore, providing a more convenient visual perception of identifiers and their recognition during calculations and other actions with numeric values.
Therefore, the features that discloses b) creating a table containing at least one column, and forming at least two basic elements of the identifier in lines, wherein one of the basic elements determines a type of the numeric value of the identifier; the column of the basic element of the identifier has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported by electronic databases; c) using the table created during the step b) with the addition of at least one column or creating another table containing at least two columns; and forming a preliminary identifier in the lines of one of these tables; wherein the preliminary identifier is based on the basic elements of the identifier formed during the step b) so that the lines reflect a composition of the preliminary identifier in the form of a ratio of one preliminary identifier to many basic elements; wherein the column of preliminary identifiers has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported by electronic databases; d) generating a final identifier by concatenating the preliminary identifier with a separator and / or the basic element of the identifier expressing the type that determines the type of numeric value; wherein the final identifier is based on values of the basic elements of the identifier formed during the steps b), c) so that at least one of them determines the type of numeric value; e) forming the lines of the table or tables created during the step a) based on the final identifiers generated during the step d), and attaching the numeric values to the final identifiers in the form of a ratio of one final identifier to many numeric values, wherein the ratio allows determining their direct relationship with the identification object are significant in terms of the technical result, namely in terms of providing a more convenient visual perception of identifiers and their recognition during calculations and other actions with numeric values in databases, simplifying the operation with a database, reducing the significance of a human factor, and, therefore, decreasing
human errors, increasing the software performance, as well as providing a significant increase in the data processing rate during the operation with databases.
Therefore, the achievement of the required technical result is ensured, namely: providing a more convenient visual perception of identifiers and their recognition during calculations and other actions with numeric values in databases, simplifying the operation with the database, reducing the significance of the human factor, and, therefore, decreasing human errors, increasing the software performance, as well as providing a significant increase in the data processing rate during the operation with databases.
Implementations of the claimed invention have at least one of the above mentioned objects and / or aspects, but do not necessarily contain all of them. It should be understood that some aspects of the present invention arisen as a result of attempts to reach the aforementioned implementation object may satisfy not this object and / or may satisfy other objects not specifically mentioned here.
The specified order of filling the lines and columns is given as an example and used to describe the essence of the invention. In each case, the order of lines and columns and the order of their filling may be different depending on decisions of the developer / user.
The invention can be implemented using any electronic device that allows installing and running the software, for example, personal computers, smartphones, tablets, laptops.
The invention can be implemented with the use of a client-server technology for organizing user interfaces, as well as in local applications, wherein the server and client can be a hardware system, as well as a software system, for example, the separation of physical electronic devices into a client and a server, or the separation of software products into the client and the server.
The invention can be implemented using any relational and nonrelational DBMSs that support storing tables and text and/or numeric data in these tables, for example, MySQL, Microsoft Access, Oracle Database, Hbase, etc.
The invention can be implemented using any programming language, both low and high level languages, for example, Assembler, C++, Visual basic, Java, PHP, etc.
In the context of this description, tables, which are created as a part of the method of structuring data representing the distribution of data by lines and columns, can be implemented using DBMS means or programming languages.
In the context of this description, the concatenation is the operation of gluing objects of a linear structure. For example, the concatenation of the words“micro” and“world” will give the word“microworld”.
In the context of this description, SQL (structured query language) is a formal non-procedural programming language used to create, modify and manage data in an arbitrary relational database controlled by the corresponding database management system (DBMS).
In the context of this description, NoSQL is a term denoting a number of approaches aimed at implementing database storages that differ significantly from models used in traditional relational DBMSs with the access to data using SQL means.
In the context of this description, an interface is a set of means, with the use of which a user interacts with an application running in hardware of an electronic device.
In the context of this description, a database is a complex of independent data presented in an objective form.
In the context of this description, a database management system (DBMS) is a complex of general and special software and linguistic tools that provide the control over the creation and use of databases.
In the context of this description, an electronic device is any computer equipment that can run software for a given task. In the context of this description, the electronic device term means that the device can function as a server for other electronic devices; however, this is not necessary in relation to the present invention. Thus, some (non-limiting) examples of electronic devices are personal computers (desktop computers, laptops, netbooks, etc.), smartphones and tablets, as well as network equipment such as routers, switches, and gateways. It should be understood that in this context, the fact that the device functions as an electronic device does not mean that it cannot function as a server for other electronic devices The use of the electronic device expression does not exclude the use of several electronic devices for receiving / sending, executing, or invoking the execution of a task or request or consequences of any task or request or steps of any method described herein.
The examples and the reference language provided here are mainly intended to help a reader understand the principles of the claimed invention and not to limit the scope of its application by the indicated examples and conditions. It is obvious that persons skilled in the art can develop various devices that are not explicitly described or shown here, but put into action the principles of the claimed invention, as well as are included in its essence and scope of application.
Taking into consideration the novelty of the set of essential features, the technical solution of the problem, the inventive step and the essentiality of all general and particular features of the invention proven in the “Background of the invention” section and the“Disclosure of the invention” section, technical feasibility and industrial applicability of the invention proven in the“Implementation of the invention” section, the solution of the determined inventive tasks and the confident achievement of the required technical result during the implementation and use of the invention, in our
opinion, the claimed invention meets all protectability requirements applicable to inventions.
The conducted analysis also shows that all general and particular features of the invention are essential, since each of these features is necessary, and together they are not only sufficient to achieve the purpose of the invention, but also make it possible to implement the invention in an industrial way.
Claims
Method of forming and structuring an electronic database, the method implemented using a database management system and / or programming languages by means of an electronic device according to the following steps:
a) placing data into a table or tables with the necessary structure; wherein the structure contains at least one column for a final identifier of numeric values and / or index that uniquely indicates the final identifier of numeric values;
b) creating a table containing at least one column, and forming at least two basic elements of the identifier in lines, wherein one of the basic elements determines a type of the numeric value of the identifier; the column of the basic element of the identifier has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported by electronic databases;
c) using the table created during the step b) with the addition of at least one column or creating another table containing at least two columns; and forming a preliminary identifier in the lines of one of these tables; wherein the preliminary identifier is based on the basic elements of the identifier formed during the step b) so that the lines reflect a composition of the preliminary identifier in the form of a ratio of one preliminary identifier to many basic elements; wherein the column of preliminary identifiers has a text data type that allows storing data in the form of a set of characters of the natural language and / or other characters supported by electronic databases;
d) generating a final identifier by concatenating the preliminary identifier with a separator and / or the basic element of the identifier expressing the type that determines the type of numeric value; wherein the final identifier is based on values of the basic elements of the identifier
formed during the steps b), c) so that at least one of them determines the type of numeric value;
e) forming the lines of the table or tables created during the step a) based on the final identifiers generated during the step d), and attaching the numeric values to the final identifiers in the form of a ratio of one final identifier to many numeric values, wherein the ratio allows determining their direct relationship with the identification object.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
RU2018138502A RU2696295C1 (en) | 2018-10-31 | 2018-10-31 | Method of forming and structuring an electronic database |
RU2018138502 | 2018-10-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020091627A1 true WO2020091627A1 (en) | 2020-05-07 |
Family
ID=67586926
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/RU2019/000747 WO2020091627A1 (en) | 2018-10-31 | 2019-10-18 | Method of forming and structuring an electronic database |
Country Status (2)
Country | Link |
---|---|
RU (1) | RU2696295C1 (en) |
WO (1) | WO2020091627A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6505205B1 (en) * | 1999-05-29 | 2003-01-07 | Oracle Corporation | Relational database system for storing nodes of a hierarchical index of multi-dimensional data in a first module and metadata regarding the index in a second module |
US20060294060A1 (en) * | 2003-09-30 | 2006-12-28 | Hiroaki Masuyama | Similarity calculation device and similarity calculation program |
RU2386166C2 (en) * | 2008-02-04 | 2010-04-10 | Открытое акционерное общество Таганрогский авиационный научно-технический комплекс им. Г.М. Бериева | Method and system for arrangement and functioning of regulatory documentation database |
RU2650032C1 (en) * | 2017-03-20 | 2018-04-06 | Алексей Петрович Семенов | Electronic database and method of its formation |
-
2018
- 2018-10-31 RU RU2018138502A patent/RU2696295C1/en active
-
2019
- 2019-10-18 WO PCT/RU2019/000747 patent/WO2020091627A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6505205B1 (en) * | 1999-05-29 | 2003-01-07 | Oracle Corporation | Relational database system for storing nodes of a hierarchical index of multi-dimensional data in a first module and metadata regarding the index in a second module |
US20060294060A1 (en) * | 2003-09-30 | 2006-12-28 | Hiroaki Masuyama | Similarity calculation device and similarity calculation program |
RU2386166C2 (en) * | 2008-02-04 | 2010-04-10 | Открытое акционерное общество Таганрогский авиационный научно-технический комплекс им. Г.М. Бериева | Method and system for arrangement and functioning of regulatory documentation database |
RU2650032C1 (en) * | 2017-03-20 | 2018-04-06 | Алексей Петрович Семенов | Electronic database and method of its formation |
Also Published As
Publication number | Publication date |
---|---|
RU2696295C1 (en) | 2019-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110618983B (en) | JSON document structure-based industrial big data multidimensional analysis and visualization method | |
JP6857689B2 (en) | Data retrieval devices, programs, and recording media | |
US10019509B1 (en) | Multi-dimensional modeling in a functional information system | |
EP3365810B1 (en) | System and method for automatic inference of a cube schema from a tabular data for use in a multidimensional database environment | |
US11037096B2 (en) | Delivery prediction with degree of delivery reliability | |
CN111950922B (en) | Equipment economic data evaluation method based on multi-source data interaction analysis | |
CN118227767A (en) | Knowledge graph driven large model business intelligent decision question-answering system and method | |
CN114490571A (en) | Modeling method, server and storage medium | |
Savinov | ConceptMix-Self-Service Analytical Data Integration based on the Concept-Oriented Model. | |
Chatziantoniou et al. | Just-In-Time Modeling with DataMingler. | |
WO2020091627A1 (en) | Method of forming and structuring an electronic database | |
Andreescu et al. | Measuring Data Quality in Analytical Projects. | |
Jemal et al. | MapReduce-DBMS: an integration model for big data management and optimization | |
ARGHIR et al. | Organizational development through Business Intelligence and Data Mining. | |
Renganathan | Business Intelligence: An overview | |
Da Silva et al. | Analytical processing for forensic analysis | |
US20240095243A1 (en) | Column-based union pruning | |
Kulikov et al. | Information technologies in digital economy | |
Smyrnaki | Data warehousing in higher education. A case study of the Hellenic Mediterranean University. | |
JPWO2016080413A1 (en) | Data search device, program, and recording medium | |
Darkenbayev | BIG DATA PROCESSING ON THE EXAMPLE OF CREDIT SCORING | |
Cruz Pantano et al. | A Pool of Free Software Tools to Assist Business Intelligence and Analytics. | |
At-taibe et al. | From ER models to multidimensional models: The application of Moody and Kortink technique to a university information system | |
Dalab et al. | Data mining for community service course | |
Ghosh | Scalable Multi-Tenant Data Warehouse for Mining and Analysing Educational Data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19879119 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 22/06/2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19879119 Country of ref document: EP Kind code of ref document: A1 |