EP3095052A1 - System and method for data search in a graph database - Google Patents
System and method for data search in a graph databaseInfo
- Publication number
- EP3095052A1 EP3095052A1 EP15771496.5A EP15771496A EP3095052A1 EP 3095052 A1 EP3095052 A1 EP 3095052A1 EP 15771496 A EP15771496 A EP 15771496A EP 3095052 A1 EP3095052 A1 EP 3095052A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- data
- autotag
- search
- auto
- item
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
Definitions
- the present invention is related to methods for data search, and in particular, to a method for searching data in graph-based databases.
- Computer systems are often used to store large amounts of data from which individual records must be retrieved according to some search criterion.
- search criterion some search criterion
- the search data is inextricably linked with the concept of data filtering.
- the search means a procedure used for examining large amounts of data in order to find non-obvious, hidden or lost parts.
- the search data is usually associated with processing of data storage. A lot of search algorithms are used for data searching depending on the nature of data.
- the data search can imply a search for files on the data storage, the search for data within files/documents, data/information in the Internet, data in the database, and so on.
- the data search i.e., a search for an item
- the data search is done based on various parameters of data that distinguish the given data from other data (i.e., the search by parameter(s), which uniquely characterize desired data).
- a file name, a file type, a file size can be used as such distinctive characteristics; a table name, a table ID - for DB; a number of characters in the word, a part of speech - for words, and so on.
- Tags can be used to facilitate and optimize data search.
- a tag can be assigned to the document manually, during the process of data or documents creation, storing and/or adding to a database.
- Tags characterize the data, so these tags can be used for the data searching.
- a tag is a non-hierarchical keyword or term assigned to a piece of information (such as an Internet bookmark, digital image, or computer file). This kind of metadata helps to describe the data and allows for it to be found again by searching.
- the tags are generally chosen informally and personally by the data's creator or by its viewer, depending on the system, the data structure, type of data, the data content, the data context, and so on. For example, data related to computer science can be characterized by tags "computer,” “science,” “information,” “software,” “hardware” and etc.
- Various algorithms can be used for tag creation.
- tags can be created based on the results of the analysis of the document's text.
- tags can be created automatically (referred to as "autotags” below) for the document and can be further associated with it.
- Autotags can be created not only for the data/document search, but for an Item search, for example, in the database (DB).
- the Item is an entity, for example, business-entity: the task "Add a description for the animal picture” is the Item, an employee "Bill” - is an Item as well, a request for "The bridge building," a record in the DB for the user, IT department, HR department, or any other entity.
- Such entities in programming field are called class instances.
- tags can be created for any data types stored in various form, for example, in a form of triples/n-tuples.
- the triples for example, can be stored in the various types of databases, for example, relational, hierarchical, network-based, object-oriented DB, and etc.
- the triples are stored in a triplestore.
- the triplestore is a special database for storage and retrieval of the triples.
- a triplestore is a purpose-built database for the storage and retrieval of triples, a triple being a data entity (also known as a statement) composed of Subject- Predicate-Object, like "John is 35" or "John knows Helen.”
- a triplestore is optimized for the storage and retrieval of triples.
- a query language is used for accessing the triples stored in the triplestore.
- triples can usually be imported/exported using Resource Description Framework (RDF) and other formats.
- RDF Resource Description Framework
- Some triplestores are built as database engines from scratch, while others are built on top of the existing commercial relational database engines (i.e., SQL-based).
- SQL-based the existing commercial relational database engines
- triplestores will have the advantages in terms of performance.
- a difficulty with implementing triplestores over SQL is that although triples can be stored, implementing efficient querying of a graph-based RDF model onto SQL queries is difficult.
- the number of tags increases as the amount of data increases. Stored data changes over time, so the old tags are deleted or replaced by the new tags. The new tags must be created. Usually, data items intersect with each other, and changing of one piece data can affect other data. Manual creation of tags for new data and update of changed data is extremely time-consuming and costly process.
- the present invention is related to method for searching data in graph-based databases that substantially overcomes the disadvantages of the related art.
- a method for auto-generation of tags and retrieval of data form a graph-based database is provided.
- triples are stored in a triplestore.
- the triplestore is a special database for storage and retrieval of the triples.
- a triplestore is a purpose-built database for the storage and retrieval of triples, a triple being a data entity (also known as a statement) composed of Subject-Predicate-Object.
- a triplestore is optimized for the storage and retrieval of triples.
- a query language is used for accessing to triples stored in the triplestore.
- triples can usually be imported/exported using Resource Description Framework (RDF) and other formats.
- RDF Resource Description Framework
- the search items need to be tagged for search and retrieval.
- the number of tags increases as the amount of data increases. Stored data changes over time, so the old tags are deleted or replaced by the new tags. The new tags must be created.
- the tags are created automatically (i.e., autotags) to facilitate efficient data retrieval.
- FIG. 1 illustrates an example of a graph, in accordance with the exemplary embodiment
- FIG. 2 illustrates a portion of the semantic web, in accordance with the exemplary embodiment
- FIG. 3 illustrates a data processing algorithm, according to one embodiment of the invention
- FIG. 4 illustrates a flowchart of the autotags finding for the changed Item and for related Items, according to one embodiment of the invention
- FIG. 5 illustrates an example of data relationships as it applies to searching the Items
- FIG. 6 illustrates the process of data processing during the Item creation
- FIG. 7 illustrates an example of different business applications used in different departments of a company and data processing within them, in the exemplary case
- FIG. 8 illustrates the system in accordance with the exemplary embodiment
- FIG. 9 illustrates a computer or server which can be used in the exemplary embodiment.
- a query language is used for accessing to triples stored in the triplestore.
- triples can usually be imported/exported using Resource Description Framework (RDF) and other formats.
- RDF Resource Description Framework
- the search items need to be tagged for search and retrieval.
- the number of tags increases as the amount of data increases. Stored data changes over time, so the old tags are deleted or replaced by the new tags. The new tags must be created.
- the tags are created automatically (i.e., autotags) to facilitate efficient data retrieval.
- RDF graphs in XML format (which is frequently more convenient for computer-based processing) and in the form of N-triples or N3s (which is used in the present approach, and which is more convenient for human understanding).
- N3 syntax would be as follows:
- the XML syntax is far more verbose than the N3 syntax, but, it is much easier for computers to process.
- the triple is a basic unit of the RDF Resource Description Framework (RDF), and consists of a Subject, a Predicate, and an Object.
- the set of triples is commonly referred to as an RDF graph, an example of which is shown in FIG. 1.
- a direction of an arrow e.g., (110a, 110b) in any given triple (e.g., 120) points from the Subject (130) to the Object (140).
- the RDF data model is similar to classic conceptual modeling approaches such as entity-relationship or class diagrams, as it is based upon the idea of making statements about resources (in particular web resources) in the form of Subject- Predicate-Object expressions.
- RDF Resource Description Framework
- Predicate traits or aspects of the resource and expresses a relationship between the Subject and the Object.
- a collection of RDF statements intrinsically represents a labeled, directed multi-graph. As such, an RDF-based data model is more naturally suited to certain kinds of knowledge representation than the relational model and other ontological models.
- the RDF data is often resides in relational database or in a native Triple stores, or Quad stores if context (i.e., the named graph) is also stored for each RDF triple.
- Named graphs are a key concept of Semantic Web architecture in which a set of Resource Description Framework (RDF) statements (a graph) are identified using a URI, allowing descriptions to be made of that set of statements, such as context, provenance information or other metadata.
- RDF Resource Description Framework
- Named graphs are a simple extension of the RDF data model through which graphs can be created, but the model lacks an effective means of distinguishing between them once published on the Web at large. While named graphs may appear on the Web as simple linked documents (i.e., Linked Data), they are also very useful for managing sets of RDF data within an RDF store.
- the object "Man”, “Manager” 140 and subjects "John” 130, "Michael,” “Morgan,” “Mona,” “Alex” of an RDF statement is a Uniform Resource Identifier (URI), which denote resources. Resources also can be indicated by blank nodes. Blank nodes are not directly identifiable from the RDF statement. A blank node is a node in the RDF graph representing a resource for which a URI or literal is not given. The resource represented by the blank node is also called an anonymous resource. According to the RDF standard, a blank node can only be used as Subject or Object of an RDF triple. Blank nodes can be denoted through blank node identifiers in the following formats, RDF/XML, Turtle, N3 and N-Triples. The following example shows how it works in RDF/XML:
- the blank node identifiers are only limited in scope to a serialization of a particular RDF graph, i.e. the node "_:b" in the subsequent example does not represent the same node as a node named "_ ⁇ " in any other graph.
- the blank nodes are treated as simply indicating the existence of a thing, without using a URI (Uniform Resource Identifier) to identify any particular thing. This is not the same as assuming that the blank node indicates an "unknown" URI.
- the Predicate (“is” 110a,”position” 110b) is a URI, which also indicates a resource, representing a relationship.
- the Object (“Manager,” “Developer,” “CEO” and in particular cases “John,” “Michael,” “Morgan,” “Mona,” “Alex”) is a URI, blank node or a Unicode string literal.
- the triple approach is one that is utilized in the present invention to process information from various sources.
- the semantic stack utilized in the exemplary embodiment includes the Uniform Resource Identifier (URI) 201.
- URI Uniform Resource Identifier
- URLs locators
- UPNs names
- a uniform resource name serves as a person's name, while a uniform resource locator (URL) resembles that of person's street address.
- the URN defines an Item's identity, while the URL provides a method for finding it.
- CmwL (Comindware Language) 211 describes the function and relationship of each of these components of the semantic web stack;
- XML 203 provides an elemental syntax for content structure within documents, yet associates no semantics with the meaning of the content contained within;
- the RDF 205 is a simple language for expressing data models, which refers to objects ("resources") and their relationships.
- An RDF-based model can be represented in XML syntax.
- the RDF schema 207 extends the RDF and is a vocabulary for describing properties and classes of RDF-based resources and semantics for generalized-hierarchies of such properties and classes.
- Ontology 215 formally represents knowledge as a set of concepts within a domain, using a shared vocabulary to denote the types, properties and interrelationships of these concepts.
- Ontologies are the structural frameworks for organizing information. The ontologies are described by (Web Ontology Language) OWL or by CmwL, which allow to describe the Classes and their relations with each other and other entities (see below).
- the ontologies can extend pre-defined dictionary (for example, RDF or OWL dictionaries).
- the dictionary represents a collection of data/information of certain terms that have the same meaning in all contexts.
- the ontology uses the pre-defined reserved dictionary/glossary of terms for concepts and relations determining for a particular domain/subject area.
- the ontologies can be used to express the semantics of dictionary terms, their relationships and contexts of usage.
- RDF Schema is a dictionary for the RDF.
- OWL or CmwL can be used to record the semantic of subject areas to the ontologies.
- any data for example, ontologies or taxonomies, can be expressed in triples. The triple is a fact.
- Taxonomy 209 is a hierarchical way to categorize all the items in a given world: books, products, species, concepts, etc.
- the taxonomy is a dictionary of terms and their precise definitions. When a dictionary is ordered logically within a hierarchy, it is called taxonomy. It is a shared resource everyone in an information ecosystem uses to sync the meaning of terms.
- Comindware language 211 is used instead of the Web Ontology Language (OWL) in the semantic stack.
- Comindware language represents a limited version of OWL, in order to improve performance and get rid of functionality and operations that are not necessary for the purposes of business applications and/or for using with the ontologies (but using OWL vocabulary and some of its rules 213).
- relational database In terms of data storage, the relational database is not the best choice for RDFrepository as it is ill-suited to work with loosely structured data.
- MySQL search for the information on the tags is solved by introducing a staging table, so that the following structure can be obtained:
- Tags table tag_id, title
- Linking table 2 Linking table 3
- Linking table 3 Linking table 3
- the SQL JOIN clause combines records from two or more tables in a database. It creates a set that can be saved as a table or used as is.
- a JOIN is the means for combining fields from two tables by using values common to each (IDs for example). This and other related operations require a large amount of time.
- Relational databases provide for a sufficiently high speed search only for the data for which the search was set up/adjusted.
- the search for SQL-databases uses SQL-requests that have to be rewritten every time the new data is added to a database or the new data types, which differed from the data stored in the database, are used.
- Another example is adding of the department (no departments was added previously and such Item was not used) to a database.
- new department adding must be available for adding: Information technology (IT), Human resource management (HR), Research and development (R&D), and other. So few tables must be created, wherein at least one of it stores department data and at least another table stores the relations with other database tables.
- the exemplary embodiment provides a system, which automatically adapts to changes within the database data, and which implements the search request(s) that are simple for a user, such as: "What should I find?" -> "Bugs" + "High priority", or "manager” + "Alex").
- search request can use the text fields, where search request can be entered.
- the system must provide the selection of the data type/search set, wherein the search is to be performed.
- GUI Graphical User Interface
- the GUI can include text field, where a user can enter the search term, to start the Item (sometimes referred to as "data objects") search process.
- data objects sometimes referred to as "data objects" search process.
- the GUI can contain more than one text field.
- the user can enter the part of the search word to the text field, and the system provides variants for auto-complete of this word. For example, a user can enter the first; characters of the word (or characters from the middle of the word) "Morgan,” i.e. "Mo" to the text: area.
- the system will provide possible variant for auto-completing (endings of the words, i.e. words, where such sequence will be found, or where these characters are found in any sequence),, "Morgan” and "Mona.” Further, the system is available to provide the search area to the user with using the GUI, for example.
- the search area is represented by a triple predicates, which are associated with the search word. For example, for the "Morgan” word the system will provide the following search areas: “Creator,” “Bug Fixer,” and other, if they were found in the database. Such variants of search areas are represented by autotag predicates for found autotags for the "Morgan” word. If a user will select the "Creator” search area, the result of the search will include "Bug No. 2121", according to FIG. 1.
- a “Bug” is a Class
- “Bug No. 2121” is an instance of this Class.
- the search area can be identified/defined and more narrowed. The examples of Classes and the instances for specific cases will be discussed below.
- the GUI allows the user to select at least one search area for entered search word or a part of it. The user can skip a selection of the specific search area, such as "Creator” or "BugFixer.” The user can reject the selected/provided search area. In this case, such decision is equal to the selection of all search areas. If at least one search area is selected, the search is performed in this search area.
- Class instances acts as required Items (which are associated with search word). "Michael,” "Morgan” (FIG.
- data can be stored in the form of triples or n-tuple.
- the data search terms can be implemented in the triples, but all triples related to data for the search must be known.
- the autotag usage is more convenient way for data search. Further the example for autotag description/definition and the autotag values searching process is provided. Also, an example of the Item search by autotag values is provided.
- the following source code is written in the N3. The N3 is based on RDF standards and is equivalent to RDF/XML syntax, but have extra features like rules and formulas.
- @prefix cmw ⁇ http://comindware.eom/logics#>.
- http://comindware.eom/logics# for the "cmw" prefix.
- the @prefix directive binds a prefix to a namespace URI. It indicates that a qualified name (qname) with that prefix will thereafter be shorthand for a URI consisting of the concatenation of the namespace identifier and the bit of the qname to the right of the colon.
- URI uniform resource identifier
- URL uniform resource locator
- http HyperText Transfer Protocol
- comments describing a particular block of the document follow the "#" sign.
- the comments can be enclosed as "##” as well.
- Classes just tell about the thing which is in them.
- An Object can be represented/described in many classes. Any sort of a hierarchical relationship is not required. If there is a relationship between two classes it is possible to state it. See properties of classes in the RDF Schema (http://www.w3.org/TR/rdf-schema/) and OWL vocabularies (http://www.w3.org/TR/owl-guide/ or http://www.w3.org/TR/owl2-overview/).
- the invention can be used with triples, quadruples, and so on. For purposes of examples, triples are used.
- the Enumeration (enum) autotag ##.
- the autotag with the property as an autotag type, and this autotag property is a list of specific values (in other words, the autotag for a property, and the property consist of list of values).
- An enumerator (an enum) can be used as the list of values.
- An example of such enum is a bug or task importance/priority:
- the bug severity is a property.
- the name of the property (bug severity) is “Bug Severity.”
- the type of the property is “enumProperty.”
- the property can have a description, such as "Severity of the bug,” which can be used for displaying by the GUI as hint/additional information for a user. This hint can be represented as a text, which will be visible when the user will hold the mouse pointer over some GUI element without clicking on it.
- the bug severity has values (“valueVariant”). In this case, the bug can have a low severity, a medium severity and a high severity (which are represented as “bugSeverity:low,” “bugSeverity:medium” and “bugSeverity:high”).
- bug severity value (low, medium and high represented as bugSeverity:low, bugSeverity.'medium, bugSeverity:high) is an instance of the "ValueVariant" Class:
- Each bug severity value can have a name (as well as property from the source code block provided above):
- bug severity autotag is declared as the autotag.
- each bug severity value (low, medium, high) is the value of the bug severity autotag.
- the product is the predicate of autotag now.
- the Productl, Product2 and Product3 will be defined as the instances of the Product Class. Also titles (names) will be assigned to them.
- Autotag values can be defined manually or can be automatically found. In the lines above, Productl, Product2 and Product3 were defined directly/manually as the autotag values. The manually created autotag values must be created each time the new Item is created or changed. Also, if the Item is deleted by the autotag(s), the autotag values must be deleted as well. Rules can be used to avoid the manually autotag values creation/definition.
- the rule definition autotag values can be used instead the direct definition of autotag values.
- the rule definition method is a more acceptable method to define autotag values.
- the "productForBugAutotag” part of a statement can be interpreted as "The product for the Bug Autotag” or as "The Autotag with the name "Product for Bug.” In other words, "The Autotag, which is associates the Product with the Bug.”
- An another example is the "cmw:propertyAttributes”, which can be interpreted as attributes of a property or property attributes.
- the "creatorAutotag” part of the statement can be interpreted as the Autotag which name is "Creator,” or author - i.e., the person who creates something, a task or a bug, for example.
- N3 is based on RDF standards and equivalent to RDF/XML syntax, but has extra features like rules and formula. Rules can be written in N3, OWL, XNL, and other.
- the curly brackets here enclose a set of statements and represents a formula. All formulas are enclosed by the curly brackets. Apart from the fact that the Subject and Object (Subject formula is represented by two statements and Object formula is represented by one statement) of the triple are formula, the example shown above is just a single statement.
- a formula is a part of the rule, and can be represented by a set of statements (at least by one statement), wherein the rule is the statement as well, and wherein the statement is a triple.
- the "?x” is a variable (or it is an unknown value, desired value).
- the "?x” is not always represented as the desired variable, instead the known part of the statements (as URI) "?x” can be used for authentication of a fact.
- any data, such as entity, event, and other can be described by a triple.
- a triple can describe the fact as well, for example, "Bill is a Man,” “John is a Man,” “John's position is a manager.”
- the directly defined values for the autotags must be updated by a software developer (or someone else with the necessary permissions) each time after the Item creation or change. These values must be deleted after the deletion of corresponding Items. So, if the "Product4" Item is be added (for example, if the new triple "- products:Productl a example:Product;” is created or generated automatically and optionally will be added to a database) the triple “example:productForBugAutotag autotag:value products:Productl, products:Product2, products:Product3.” must be changed to the triple “example:productForBugAutotag autotag:value products:Productl, products:Product2, products:Product3, products:Product4.” After the new Item is created by the user, the new values are found for the autotag by the engine 305 by using the "rule definition” method (herein the autotag values are the result of the software processing of the data/triples).
- the new product is created, then it is transformed by the engine 305 to the at least one triple, like the following triple: "products:Product4 a example:Product;", and to other necessary triples, such as “products:Product4 cmw:title "Case Management Software v2.5".
- the "Event Filter-Interceptor Of Actions With The Item” 310 which is a part of the engine 305, intercepts all events with the Items.
- the "Structured Data Processing Module” 315 which also is a part of the engine 305, translates all events with the Items to triples).
- the applicable/possible values, which corresponds to added Item is found for the autotag during the Item creation.
- triples for new or changed Item can be added to the database (or to the Ram and/or cached) during the Item parameters adjusting - i.e., if at least one of the Items is created or changed, then one or more triples can be generated and stored in the database before the saving of the Item.
- This set of triples describes the Item, the Item's relations with other Items, data storage, where the Item triples are stored. Note that these mentioned triples can be created/written manually by a developer, by a user, by a database administrator, and etc. Also, such triples can be created by the software.
- the Item saving (storing to RAM, HDD, to a Network/Cloud storage, etc.) can be an intermediate saving, for example, when the Item parameters and attributes are stored during the Item adjusting. Also, the complete saving can be performed when a user, a system administrator, a developer, a database user or administrator decides if the Item is modified/adjusted/configured enough for the current purposes. Note that the Item and its parameters can be changed/reconfigured in the future, for example, new ontology can be added, the Item can be marked as an autotag and so on.
- autotag values can be found not only after the Item creation, but also on the search request (the search request(s) will discussed below) from the user (actually, the system send the request to find the data (translated from the user search request) - > the autotags are found for these data -> autotag values are found with using the autotag predicates - > autotag values are used to find Items to form an answer to a query from the system for searching). So the described above rule is used to find autotag values (so called the calculation of auto-tags) which are necessary to find Items for search words.
- the rule definition method replaces the manual adding of the autotag values.
- the rule definition method does not limit the possibility of the manual adding of triples, including part of statements, including properties/attributes and values.
- the manual adding of triples does not limit the possibility of the automatic adding of triples, including part of statements, including properties/attributes and values.
- the combination of manual and automatic/software methods for triple adding/creation can be used for a fine-tuning/more detailed description of Items or the system as a whole (for example, such a system may be a system for tracking and bug fixing in a customer service system, a personnel management system, etc.). Also, it can be used for the automation of the Item describing/definition, for example, in case of duplicated parameters of Items, using the parameters of similar/ related items, similar triples, including ontologies.
- the ontologies are used for Item describing information that can be obtained by the calculation (from the rules) and can be hard-coded in the form of axioms. It is possible to use multiple ontologies for an Item, and a single ontology can be used for the some Classes, for example, for similar Classes.
- cmw propertyAttributes cmw:predefined, cmw:readonly.
- the attribute of the property (some kind of a flag) is introduced for accountTagAttribute in these two lines. Each bug creator is marked by this flag, i.e., every creator will have accountTagAttribute attribute of the property.
- the formula is in the left side of this rule, and the formula consists of two statements.
- the search for all things, which have the accountTagAttribute as a property attribute, is performed in the first statement.
- the search for all things, which are related with propertyName predicate, is performed in the second statement, and these things are stored to "?property" and "?name” variables.
- all things found and stored to variables are given to the right side of the rule. In other words, all pairs for propertyName predicate will be found and then given to the right side of the rule.
- the first two statements on the right side of the rule say that all things found on the left side of the rule are autotags, and are autotag predicates.
- Third statement consist of usage of the data stored in ?name variable as autotag names.
- Fourth statement declares the Account as autotag type.
- a search of all things with an autotag type account is performed in the first statement of the left side of the rule.
- a search of all things that are accounts is performed in the second statement of the rule, i.e. the Items used as instances of the accoun Account Class.
- the autotag with the URI "example:userRelatedToBugAutotag” is created.
- This autotag is used to establish the relation(s) between a person and a bug.
- Such relations can be represented by the bug and bug creator (in some embodiments of the invention the bug creator is the person, who detected the bug and/or created the bug Item).
- a developer, a bug-fixer, a user, DB or system administrator, an operator and others can be such a person.
- the following statements show that some persons can be related with the bug, for example, a bug creator, and a developer/or a bug fixer, who will fix this bug. In other words, in this case, the bug can be related with or assigned to persons responsible for this bug.
- autotag predicates “cmwxreator” and “example:bugFixer” are defined for the autotag “example:userRelatedToBugAutotag.”
- the autotag predicates are the characteristic of the autotag. Based on the predicates it is possible to say which search area the autotag belongs to. In other words, autotag predicates of the specific autotag belong to one type.
- the values of the ?property was obtained with using the "Pproperty cmw:propertyAttributes autotag:accountTagAttribute" triple can be seen that there is one ?property type is used. Further in the example, the values of the ?property are used as the autotag predicates.
- Second statement is a complex statement and is written using Comindware Language.
- Comindware Language allows to use "or” logical operator for "products:Productl” and “products:Product2" as "Pproduct” variable.
- first statements written above can be described as: find all pairs "?x + Pproduct” with “example:product” predicate, wherein "Pproduct” variable can be “products:Productl” value or “products:Product2” value. All things (bugs in this case) with the high severity status will be found in the third statement of the rule (on the left side of the rule).
- the "query:42" is the interpretation of the user's search request.
- the above example describes the search of all bugs for "Productl” and "Product2.”
- This example can be considered as following: a user enters "high” as a search word to the GUI textbox; the system associates the entered word with at least one statement element (in this case with "bugSeverity:high”) based on the statements analysis.
- the statement “bugSeverity:high cmw:variantName "High” can be one of such analyzed statements.
- the statement element will be related by the system with the "autotag:value” predicate (it is evident from the “example:bugSeverityAutotag autotag:value bugSeverity.low, bugSeverity.medium, bugSeverity.high.” statement).
- the system will find all Items by using the mentioned rule for at least one appropriate autotag predicates (in this case "example:bugSeverity” predicate from the “example:bugSeverityAutotag autotag:predicate example:bugSeverity;” statement).
- autotag predicates in this case "example:bugSeverity” predicate from the “example:bugSeverityAutotag autotag:predicate example:bugSeverity;” statement.
- any type of a machine/processor/application readable request/query can be used.
- the query characterizes and describes the request from the user, and the result(s) of the autotag search is stored in "?x" variable.
- the following example is an alternative implementation of the search request for bugs, that have high severity status and related to "Productl” and "Product2" products.
- the second rule is similar to the above example of query/search request for the Item search.
- the Item search itself is processed in the first formula, in which found items stored in the "?x"; and "?values" and "?tag” from the second rule are used as the input parameters.
- the first rule is called from the second rule with the specified parameters - values of the "?tag” and "?values".
- brackets In the right-hand side of the first rule the content of the brackets (?tag ?values) is an array/list of two elements/variables.
- a list or sequence is an abstract data type that implements a finite ordered collection of values, where the same value may occur more than once.
- An instance of a list is a computer representation of the mathematical concept of a finite sequence; the (potentially) infinite analog of a list is a stream. All found things (and stored to the "?x" variable) on the left side of the second rule is the result of the "query:42" search request.
- the transitive autotag can be used for the Item search, where the Items are related with other Items.
- such autotags can be used for Item search in the Item group.
- the Item group can be created for some Items to provide links/relations of Items.
- linked/related Items for example, linked by at least one group/family
- a characteristic which further can be used as an autotag predicate.
- the "example:productFamily" predicates can be used for establishing the Item belonging to at least one group/family.
- some common Item attributes/properties can be described and used for all Items included into the group, for example, bug or task priority/severity, belonging to the same product, etc.
- example:ProductFamily statement element is determined as Class. Further “productFamilies:Familyl” and “productFamilies:Familyl” is determined as an instances of “example:ProductFamily.” Also, a concept of the product family is introduced. Such families can be used for Item search (products in this case), which is included into a family. It can be implemented by declaring the statement element property and used for searching for bugs that are related to family, for example, products that belong to Enterprise products. Such properties can be used as autotag values for Item search. As mentioned above, the Item has the above attributes, such as a name, attribute properties, property types and etc. Note that some attributes can be assigned to the Item by default, for example, by using the ontologies that can consist of the facts and/or the rules.
- the Item attribute name can be assigned from the system data, external application/module, combined from the other Item name or attribute and property.
- the counter each Item can be numbered according to a counter
- current date for example, can be used for adding to the Item attribute, for example, ItemOOl, Item002 or Bug_10_10_2014.
- autotag name and autotag predicate is defined: - autotag:name "Product Family for Bug”;
- This rule is another demonstration of an automated search for autotag values without having to specify them manually, although the implementation of the present invention also allows the use of hand-written autotags and their predicates, values, etc. It is worth noting that one of the implementations of the present invention allows for combining all of them.
- the Item search is performed across all possible autotags and all autotags that are associated with the search word.
- the user can select at least one of the search area (FIG. 5), which can be used for the Item search.
- the Item search in the defined search area can be implemented by the rule usage.
- autotag predicate names can be displayed to a user as search areas via a GUI.
- the entered by the user word “high” is the name of "bugSeverity:high” (as can be seen from the statements “example:bugSeverity cmw:valueVariant (bugSeverity:low bugSeverity:medium bugSeverity:high).” and “bugSeverity:high cmw:variantName “High”.”).
- the importance/severity property of the bug has the "Bug Severity” name (see the statement above: “example:bugSeverity cmw:propertyName "Bug Severity”;”). This name can be used as the search area.
- the severity of the bug is an autotag predicate (see the statement above: “example: bugSeverityAutotag autotag: predicate example: bugSeverity;”), and the autotag has the "Severity” name (the statement from the source code: “example: bugSeverityAutotag autotag: name” Severity ";”), which can also be shown to the user as the search area.
- the search by autotags can be processed with using at least three ways:
- a user knows the Item property by which he wants to search; for example, by the "assignee" or by the "manager.” If the user selects "manager,” the system will offer the one of the autotags associated with the selected Item property to the user, for example "managerName.” After the user selects the appropriate autotag and, then, selects the user name from the list, the system will store the autotag and autotag value (also autotag predicate can be stored). These stored data can be used for further Item search and/or for search of all Items related with current Item.
- a user knows exactly who/what needs to be found (i.e., he knows the name of the Item). The user can enter "Bill,” and the system will provide him such autotags as "assignee,” "manager” and other autotags associated with the entered search word "Bill.” In this case, the Item name is known, and appropriate autotags are found for it.
- the search process is carried out as described in the previous paragraph. - the user can enter the search word, but he does not know which of search areas will satisfy him, so he can select the search in all autotags by using the logical operator "OR”. In this case, all search results (predicates of which are related with the search word) will be shown to the user.
- autotags can belong to an autotags set(s).
- the autotag set can be created/described for the bug severity, for product and so on. Values of autotags from the autotag set will belong to this autotag set.
- the autotag searches are performed using the logical operators "AND” and "OR.”
- the operator "OR” is used to find Items with autotag values from the same autotag set. For example, if the user want to find a bug with a severity "High” and "Low,” then the operator "OR” will be used and the request can be interpreted as "search for all bugs, which priority is High OR Low.”
- the "AND” operator usually is used for the combining the search requests from the different autotag sets.
- the autotag can be determined by the autotag predicate, autotag property, Item property.
- Item's content can be analyzed during the Item search, i.e., the Item main body text, a content of Item text fields, attached documents can be parsed during the Item search. Parsing or syntactic analysis is the process of analyzing a string of symbols, either in a natural language or in computer languages, according to the rules of a formal grammar. A search in such elements can be provided instead of the parsing. Such search or parsing can be provided by third-party modules (an additional search engine, which can provide a full search in Items, for example). However, autotags can be created for such search, for example for the data search in Item content.
- Autotags can reflect the character or emotions of the data/content.
- Such autotag values can be described based on the Item data analysis, for example, such data can consist of Item description, in Item attachments, and etc. So if the Item data or attachments consist of content elements (words or a set of words) "A”, “B", “C” and “D,” then the "ABCDE” can be considered as the Autotag value.
- Such summary content can be compressed before being used as an autotag value, for example, it can be hashed.
- Such content elements can be common elements in various contents of Items. So conclusions can be made based on the Item data, and these conclusions can be used as autotag values. A semantic analysis of the Item data can be used for analysis discussed above.
- the above described example of the invention (when a user initiates a process of Item search) is also called on-demand and on-the-fly search.
- the previously found autotags, autotag values and autotag predicates can be used in the next data/things search, which can be initiated by the user or by the system.
- Such found autotags, autotag predicates and autotag values can be used for searching the Items (which corresponds to the user search words and selected parameters), which have not been changed after the last search request.
- the on-demand search method provides relevant search results, and it is not necessary to start the new search process.
- the Items that have been changed after the last search can be marked in the list of the changed Items.
- the list of the changed Items can be used during the on-demand search to signal to the system for which Items the autotag values must be re- found and for which Items the autotag values are relevant.
- FIG. 3 illustrates a data processing algorithm, according to one embodiment of the invention.
- the Item Processing Engine 305 coupled with a GUI is used to allow the user to create, delete or edit Items.
- the Engine 305 by using the Module 315, converts the user actions with the Items to triples and stores them into the Database 390.
- the triples are stored into a database(s), which can be stored on the data storage, such as local data storage, cloud storage in cloud services, SAN, NAS, various web-services and others.
- any known data storing system for example a data serialization can be used for storing the data, for example in xml format
- the triples can be stored in the database format or in the format of triples.
- the user can create a new task with a title "Add new color image with the mobile device to the product folder" for "Maxim” employee (which is an assignee for it) and relate this task to a product, for example, "Comindware Process,” - i.e. it will link the task with the product to which the task belongs to. Also, the user can change the status (from "Opened” to "Closed"), title and other Item properties.
- the Engine 305 transform the result of the user actions to the triples, so, for example, the triple "Task status Closed” is created and replaces the triple "Task status Opened” in the database 390.
- Actions with Items are tracked by Event Filter-Interceptor Of Actions With The Item module 310.
- One of the purposes of this module is tracking the need for marking/unmarking the Item as a potential one for generating the auto-tag.
- the system/application initiated actions can be considered a bug automatically closing after the product (for which the bug was created) supporting is stopped.
- Another example of such actions is the bug creation (and/or a task creation for this bug) by the system/application after the bug is detected by the user or by the software debugger, or by "try-catch" function.
- the Engine 305 also sends at least one request to the Autotag Processing Engine 320 for the Item search according to a search word.
- the request initiates the Item search process (in Engine 320) by using rules mentioned above for the search of the autotag values.
- the "query42" can be considered as an example of such request.
- At least one found Item is the result of the functioning of the Engine 320. Note the "null" result can be obtained from the Engine 320 if no Item was found during the Item search.
- autotags, autotag predicates and autotag values can be stored to RAM, including optional caching of them.
- data stored in RAM can be used, for example, by GUI to display found Items, and for further Items search, which can be initiated by the user by using other search words and same or other search area(s).
- the search areas can be extended by the user.
- Module for Determining Related Items 330 is responsible for determining the relations of Items with each other.
- the example of the Item relations is the "transitive auto tag" which is used for establishing the direct and indirect relations between the Items.
- the direct Items relation can be represented as a link between the two items, but the indirect Item relation - as a link to other Item through at least one another Item.
- the relations between Items can be established by using the rules and by using the description of the Items, Item properties and can be represented as autotag and autotag predicates.
- the search for autotags, autotag predicates and autotag values for related Items is performed based on the data received from the module 330 (and data from the RAM) and from the database 390.
- the result of previously searched Items and all data associated with them can be stored in RAM.
- the usage of the data from RAM can speed up the execution of the next search requests.
- the data from the database 390 can be obtained (and used by the Engine 305 and Engine 320) in the form of triples, a list of Items with relations, and in the other forms.
- the Item relations data (and related to them autotags, autotag values, autotag predicates) optionally can be stored in the database 390 (or to another database, which can be used for the related Item triples storing) or to RAM for further processing. For example, such Item relations data can be used for the second search request from the user or from the system.
- the Autotags processing engine 320 uses the data from the module 330 and from the database 390 via the optional triple processing module 360.
- the optional module 360 is responsible for representation of the data from the database 390 in the applicable form for the autotag calculation engine, for example, if such data were stored not in the Subject-Predicate-Object format, but in the DB format.
- the data can be converted by DB means or modules instead of converting in the module 360.
- the module 360 is responsible for applying the rules to facts from a database.
- the module 360 can be a part of the Semantic processor 840 (FIG. 8) and perform all its functions or part of them.
- the module 360 is responsible for representation of the data from the Item Processing Engine 305 to the database format.
- the data from the Item Processing Engine 305 can be written to database 390 after the converting the data to appropriate database format if necessary.
- FIG. 4 illustrates a flowchart of the part of Item search process (406) comprising of autotags finding for changed or created Items.
- the autotags and other data that are necessary for Item search process can be found for related Items.
- the embodiment which does not employ on-demand search for autotags, autotag values and autotag predicates is depicted in FIG. 4.
- the autotags, autotag values and predicates can be found when the user asks to find the things with using the search words (herein autotag and other data will be found on-demand) or when the Item is changed or new Item is created.
- the process goes to step 410.
- step 410 the process determines if the current Item is related to other Items. The relations are established during the new Item creation, and during the associating and reassociating of Items with each other (for example, the task can be re-assigned to other user, or linked to second Product, or the Product can be moved to other Project or Family and etc.). If the current Item is not related to other Items, then, in step 430, the process performs the search for the autotag, autotag predicates and values for current search words. Then, the process goes to step 455 wherein the found autotag, autotag predicates and autotag values for Items are stored to the RAM or to the Database. Note that steps 410, 430 and 440 are steps of the Item search process.
- the found autotag predicates and autotag values can be sent to the external application, for example, if the search request was sent by external application or the search result is necessary to be processed by the external application (for example, for displaying to users after the converting to a displayable format). Also note, autotag predicates and autotag values can be converted to the external application format.
- step 410 If in step 410 at least one related Item is determined, the process moves to step 440 wherein the autotag values and predicates for related Items are found. Then, the process moves to step 430. Note that the previously found autotag values and predicates are deleted, if the user or system deletes corresponding to them Item. If the Item is changed, autotag predicates and values must be re-found. Also, autotag predicates and values must be found again for related Items as well.
- a weight can be assigned to the autotag predicate. This weight is used to Item search beyond the found autotag predicates through previously described Item search method by using autotag predicates.
- weights of autotag predicates and their usage are described.
- the plurality of predicates is used to describe an example of usage of weights of autotag predicates.
- the predicates plurality describes family ties within the family. For example, a man has a mother, a father, a sister, a cousin, and a second cousin.
- the "relationshipAutotag” is created to describe the kind of relationship between family members: "motherForMan elationship,” “fatherForManRelationship,” “cousinForManRelationship,” “secondCousinForManRelationship” and “sisterForManRelationship” are predicates for this autotag.
- a mother, a father and a sister are close relatives, but a cousin is a distant relative, and a second cousin is the most distant relative.
- the distance to the family tree of the Item can be described by the weight of autotag predicates. The closer is a given relative to the Item, the higher the weight of the autotag predicate.
- motherForManRelationship, fatherForManRelationship and sisterForManRelationship predicates have a larger weight (for example, the weight is equal to 1), than a cousinForManRelationship predicate (the weight is equal to 0.7), but the weight of the secondCousinForManRelationship predicate will be smallest, and will equal to 0.3.
- the sister, the cousin, the brother are Items.
- Weights described above can be used for the search of such Items - i.e. it is possible to find Items by autotag predicates with weights that are more than 0.5.
- the Items search is performed trough the following autotag predicates: motherForManRelationship, fatherForManRelationship, sisterForManRelationship and cousinForManRelationship, but secondCousinForManRelationship autotag predicate will not participate in this search.
- weights can be assigned to autotag predicates manually and stored (i.e., for example, they can be stored to a database by a user, a developer, a database administrator, etc.) during the process of the Item creation and adjusting.
- weights of autotag predicates can be calculated and assigned/added to them. The calculation of weights of autotag predicates can be performed by using rules and surrounding context, such as ontologies, current Item data and other Items data. For example, if the bug was created on Friday, the predicate weight for this bug can be higher. The calculated weight can be added to the autotag predicate.
- calculation of weight of the autotag predicate can result in adding, increasing, decreasing of the weight of the autotag predicate (including the weight reduction to zero, which means that there is no weight of the autotag predicate or it means that the weight is infinitely small).
- the heavyWeightMotherForManRelationShip predicate weight can have the weight equal to 1 ("heavyweight" in the predicate name is associated with the unit weight), and averageWeightFriendForManRelationShip can have a weight equal to 0.5 (averageWeight").
- the weight in the predicate name can be specified directly, for example, WeightOneMotherForManRelationShip, WeightlMotherForManRelationShip, WeightHalfFriendForManRelationShip, WeightODot5FriendForManRelationShip.
- the presence of certain words or phrases in the predicate name can affect the weight of autotag predicate. For example, words like mother, father, wife, sister, brother, important, interested, WOW, radiant, great, etc. can implement the higher weights for autotag predicates, while little, weak, flaccid, dark, cousin can implement the low weights.
- semantics (the meanings) of part of autotag predicates names can determine the weight of autotag predicates.
- the propagation can be used to specify at least one additional predicate, which also is involved in searching for Items.
- An autotag predicate can be used as propagation parameter (i.e., an additional descriptor).
- autotag predicate "cmw reator” can be complemented by the descriptor "propagWith”, which means that the predicate is propagated through the following predicates "mainPredicate propagWith targetPredicates", through “example:bugFixer” predicate.
- the weight for the propagation can be adjusted to avoid the weights calculations for duplicated relations with Items.
- the bug has a text field "same_as_that_bug," i.e. duplicated relations exist and are stored in the database, or virtually presented as ondemand calculated relations/links.
- Virtual relations in this context mean the relations that can be found by using rules, ontologies, etc.
- Bill changes same/identical bugs, so high weights will be assigned to Bill (if the propagation was not adjusted, i.e. the default propagation was used), because each bug affects another one.
- the duplicates can be ignored and will be not used in Items search.
- the search system/search engine can consider such duplicated bugs as a one bug, or these relations between bugs are considered as a one relation.
- the weights of relations are considered but not the predicates weights - i.e., the supposed weight of the predicate is considered.
- small weights can be also assigned to such duplicates, for example, the weight value can be equal to 0.00001, so it has a negligible effect for the sorting of found Items.
- This search system is able to automatically distribute weights based on algorithms and/or previously received data.
- the Item weight (and position in the sorted list of found Items) depends on the autotag predicate, which was used during the Item search, i.e. the Items with small weights (for autotag predicates with small weights) are positioned closer to the end of the sorted list of Items than Items with large values of weights. Note that the weights of autotag predicates are also used to establish the depth of the Item search. The weight of the predicates of the autotags and the search parameters determine whether the particular Item will be found, or some other Item, or nothing at all will be found.
- a car and car's part are used.
- the car is composed of the following parts: doors, bumpers, trunk, bonnet, fenders, etc. Each of them is an instance of the Class named Car's part and they are Items.
- An Item's property is a Color. Say the Color is propagated through car's parts.
- the Car 1 Item is the instance of the Car Class.
- the Red color can be used as an Item search condition, i.e. a user wants to find the Red Car.
- the colored parts of the car can determine the overall car color, i.e. if the one of the car's part has a colored red part the car's color is red.
- a color can be propagated through at least one predicate, for example, a door predicate (a left_front_door predicate, a right_rear_door predicate), a bumper predicate, a trunk predicate, a bonnet predicate, a fender predicate, etc.
- the bumper is 5 percent of the entire car, so the weight of this predicate is equal to 0.05.
- the weight of a door predicate is equal to 0.1, the weight of a trunk - 0.2, a body - 0.5, etc.
- the DB contains the following data for a Car 1: front and rear bumpers have a Green color, a body is Green, a trunk is Red and a door is Black.
- Car 1 has a 60 percent of Green colored parts and 20 percent of Red colored parts and 10 percent of Black colored parts (a front left door has been replaced and is was not painted yet).
- Another Item, Car 2 can comprise 40 percent of Green colored parts (a Green colored trunk and two front Green colored doors), also it can comprise 50 percent of Red colored parts (in this case the car body has Red color) and 10 percent of the car are Black colored bumpers.
- the DB can store (in form of ontologies and/or rules) the fact the car is Green if Green colored Car's parts are more than 50 percents (i.e., total weight of autotag predicates more than 0.5), and the car is Red if total weight of autotag predicates comes to more than 0.4, and the car is Black and Red colored car, if the predicates weigh more than 0.5 for Red color and 0.9 for Black color.
- the search engine during the search process will combine autotag predicates of car parts, so the Car 1 is found as a Green colored car, but the Car 2 is found as Red and Red and Black Car.
- the weight of the autotag predicate is a part of the total value (in this case 0.1 is 10 % of the car color, 0.5 is 50 %, etc.).
- the weights introduction and operations with them can vary, since the weight may represent the importance of a particular predicate, high priority, etc. So the propagation is an implicit/indirect relationship between Items.
- FIG. 7 illustrates an example of different business applications used in different departments of a company and data processing flow within them.
- Data from a database can be separated into the user data as axioms, facts, rules and the ontologies (presented as axioms as well, but which can be distinguished by N3 syntax).
- the ontologies contain information about what data and how data should be presented for the particular user in a particular business application.
- the facts and the ontologies are processed by the Core and the result of processing is data in a particular context according to the ontologies. Thanks to the use of the RDF (Resource Definition Framework), it is possible to work with different data sources (e.g., databases, data storages local, located on a corporate network, or located on the Internet).
- RDF Resource Definition Framework
- Element 707 permits presenting contextualized business information for a particular application.
- "Global" is an entity for a representation of data in a particular context of a business application.
- Context can be thought of as an environment surrounding a particular event, an object or an individual that defines the interpretation of data and actions/operations on data in a particular situation.
- the context defines a method of processing data in a particular situation. For example, someone's email address can be treated as login information in one context and as contact information in user's profile in another context. It can be used as a requester in a support ticket a third context - all depending on interpretation of data.
- the proposed architecture permits each individual group or department within a business to work with its own database/storage and its own server system, while the global server with engine can present a combined view of all the data from all the departments. This is done on-the-fly and without duplicating the data, which is particularly important from the perspective of information security, as well as from a point of view of maintaining up-to- date information (in this case, when the data on one storage changes, there is no need to change the data on other storages).
- the on-the-fly results of search representation based on the calculated autotags can be illustrated with the following example.
- ontology data i.e., the data that describes the entities that are being manipulated
- Each one is stored in its own database (or, alternatively, in a common database), and are combined at a higher level.
- a logical core it is possible to combine the various available data into different combinations and sets of ontologies, for example using specific ontologies.
- a developer can see the context of the representation of the bug submitted by QA as a task, assigned to him. In other words, it is possible to track to which task each particular bug relates to.
- OLAP is a conventional technique for generating reports and different statistical documents.
- the OLAP cubes are frequently used by analysts to quickly process complex database queries, and they are particularly commonly found in marketing reports, sales reports, data mining, and so on.
- the business layer requests data from the core
- the logical core collects the data from different sources
- the engine required for completing business layer request compiled rules is put together (for example, in C# code, although the invention is not limited to any particular programming language).
- the engine can have rules that have been collected and compiled previously, as well as new rules for the rules that have not been processed before. Therefore, the compilation needs to be performed for new rules only.
- the core does not need to constantly work with the data, but only addresses the data in response to requests from the business layer;
- the core returns the requested data to the business layer.
- the filter type rule is frequently used in the tracker cases, for example, receiving a list of support ticket/bugs/tasks from a particular user and a particular project with particular attributes. Therefore, from the overall pool of tasks/bugs/tickets, the information needs to be filtered for the particular user and project, and presented in a form of separate axioms.
- Transformative rules relate to the same information that can be presented in a different manner.
- the IT department views people as users of the IT system.
- a project manager can view the same people as resources working on particular tasks.
- transformative rules can include the following: at the input, data (axioms) is received that describes a particular ticket (i.e., requests from a particular end user) and its data representation. Therefore, the engine will receive, at the output, axioms in the form of a specific user interface, filled in with data.
- Another example of a type of rule is a generating rule. For example, in a particular project that has more than 50% critical bugs and more than 50% critical tasks, the system automatically generates the fact about project status (e.g., the status of the project is listed as "Critical").
- axiom conjunction example is frequently the most easily understood one.
- the bugs from the bug tracker axiom "bugs” - bugs, identified by the testers, who are typically quality control engineers, and filled out bug tracking forms based on certain requirements
- "result axiom” results which, in essence, is a combination of how many bugs are associated with a single functional requirement.
- FIG. 8 illustrates the system in one embodiment of the invention.
- JSON/Ajax API 822 is a module for realization of the protocol of the API (Application Programming Interface) methods calls using a JSON reporting format and with a data transmission over the HTTP using Ajax requests.
- WCF API 824 is a module that implements the protocol of the API methods call using the XML/SOAP representation format and a data transmission over HTTP.
- the API manager module 826 The API manager module 826:
- API manager module 826 uses N3-files 830 (which contains triples in the N3 format) for searching the method implementation 828, wherein "a pi: Method” ontology (for example) lists all methods, arguments types and return values.
- Implementation of the method is a program code. For example, it can save the data to a database, or it can close or open the task and other operations.
- a "CreateTask” API method is used for the task creation.
- the method accepts tasks data as an argument and returns the identifier for the created task.
- the JSONhandler gets the method name and arguments (a task data) in the JSON-format when the method is invoked via JSON- protocol. Then, arguments are converted to an internal representation of data in the memory and transmitted to the API manager.
- the API manager (which has a list of methods) can find the required "CreateTask” method by name. Then, the API manager validates arguments (the number and type of) and implements a method "CreateTask.” After the task was created by the "CreateTask” method, the API manager transfers back the result to the JSON-handler.
- the JSON-handler converts the result to the JSON format and sends it back to the client (for example, to the MS Outlook client or to a mobile device application).
- API Manager loads the API specification and extension modules from the Database during the application start (MS Outlook add-in/plug-in 807).
- This specification can be requested by the Outlook plug-in 807 of MS Outlook client 806 or through a special Ajax-request or as a scheme in the WSDL-format with using a SOAP-protocol.
- a Web Client 802 (for example, based on JavaScript, or on HTML5) or the Command Line Client 804 can be used instead of MS Outlook.
- Console client is the client application, which can call API methods using command line.
- a mobile application on a mobile device 801 can be used.
- JavaScript client is a client application, which is executed in the user web-browser and which can call API methods using JavaScript language.
- Outlook Add-in is the client application, which is executed in the MS-Outlook application which can call API methods with using WCF-protocol.
- the Web Services Description Language is an XML-based interface description language that is used for describing the functionality offered by a web service.
- a WSDL description of a web service (also referred to as a WSDL file) provides a machine- readable description of how the service can be called, what parameters it expects, and what data structures it returns. Thus, it serves a purpose that corresponds roughly to that of a method signature in a programming language.
- client applications (801, 802, 804, 806) can make calls using Ajax-query reporting format JSON, or SOAP protocol.
- the main stages of processing the request are:
- Incoming request is processed by the HTTP-server (or by the External server or MS Exchange Server). JSON serialization or SOAP converting to internal format occurs.
- API manager 826 receives the input data and validates the input arguments to match the method description.
- API Manager 826 loads and prepares required data model and creates the snapshot of the model for isolation from the other requests and operations. The write transaction is opened if the operation changes the model data.
- the transaction is closed if the operation is a modifying operation, and checks for changes to security, the conflict detection, the update transaction history are performed.
- the result is serialized in the format required by a customer and given to HTTP-response.
- the business logic of the application 820 implements an object layer over the facts storage.
- Access to data is provided through the client API, which contains methods for objects reading/writing (e.g., object templates, business rules, etc.)
- Calls of the API methods by clients are implemented through the sessions that are created after the client authorization.
- This layer contains a number of system ontologies, such as, for example, "the template of the user object” or "business-rule.” Ontologies are used in API for data serialization and validation.
- the data storage 842 provides a physical storage for the model data on the hard disk drive.
- the data is sent to the data storage 842 and back out in the form of the facts (triples).
- a fact is a triple, which is stored in the model. Also, the fact can be obtained by applying the rules or requests.
- a data storage consists of:
- - triples streaming store 858 allows to record and query triples of special file format. Streaming store triples supports multiple types of queries on various components;
- the transaction and snapshots manager 854 allows to create:
- Transactions are the objects with the interface for atomic modification of the storage triples. Model changing is possible only within the framework of such a transaction while guaranteeing atomicity modification store triples (commit all changes made within a transaction, or none of them);
- snapshots are objects with an interface for consistent read from the triple storage. It is guaranteed that none of the transactions (which were committed during the existence of the snapshot) affect its contents.
- Triplets stored in the repository are simple, small objects (numbers, strings, names).
- the binary stream manager 856 is used to save large values (files, data streams) onto the storage.
- the stream is stored in a separate file, and a link to the stream is stored to this file;
- the data storage model 850 represents a set of interfaces for managing data storage 851. Such interfaces can include transactions, snapshots, the interface for requesting the facts (triples) from the snapshot and interface for writing the facts to the transaction.
- the semantic processor 840 contains a description of interfaces, such as name, facts (triples) and model rule.
- the N3-converter 849 allows for generation of a data model based on the content of N3-file 830. (Note that the triples can be stored in a database in any format discussed above).
- a connection to the data store is another method of forming a pattern.
- the combined models can be formed, so multiple models are combined into one. Requests for such models lead to a request to the facts of each connected model. Record data while still working with only to one of the models.
- a business rules handler 844 is an optional add-on over the data model. After handler 844 is connected with the model, it allows for computing derivatives based on existing facts and rules there.
- the Data Model Interface 846 is a set of interfaces for requesting facts from the model, for writing to the model, transactions and model snapshots creating.
- the Ontologies Serializer 848 creates the queries to retrieve objects from the entire model based on the ontologies (description of the structure of objects stored in the model).
- Transactions and queries are isolated using transactions. After a transaction is opened for write or read, the transaction is completely isolated from other transactions. Any changes in the data model made by other transactions are not reflected. Conflicts detection and resolution of conflicts are performed when closing the transaction, which was opened for writing. The so-called model optimistic concurrency is used. Detection of conflict occurs at the level of individual semantic facts. A conflict occurs when the fact has been modified by two transactions since the snapshot model was created and until the closing of the transaction. An exception will be generated during the conflict determination. In this case, the user can try updating the saved changes and try again to commit changes.
- OCC Optimistic concurrency control
- OCC is generally used in environments with a low data contention.
- conflicts are rare, transactions can be completed without the expense of managing locks and without having transactions wait for other transactions' locks to clear, leading to higher throughput than other concurrency control methods.
- conflicts happen often the cost of repeatedly restarting transactions hurts performance significantly and other concurrency control methods have better performance under these conditions.
- OCC transactions involve these phases:
- Begin Record a timestamp marking the transaction's beginning.
- Validate Check whether other transactions have modified data that this transaction has used (read or written). This includes transactions that had been completed after this transaction's start time, and optionally, transactions that are still active at validation time.
- an exemplary system for implementing the invention includes a general purpose computing device in the form of a computer 20 or a server, including a processing unit 21, a system memory 22, and a system bus 23 that couples various system components including the system memory to the processing unit 21.
- the system bus 23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
- the system memory includes read-only memory (ROM) 24 and random access memory (RAM) 25.
- ROM read-only memory
- RAM random access memory
- the computer 20 may further include a hard disk drive 27 for reading from and writing to a hard disk, not shown, a magnetic disk drive 28 for reading from or writing to a removable magnetic disk 29, and an optical disk drive 30 for reading from or writing to a removable optical disk 31 such as a CD-ROM, DVD-ROM or other optical media.
- the hard disk drive 27, magnetic disk drive 28, and optical disk drive 30 are connected to the system bus 23 by a hard disk drive interface 32, a magnetic disk drive interface 33, and an optical drive interface 34, respectively.
- the drives and their associated computer-readable media provide non-volatile storage of computer readable instructions, data structures, program modules and other data for the computer 20.
- exemplary environment described herein employs a hard disk, a removable magnetic disk 29 and a removable optical disk 31, it should be appreciated by those skilled in the art that other types of computer readable media that can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memories (RAMs), read-only memories (ROMs) and the like may also be used in the exemplary operating environment.
- RAMs random access memories
- ROMs read-only memories
- a number of program modules may be stored on the hard disk, magnetic disk 29, optical disk 31, ROM 24 or RAM 25, including an operating system 35.
- the computer 20 includes a file system 36 associated with or included within the operating system 35, one or more application programs 37, 37', other program modules 38 and program data 39.
- a user may enter commands and information into the computer 20 through input devices such as a keyboard 40 and pointing device 42.
- Other input devices may include a microphone, joystick, game pad, satellite dish, scanner or the like.
- serial port interface 46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port or universal serial bus (USB).
- a monitor 47 or other type of display device is also connected to the system bus 23 via an interface, such as a video adapter 48.
- personal computers typically include other peripheral output devices (not shown), such as speakers and printers.
- the computer 20 may operate in a networked environment using logical connections to one or more remote computers 49.
- the remote computer (or computers) 49 may be another computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 20, although only a memory storage device 50 has been illustrated.
- the logical connections include a local area network (LAN) 51 and a wide area network (WAN) 52.
- LAN local area network
- WAN wide area network
- the computer 20 When used in a LAN networking environment, the computer 20 is connected to the local network 51 through a network interface or adapter 53. When used in a WAN networking environment, the computer 20 typically includes a modem 54 or other means for establishing communications over the wide area network 52, such as the Internet.
- a modem 54 or other means for establishing communications over the wide area network 52, such as the Internet.
- the modem 54 which may be internal or external, is connected to the system bus 23 via the serial port interface 46.
- program modules depicted relative to the computer 20, or portions thereof may be stored in the remote memory storage device. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
RU2015112157A RU2707708C2 (ru) | 2015-04-03 | 2015-04-03 | Система и способ поиска данных в базе данных графов |
PCT/RU2015/000237 WO2016159819A1 (en) | 2015-04-03 | 2015-04-10 | System and method for data search in a graph database |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3095052A1 true EP3095052A1 (en) | 2016-11-23 |
Family
ID=54366492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP15771496.5A Withdrawn EP3095052A1 (en) | 2015-04-03 | 2015-04-10 | System and method for data search in a graph database |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP3095052A1 (ru) |
RU (1) | RU2707708C2 (ru) |
WO (1) | WO2016159819A1 (ru) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11783131B2 (en) | 2020-09-10 | 2023-10-10 | International Business Machines Corporation | Knowledge graph fusion |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
RU2670781C9 (ru) * | 2017-03-23 | 2018-11-23 | Илья Николаевич Логинов | Система и способ для хранения и обработки данных |
US10592557B2 (en) | 2017-03-31 | 2020-03-17 | Microsoft Technology Licensing, Llc | Phantom results in graph queries |
US20210357791A1 (en) * | 2018-08-31 | 2021-11-18 | Ilya Nikolaevich LOGINOV | System and method for storing and processing data |
WO2023121504A1 (ru) * | 2021-12-24 | 2023-06-29 | Общество С Ограниченной Ответственностью "Кейс Студио" | Система и способ управления оповещениями |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7890518B2 (en) * | 2007-03-29 | 2011-02-15 | Franz Inc. | Method for creating a scalable graph database |
US8244772B2 (en) * | 2007-03-29 | 2012-08-14 | Franz, Inc. | Method for creating a scalable graph database using coordinate data elements |
US8606807B2 (en) * | 2008-02-28 | 2013-12-10 | Red Hat, Inc. | Integration of triple tags into a tagging tool and text browsing |
US8478766B1 (en) * | 2011-02-02 | 2013-07-02 | Comindware Ltd. | Unified data architecture for business process management |
RU2490702C1 (ru) * | 2012-05-02 | 2013-08-20 | Корпорация "САМСУНГ ЭЛЕКТРОНИКС Ко., Лтд." | Способ ускорения обработки множественных запросов типа select к rdf базе данных с помощью графического процессора |
-
2015
- 2015-04-03 RU RU2015112157A patent/RU2707708C2/ru not_active IP Right Cessation
- 2015-04-10 WO PCT/RU2015/000237 patent/WO2016159819A1/en active Application Filing
- 2015-04-10 EP EP15771496.5A patent/EP3095052A1/en not_active Withdrawn
Non-Patent Citations (2)
Title |
---|
None * |
See also references of WO2016159819A1 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11783131B2 (en) | 2020-09-10 | 2023-10-10 | International Business Machines Corporation | Knowledge graph fusion |
Also Published As
Publication number | Publication date |
---|---|
WO2016159819A1 (en) | 2016-10-06 |
RU2015112157A3 (ru) | 2018-09-11 |
RU2707708C2 (ru) | 2019-11-28 |
RU2015112157A (ru) | 2016-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10019537B1 (en) | System and method for data search in a graph database | |
Ding et al. | Ontology Library Systems: The key to successful Ontology Reuse. | |
Miller et al. | Three implementations of SquishQL, a simple RDF query language | |
US9684699B2 (en) | System to convert semantic layer metadata to support database conversion | |
Hassanzadeh et al. | Discovering linkage points over web data | |
US8595231B2 (en) | Ruleset generation for multiple entities with multiple data values per attribute | |
US9213698B1 (en) | Unified data architecture for business process management and data modeling | |
RU2707708C2 (ru) | Система и способ поиска данных в базе данных графов | |
López et al. | An efficient and scalable search engine for models | |
Lee et al. | Ontology management for large-scale enterprise systems | |
Bergamaschi et al. | From data integration to big data integration | |
CN115905212A (zh) | 具有相关性标识符的集中式元数据储存库 | |
Konstantinou et al. | Incremental export of relational database contents into RDF graphs | |
Bajaj et al. | IAIS: A methodology to enable inter-agency information sharing in eGovernment | |
Zhao et al. | Forestry big data platform by Knowledge Graph | |
Mosca et al. | Ontology learning from relational database: a review | |
RU2708939C2 (ru) | Система и способ обработки данных графов | |
US12093289B2 (en) | Relationship-based display of computer-implemented documents | |
Palopoli et al. | Experiences using DIKE, a system for supporting cooperative information system and data warehouse design | |
Konstantinou et al. | An approach for the incremental export of relational databases into RDF graphs | |
El Malki et al. | Querying heterogeneous data in graph-oriented NoSQL systems | |
Lee et al. | Ontology management for large-scale e-commerce applications | |
Ramanujam et al. | Relationalization of provenance data in complex RDF reification nodes | |
Salles | Pay-as-you-go information integration in personal and social dataspaces | |
Bakhtouchi et al. | Mediated Data Integration Systems Using Functional Dependencies Embedded in Ontologies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20151007 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: TSYPLIAEV, MAXIM VIKTOROVICH Inventor name: VOLYNSKY, PETER EVGENIEVICH |
|
17Q | First examination report despatched |
Effective date: 20180314 |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20190727 |