EP2193457A1 - Detecting correlations between data representing information - Google PatentsDetecting correlations between data representing information
- Publication number
- EP2193457A1 EP2193457A1 EP07802083A EP07802083A EP2193457A1 EP 2193457 A1 EP2193457 A1 EP 2193457A1 EP 07802083 A EP07802083 A EP 07802083A EP 07802083 A EP07802083 A EP 07802083A EP 2193457 A1 EP2193457 A1 EP 2193457A1
- European Patent Office
- Prior art keywords
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- 230000003287 optical Effects 0 Abstract Claims Description 9
- 230000001174 ascending Effects 0 Claims Description 3
- 230000003247 decreasing Effects 0 Claims Description 2
- 230000000051 modifying Effects 0 Claims Description 5
- 230000004044 response Effects 0 Claims Description 2
- 230000002123 temporal effects Effects 0 Claims Description 2
- 210000004556 Brain Anatomy 0 Description 3
- 241000282414 Homo sapiens Species 0 Description 3
- 238000004458 analytical methods Methods 0 Description 40
- 230000002238 attenuated Effects 0 Description 1
- 230000006399 behavior Effects 0 Description 6
- 238000004364 calculation methods Methods 0 Description 10
- 230000001721 combination Effects 0 Description 1
- 238000004891 communication Methods 0 Description 10
- 230000002860 competitive Effects 0 Description 1
- 150000001875 compounds Chemical class 0 Description 1
- 230000001276 controlling effects Effects 0 Description 1
- 239000011162 core materials Substances 0 Description 1
- 230000000875 corresponding Effects 0 Description 18
- 230000003111 delayed Effects 0 Description 1
- 230000001419 dependent Effects 0 Description 1
- 230000018109 developmental process Effects 0 Description 2
- 230000000694 effects Effects 0 Description 3
- 238000005516 engineering processes Methods 0 Description 1
- 230000003203 everyday Effects 0 Description 1
- 238000003384 imaging method Methods 0 Description 1
- 230000001976 improved Effects 0 Description 2
- 230000001965 increased Effects 0 Description 2
- 239000011133 lead Substances 0 Description 2
- 239000010912 leaf Substances 0 Description 1
- 230000013016 learning Effects 0 Description 1
- 239000002609 media Substances 0 Description 3
- 230000003340 mental Effects 0 Description 2
- 238000000034 methods Methods 0 Description 43
- 238000006011 modification Methods 0 Description 1
- 230000004048 modification Effects 0 Description 1
- 101000158429 mouse Transcription factor 23 Proteins 0 Description 1
- 230000001537 neural Effects 0 Description 2
- 238000005457 optimization Methods 0 Description 2
- 230000000737 periodic Effects 0 Description 1
- 230000002688 persistence Effects 0 Description 2
- 230000002085 persistent Effects 0 Description 1
- 229920001690 polydopamine Polymers 0 Description 2
- 239000001965 potato dextrose agar Substances 0 Description 2
- 238000004886 process control Methods 0 Description 3
- 238000005365 production Methods 0 Description 1
- 230000002829 reduced Effects 0 Description 3
- 230000001603 reducing Effects 0 Description 1
- 238000006722 reduction reaction Methods 0 Description 1
- 230000001105 regulatory Effects 0 Description 2
- 230000000717 retained Effects 0 Description 2
- 230000002104 routine Effects 0 Description 1
- 210000000697 sensory organs Anatomy 0 Description 1
- 238000000926 separation method Methods 0 Description 2
- 230000003068 static Effects 0 Description 10
- 238000007619 statistical methods Methods 0 Description 1
- 238000003860 storage Methods 0 Description 2
- 230000000576 supplementary Effects 0 Description 1
- 230000001360 synchronised Effects 0 Description 2
- 230000001960 triggered Effects 0 Description 2
- 230000016776 visual perception Effects 0 Description 1
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/907—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
Acquisition of interrelations between representing information data
The present invention relates to a method for detecting at least one relationship between at least one at least one information representative of at least one data stock and at least one at least information representing the date of a request for a connection (connection request) to the at least one at least one information representative of at least one data holdings by a data processing system representing information data in at least one of at least one data source accessible database, wherein the at least one connection itself as a at least one information representative of date in a database dynamically as a link between at least one at least one information representative of at least one data stock and at least one further at least one information representative of at least one data backlog and / or at least one electronic use of at least one information representative of at least one data stock is detected and is represented by an optical and / or acoustic display of said data processing system, wherein the link by at least one syntactic comparison and / or by at least one semantic comparison of at least one at least one information representing the data of a data set with the at least one at least one information representative of the request for a connection (connection request) to the at least one at least one information representative of data is a data stock generated at least at least.
Further, the present invention relates to a data processing system having data representing information in at least one accessible via at least one data source of data, which is designed and / or arranged to execute a method of the invention at least partially. The present invention is also a data processing device for the electronic processing of data, perform at least a control and / or computing unit, an input unit and an output unit which is designed and / or set up a method of the invention partly, preferably by using at least a portion a data processing system according to the invention.
Methods, systems and apparatus for electronic processing of data are known in the art in numerous embodiments, in particular from WO 2005/050471 A2, the disclosures of which are hereby explicitly referenced.
Procedures, data processing systems and / or data processing devices of the type mentioned place in the context of Web applications or - routines, for example, by operating systems and / or on the part of so-called search engines, as well as part of the organization, provision and / or delivery of information use.
Usually, content are mechanically processed as data representing information of a data stock, to be made available to particular users, as a technical aid for solving tasks and / or to serve. Databases for the purposes of the present invention are simple, universally usable, persistent data objects containing especially as files and / or documents in operating systems or databases, structure, content and, if necessary, management information.
In data processing systems and / or data processing devices, the data sets are a data processing system and / or a
A data processing apparatus usually via at least one data source, usually an existing in a data processing system or be connected through a communication network connectable or media, such as a hard disk or the like data recording medium, accessible.
Operating systems use a hierarchy of files, for example, in their so-called file system. The files are classified as inventory data in a tree structure in directories. A navigation in the files is generally along an Aristotelian logic of the names of the individual folders to a file. The navigation can comprise multiple steps and further includes the problem of a unique assignment. In addition, in operating systems to manage files from the management of data separated exclusively on databases or file-based data structures, such as XML, accessing application programs are accessible is. Usually, the separation is carried out in accordance with the technical implementation or realization of the persistence of the respective data.
yet come into database-driven applications, relational databases used to manage the static links which are provided with data tables. Due to the used static links in the search engines by managed tables, changes in the data sets are not or only to a limited and delayed detected. The evaluation or use of the data must be set ahead.
Search engines allow users to browse normally only databases for keywords or by a Boolean combination of keywords. Precise search queries such as the call bills or the like from a certain period of time or the like queries are not possible.
Data processing systems also operate over a rule with a static - that is permanently prescribed - hierarchical menu structure that provide the user with a selection of possible functions for the operation of data processing system. For operating systems, such as Mac OS X of Apple, known as context menus are also used. It also is hierarchically structured menu structures, but at least partially the same are supplemented in dependence of the installed application programs corresponding to menu items or to start calling individual application programs and / or functionalities. Apart from the supplementability the menu structures, these are still structured hierarchically static in its use. The date given in particular in data processing systems static hierarchical structure of menu structures thus can not meet a user or usage preferences limited. An account of its use of context by the menu structures, in particular the menu items are offered in a menu that makes sense in the context of the situation or the content accessed or is meaningful is not possible, especially because of the static hierarchical structuring of the menus. - A -
The synchronization of data representing information in databases between the data sets together-using data processing devices - especially given the increasing popularity of mobile usable data processing devices such as so-called PDAs (PDA: Personal Digital Assistant) - an important part of data processing systems and is for example in the form of so-called PIM - systems (PIM: personal information management) integrated into data processing systems. The functionality of the synchronization is far been limited to a purely manual selection of information to be synchronized. A user can only select, for example, by manually entering that or he wants to synchronize what part of his part of the data processing system manages contact addresses. A closer or more detailed specification in terms of a substantive limitation of information to be synchronized is not possible, especially not as an automated process that automatically adapts to the user's needs.
In view of this prior art, the present invention has the object to improve the detection of contexts representing between information data while avoiding the disadvantages described, in particular on the manner or extent of handling the collection of contexts, as well as the use of the detected correlations , in particular taking into account the user individual uses of the correlations.
The technical solution, a method for detecting at least one relationship between at least one at least one information representative of at least one data stock and at least one at least information representing the date of a request for a connection (connection request) is at least one representing an information to the at least with the present invention at least one data stock by a data processing system having data representing information in at least one of at least one data source accessible database, wherein the at least one connection itself as an at least information representing the date in a database dynamically as a link between at least one at least one information representing date of at least one data set and at least one further at least information representing the date at least a data stock and / or at least one electronic use of at least one information representative of at least one data stock is detected and is represented by an optical and / or acoustic display of said data processing system, wherein the link by at least one syntactic comparison and / or a semantic comparison by at least the at least a least one information representative of data of a data set with the at least one at least one information representative of the request for a connection (connection request) to the at least one at least one information representing date is at least at least one data set generated proposed, by a user further uses initiated at least a detected link in a list (tracker) to be logged.
A particularly preferred embodiment of the invention provides that the connection request is made as a query to a search engine. A search engine according to the present invention is in particular a program for searching documents and / or records - which are a least information representing date of a data set in accordance with the present invention - the part of a computing device - also called computer or computer - or in be provided or via a computer network, in particular the Internet or an intranet or kept. As part of a method of use according to the invention of the related request for or with a search engine is advantageously a keyword index for the document base - which is a result of a search or connection request in the context of the present invention - created to searches on keywords - the one for the purposes of the present invention at least one information representative of the interrelation request - to answer with a relevance ordered list of results. After entering one or more search terms - which are an information representative of the interrelation request in the context of the present invention - is it on the search engine, a list of references to be relevant documents or records - the purposes of the present invention, a least one information representative Date a data stock are - delivered.
The requests for a link between the content of data representing information, connection requests in accordance with the present invention are advantageously Information and / or actions representing data. Accordingly, the present invention advantageously detects generally relationships between two successive contents representing information as data of one or two data sets.
The invention is based on the recognition that at least a detected link connection requests are through the inventive logging initiated by a user of further uses optimized further, particularly because with a method of the invention guide the user's usage behavior with in the inventive context detecting received. According to the invention so that a self-learning solution is realized. The detection of correlations representing between information data is thus improved as a whole, in particular on the manner or extent of handling the collection of contexts, as well as the use of the detected correlations, particularly taking into account the user individual uses of the correlations. while correlations According to the invention manageable than knowledge.
Here, the knowledge will be implemented by the inventive process regime that knowledge is created by linking content. Unlike a using term networks, encoding and in a so-called topic map in accordance with ISO / IEC 13250, for example, trade based created by consulting firms or verticalized knowledge management solutions from software vendors, a self-learning solution provided by the invention, which itself to the needs and adjusts preferences of each user or user. Accordingly, the functionality of the inventive method can quickly and easily into existing solutions, particularly data processing systems and / or data processing devices to be integrated. Complex and intensive training launches an inventive solution in existing or new projects can be omitted.
The inventive solution allows a user to easily and quickly get from one content to linked through relationships other content. Advantageously, since all content on relationships are multi-linkable, a navigation as part of a connection request also advantageously lead back to the starting point of the inquiry, that is, the connection request itself. The user. can thus advantageously recognize contexts, particularly to and between its uses that are relevant example for his current interest focus. In a fixed, static hierarchical order of data processing systems, as given for example in the rule tree structures for the selection of content-using operating systems and application programs, this is not possible, as this not to be a consideration of new regulatory structures.
While users of data processing systems enter otherwise be provided by the data processing system input forms data allows the solution of the invention advantageously data corresponding to a current need of the user to describe. Thus, the description of an address or a project through the inventive dynamic data management and the design of an appropriate graphical user regarding the use of use for users is extremely flexible and individually, especially since the description of an address or a non-project, as otherwise usual, rigidly fixed.
In a further advantageous embodiment of the invention, the logging of usage requests for a connection (connection request) and / or actual usage of results of requests for a connection (connection request) are recorded in the frame. Advantageously, requests for an interrelation (interrelation request) and / or actual uses of results of request for an interrelation (interrelation request) are logged over use timing.
A further advantageous embodiment of the invention provides that in a further request for an interrelation (interrelation request) using the list (Tracker) is checked whether the additional request for an interrelation (interrelation request) recorded with the list (tracker) request for a connection (connection request) coincide, said further request for a connection (connection request) is detected in the absence of agreement, and is reproduced in correspondence of the already detected context.
Advantageously, uses the results of a request for a connection (connection request) in the list (tracker) will be checked for at least one connection to a previous request for a connection (connection request) logged. In the course of using the invention, the connection request as a request to a search engine is examined here is how long a search result was considered. A preferred embodiment of the invention is characterized in that in the list (tracker) logged uses the results of a request for a connection (connection request) to be checked in pairs, in the review of the time interval between the time of use of the logged usage of results of request for a connection (connection request) is determined and logged at a previous time use is either ignored in response to a preferably predefined maximum time interval between the use of time points of the two recorded uses the results of a request for a connection (connection request) when the time interval between the use of time points of the usages of recorded results of a request for a connection (connection request) is smaller and / or is equal to the temporal Maximalabsta nd, or is checked whether the result already used in relation to other previous request for an interrelation (interrelation request) is available. Advantageously, the used result in a relation to another preceding request for a connection (connection request) is weighted according to the detected either together or is re-weighted.
The inventive process control (search optimization) is individually and / or in combination with the aforementioned embodiments of an optimized search feasible. A method in accordance with an inventive-converting or using system learns advantageously from the usage patterns of a user to assess the relevance of the findings of a connection request (search) or to evaluate.
In a particularly preferred embodiment of the invention a ranking is to further improve a connection detection, particularly in connection with the use of a connection request as a query to a search engine, advantageously used, the number of links pointing to the at least one information representing at least one data inventory detected correlations to , to determine a so-called ranking, preferably by calculation. The invention makes advantageously use of the knowledge that information, such as a scientific publication, in importance and value when it is frequently cited wins. The value of a respective rank or a corresponding rankings is between O and 1, respectively, accordingly, between O% and 100%. To calculate the ranking value are used in the present solution is advantageously three values: A calculation value by Lucene, an open source Java library for creating and searching text indexes, one resulting in a present context detection calculation value of a search query and a calculation value into a single search result entries resulting from the number of links. As part of the process of the invention is then determined for each entry how many links there are for the entry. Here, each result will be evaluated using the query of the relationships. The result is then output. According to the invention a faster and more accurate calculation of the rankings is made possible by means of this procedure for a request.
In a further advantageous embodiment of the invention it is provided that at least one word of a list created from a full-text (word list) contained in the full text words as the at least one at least one information representative of the request for a connection (connection request) is used.
An advantageous embodiment of the invention provides that the words in the list (list of words) are sorted in ascending order of data contained in the full text words to the frequency of their presence in the full text, and the sorted words taking into account the frequency of their presence in the full text as the at least a least one information representative of the request for a connection (connection request) may be used.
In a further advantageous embodiment of the invention that a predetermined number of the sorted according to the frequency of their presence in the full text words in a list (list of words) is provided which detects and the words of the list (list of words) than the at least one at least one information representative the request for an interrelation (interrelation request) are used. Advantageously, the list (arranged list of words) contains the words with the lowest frequency of their presence in the full text. In another embodiment of the invention, the words are sorted advantageously according to the frequency of their presence in the full text. The predetermined number is preferably limited to up to 32, especially since the usable by the predetermined number, the number for a connection request, preferably logically linkable with each other attributes can be defined and beyond that - as stated empirically - the performance of the inventive solution in use by can be optimized by data processing systems.
According to a further proposal of the invention, the words of the list, at least partially, preferably completely - i.e., the entire list of words, preferably consisting of the 32 words of the lowest frequency - in parallel as the at least one at least one information representing date of the request (for a connection context request ) was used. The words of the sorted word list advantageously be logically linked together, preferably used with a Boolean OR operation, and the link as the at least one at least one information representative of the request for a connection (connection request). As a result, the connection request preferably provides a search engine then a similarity value. The similarity value is advantageously a percentage figure based on the similarity, wherein in a similarity value of 100% all of the words used for the request as attributes from the sorted list of words in the requested connection, particularly preferred between two full-text or full text documents, occur and / or relevant and are in a similarity of 0% none of the words used for the request as attributes from the sorted list of words in the requested context, more preferably between two full texts or full text documents occurs and / or is similar or is relevant to the similarity.
An alternative and / or supplementary embodiment of the invention advantageously provides that the words of the list in sequence than the at least one at least one information representative of the request for a connection (connection request) may be used.
Advantageously, the links are such as and preferably detected with the dynamically acquired contexts. Unlike a relational database, in which combinations of tables provided with data to be managed, are created dynamically links the detected correlations in the inventive solution, preferably in n-to-n relation, wherein the connections according to the method advantageously independently determined and maintained, i.e. held particular date. According to the invention advantageously all databases in the system and / or connected or einbindbaren be indexed. From the user perspective, a resolution of the otherwise given, for example, to search engines separation of database and file system will be created or provided. Thus, the inventive solution enables independently of the technical realization of the persistence of data, a search content. In this case, according to the invention provided a quasi compound of indexes and databases that allows various search options for connections, in particular an integration precise search queries and full-text searches. Queries are analyzed and converted for an internal query the data sources.
to pass through the solution according to the invention advantageously enables integration of data sources without having to replicate data or resources for redundancies use. An inventive data processing system is for this purpose advantageously have a structure that allows it to integrate data sources without having to replicate data or use resources for redundancies to have (plug-in structure of the data processing system). The otherwise high development costs and large system resources requiring integration of data sources for data migration can thus advantageously be dispensed with.
According to another proposal of the invention, the inventive solution also takes into account local databases for example, different users can share a computer network with each other to use, for example, different employees in a corporate network, in particular a client-server network. Contents and their relationships are available in a corporate network, for example, market research and / or competitive analysis or the like applications.
In a further advantageous embodiment of the invention the link by a syntactic comparison of the at least one at least information representing the date of the request for a connection (connection request) and the at least one data set is generated. The special feature of the technology according to the invention is a syntactic comparison on the basis of in insertable modules called plug-ins are each recordable rules (Keys). This has two advantages: First, valid, user-specified relationships can be defined (for example, business rules) and apply it to the database. This allows the same time a flexible data structure. On the other (dynamic functions) can be prepared by the rules relationships between information and action. Here, in the present invention a solution is used, in which the information of actions on the part of a user or automated process manage all data sources System components (plug-ins) communicated to possibly even trigger actions or make changes to the database. Advantageously, the syntactic analysis is used in combination with other analyzes of the application in order to realize rapid results and immediate use capable overall system.
A further advantageous embodiment of the invention provides that the link by a semantic comparison of the at least one at least information representing the date of the request for a connection (connection request) and the at least one data set is generated. thereby, in the context of semantic comparison is advantageously generates a semantic network, can be detected from the correlations.
In a particularly advantageous embodiment of the invention, the various ways of linking production are applied combinative. In contrast to neural networks, the inventive solution uses a combination of syntactic and / or semantic analysis processes in order to achieve a high learning speed.
According to an advantageous proposal of the invention the linkage is generated by manual input, preferably through a selection input.
In a further particularly advantageous embodiment of the invention further comparisons to link detection in the inventive data processing system can be integrated. The user can integrate, for example, its individual requirements in addition to the mentioned possibilities of linking acquisition further analysis methods and / or methods and expand the data processing system according to this.
A further advantageous proposal of the invention is characterized by at least one electronic use of at least one information representative of at least one data stock as the at least one at least one information representative of the request for a connection (connection request). A particularly advantageous embodiment of the invention is characterized by a detection of electronic uses of at least one information data representing at least one data set by time and / or frequency and use this detection as the at least one at least one information representative of the request for a connection (connection request) , A further advantageous embodiment of the invention is characterized by at least a detection of a link as electronic use of at least one information representative of data of a data set. Another embodiment of the invention is characterized by use of at least one reference to at least one data representing information comprehensive database. With these measures, for example the use of a file with an application program as a connection can be detected advantageously individually or in combination. Scope and handling of the solution according to the invention can thus improve and further customize the user uses. For the analysis advantageously actions, timing and sequence of the user and results of data changes are logged according to the invention. The data changes are advantageously checked with a so-called crawler.
Advantageously, the connection detection is made weighted. Thus, the dynamics of the acquisition of interrelations advantageously further can be increased or diminishable. In a preferred embodiment of the invention, the connection weighting of at least one previously recognized as an at least one information representative of date in a data connection is changed in dependence on the weighting of a detected following relationship, preferably, the connection weighting increases (increments) or decreased (decremented). In a particularly preferred embodiment of the invention, the weighting is performed not only by a percentage hit accuracy of a query, but by means of the invention from the solution by dynamic detection semantic network formed which is optimized independently and continuously by numerous parameters, in particular by continuous updating of the acquired interrelations , Details of this iterative process comprising multiple bifurcations arise particularly in connection with the bottom still following description of the flow diagrams illustrated in the figures of the embodiments, in particular in FIG fourteenth
Advantageously, a detected connection is reproduced at least, preferably by an optical and / or acoustic display of said data processing system according to the invention or a data processing device according to the invention. According to a further advantageous proposal of the invention is reproduced with the detected connection whose weighting.
In a further embodiment of the invention, weights are provided with an expiration time to devalue correlations as a function of time. Since each user will utilize the inventive solution with different intensity, it is appropriate for the age limit to use a counter which is used as the smallest unit of time an action. An action is in this case, the call of a content thereof is preferably including editing and / or creating a new content from the data processing system. In a further embodiment of the invention is the period of time within which a user has used a context, for example, has been viewed, in determining the expiration time use. Advantageously, the expiration time of a weighting of a link extended, the longer and / or more frequently the user has engaged in a content, as its meaning for the user, the higher must be or.
The present invention further provides a data processing system representing information data in at least one of at least one data source accessible database, which is designed and / or arranged to execute a method of the invention at least partially.
A further advantageous embodiment of the invention is characterized by a graphical user interface for inputting and / or reproducing of connection requests, links, connections and / or connection weights. Advantageously, the graphical user interface further representative for entering, modifying and / or reproduction of information data is formed in at least one data set and / or set up. The user interface advantageously provides a graphical user interface that allows an action-oriented navigation. This means that instead invention of otherwise in hierarchical menu structures offer the user to select functions possible courses of action are available for selection, which makes sense in the context of the situation or accessed content or makes sense. According to a particularly preferred proposal of the invention, the action-oriented navigation system uses a binary framework that an input of content and on the other side allows on the one hand an output of content. Advantageously, this action-oriented binary navigation with device-wide basis, that is independent of the respective data processing device realized.
In a particularly preferred embodiment of the invention, the graphical user interface splits the available for reproduction by an optical display device optical display region into three areas, wherein in a first region the result of a selection of information-representing data in a second area with a display, a from the selection in the first region and selected information in a third area, which is reproduced at least one context. The playback of the selected from the selection in the first area information in the second region is preferably carried out as a preview or full view of the information. The display area can be provided in the form of a window on the part of the display device. A further proposal of the invention is characterized by a horizontal or vertical division. Advantageously, the size of the areas is switched on and / or adjustable.
In a further advantageous embodiment of the invention, reproduction is effected at least partially in selectable form, that is, the reproduced relationships are for example, even formed as a menu item for action and / or as a link in the manner of a link, and by selecting, for example, by so-called "clicking" accordingly usable.
Advantageously, content or content containing documents as part of a preview are reproduced by the data processing system. Thus, the policy options for the user in the context and the content will be further improved. Advantageously, the preview during navigation or the control of data bases can be reproduced, so that the user can gain a quick overview of contexts. Advantageously, this preview contains a summary of the content, for example, reduced with respect to the illustrated elements or components website or a combined text. Thereby, otherwise given restrictions, for example, in a preview of documents in the form of small images (called thumbnails), in the form of beginnings of texts or in the form of program or document symbols (called icons) imaging operating systems are given, according to the invention beseitigbar.
In a particularly preferred embodiment of the invention, data processing system according to the invention is preferably used in the context of an application running on a computer software for the dynamic organization of information and / or processes.
Advantageously, according to the invention part of the data processing system of a database application, or at least with a database application is available.
The present invention further provides a data processing device for the electronic processing of data, perform at least a control and / or computing unit, an input unit and an output unit which is designed and / or set up a method of the invention partly, preferably using at least a portion of a data processing system of the invention.
In a further advantageous embodiment of the invention, a data processing device for the electronic processing of data, provided with a control and / or computing unit, an input unit and an output unit, which is characterized by use of a data processing system according to the invention.
In an advantageous embodiment of the invention the data processing device is designed as a mobile device, preferably as a usable in mobile radio networks operable respectively mobile terminal. Particularly preferred is a configuration of the data processing device as a mobile phone.
Advantageously, the data processing system of the invention is formed such that it runs with a Java VM, so that the data processing system is available on all mobile devices in principle. In a particularly preferred embodiment of the invention the data processing device, supporting the special ergonomics of the data processing system. The solution according to the invention includes, and realized in a preferred embodiment of the invention, advantageously, the following methods, systems and / or devices for detection of correlations between data representing information:
Based on the knowledge of linguistics, Epistemologie and neurology, interrelations between contents are (data representing information) detected, the correlations corresponding to neural patterns or associations in the human brain, dynamically modifiable, advantageously amplifiable or debuffing bare, n-to-n linkages (n: natural integer) are detected.
Thus, as the human brain compressed stimuli his sense organs as mental Präsentate to mental Repräsentaten or processed, for example, conducts the human brain by means of the visual perception system of synchronously activated stimuli a moving object that are in the inventive solution interrelations between contents (data representing information) detected , In iterative processes can be condensed into knowledge based on the thus detected dynamic interrelationships information or content. Furthermore, the correlations are thus detected according to the invention can be used as new content itself in the form of data representing information, for example, as policy options offered to the user for selection or through automated processes in process control or the like incorporated. Advantageously, the solution of the invention is pragmatic and self-organizing, so that no configuration is required by the user. The solution of the invention still allows open controls by the user.
Advantageously, the detection of contexts are used various methods for preparing, modifying and dissolving dynamic n-to-n links combinative each other in the frame, in particular by manual input links, links by syntactic comparisons and / or links by semantic comparisons. Furthermore, mathematical statistical analysis method for detecting contexts can be used. Advantageously, the correlations detected by links are provided with a weighting. The weighting is carried out advantageously with values, preferably values between 0 and 1, corresponding to no relation (value 0) or a direct connection (value 1). Associated links by manual input from the user can be provided with the maximum weighting value in the present case 1, that is, the weighting of the connection gets the highest and fixed value It finds particular the ergonomics of action-oriented logic described in more detail below use. Alternatively, a smaller value can be set in order to incorporate the link in the semantic network.
In connection links by syntactic comparisons and / or regulating the content (data representing information) files and / or databases, or defined parts of content to be searched as a database for matching words, word components or strings that the contents (information representing data) corresponding to the connection request , This rapid and valid link that provides a kind of framework for relationships, the independent detection of contexts, the so-called "self-learning" of the inventive solution is accelerated.
In connection links by semantic comparisons dynamic relationships advantageously based or by means of the sequence of content (data representing information) of data sets (files and / or databases) and / or uses of contents (information representing data) of data sets (files and / or databases are ) detected. The solution advantageously findings of implicit semantics invention uses. the semantics are as implicitly called because they present not explicitly a term network (topic map) or semantic rules of linguistics is modeled, but by the realization that relationships between content such meanings in a language not arise by definition, but dynamically by the use of the same. This creates meanings of a language because the language is used and relationships between inventive content in that this content is used. Semantics is so far not abstracted from the language practice or present of the content.
In the technical realization of connection links through semantic comparing each dynamic link receives system internally a value between 0 and 1, wherein value of 0 indicates that no connection exists and the value 1 indicates a direct connection which has been produced, for example manually or by syntactic comparisons. The inventive solution logs all user actions with the data processing system or the result of all called or used content, such as edited content. Data representing information of any data set, such as a file and / or database, understood, which may originate from different data formats and from different data sources or - as content - as already explained.
Followed by two successive contents, creating a latent link. Surfaced this sequence several times the link is strengthened. Each link is advantageously also provided along with a decay time of a predetermined unit time, wherein the value of a link is attenuated in a unit of time and tends ultimately to a plurality of time units compared to 0.
About the direct result of two content also are given by the solution of the invention advantageously groupings of content calls that form a pattern. Subpattern a sequence is understood by invoking content which recur irrespective of their order. The pattern may be of different size or number of content per episode and advantageously form a so-called clusters, which forms on the relationship amounts to a semantic meta an issue. Such a topic would again be named for example as a connection request using a syntactic comparison for detection.
Further, the solution according to the invention is able to determine the relevance of the detected correlations for the respective user by the semantic comparison. Thus, the inventive solution avoids excessive flow of information and is able to answer the user's connection requests more precise and focused. Advantageously, the relevance of the inventive solution can be used to form a self-organizing maintenance of the data processing system further, for example, unused, old and / or irrelevant or unimportant become data that would otherwise burden the data processing system to remove.
In connection links through semantic comparisons can be analyzed by appropriate design advantageously content and / or uses of content according to individual interests of a user, hereinafter referred to interest or action analysis. In an interest analysis which content categories, the user is preferably required in connection examined. In this way, the inventive solution trains the user-specific rules for parsing the contents. the user needs such as invoices and orders when he calls an address set, these relationships are displayed preferably by the inventive data processing system. Under content categories, the inventive solution understands content that formal similarity have (e-mail), like for example, electronic mail addresses, invoices, contracts, project plans, schedules, and / or. As with semantic compare the results of the analysis of interests are represented by a dynamic network whose relations can be strengthened or diminishable.
Through this interest analysis, a narrower focus can narrow down the interest the user content uses of contents and / or aspects of an issue of content. Subtopic is understood to mean a group of content - have been identified as part of a semantic comparison and summarized or results from a requested and acquired interrelation - as already explained. The connection request can be initiated advantageously in a variety of manner, for example by dynamically recorded uses of the data processing system by the user or by a manual or - designated orally Asked connection request, for example, in a - in integration of speech recognition and natural language interpretation by the inventive data processing system search field on the part of a graphical user interface for controlling the data processing system according to the invention.
In an impact analysis which uses content the user is preferably required in connection investigated. It is thus not the content but the actions associated with the content that is advantageously treated to courses of action, depending on variables such as content, content type, or theme by the inventive solution, networked captured. For each called content or context request, the inventive solution is able to dynamically provide options for action that make in each case for the user meaning or are especially likely due to its usual mode of action. So it makes sense, for example, that an invoice created is recorded in a corresponding data processing system or the like application or can be done in an email reply. In an advantageous analysis for patterns in the content and / or their uses advantageously a so-called Pattern Analyzer is used herein, a process that in unstructured content for patterns that it recognizes as independent content or uses for a summary of contents examined.
So the so-called Pattern Analyzer, for example, recognize an address or an image in the text and can use this information to make available as standalone content. Internet pages can be used for an automated address research in this way. Thus the Pattern Analyzer advantageously the concepts of the semantic analysis of the invention described accesses. Advantageously selected content, such as e-mails automatically be evaluated and provided as structured information.
as already explained above, the analyzer checks following the semantic analysis, whether they are links between the links and whether a meta-link can be established.
For creating meta contents or for the identification of issues of Semantic Analyzer first checks how many links there are to a content. Is a critical number that is dynamically defined, reached and yet there is no meta content that contains the set of links of the examined by the Analyzer content as a subset of its links, a new meta content is created. For this, the individual contents are combined and merged. This can be done advantageously with the Pattern Analyzer or other analytical method or methods.
If there is an issue and the critical number of links is not reached, the meta content will be deleted. Metainhalte can be managed advantageously via an appropriate plug-in.
The inventive solution is in principle able to include unlimited data resources in the form of files and / or databases of different formats and from different data sources in the data processing system. Here are the data sources are modified to advantageously remain in their original system context functional and usable not imported or. This is to secure investment and accelerates or also supports an implementation of the inventive solution. To this end, data processing system advantageously to the invention a central data management - referred to as "repository" called - on which contains references to the various databases and data sources. Not the entire content is stored, but only references to the corresponding data. This can avoid a duplication of data.
As explained above, the inventive solution capable of information is to integrate data representing the content from different databases, both databases and files in different formats from different data sources, for example via the Internet or a local company network, to identify their structure and to identify the determined interrelationships of the contents or parts of the content, for example, the sender of an e-mail. The individual content components are placed advantageously in relation to the corresponding content components other content. in particular text documents - - By this measure, the inventive solution when browsing contents of databases is for words, word components and / or strings accelerated.
This process of searching contents of databases for words, word components and / or strings - hereinafter called "crawler" - is advantageously carried out in the background of the application of the inventive approach on the part of an inventive data processing system and examined periodically for new and changed content that are preferably stored for analysis in temporary files. Once the analysis is complete, the temporary data is deleted.
Further details, features and advantages of the invention will be explained below with reference to the description of the embodiments illustrated in the figures of the drawing. They show:
Figure 1 is a block diagram showing the principal components of a data processing system according to the invention.
Fig. 2 shows a basic embodiment of a communication between selected components of the invention
Data processing system of FIG. 1; Figure 3 is a block diagram of another principal embodiment, a communication between selected components of the data processing system of the invention of FIG. 1.
Fig. 4 is another schematic exemplary embodiment of a communication between selected components of the invention
Data processing system of FIG. 1;
Fig. 5 shows a basic embodiment of a program implementation of a communication between selected components of the invention in a block diagram,
Data processing system of FIG. 1;
Fig. 6 shows another schematic exemplary embodiment of a program implementation of a communication between selected components of the invention in a block diagram,
Data processing system of FIG. 1;
Fig. 7 principle details of the communication of FIG. 6;
Fig. 8 is another schematic exemplary embodiment of a program implementation of a communication between selected components of the invention in a block diagram,
Data processing system of FIG. 1;
Fig. 9 is another schematic exemplary embodiment of a program implementation of a communication between selected components of the invention in a block diagram,
Data processing system of FIG. 1;
Fig. 10 is another schematic exemplary embodiment of a program implementation of a communication between selected components of the invention in a block diagram,
Data processing system of FIG. 1;
FIG. 11 is a flow chart showing an embodiment of an acquisition of the present invention; Figure 12 is a flow chart showing a further embodiment of an acquisition of the present invention.
FIG. 13 is a flow chart showing a further embodiment of an acquisition of the present invention;
FIG. 14 is a flow chart showing a further embodiment of an acquisition of the present invention;
FIG. 15 is a flow chart showing a further embodiment of an acquisition of the present invention;
FIG. 16 is a flow chart showing a further embodiment of an acquisition of the present invention;
Figure 17 is a schematic representation of a basic embodiment of a graphical user interface according to the invention to use a data processing system according to the invention.
Fig. 18 is a schematic representation of another principal embodiment, a graphical invention
User interface for the use of a data processing system of the invention;
Figure 19 is a flow chart showing an embodiment of an inventive detection of a user's usage behavior as part of an acquisition of the present invention.
Figure 20 is a flow chart of another exemplary embodiment of an inventive connection detection.
Figure 21 is a flow chart of another exemplary embodiment of an inventive connection detection.
Figure 22 is a flow chart showing an embodiment of an inventive nutzerindividualisierte context detection. Fig. 23 is a flow chart of another embodiment of a connection detecting nutzerindividualisierte invention according to FIG. 22 and
FIG. 24 is a flow chart showing a further embodiment of an acquisition of the present invention;
Fig. 25 is a schematic representation of a basic embodiment of the present invention a contiguous content of FIG. 24 and
Fig. 26 is a flow chart of another embodiment of an acquisition of the present invention.
That an inventive acquisition of interrelations between representing information data realized hardware or software design, including the user interface appears particularly from the inventive context-sensitive management of information and / or actions such that a provable connection is given.
The data processing system consists of several components which contain in turn other subcomponents. In connection with Figures 1 to 13, the respective main components are described for a general overview of the architecture of the data processing system hereinafter. The background to the architecture shown and described herein, the components need not necessarily be executed within an application, but also a division into different applications and systems is possible. This so-called both a standalone and a so-called client / server application is supported.
As can be seen from Fig. 1 can be seen, the data processing system to a user interface, a kernel and a so-called repository.
The user interface (hereinafter referred to as GUI) is the interface to the user. The user interface (GUI) is implemented and configured so that the ergonomics of the user or user's needs are fully met. In the present case is the user interface (GUI) platform or device dependent and is therefore customized for each platform or computing device to its capabilities. In the present example implementations of user interface (GUI) for PC's, PDA's, or HTML Web applications, Mobilfunktelefon- or WML / WAP applications are provided which can be realized preferably by means of or as Java applications.
The kernel is the central core of the application data processing system in which all the components come together and are connected to each other. The kernel itself is divided into more sub-components, below IQser, content providers, crawlers, logger or tracker (not explicitly shown in FIG. 1) and called Analyzer.
The kernel provides an interface for the graphical user interface (GUI), which will hereinafter be called IQser component. The relevant methodological Views user interface (GUI) will be forwarded to the responsible components and, if required before being returned to the user interface (GUI) processed accordingly, as seen from the shown in Fig. 2 principle. Performing the requested tasks respectively of FIG. 2 is made to the respective components provided. Fig. 3 shows an exemplary overview of the connections of the individual components with one another.
Furthermore, the IQser component is the controller instance, which controls the access to the repository or repositories, as well as crawler and Analyzer controls processes. Further, the IQser component implements the present case the task of integrating the respective content providers in the system.
The Content Provider is a abstract component. It is present a framework that enables any data source in the data processing system to integrate. Thus, the overall system is very flexible and can be integrated into the existing infrastructure of the user.
The crawler component has to seek the job for new content items or changes to existing objects. All changes and new features to synchronize the repository so that the repository is always up to date. The realized by the crawler component process takes place in the background and is started by a user definable interval.
The object of the in Fig. 1 Logger- not explicitly shown or tracker component to each of the user activity with a content (hereinafter also called content item) is recorded. This logging is required to record later in the Analyzer certain processing or the user in relation to various content items usage patterns and possibly less relevant connections (so-called "weak links") to create between objects or delete them.
The Analyzer component runs as an independent process present in the background of the data processing system and performs a number of tasks. For a conceptual comparison, the analyzer component of the logger or searched Tracker component activity logs or the user uses for patterns and creates or deletes the relationships mentioned in the repository. Thus, the data processing system can dynamically recognize relationships between content and its uses independent and quasi learn. In syntactic compare the analyzer component compares the actual contents of a data set (content object) on text fragments that indicate to other databases (content objects).
The data processing system refers to external databases and uses it to related acquisition. The entries can be in foreign databases, such as emails or addresses, as well as documents and objects of the data processing system itself. Follow the user a related reference to the record or the external document opens in the for proper application. Takes the user through network connections to a database to, the document opens, for example, in a separate browser window after being previously converted accordingly by the data processing system.
The repository is the interface for data storage. Here connections and / or references are managed and their respective links to the databases (content objects). In Fig. 1, only one repository is exemplified. However, there are several repositories in the data processing system be integrated, so that, for example, local, server-based relationships and databases can be integrated in the respective user system.
The data processing system further includes an interface for integrating data sets, to handle every conceivable type of content in the form of data representing information. Since generally all the possibilities and ways of integrating data sets can be predetermined not a priori, the content provider component is presently implemented as programmable interface. This will allow developers more content providers to program individually, which can then be used by the data processing system. This programming interface is hereinafter also called plug-in and is shown in Fig. 4.
The plug-in (Fig. 4) consists of several objects which are to be implemented by the developer:
- Content Provider: This is the interface to the kernel and provides the methods that are necessary for the processing of contents.
- Content: The content object is the actual content. It is used by the content exchange between the components.
- Content View: The interface to the GUI. This presentation and the possible actions or uses of the content are implemented.
As with reference to FIG. 5 to recognize, plugins (plugins) of two objects are managed, the Plugin Manager and Plugin. The plugin manager's job to look at the start of IQser component of the data processing system for installed plugins to load existing plugins and initialize. The plugin itself is presently used as a data container. It contains an instance of the content provider and the configuration of the respective plugins.
the objects content providers, content and content view are to be implemented by the developer for the development of new plugins. The configuration of a plugin on a file (in this case plugin.xml) that must be present in each plugin directory. If it is missing, the directory will not be covered by the presently designated as IQser component of the data processing system as a plugin.
In FIG. 6, implemented by the crawler component processes that run in the background of the application of the data processing system are illustrated by way of example. The databases will be searched for new or modified content and this example, stored in a table for analysis. The processes are designed here with a low priority at periodic intervals.
As can be seen from the illustration in Fig. 7, the so-called crawler obtained from the Plugin Manager, the list of installed plugins and processes them successively. Each content provider a plug is then queried for new or changed content objects. The content provider each provides a list of content IDs back. These lists are then stored by the crawler in a cache table. The table is then in turn processed by the Analyzem.
The object of the so-called, in Fig. 8 logger or tracker component exemplified is similar to the crawler component. The tracker or logger logs the user to content objects actions. Creating an audit trail later be detected by the InterestAnalyser pattern in the processing and / or use of content and fed to a further use. So relationships between content can be dynamically generated. These relationships are weighted low and as soon as they have not been used for a longer, specified period of time, almost forgotten by release. The period over which the tracker or logger to monitor the activities is configurable. The lower the time, the faster the processing. The longer the period, the higher the possibility of processing or usage patterns to detect.
. The analyzer shown in Figure 9 is a process of analyzing the background of the data processing system, the "found" by the crawler content according to various criteria of the analyzer itself is present again in the following sub-components or processes.:
- Index: Analyzes the content for keywords that are necessary for the related acquisition.
- Semantic Analyzer: Analyzes the content according to semantic criteria for connections between content capture.
- Syntax: Analyzes the content of syntactic links to other content. - InterestAnalyser: Analyzes content for patterns in the processing or the use by the user.
Were at least two events in the log (tracker) registered an analysis with the call of the analysis process can begin. For this, the Analyzer looks at the first two entries of the protocol. If it is not in the second entry to an event, that by selecting (for example, "to display") has been triggered a content, the syntax analysis starts (see Fig. 11). If it were a connection (hereinafter also selection. called), the parsing is skipped. If it is both entries to an event that was triggered by the user, both content present in each case in semantics and interest Analyzer and examined. then follow further analysis steps are all analysis steps performed. the first entry is deleted. Now the whole process is repeated until only one entry in the log is available.
In Fig. 14, the process is a so-called Metaanalyzers this example shown. The Metaanalyzer checks whether the number of links to a studied content exceeds a threshold n. If not, it is checked whether there is a meta-content that - needs to be deleted - if it exists. If the limit is reached, the first also checks whether there is a meta content. In both cases, all related content will be merged and summarized. Has there been a meta content, it is updated by the summary. Were there no meta content, such is created and stored in a database.
The repository exemplified in Fig. 10 in detail is the interface to the databases. It takes over the storage of all relevant data for the kernel, in this case these are in particular:
- An index to all the content.
- connections or links content to other content.
- Temporary information about the order of the processing of content by the user (tracker).
- Temporary information about new and changed content (crawler). - An index of all the key values of the content (weight).
At the start of semantic analysis is to check whether there is already a link between the two contents. There is no link with a low value greater than 0 is created for the weighting. There is already a link asks the Semantic Analyzer whether the weight is less. 1 If this value is 1, the analysis is terminated, it is less than 1, they will be continued. Now, the weighting of the connection to a smallest value becomes larger increases 0th Subsequently, it is checked whether the resulting weight is greater than or equal to the first If the value is less than 1, the semantics is completed analysis. If the value is greater than or equal to 1, the weighting is reduced to a value smaller than the largest. 1 Following this, all other existing links are called and their weighting reduced by a factor equal to the reduction of the currently examined shortcut. The Semantic Analyzer is subsequently terminated.
Fig. 11 shows the iterative analysis process of the connection detecting. Various analysis processes applied combinative - This will present - as already explained. As part of the connection detection of the analysis process is conducted by an entry in the tracker (event log) is started and is completed when only one entry remains in the tracker. As explained above, the analyzer component has the task of identifying the relationships between the contents independently. For the data processing system used combinative particular a syntactic and semantic analysis. Fig. 12 shows the flow of processes already described above in the context of a semantic analysis. Fig. 13 shows the basic structure of the processes in the context of a syntactic analysis.
In Fig. 11, the principle implemented by the analyzer component method steps are illustrated. As part of the call to the analyzer component or log files are the records (logs) in the chronological sequence in which the individual items are written processed. Have been processed in this way entries will be deleted from the list, but at least is saved the last action. How long is the list of actions in the log file or database table is so depends on how fast the Analyzer works or how much computing power is assigned to the secondary thread. Priority has always been the thread of IQser component or the actions performed by the user of data processing system. Once a new action in the log file is written and the Analyzer was not yet active, the analysis begins to context acquisition. The following situations can lead to the invocation of the analyzer:
- At the start of the data processing system at least two entries from the last session in the log are.
- The crawler has identified new content, for example through a new plugin.
- The user creates a new content.
- The user selects a content from a called list.
The syntax analyzer determines the rules for the syntax analysis by calling the "key" (Keys) for the to be examined each content. Those "key" are attributes that describe used what information blocks (data fields) and data types for the detection of contexts to be. from the Keys, the Analyzer a filter along that triggers a search with the total acquired dataset to the found content link is eventually produced. the function of the implementation and the user's requirements with one or a maximum value less than 1 or the accuracy of matches is weighted.
As explained above, all content using any and any number of other content can be linked in contexts. Relationships both across categories as possible within a category. For example, addresses can be with addresses but also addresses with projects to link. A hierarchical order does not exist. The data processing system is different static and dynamic links for relationships or their acquisition. Static links are always displayed and can be made by the user and processed. Dynamic links, the system automatically with a weighting ago. serve as criteria for weighting:
- The frequency of calling a content.
- The frequency of the call content in the produced by the related context.
- The age of the last call of the relationship or the linked content. In search results for a connection request there is also a weighting according to the frequency of the desired content in context.
There is - as already explained - an internal threshold, after which a dynamic link is displayed or not. The value (weighting) varies as a function of user behavior or the evaluation of the log of all actions performed by the user with the data processing system.
In the data processing system, the relationships are represented as dynamically linked objects.
FIG. 15 illustrates and describes examples as for the identification of new content, the corresponding plug-in for a crawler task is requested. Technically, it is in this synchronization process realized subsystem to a so-called class that retrieves the appropriate data sources of plug-ins for new, deleted, or generated contexts.
Fig. 16 shows and example describes the flow of an analysis of full texts as part of an acquisition of the invention according to the invention, at least one word of a list created from a full-text (word list) contained in the full text words as the at least one at least one information representing date of the request for a connection (connection request) is used. The word list generated in the pattern analysis of a full text is sorted in ascending order according to the frequency of words. The first 32 words of the resulting list as a search query - made on a full-text index of a search engine - that is, as request for an interrelation (interrelation request). As a result, a list of similar documents with a determined from the search engine weighting is delivered. Then, determinations are respectively made in the further analysis of whether already a combination of the full text of search results. If this is the case, the higher weighting is applied. This is not the case, a link is established and assumed the weighting of the search result.
The data processing system according to the invention further provides a cross-device man-machine interface for display, processing and control of complex content or data sets and their interrelationships ready, in particular to meet the requirements with respect to a transparent control for modification of large data sets and their contexts to meet and such a particular understandable to the untrained user and easier to make compared to other systems. Advantageously, the used by the human-machine interface logic and ergonomics are independent from those used for or with the inventive data processing system data processing devices and data processing devices, for example, their output devices such as monitors or displays.
Advantageously, an action-oriented control of the data processing system is made possible with the man-machine interface of the data processing system of the invention. The action-oriented control of the invention the data processing system replaces today used generally functional menu control of computer-based programs. A functional control menu provides a list of functions that are grouped according to abstract criteria and are accessible in menu trees. Such abstract criteria such as "File" and "Edit" for the Windows operating systems from Microsoft. In contrast, the action-oriented control of the invention operates - including action-oriented. called navigation - with context-sensitive options for action that are structured binary in every situation of the system. The binary structure refers to an input on one side and an output of content on the other side, or on the writing or modifying and displaying databases. Advantageously, the binary action options to be adjusted depending on the respective action context, that is, it may be added new options and / or eliminated other courses of action depending on the respective action context. Advantageously, leaves the binary action-oriented control so at the same time applicable to all output media, for example, in small displays of mobile terminals or a voice input and / or -ausgäbe, to which the data processing apparatus then advantageously comprising microphone and speaker as input and output units.
The control or navigation through complex databases and their interrelationships realizes the man-machine interface of the data processing system of the invention advantageously with a graphical user interface which serves for inputting and / or reproduction of connection requests, links, connections and / or connection weights. The graphical user interface is embodied in at least one database for entering, modifying and / or reproduction of data representing information.
The graphical user interface splits the available for reproduction by a display device display area into three regions, wherein representing in a first range information data in at least one accessible via at least one data source of data, in a second region at least one at least one information representative date of a connection request, and is reproduced, a detected relationship between the at least one information representative of the interrelation request and the at least one information representative of date of the data set in at least a third region.
The inventive division of the display area into three sections hereinafter called triadic window technique. While the known from the graphical user interfaces of modern operating systems window techniques work either with list, icon or tree diagrams for illustrating the hierarchical structure of data and files of a computer system, the triadic window technique of the invention operates with a horizontal or vertical three-division of the available display area (window ).
In Fig. 17 and Fig. 18 is a corresponding graphical user interface is shown respectively. Present in the form of a principal example of a so-called Internet- front ends, with a vertical division of the display area. In this data processing system according to the invention is in the form of a so-called Web browser (FIG. 17) or a so-called Java clients (Fig. 18) can be used.
In the illustrated in Fig. 17 and Fig. 18 vertical division of the display area is located for example in the upper third of the display area (window) the selection of content as a tabulated list in the second third of the display area (window) is a detail view of a first from the list selected content, and in the last third of the display area (window) all with the selection of associated content, for example, also appear in a list or play. The user can see at a glance what content belong together and can navigate further from there equal to search content or derive insights from the contexts. The epistemology is that knowledge is created by linking information. The data processing system of the invention thus not only makes information transparent but also knowledge. If the user selects a content from the list in the last third of the window is advantageously started an animation that moves the last animation upward, which is then to be seen has the same structure and logic "triadic window". Further, the user can navigate back so that the animation reverses accordingly.
In a further, not shown in the figures, the vertical division of the display area is located for example in the upper third of the display area (window) the result of a (selection) in the form of a list of contents, in the second third of the display area (window), the links or the relationships to a selected content from the list of the first region and in the last third of the display area (window) to preview the selected content (data representing date). The reproduced in the third region of the display area preview is sometimes easier to read or sehbar when the size ratio between the first two areas and the third region is variably adjustable, so-called "Split Plane". The control or navigation is limited in this case advantageously to the first two areas of the display area during the preview in the third area just alternates at the same place. The order of the panes can vary depending on the implementation.
An advantage of the graphical user interface of the data processing system of the invention is that the user can see at a glance what content belong together and just continue to control or - to navigate to search content or directly derive insights from acquired contexts. Here, an endless control or navigation is provided by the invention.
. The fields marked in Figure 18 with the reference numerals 1 to 7 have the following functionality:
1: List entries can be drawn to the list of references by drag and drop to create a shortcut. A double-click opens the detail view in a new window;
2: Column Headings: With a mouse, a sorting criterion is selectable; 3: scrollbars to view lists and detail lists that are not fully displayed in the window area, complete;
4: the range for the list can be moved with the mouse and also quite "collapse";
5: Detail list entry has Scrollingbalken to lists and detail lists that are not fully displayed in the pane to view complete;
6: deposited with features column positions, such as an email address leads to a mask for writing an e-mail and
7: References can be erased simply by dragging them with the mouse out of the window area.
The following describes an example in the context of the use of data processing system resulting processes and their implementations by the respective user interface:
- "Display a list"
In the application, the user interface (GUI) the user from the menu or the navigation chooses to display content to select an item, for example, to view all recent projects. The list of menu items to select is similar to a bookmark list of a browser. However, the inventive data processing system, the bookmarks do not refer to static pages or lists, but in a dynamic context. Accordingly, this part of the navigation of users will look different from the user. In the present case the data processing system is implemented as a Java client.
- "Add Attributes"
To add a new attribute to a content object, the user is calling from the edit mode of a content object. In addition to the existing attributes of the user finds an empty space into which it can enter a new attribute or to select an adjacent list. An address is an object for which the attributes IQser suggests. Because no different in the Java client between editing and viewing mode, is advantageously proposed for this variant, to be completed when navigating "new" the "Add Attribute" option. When working with external data sources, the additional attributes are stored in IQser.
If the user chooses the menu "Block" (or another name) is shown a new empty block in the detail view. The term is also empty. Advantageously, there is an additional selection with the previously defined blocks. Advantageously, the data processing system re-examined as to whether there was already this term for a block and corrected if necessary a typo or redundancies.
Fig. 19 shows and example describes the sequence of detection of the present invention a user in the context of an acquisition of use behavior. The connection request is made present as a query to a search engine. By a data processing system according to the invention is realized with a so-called Search Analyzer a search optimization. As part of implementing the process of search Analyzer automatically learns from user behavior of the user, which are search hits for an inquiry of relevance.
In the embodiment shown in Fig. 19 Realization (Tracker) is retained in a protocol called the search queries, ie requests for a connection (connection request), and the results, that uses results of requests for a connection (connection request), at which time were. The protocol (tracker) is executed step by step in the process according to the analysis. Is a log entry a search query, it is checked whether this search is happened once before. This is not the case, the query is stored in a database. If the log entry, a call of a data object, that is a result or search results, it is checked whether the second log entry, a call of a data object is also (earnings / search results). If so, it is checked whether a certain time difference has been exceeded. This is not the case, the first log entry is ignored. Exceeded the time difference, it is checked whether the data object from the first log entry is already stored in relation to the previous query in the database. If this is the case, the weight of the object increases in relation to the requirements to the value of d (0 <d <1). This gives a total value of more than 1, the value is advantageously retained and the weighting of all other objects lowered with respect to the query to the percentage value d on the basis of Figure 1. If the object above is not stored in relation to the search query, it is stored with a weighting of d.
Fig. 20 shows and example describes a further improvement is provided for an acquisition of the inventive process control. Advantageously the number of the at least one information representing the date of at least one data set referring detected correlations is used to determine a so-called ranking, a rank order, preferably by calculation. The invention makes advantageously use of the knowledge that information, such as a scientific publication, in importance and value when it is frequently cited wins. The value of a respective rank or a corresponding rankings is between 0 and 1, respectively, accordingly, between 0% and 100%. To calculate the ranking value are used in the present solution is advantageously three values: A calculation value by Lucene, an open source Java library for creating and searching text indexes, one arising from the present Search Analyzer calculation value of a search query (cf. . Fig. 19) and a resulting from the number of links to a single search results listing calculation value, as part of the process of the invention is thereby determined at regular intervals for each entry in the repository, as many links are available for the entry. The result is then recorded in the repository. According to the invention by means of this procedure for a request (connection request) allows a faster calculation of the rankings.
Fig. 21 shows and example describes a further provided for improving a connection detection process of the invention guide, a user suggestions for connection requests are preferably proposed in the form of a keyword list in the context of detection of correlations in order to limit the search (connection request), and thereby improve. The connection request is made present as a query to a search engine. the case of a search (connection requests) is so avoided with only a single keyword commonly encountered large and often confusing search results that contain many results that are not relevant to the searcher according to the invention. submitted The inventive method for search suggestions to automatically suggestions for keywords the user in the search to narrow the search. These proposals are advantageously displayed in a list next to the search result. Each term is defined in a preferred embodiment of the invention with a link that leads to a narrowing-down search. The list of search suggestions results from a database query. The Search Analyzer builds this present a database (table) to be stored in the searches with the favored search results. From this database (table) determines a request, preferably an SQL query, all search queries in which the keyword or keyword of the searching user occur. the terms are then extracted from the list that do not match the keyword or keywords. As a result, the result is a list of keyword suggestions that can be reproduced by the optical display device of a data processing system and / or a data processing device.
Figs. 21 and FIG. 22 and described by way of example, a further improvement for an inventive nutzerindividualisierte context detection. Advantageously of the so-called clickstream an Internet user with an inventive connection detection is evaluated. Under clickstream an Internet user, the behavior of visitors is for the purposes of the present invention understood on websites that are being collected as part of a click streams, in particular, soft web pages and / or sections of websites a user has how often and / or how long viewed or visited. In the inventive method the clickstream an Internet user is evaluated in the context of a connection request, advantageously as follows:
1. A local search engine, in this case in particular an executed by a computer according to the invention IQser application receives a data connector, in this case a plug-in which the history (History) reads an Internet browser (browser history) performed by the computer and automatically for each entry, the corresponding Internet calls and indexed. The result for the user a personal or user-specific index of the Internet.
2. The user (Internet users) uses a virtual browser for which he has authenticated previously, for example by a login with a user ID. All entries in the browser and click the following are then sent to a server that maps pages viewed the corresponding user ID in a database and stores. In addition, a connection request is sent the present IQser application server performing a to know in each case the associated pages. The linked pages are also allocated to the above user ID. Thus the invention results in a cut-out of the Internet, which is determined by personal or user-specific interest of the user (Internet user).
a virtual browser on a website appears in the proposed process procedure according to the above solution 1 and / or 2 above solution in detail and contains among other things an input field for an Internet address and a search box for personalized search. About the virtual browser is logged, calls which pages the user. The respective process sequence is shown in the flowchart of Fig. 22 by way of example and for both Solution 1 Solution 2 as valid.
The index of the search engine is complemented by a database of user IDs and associated Content IDs with timestamp. Each search result is iterated and checked to see whether the content ID is stored from the hit list each associated user ID. In the database of user IDs and associated Content IDs with timestamp is advantageously checked regularly which entries are older than a predetermined period or a date, preferably a year. Such entries will be deleted.
According to the invention, there is thus also allows a general Internet Index build when it is used by many users and / or applications.
The flowchart of FIG. 23 is to be processed as a URL containing log entries. A log entry encompasses at least a user ID, a URL and a time stamp.
A further advantageous embodiment of the invention provides that information will be enriched with semantic tags as part of a connection request according to the invention. Fig. 24 shows a flowchart of a procedure corresponding to an inventive enrichment of information by so-called tagging. Fig. 25 shows an embodiment corresponding linked information (terms) by appropriate tagging. A day present case consists of at least one keyword that describes the meaning of the content. The IQser has an architecture in which all information that has been integrated are stored in a database with an ID and a URL. This makes it possible to assign tags also manually or automatically each content. At the same time the tags are added to the index with the attribute name "TAG". The user can later explicitly search for tags, with and / or without combination with other search parameters. Furthermore, the user passes through every day to all other content, which also features this day have been provided. this facilitates a highly targeted search and allows an alternative network of the content.
In turn, tags can be evaluated to see a network of concepts. FIG. 25 shows an example of this, according to by tagging linked terms. Here is queried for each day, the context in which this term appears. These terms are then associated with each other and stored in a database. The link is weighted on hand the number of common occurrence. this ratio is represented in a network structure (see. Fig. 25). So the user can navigate to additional terms and / or the contents lying behind these or by searching refine link multiple tags. It is created when selecting a network node, for example by a mouse click, a search query, which forms an intersection for the contents contain the common tags.
A further advantageous embodiment of the invention provides that information is automatically enriched with semantic tags as part of a connection request according to the invention. Fig. 26 shows a flowchart of a procedure corresponding to an inventive automatic enrichment of information by so-called tagging. About Tags content is semantically enriched. This is particularly important when large data and text stocks are to open up. Some content may inventively found quickly and effectively and relationships to other content via a semantic network of tags are displayed as described in connection with FIG. 24 and FIG. 25 explained. The IQser provides the opportunity present day not only to create manually, but also automatically detect. Thus large amounts of data are machine-deducible. Here, in the present case placed advantageously on the syntax analysis of the IQsers. Recognizes the IQser with the syntax analysis a relationship between two contents using a key attribute (Key attributes) can be provided to the content in which the key attribute was found with a corresponding day. A database of technical terms and concepts in addition preferably with the definition of the compartment can be used advantageously to semantically enrich content with subject-specific so-called key attributes. The inventive solution also offers advantageously a plugin that integrates databases, preferably online encyclopedias like Wikipedia and / or the like so that the manual effort can be eliminated. so the user will be charged a large amount of terms that are used for semantic annotation of their own data set available according to the invention.
The embodiments of the invention illustrated in the figures of the drawing and explained in connection with the description serve merely to illustrate the invention and are not limitative of these.
Priority Applications (1)
|Application Number||Priority Date||Filing Date||Title|
|PCT/EP2007/007666 WO2009030248A1 (en)||2007-09-03||2007-09-03||Detecting correlations between data representing information|
|Publication Number||Publication Date|
|EP2193457A1 true EP2193457A1 (en)||2010-06-09|
Family Applications (1)
|Application Number||Title||Priority Date||Filing Date|
|EP07802083A Pending EP2193457A1 (en)||2007-09-03||2007-09-03||Detecting correlations between data representing information|
Country Status (3)
|US (1)||US9317604B2 (en)|
|EP (1)||EP2193457A1 (en)|
|WO (1)||WO2009030248A1 (en)|
Families Citing this family (3)
|Publication number||Priority date||Publication date||Assignee||Title|
|US8661403B2 (en) *||2011-06-30||2014-02-25||Truecar, Inc.||System, method and computer program product for predicting item preference using revenue-weighted collaborative filter|
|DE102012109096A1 (en)||2012-09-26||2014-03-27||Iqser Ip Ag||Method for sequential delivery of personalized data representing information, especially in the form of videos and the like, particularly for a personalized TV|
|US9507589B2 (en) *||2013-11-07||2016-11-29||Red Hat, Inc.||Search based content inventory comparison|
Family Cites Families (9)
|Publication number||Priority date||Publication date||Assignee||Title|
|AU2003210393A1 (en) *||2002-02-27||2003-09-09||Michael Rik Frans Brands||A data integration and knowledge management solution|
|US7346839B2 (en) *||2003-09-30||2008-03-18||Google Inc.||Information retrieval based on historical data|
|WO2005050471A2 (en) *||2003-11-22||2005-06-02||Wurzer Joerg||Data processing system and device|
|JP2005165958A (en) *||2003-12-05||2005-06-23||Ibm Japan Ltd||Information retrieval system, information retrieval support system and method therefor, and program|
|US7519595B2 (en) *||2004-07-14||2009-04-14||Microsoft Corporation||Method and system for adaptive categorial presentation of search results|
|JP2006099388A (en) *||2004-09-29||2006-04-13||Hitachi Software Eng Co Ltd||Text mining server and system|
|AT518197T (en) *||2006-02-23||2011-08-15||Netbreeze Gmbh||System and method for user-controlled multi-dimensional navigation and / or theme-based aggregation and / or monitoring of multimedia data|
|US8244730B2 (en) *||2006-05-30||2012-08-14||Honda Motor Co., Ltd.||Learning syntactic patterns for automatic discovery of causal relations from text|
|US8442972B2 (en) *||2006-10-11||2013-05-14||Collarity, Inc.||Negative associations for search results ranking and refinement|
Non-Patent Citations (1)
|See references of WO2009030248A1 *|
Also Published As
|Publication number||Publication date|
|Chakrabarti et al.||Focused crawling: a new approach to topic-specific Web resource discovery|
|Perkowitz et al.||Adaptive web sites|
|US7346613B2 (en)||System and method for a unified and blended search|
|US9690786B2 (en)||Systems and methods for dynamically creating hyperlinks associated with relevant multimedia content|
|CN101501627B (en)||Data manipulation systems and methods for analyzing user activity data items|
|US6546393B1 (en)||System method and article of manufacture for dynamically user-generated internet search directory based on prioritized server-sided user bookmarks|
|KR101222294B1 (en)||Search systems and methods with integration of user annotations|
|US7809716B2 (en)||Method and apparatus for establishing relationship between documents|
|US6493702B1 (en)||System and method for searching and recommending documents in a collection using share bookmarks|
|US7580930B2 (en)||Method and apparatus for predicting destinations in a navigation context based upon observed usage patterns|
|US8527515B2 (en)||System and method for concept visualization|
|US8214361B1 (en)||Organizing search results in a topic hierarchy|
|US8131779B2 (en)||System and method for interactive multi-dimensional visual representation of information content and properties|
|US7003506B1 (en)||Method and system for creating an embedded search link document|
|US7702690B2 (en)||Method and apparatus for suggesting/disambiguation query terms based upon usage patterns observed|
|US7933906B2 (en)||Method and system for assessing relevant properties of work contexts for use by information services|
|US7493312B2 (en)||Media agent|
|US7788251B2 (en)||System, method and computer program product for concept-based searching and analysis|
|US7895595B2 (en)||Automatic method and system for formulating and transforming representations of context used by information services|
|US7502785B2 (en)||Extracting semantic attributes|
|US8935249B2 (en)||Visualization of concepts within a collection of information|
|US9652537B2 (en)||Identifying terms associated with queries|
|US6912550B2 (en)||File classification management system and method used in operating systems|
|KR101278406B1 (en)||System and method for assisting search requests with vertical suggestions|
|RU2419858C2 (en)||System, method and interface for providing personalised search and information access|
|17P||Request for examination filed||
Effective date: 20100324
|AK||Designated contracting states:||
Kind code of ref document: A1
Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR
|AX||Request for extension of the european patent to||
Countries concerned: ALBAHRMKRS
|DAX||Request for extension of the european patent (to any country) deleted|
|17Q||First examination report||
Effective date: 20160608