EP2810192A2 - Bearbeitung einer datenmenge - Google Patents
Bearbeitung einer datenmengeInfo
- Publication number
- EP2810192A2 EP2810192A2 EP13716973.6A EP13716973A EP2810192A2 EP 2810192 A2 EP2810192 A2 EP 2810192A2 EP 13716973 A EP13716973 A EP 13716973A EP 2810192 A2 EP2810192 A2 EP 2810192A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- search pattern
- data
- query
- search
- pattern
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2425—Iterative querying; Query formulation based on the results of a preceding query
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2428—Query predicate definition using graphical user interfaces, including menus and forms
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the invention relates to a method and a device and a system for processing a (large) amount of data, in particular for finding at least one hit in the dataset.
- Information technologies produce with progressive
- Data mining refers to the systematic application of methods that are usually statically ⁇ -mathematically justified, to a database with the aim of recognizing new patterns. This is also about processing large amounts of data (which no longer could be processed manually), for which efficient metho ⁇ needed whose time complexity makes for such Since ⁇ tenmengen suitable. The methods are also used for smaller amounts of data.
- KDD Knowledge Discovery in Databases
- an algorithm for anomaly detection requires in advance a definition of normal behavior or the provision of normal data by the user.
- This approach is also called "supervised machine learning ': means (such as monitoring ⁇ tes machine learning) or” Active Learning “(active Ler ⁇ NEN).
- supervised machine learning means (such as monitoring ⁇ tes machine learning) or” Active Learning "(active Ler ⁇ NEN).
- Visual Analytics (VA) is known as a interdisziplinä ⁇ rer approach that combines the advantages of different research areas. The goal of Visual Analytics
- the method is to gain insights from large and complex data sets .
- the approach combines the power of automatic data analysis with human capabilities, quickly capture patterns or trends visually. By qualified ⁇ designated mechanisms of interaction data can visually explored and findings obtained (see:
- VA systems The interaction on the graphical representation of known VA systems consists essentially of selecting interesting patterns that already exist in the data. Here the user is limited to already existing patterns. A further flexibility is it not ge ⁇ endures.
- the object of the invention is to avoid the above-ge ⁇ called disadvantages and in particular to provide an efficient way to search for information in large data sets.
- a method for processing a data set, in particular for searching hits in a large amount of data,
- This approach enables automatic pattern recognition based on an interactive visual query.
- the user can select an existing pattern or create or modify a search pattern on existing data, or create a template without a template and adapt it to his liking.
- the graphical search pattern provides the user with easy access to complex query structures, which he can easily capture and modify.
- the graphic patterns are translated into the at least one query and applied to the dataset; This way, hits based on the graphic search pattern are found.
- the creation of the search pattern may also include a modification of existing data or an existing search pattern.
- One development is that the pattern is created through a gra ⁇ phical user interface.
- the graphical user interface may for example include a graphi ⁇ rule editor.
- the search pattern is created by means of a two- or three-dimensional scanner and / or by means of at least one camera.
- movement or interaction of the user with the machine can be recorded and converted into a suitable Modifi ⁇ cation of the search pattern.
- the user can virtually model data by means of a camera and / or by means of a scanner and thus adapt the graphic search pattern according to his ideas.
- the search pattern is created based on data of the data set or on other data and / or based on at least one other search pattern.
- search pattern is converted into at least one query by the graphical representation of the search pattern in rules, conditions
- the method is used iteratively with the search pattern in each iteration step ⁇ created (for example, varied).
- the query is applied to a dataset and at least one hit in the dataset is determined.
- the matching results may have a predetermined similarity, for example, a minimum level of similarity with the Suchmus ⁇ ter.
- the pre-admit ⁇ ne action at least one message, display and / or alarms includes optimization.
- an automated alerting may occur in the detection of certain trends and constellations for detecting malfunctions of a network, or the like.
- An embodiment is that the query is applied to a data ⁇ amount and the best match the search pattern matches are determined.
- the search pattern is converted into at least one query, wherein the search pattern is scaled and / or normalized.
- the normalization may include a compression and / or extension of the time interval.
- a next embodiment is that properties of the search pattern are extracted, wherein these extracted properties are at least partially represented as variable parameters.
- search pattern and / or the query is used as a target function for a machine learning method.
- the search pattern can be used to search for similar patterns, to mark areas (clusters) in the data set (or a part thereof).
- the search pattern is convertible into at least one query
- the query is applicable to a dataset.
- the above object is achieved by means of a system ⁇ at least one such device.
- the statements made above concerning the method also apply correspondingly to the device and the system.
- the presented solution further comprises a Computerpro ⁇ program product, directly loadable into a memory of a digital computer, comprising program code portions which are suitable to carry out steps of the method described herein.
- a computer-readable storage medium e.g. any memory comprising computer-executable instructions (e.g., in the form of program code) adapted for the computer to perform steps of the method described herein.
- FIG. 1A shows an amount of data indicating, over a time axis, a total power output in megawatt hours (MWh) with a section;
- Fig.lB the detail of Fig.lA
- Fig. 1C is a graphical search pattern modified by the user by means of a graphical interface, eg an editor
- 2 shows a schematic flow diagram with steps of the method presented here.
- the present approach proposes, based on similar patterns from existing data or according to one's own specifications, a search pattern, e.g. through an editing process.
- This is preferably a graphical search pattern that is processed and / or created, for example, via a graphical editor.
- the specific (edited or adapted) search pattern may (but need not) be identical to the data that the user wants to find in a data set (that is, a variety of data, also referred to as a database).
- the user has the ability to define a subset of existing data by graphical editing (e.g., by dragging a frame).
- the search pattern is preferably a graphic pattern that is entered or modified eg via a graphic editor, by means of data points and / or in the form of a freehand drawing. Accordingly, different input ⁇ possibilities can be realized: Thus, the graphical muscle can ter also be drawn on a sheet and digitized. Two- or higher-dimensional patterns are possible and can be used by the user to make their own search in the data volume more efficient.
- the search pattern is inputted from the user ⁇ Be and / or modified and stored. Such a stored search pattern may be used repeatedly or further modified.
- the pattern input may be the graphical editor, a graphical user interface, a two- or three-dimensional scanner, a camera, or the like. act.
- the search pattern can be used to track or achieve at least one of the following goals:
- a measure of a match between search pattern and found data may be determined and output;
- the hits in the data can be displayed sorted according to their match (ie the measure).
- a representative possible search pattern ⁇ true be by the search pattern is normalized au ⁇ tomatinstrument reference to the amount of data to increase the frequency of hits within the dataset.
- Such Normie ⁇ tion may comprise, for example, a compression and / or extension of the time interval.
- Certain properties of the search pattern can be extra ⁇ hiert. These extracted features extend the interface between the machine and the Benut ⁇ zer. The user can change as the characteristics of the pattern (graphical) and gets a Aktuali ⁇ tion of results displayed.
- Search pattern can be used as an objective function for machine
- the search pattern can be used to search for similar patterns, to mark areas (clusters) in an existing data set. 5.
- the search pattern can be used in surveillance applications (as machine breacheses search ⁇ pattern for example). For example, an automated alerting he ⁇ follow in identifying certain trends and constellations to detect malfunctions of a network, oa
- An exemplary question could be: "When and where can solar energy displace existing nuclear energy?"
- Fig.lA shows a section of the data set, which gives the provided total power in MWh over a time axis ⁇ . Shown in FIG. 1A are the following forms of energy:
- Wind power 110 The user himself can graphically determine a suitable search pattern. Alternatively, the user selects a section 109 from the diagram shown in Fig.lA. This section 109 is shown again in Fig.lB for illustrative purposes. The section 109 thus corresponds to a sample of the amount of data which serves as a starting basis for the search pattern.
- Figure IC shows a search pattern that is graphically displayed by the user, e.g. with an editor, has been modified.
- the statement associated with the search pattern reads: "Solar energy 108 displaces nuclear power 102".
- This search pattern can now be used to find in the dataset results that correspond to the statement described above (with a prescribed minimum similarity), and thus the aforementioned exemplary question beantwor ⁇ th.
- the aforementioned exemplary question beantwor ⁇ th can be found not only hits that with the search pattern are identical, but it can also such partial data are identified as hits having a predetermined minimum similarity with the search pattern.
- FIG. 2 shows a schematic flow diagram with steps of the method presented here.
- a search pattern is created, for example ba ⁇ sierend on existing data or existing search patterns.
- the search pattern can also be created without a template.
- the creation may involve a modification or rebuilding.
- a graphical user interface or a graphical input medium can be used for this purpose.
- the search pattern is converted to its query in at least one, and in step 203, the Minim ⁇ is applied later than a query on the amount of data.
- the hit (s) - possibly prioritized - can be output in a next step. Subsequently, to step 201 branches back to ⁇ .
- a step 204 it is determined whether (at least) one hit was found in the dataset.
- a hit can be part of the dataset that has a given (minimum) similarity to the search pattern. If this is the case, a predetermined action is performed in step 205, e.g. an alarm is triggered. Subsequently, it is possible to branch back to step 203 or, as shown in FIG. 2, to step 201.
- step 204 If no hit is determined in step 204, then it is possible to branch to step 203.
- the interactive search suggested here can be used in conjunction with a visual analytics system.
- a From ⁇ running can thereby example look like this, or at least ⁇ comprise a portion of the following:
- evaluation of data and presentation of results Interaction with the evaluation / representation of the data.
- the data are based on the amount of data (Datenba ⁇ sis) or on existing search patterns.
- search pattern eg by means of graphical input.
- the search pattern can be entered via any graphical user interface.
- rules are defined, which are implemented in the form of a query by the machine, for example.
- the data or the search patterns are displayed in an appropriate form. The user can receive different interaction possibilities, eg for selecting, for changing the search pattern, for drawing new search patterns.
- the pattern is machine translated into information ( "translated") / the search pattern beschrei ⁇ ben. This information is used to perform a machine search.
- a similarity measure is selected which allows data similar to the search pattern to be found in the dataset. Examples of similarity measures are: Pearson coefficient, cosine similarity, etc.
- the similarities are calculated (eg distances between search pattern and data volume data).
- the N most similar hits can be grouped.
- the hits are sorted according to their quality (eg similarity); For example, the best matches will be displayed first.
- a threshold value (alarm value) can be specified.
- a distribution of the found similar hits is calculated. Processing the results a) The hits (patterns) are displayed. b) Optionally, the groups (clusters) of the hits can be displayed. c) A distribution of the hits found in the relevant or in given dimensions is displayed. d) A ranking of the hits can be displayed as a heatmap, optionally using one color scale per grouping.
- a heat map (English for "heat map") is a diagram for the visualization of data whose dependent values of a two-dimensional definition set are represented as colors. Its purpose is to capture significant values quickly and intuitively in a large amount of data (compare:
- One advantage is the improved ability to interact Zvi ⁇ rule the user and the system.
- the user has the option of determining a search query for a large amount of data using a visual search pattern.
- This allows a flexible way the implemen ⁇ tation of a "visual request", which is automatically converted from the machine into a query for the amount of data.
- the user can flexibly intuitively determine his ideas for the search request, whereby, for example, a powerful search tool is provided by the visual two-dimensional or three-dimensional (possibly also colored) description of the search pattern.
- the presented approach is suitable for a large number of applications, eg the monitoring of large amounts of data, the alarm when complex scenarios occur or before they actually occur.
- the alarm when complex scenarios occur or before they actually occur.
- the search patterns may be space-time data of a geo-database.
- the search patterns may be space-time data of a geo-database.
- it is proposed to find at least one hit in a large amount of data based on a graphical search pattern, wherein the graphical search pattern is recreated or modified by a user, preferably via a graphical interface.
- the user can intuitively implement complex searches and specifically use a graphical representation of properties and / or relationships for the search.
- the invention can be used for example in data mining, in the monitoring of conditions or in automated alerting.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| DE102012208999A DE102012208999A1 (de) | 2012-05-29 | 2012-05-29 | Bearbeitung einer Datenmenge |
| PCT/EP2013/056203 WO2013178376A2 (de) | 2012-05-29 | 2013-03-25 | Bearbeitung einer datenmenge |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP2810192A2 true EP2810192A2 (de) | 2014-12-10 |
Family
ID=48139888
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP13716973.6A Ceased EP2810192A2 (de) | 2012-05-29 | 2013-03-25 | Bearbeitung einer datenmenge |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US10191938B2 (de) |
| EP (1) | EP2810192A2 (de) |
| CN (1) | CN104321770A (de) |
| DE (1) | DE102012208999A1 (de) |
| IN (1) | IN2014DN07790A (de) |
| WO (1) | WO2013178376A2 (de) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2015034759A1 (en) * | 2013-09-04 | 2015-03-12 | Neural Id Llc | Pattern recognition system |
| EP3001360A1 (de) * | 2014-09-29 | 2016-03-30 | Siemens Aktiengesellschaft | Rechenvorrichtung und verfahren zur verarbeitung grosser datenmengen |
Family Cites Families (16)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5608899A (en) * | 1993-06-04 | 1997-03-04 | International Business Machines Corporation | Method and apparatus for searching a database by interactively modifying a database query |
| US7469226B2 (en) * | 2001-12-11 | 2008-12-23 | Recognia Incorporated | Method of providing a financial event identification service |
| US7664830B2 (en) | 2002-03-29 | 2010-02-16 | Sony Corporation | Method and system for utilizing embedded MPEG-7 content descriptions |
| WO2003088085A1 (en) | 2002-04-04 | 2003-10-23 | Arizona Board Of Regents | Three-dimensional digital library system |
| JP4516957B2 (ja) * | 2003-01-25 | 2010-08-04 | パーデュー リサーチ ファンデーション | 3次元オブジェクトについて検索を行なうための方法、システムおよびデータ構造 |
| US7328220B2 (en) | 2004-12-29 | 2008-02-05 | Lucent Technologies Inc. | Sketch-based multi-query processing over data streams |
| EP1901241A1 (de) * | 2006-09-06 | 2008-03-19 | Kba-Giori S.A. | Verfahren zur Qualitätskontrolle eines gedruckten Dokuments auf der Basis von Mustererkennung |
| JP2008146603A (ja) * | 2006-12-13 | 2008-06-26 | Canon Inc | 文書検索装置、文書検索方法、プログラム及び記憶媒体 |
| CN101093559B (zh) | 2007-06-12 | 2010-06-23 | 北京科技大学 | 一种基于知识发现的专家系统构造方法 |
| US8165406B2 (en) * | 2007-12-12 | 2012-04-24 | Microsoft Corp. | Interactive concept learning in image search |
| JP5268595B2 (ja) * | 2008-11-28 | 2013-08-21 | ソニー株式会社 | 画像処理装置、画像表示方法及び画像表示プログラム |
| US9405772B2 (en) * | 2009-12-02 | 2016-08-02 | Google Inc. | Actionable search results for street view visual queries |
| US8640079B2 (en) * | 2010-03-02 | 2014-01-28 | Cadence Design Systems, Inc. | Method and system for searching and replacing graphical objects of a design |
| US9165029B2 (en) * | 2011-04-12 | 2015-10-20 | Microsoft Technology Licensing, Llc | Navigating performance data from different subsystems |
| US20120283574A1 (en) * | 2011-05-06 | 2012-11-08 | Park Sun Young | Diagnosis Support System Providing Guidance to a User by Automated Retrieval of Similar Cancer Images with User Feedback |
| US8553981B2 (en) * | 2011-05-17 | 2013-10-08 | Microsoft Corporation | Gesture-based visual search |
-
2012
- 2012-05-29 DE DE102012208999A patent/DE102012208999A1/de not_active Withdrawn
-
2013
- 2013-03-25 WO PCT/EP2013/056203 patent/WO2013178376A2/de not_active Ceased
- 2013-03-25 US US14/404,322 patent/US10191938B2/en not_active Expired - Fee Related
- 2013-03-25 CN CN201380028056.1A patent/CN104321770A/zh active Pending
- 2013-03-25 IN IN7790DEN2014 patent/IN2014DN07790A/en unknown
- 2013-03-25 EP EP13716973.6A patent/EP2810192A2/de not_active Ceased
Non-Patent Citations (1)
| Title |
|---|
| None * |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2013178376A2 (de) | 2013-12-05 |
| IN2014DN07790A (de) | 2015-05-15 |
| US10191938B2 (en) | 2019-01-29 |
| CN104321770A (zh) | 2015-01-28 |
| US20150339345A1 (en) | 2015-11-26 |
| DE102012208999A1 (de) | 2013-12-05 |
| WO2013178376A3 (de) | 2014-06-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| DE112020000227T5 (de) | Maschinelles lernen eines computermodells auf grundlage von korrelationenvon trainingsdaten mit leistungstrends | |
| DE69901626T2 (de) | Fall-basiertes deduktives system, verfahren und gerät für sensor-prediktion in einem technischen prozess, insbesondere in einem zementofen | |
| DE102014223226A1 (de) | Diskriminator, Unterscheidungsprogramm und Unterscheidungsverfahren | |
| DE10149693A1 (de) | Objekte in einem Computersystem | |
| EP3853778B1 (de) | Verfahren und vorrichtung zum betreiben eines steuerungssystems | |
| WO2021104608A1 (de) | Verfahren zum erzeugen eines engineering-vorschlags für eine vorrichtung oder anlage | |
| DE102022201780A1 (de) | Visuelles Analysesystem zum Bewerten, Verstehen und Verbessern tiefer neuronaler Netze | |
| DE112020004120T5 (de) | Überwachen eines status eines computersystems durch implementieren eines netzwerks für tiefes, unüberwachtes binäres codieren | |
| EP2439691A1 (de) | Vorrichtung und Verfahren zum maschinellen Erstellen eines Prozessdiagramms | |
| DE112020005732T5 (de) | Erzeugen von trainingsdaten zur objekterkennung | |
| EP3841732B1 (de) | Verfahren, computersystem und computerprogramm zur automatischen verarbeitung von datenbezeichnern | |
| DE102016223484B4 (de) | Bestimmen von Ähnlichkeiten in Computersoftwarecodes zur Leistungsanalyse | |
| EP2810192A2 (de) | Bearbeitung einer datenmenge | |
| EP3812949A1 (de) | Konfigurierbarer digitaler zwilling | |
| DE102012105664A1 (de) | Verfahren und Vorrichtung zur Kodierung von Augen- und Blickverlaufsdaten | |
| DE102021204550A1 (de) | Verfahren zum Erzeugen wenigstens eines Datensatzes zum Trainieren eines Algorithmus maschinellen Lernens | |
| EP2924585A1 (de) | Verfahren zur Kompression von Beobachtungen einer Vielzahl von Testabläufen | |
| DE202023105419U1 (de) | Ein System zur Extraktion von Web-Informationen | |
| EP4184350A1 (de) | Computer-implementiertes verfahren zum erkennen eines eingabemusters in mindestens einer zeitreihe einer mehrzahl von zeitreihen | |
| DE202022101929U1 (de) | Intelligentes System zur Vorhersage der Sekundärstruktur von RNA unter Verwendung von faltbaren neuronalen Netzen und künstlicher Intelligenz | |
| DE102020212328A1 (de) | Vorrichtung und computerimplementiertes Verfahren für eine Netzwerkarchitektursuche | |
| DE102025112668A1 (de) | System und Verfahren zum Vorhersagen diverser zukünftiger Geometrien mit Diffusionsmodellen | |
| DE102020106857A1 (de) | Mikroskopiesystem und verfahren zum verarbeiten von mikroskopbildern | |
| LU601421B1 (de) | Ein auf Algorithmen der künstlichen Intelligenz basierendes Dateninteraktionsverfahren und -system | |
| DE102024118222A1 (de) | Verfahren und informationsverarbeitungseinrichtung |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| 17P | Request for examination filed |
Effective date: 20140905 |
|
| AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
| AX | Request for extension of the european patent |
Extension state: BA ME |
|
| DAX | Request for extension of the european patent (deleted) | ||
| RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: SIEMENS AKTIENGESELLSCHAFT |
|
| 17Q | First examination report despatched |
Effective date: 20180320 |
|
| REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
| 18R | Application refused |
Effective date: 20191014 |