EP1118060A1 - Procede et dispositif permettant d'affecter un objet a au moins une classe - Google Patents
Procede et dispositif permettant d'affecter un objet a au moins une classeInfo
- Publication number
- EP1118060A1 EP1118060A1 EP99955709A EP99955709A EP1118060A1 EP 1118060 A1 EP1118060 A1 EP 1118060A1 EP 99955709 A EP99955709 A EP 99955709A EP 99955709 A EP99955709 A EP 99955709A EP 1118060 A1 EP1118060 A1 EP 1118060A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- class
- classes
- measure
- evaluation
- threshold value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
Definitions
- the invention relates to a method and an arrangement for assigning an object to at least one class from a set of predetermined classes by a computer.
- a method and an arrangement for classifying a text is known from [1].
- an object is assigned to one or more classes (multi-classification system) by determining a membership measure for the object and comparing it with an associated threshold value for each class. If the membership measure for the respective class is greater than the threshold value of this class, the object is assigned to the class. It is disadvantageous here that the threshold values of all classes are predetermined globally and thus an inaccurate classification takes place.
- the object of the invention is to enable a classification, wherein specific threshold values are automatically determined for several classes.
- a method for assigning an object to at least one class from a set of predetermined classes is specified by a computer, in which for each class a membership measure of the object to the Class is determined.
- a threshold value is calculated for each class from the set of classes by optimizing an evaluation measure under specified constraints.
- the object is assigned to a class from the set of specified classes if the membership measure is above the associated threshold value of the class.
- an object is not assigned to a class from the several predetermined classes if the membership measure is below the threshold value of the class.
- a further training consists in that the evaluation measure depends on the threshold values of the classes. In this case it is advantageous that the threshold values are included directly or indirectly in the evaluation measure.
- the evaluation measure comprises one of the following specifications: a) number of errors; b) recognition rate (recall); c) Detection rate of a detection system (precision).
- the evaluation measure can be based on certain peculiarities of the object to be classified. In particular, it is useful to take the classification error into account in the evaluation measure and to optimize it with regard to certain specifications.
- the evaluation measure is provided with a condition that requires at least a predetermined value for the evaluation measure. This takes into account that one of the above specifications with a certain value are applied and this is taken into account when optimizing the evaluation measure under the specified secondary conditions.
- the secondary conditions are preferably formulated such that:
- N is the number of classes
- M denotes the number of threshold values Tj and kj_j can only assume the values 0 or 1.
- One embodiment also consists in the fact that the evaluation measure is optimized under the secondary conditions by solving a linear system of equations (here: evaluation measure with specified auxiliary conditions and possibly additional auxiliary conditions) by means of an LP solver (see [2]).
- One embodiment also consists in the fact that the text classification method is used. Especially when it comes to text classification, it is customary to use a given text in different (thematic) classes, so-called To assign domains. Naturally, a given text can be assigned to several domains. The decision as to whether the assignment is made or not is made for each class by comparing the value determined for the text by means of an evaluation measure with the predetermined threshold value of the domain (class). For the text, there is a membership measure for each class; the assignment is made if the membership measure is above the threshold value of the respective class.
- the threshold values for the class have been predetermined in particular using the above-mentioned method.
- an optimal threshold value is determined for each class.
- a "microaveraged" evaluation is carried out on the basis of the threshold values determined for the classes. This will be discussed in detail within the scope of the exemplary embodiment.
- an arrangement for assigning an object to at least one class from a set of predetermined classes is also specified, in which a processor unit is provided which is set up in such a way that
- a measure of the object's belonging to the class can be determined for each class
- a threshold value can be calculated for each class from the set of classes by optimizing an evaluation measure under specified secondary conditions
- the object can be assigned to a class from the set of predetermined classes if the membership measure lies above the threshold value of the class; d) the object cannot be assigned to a class from the set of specified classes if the membership measure is below the threshold value of the class.
- Fig.l is a block diagram with steps of a method for assigning an object to at least one class from a set of predetermined classes
- a membership measure is determined for the object for each class. This should provide information about whether the object can be assigned to the respective class. The assignment is generally made when the membership measure exceeds a predefined threshold value (for the class).
- a class-dependent threshold value that is to say to determine a threshold value for each class
- an evaluation measure is optimized in step 102 under predetermined secondary conditions, the evaluation measure depending on the threshold values of the classes. The optimization results in threshold values for the classes, specifically one threshold value for each class.
- Step 103 checks whether the membership measure is greater than the respective class-specific threshold value.
- step 104 the object is assigned to the respective class in accordance with step 104, otherwise no assignment to this class is made (cf. step 105).
- a measure of deviation instead of the membership measure can alternatively be used, whereby the measure of deviation is understood only as the negated wording.
- Fig. 2 shows a table with dimensions for one
- Text classification can be used to assign the object to at least one class. Predefined texts are assigned to different classes (domains), with each class mostly belonging to one subject area. A concrete realization consists in the assignment of newspaper texts to one or more topics, e.g. Sports, literature, politics and / or business. As mentioned above, the evaluation measure is optimized under given constraints. The evaluation measure itself can include certain specifications. Some possible specifications are explained in more detail below using the table in FIG. Fields 201 to 204 show possible classification states. The field 201 "a" contains all automatically correctly hit by the system
- Field 202 "b" contains the number of all assignments classified as correct by the system which actually (according to
- Field 203 "c” indicates the number of classifications that the system has assigned as incorrect, but which in reality would have been correct.
- field 204 "d” includes all incorrect assignments that the system has classified as incorrect.
- Recall rate is defined as the number of correct (recognized) assignments divided by the number of possible assignments:
- a detection rate is determined by the number of correct assignments divided by the number of all automatic assignments:
- a system failure (fallout) is determined by
- the specified specifications equation (2) to equation (5) are suitable for specifying the classification quality in the form of a suitable evaluation measure Q.
- the evaluation measure can be determined directly across all classes ("icroaveraged" evaluation measure):
- N denotes the number of classes k.
- the evaluation measure is first determined individually for each class and then averaged over all classes ("macroaveraged" evaluation measure):
- threshold value ensures a sufficiently high quality of the assignment (classification) for all classes.
- equation (8) does not work. A lot of threshold values are searched here
- the best set of threshold values is determined by formulating a linear optimization problem and solving it using linear programming (see LP solver). With a training set of objects whose Classification is known, an evaluation is carried out with M different threshold values. In the following, the result of the class k x for the m the evaluation measure Q comes in at a threshold value T-,.
- the linear optimization problem to be solved is formulated as follows:
- the evaluation measure Q only receives one result per class (k x -
- a processor unit PRZE is shown in FIG.
- the processor unit PRZE comprises a processor CPU, a 25 SPE memory and an input / output interface IOS, which is used in different ways via an interface IFC: an output on a monitor MON and / or on a monitor is visible via a graphic interface PRT printer output. An entry is made with a mouse or MAS KEYBOARD.
- the processor unit PRZE also has a data bus BUS, which ensures the connection of a memory MEM, the processor CPU and the input / output interface IOS.
- additional components can be connected to the data bus BUS, for example additional memory, data storage (hard disk) or scanner.
Abstract
Un objet est affecté à au moins une classe dans un ensemble de classes. Pour chaque classe, on détermine un critère d'appartenance de l'objet à cette classe. Des valeurs de seuil fonction de chaque classe sont déterminées aux fins d'affectation, un critère d'évaluation étant optimisé dans les conditions auxiliaires données. En comparant le critère d'appartenance aux valeurs de seuil des différentes classes, on obtient l'affectation correspondante de l'objet à au moins une classe.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19844948 | 1998-09-30 | ||
DE19844948 | 1998-09-30 | ||
PCT/DE1999/002929 WO2000019335A1 (fr) | 1998-09-30 | 1999-09-14 | Procede et dispositif permettant d'affecter un objet a au moins une classe |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1118060A1 true EP1118060A1 (fr) | 2001-07-25 |
Family
ID=7882866
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP99955709A Withdrawn EP1118060A1 (fr) | 1998-09-30 | 1999-09-14 | Procede et dispositif permettant d'affecter un objet a au moins une classe |
Country Status (3)
Country | Link |
---|---|
US (1) | US20020007381A1 (fr) |
EP (1) | EP1118060A1 (fr) |
WO (1) | WO2000019335A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013034450A1 (fr) | 2011-09-08 | 2013-03-14 | Continental Automotive Gmbh | Injecteur de carburant et ensemble injecteur de carburant |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8850154B2 (en) * | 2007-09-11 | 2014-09-30 | 2236008 Ontario Inc. | Processing system having memory partitioning |
US8904400B2 (en) * | 2007-09-11 | 2014-12-02 | 2236008 Ontario Inc. | Processing system having a partitioning component for resource partitioning |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5299284A (en) * | 1990-04-09 | 1994-03-29 | Arizona Board Of Regents, Acting On Behalf Of Arizona State University | Pattern classification using linear programming |
US5675710A (en) * | 1995-06-07 | 1997-10-07 | Lucent Technologies, Inc. | Method and apparatus for training a text classifier |
US5765029A (en) * | 1996-05-08 | 1998-06-09 | Xerox Corporation | Method and system for fuzzy image classification |
US6246787B1 (en) * | 1996-05-31 | 2001-06-12 | Texas Instruments Incorporated | System and method for knowledgebase generation and management |
US6317509B1 (en) * | 1998-02-11 | 2001-11-13 | Analogic Corporation | Computed tomography apparatus and method for classifying objects |
US6192360B1 (en) * | 1998-06-23 | 2001-02-20 | Microsoft Corporation | Methods and apparatus for classifying text and for building a text classifier |
-
1999
- 1999-09-14 EP EP99955709A patent/EP1118060A1/fr not_active Withdrawn
- 1999-09-14 WO PCT/DE1999/002929 patent/WO2000019335A1/fr not_active Application Discontinuation
-
2001
- 2001-03-30 US US09/821,967 patent/US20020007381A1/en not_active Abandoned
Non-Patent Citations (1)
Title |
---|
See references of WO0019335A1 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013034450A1 (fr) | 2011-09-08 | 2013-03-14 | Continental Automotive Gmbh | Injecteur de carburant et ensemble injecteur de carburant |
Also Published As
Publication number | Publication date |
---|---|
US20020007381A1 (en) | 2002-01-17 |
WO2000019335A1 (fr) | 2000-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3744068B1 (fr) | Procédé pour la rédaction automatique d'un document d'hammeçonnage destiné à une personne déterminée | |
EP1665132B1 (fr) | Procede et systeme de detection de donnees provenant de plusieurs documents lisibles par ordinateur | |
EP0040796B1 (fr) | Procédé de differenciation entre régions d'image et de texte ou de graphique sur des originaux imprimés | |
EP1902407B1 (fr) | Systeme pour transmettre des donnees d'une application de documents vers une application de donnees | |
DE10317234A1 (de) | Systeme und Verfahren für eine verbesserte Genauigkeit von einem extrahierten digitalen Inhalt | |
DE102019211656A1 (de) | Bestimmung eines Verschleißgrades eines Werkzeugs | |
DE2435889A1 (de) | Verfahren und einrichtung zum unterscheiden von zeichengruppen | |
DE102018215590A1 (de) | Halbleitervorrichtungssortiersystem und Halbleitervorrichtung | |
EP0788632B1 (fr) | Conversion informatisee de tableaux | |
DE3026055C2 (de) | Schaltungsanordnung zur maschinellen Zeichererkennung | |
WO2000019335A1 (fr) | Procede et dispositif permettant d'affecter un objet a au moins une classe | |
DE102012025350A1 (de) | Verarbeitungn eines elektronischen Dokuments | |
WO2012017056A1 (fr) | Procédé et dispositif de traitement automatique de données en un format de cellule | |
DE102012210482A1 (de) | Verfahren und System zum Migrieren von Geschäftsprozessinstanzen | |
DE102019213061A1 (de) | Klassifizierung von KI-Modulen | |
DE10034629A1 (de) | Verfahren und System zum Verzahnen von OCR und ABL zur automatischen Postsortierung | |
DE102011003156A1 (de) | Kartendaten, Speichermedium und Navigationsvorrichtung | |
DE3128794A1 (de) | Verfahren zum auffinden und abgrenzen von buchstaben und buchstabengruppen oder woertern in textbereichen einer vorlage, die ausser textbereichen auch graphik-und/oder bildbereiche enthalten kann. | |
DE102020201383A1 (de) | Unterstützungssystem, Speichermedium und Verfahren zur Darstellung von Beziehungen von Elementen | |
DE102014116117A1 (de) | Verfahren und System zum Mining von Mustern in einem Datensatz | |
DE102009016588A1 (de) | Verfahren zur Ermittlung von Textinformationen | |
DE102009053585A1 (de) | System zur automatischen Erstellung von Aufgabenlisten | |
DE102014016676A1 (de) | Verfahren zur rechnergestützten Auswahl von Bewerbern aus einer Vielzahl von Bewerbern für ein vorgegebenes Anforderungsprofil | |
EP4307121A1 (fr) | Procédé mis en uvre par ordinateur pour configurer un système de test virtuel et procédé d'entraînement | |
DE102022128157A1 (de) | Computerimplementiertes Verfahren zur Standardisierung von Teilenamen |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20010117 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20030401 |