EP1118060A1 - Procede et dispositif permettant d'affecter un objet a au moins une classe - Google Patents

Procede et dispositif permettant d'affecter un objet a au moins une classe

Info

Publication number
EP1118060A1
EP1118060A1 EP99955709A EP99955709A EP1118060A1 EP 1118060 A1 EP1118060 A1 EP 1118060A1 EP 99955709 A EP99955709 A EP 99955709A EP 99955709 A EP99955709 A EP 99955709A EP 1118060 A1 EP1118060 A1 EP 1118060A1
Authority
EP
European Patent Office
Prior art keywords
class
classes
measure
evaluation
threshold value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP99955709A
Other languages
German (de)
English (en)
Inventor
Thomas BRÜCKNER
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens AG
Original Assignee
Siemens AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens AG filed Critical Siemens AG
Publication of EP1118060A1 publication Critical patent/EP1118060A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes

Definitions

  • the invention relates to a method and an arrangement for assigning an object to at least one class from a set of predetermined classes by a computer.
  • a method and an arrangement for classifying a text is known from [1].
  • an object is assigned to one or more classes (multi-classification system) by determining a membership measure for the object and comparing it with an associated threshold value for each class. If the membership measure for the respective class is greater than the threshold value of this class, the object is assigned to the class. It is disadvantageous here that the threshold values of all classes are predetermined globally and thus an inaccurate classification takes place.
  • the object of the invention is to enable a classification, wherein specific threshold values are automatically determined for several classes.
  • a method for assigning an object to at least one class from a set of predetermined classes is specified by a computer, in which for each class a membership measure of the object to the Class is determined.
  • a threshold value is calculated for each class from the set of classes by optimizing an evaluation measure under specified constraints.
  • the object is assigned to a class from the set of specified classes if the membership measure is above the associated threshold value of the class.
  • an object is not assigned to a class from the several predetermined classes if the membership measure is below the threshold value of the class.
  • a further training consists in that the evaluation measure depends on the threshold values of the classes. In this case it is advantageous that the threshold values are included directly or indirectly in the evaluation measure.
  • the evaluation measure comprises one of the following specifications: a) number of errors; b) recognition rate (recall); c) Detection rate of a detection system (precision).
  • the evaluation measure can be based on certain peculiarities of the object to be classified. In particular, it is useful to take the classification error into account in the evaluation measure and to optimize it with regard to certain specifications.
  • the evaluation measure is provided with a condition that requires at least a predetermined value for the evaluation measure. This takes into account that one of the above specifications with a certain value are applied and this is taken into account when optimizing the evaluation measure under the specified secondary conditions.
  • the secondary conditions are preferably formulated such that:
  • N is the number of classes
  • M denotes the number of threshold values Tj and kj_j can only assume the values 0 or 1.
  • One embodiment also consists in the fact that the evaluation measure is optimized under the secondary conditions by solving a linear system of equations (here: evaluation measure with specified auxiliary conditions and possibly additional auxiliary conditions) by means of an LP solver (see [2]).
  • One embodiment also consists in the fact that the text classification method is used. Especially when it comes to text classification, it is customary to use a given text in different (thematic) classes, so-called To assign domains. Naturally, a given text can be assigned to several domains. The decision as to whether the assignment is made or not is made for each class by comparing the value determined for the text by means of an evaluation measure with the predetermined threshold value of the domain (class). For the text, there is a membership measure for each class; the assignment is made if the membership measure is above the threshold value of the respective class.
  • the threshold values for the class have been predetermined in particular using the above-mentioned method.
  • an optimal threshold value is determined for each class.
  • a "microaveraged" evaluation is carried out on the basis of the threshold values determined for the classes. This will be discussed in detail within the scope of the exemplary embodiment.
  • an arrangement for assigning an object to at least one class from a set of predetermined classes is also specified, in which a processor unit is provided which is set up in such a way that
  • a measure of the object's belonging to the class can be determined for each class
  • a threshold value can be calculated for each class from the set of classes by optimizing an evaluation measure under specified secondary conditions
  • the object can be assigned to a class from the set of predetermined classes if the membership measure lies above the threshold value of the class; d) the object cannot be assigned to a class from the set of specified classes if the membership measure is below the threshold value of the class.
  • Fig.l is a block diagram with steps of a method for assigning an object to at least one class from a set of predetermined classes
  • a membership measure is determined for the object for each class. This should provide information about whether the object can be assigned to the respective class. The assignment is generally made when the membership measure exceeds a predefined threshold value (for the class).
  • a class-dependent threshold value that is to say to determine a threshold value for each class
  • an evaluation measure is optimized in step 102 under predetermined secondary conditions, the evaluation measure depending on the threshold values of the classes. The optimization results in threshold values for the classes, specifically one threshold value for each class.
  • Step 103 checks whether the membership measure is greater than the respective class-specific threshold value.
  • step 104 the object is assigned to the respective class in accordance with step 104, otherwise no assignment to this class is made (cf. step 105).
  • a measure of deviation instead of the membership measure can alternatively be used, whereby the measure of deviation is understood only as the negated wording.
  • Fig. 2 shows a table with dimensions for one
  • Text classification can be used to assign the object to at least one class. Predefined texts are assigned to different classes (domains), with each class mostly belonging to one subject area. A concrete realization consists in the assignment of newspaper texts to one or more topics, e.g. Sports, literature, politics and / or business. As mentioned above, the evaluation measure is optimized under given constraints. The evaluation measure itself can include certain specifications. Some possible specifications are explained in more detail below using the table in FIG. Fields 201 to 204 show possible classification states. The field 201 "a" contains all automatically correctly hit by the system
  • Field 202 "b" contains the number of all assignments classified as correct by the system which actually (according to
  • Field 203 "c” indicates the number of classifications that the system has assigned as incorrect, but which in reality would have been correct.
  • field 204 "d” includes all incorrect assignments that the system has classified as incorrect.
  • Recall rate is defined as the number of correct (recognized) assignments divided by the number of possible assignments:
  • a detection rate is determined by the number of correct assignments divided by the number of all automatic assignments:
  • a system failure (fallout) is determined by
  • the specified specifications equation (2) to equation (5) are suitable for specifying the classification quality in the form of a suitable evaluation measure Q.
  • the evaluation measure can be determined directly across all classes ("icroaveraged" evaluation measure):
  • N denotes the number of classes k.
  • the evaluation measure is first determined individually for each class and then averaged over all classes ("macroaveraged" evaluation measure):
  • threshold value ensures a sufficiently high quality of the assignment (classification) for all classes.
  • equation (8) does not work. A lot of threshold values are searched here
  • the best set of threshold values is determined by formulating a linear optimization problem and solving it using linear programming (see LP solver). With a training set of objects whose Classification is known, an evaluation is carried out with M different threshold values. In the following, the result of the class k x for the m the evaluation measure Q comes in at a threshold value T-,.
  • the linear optimization problem to be solved is formulated as follows:
  • the evaluation measure Q only receives one result per class (k x -
  • a processor unit PRZE is shown in FIG.
  • the processor unit PRZE comprises a processor CPU, a 25 SPE memory and an input / output interface IOS, which is used in different ways via an interface IFC: an output on a monitor MON and / or on a monitor is visible via a graphic interface PRT printer output. An entry is made with a mouse or MAS KEYBOARD.
  • the processor unit PRZE also has a data bus BUS, which ensures the connection of a memory MEM, the processor CPU and the input / output interface IOS.
  • additional components can be connected to the data bus BUS, for example additional memory, data storage (hard disk) or scanner.

Abstract

Un objet est affecté à au moins une classe dans un ensemble de classes. Pour chaque classe, on détermine un critère d'appartenance de l'objet à cette classe. Des valeurs de seuil fonction de chaque classe sont déterminées aux fins d'affectation, un critère d'évaluation étant optimisé dans les conditions auxiliaires données. En comparant le critère d'appartenance aux valeurs de seuil des différentes classes, on obtient l'affectation correspondante de l'objet à au moins une classe.
EP99955709A 1998-09-30 1999-09-14 Procede et dispositif permettant d'affecter un objet a au moins une classe Withdrawn EP1118060A1 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
DE19844948 1998-09-30
DE19844948 1998-09-30
PCT/DE1999/002929 WO2000019335A1 (fr) 1998-09-30 1999-09-14 Procede et dispositif permettant d'affecter un objet a au moins une classe

Publications (1)

Publication Number Publication Date
EP1118060A1 true EP1118060A1 (fr) 2001-07-25

Family

ID=7882866

Family Applications (1)

Application Number Title Priority Date Filing Date
EP99955709A Withdrawn EP1118060A1 (fr) 1998-09-30 1999-09-14 Procede et dispositif permettant d'affecter un objet a au moins une classe

Country Status (3)

Country Link
US (1) US20020007381A1 (fr)
EP (1) EP1118060A1 (fr)
WO (1) WO2000019335A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013034450A1 (fr) 2011-09-08 2013-03-14 Continental Automotive Gmbh Injecteur de carburant et ensemble injecteur de carburant

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8850154B2 (en) * 2007-09-11 2014-09-30 2236008 Ontario Inc. Processing system having memory partitioning
US8904400B2 (en) * 2007-09-11 2014-12-02 2236008 Ontario Inc. Processing system having a partitioning component for resource partitioning

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5299284A (en) * 1990-04-09 1994-03-29 Arizona Board Of Regents, Acting On Behalf Of Arizona State University Pattern classification using linear programming
US5675710A (en) * 1995-06-07 1997-10-07 Lucent Technologies, Inc. Method and apparatus for training a text classifier
US5765029A (en) * 1996-05-08 1998-06-09 Xerox Corporation Method and system for fuzzy image classification
US6246787B1 (en) * 1996-05-31 2001-06-12 Texas Instruments Incorporated System and method for knowledgebase generation and management
US6317509B1 (en) * 1998-02-11 2001-11-13 Analogic Corporation Computed tomography apparatus and method for classifying objects
US6192360B1 (en) * 1998-06-23 2001-02-20 Microsoft Corporation Methods and apparatus for classifying text and for building a text classifier

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0019335A1 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013034450A1 (fr) 2011-09-08 2013-03-14 Continental Automotive Gmbh Injecteur de carburant et ensemble injecteur de carburant

Also Published As

Publication number Publication date
US20020007381A1 (en) 2002-01-17
WO2000019335A1 (fr) 2000-04-06

Similar Documents

Publication Publication Date Title
EP3744068B1 (fr) Procédé pour la rédaction automatique d'un document d'hammeçonnage destiné à une personne déterminée
EP1665132B1 (fr) Procede et systeme de detection de donnees provenant de plusieurs documents lisibles par ordinateur
EP0040796B1 (fr) Procédé de differenciation entre régions d'image et de texte ou de graphique sur des originaux imprimés
EP1902407B1 (fr) Systeme pour transmettre des donnees d'une application de documents vers une application de donnees
DE10317234A1 (de) Systeme und Verfahren für eine verbesserte Genauigkeit von einem extrahierten digitalen Inhalt
DE102019211656A1 (de) Bestimmung eines Verschleißgrades eines Werkzeugs
DE2435889A1 (de) Verfahren und einrichtung zum unterscheiden von zeichengruppen
DE102018215590A1 (de) Halbleitervorrichtungssortiersystem und Halbleitervorrichtung
EP0788632B1 (fr) Conversion informatisee de tableaux
DE3026055C2 (de) Schaltungsanordnung zur maschinellen Zeichererkennung
WO2000019335A1 (fr) Procede et dispositif permettant d'affecter un objet a au moins une classe
DE102012025350A1 (de) Verarbeitungn eines elektronischen Dokuments
WO2012017056A1 (fr) Procédé et dispositif de traitement automatique de données en un format de cellule
DE102012210482A1 (de) Verfahren und System zum Migrieren von Geschäftsprozessinstanzen
DE102019213061A1 (de) Klassifizierung von KI-Modulen
DE10034629A1 (de) Verfahren und System zum Verzahnen von OCR und ABL zur automatischen Postsortierung
DE102011003156A1 (de) Kartendaten, Speichermedium und Navigationsvorrichtung
DE3128794A1 (de) Verfahren zum auffinden und abgrenzen von buchstaben und buchstabengruppen oder woertern in textbereichen einer vorlage, die ausser textbereichen auch graphik-und/oder bildbereiche enthalten kann.
DE102020201383A1 (de) Unterstützungssystem, Speichermedium und Verfahren zur Darstellung von Beziehungen von Elementen
DE102014116117A1 (de) Verfahren und System zum Mining von Mustern in einem Datensatz
DE102009016588A1 (de) Verfahren zur Ermittlung von Textinformationen
DE102009053585A1 (de) System zur automatischen Erstellung von Aufgabenlisten
DE102014016676A1 (de) Verfahren zur rechnergestützten Auswahl von Bewerbern aus einer Vielzahl von Bewerbern für ein vorgegebenes Anforderungsprofil
EP4307121A1 (fr) Procédé mis en uvre par ordinateur pour configurer un système de test virtuel et procédé d'entraînement
DE102022128157A1 (de) Computerimplementiertes Verfahren zur Standardisierung von Teilenamen

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20010117

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20030401