WO2023227186A1 - Pattern recognition and classification - Google Patents

Pattern recognition and classification Download PDF

Info

Publication number
WO2023227186A1
WO2023227186A1 PCT/EP2022/025352 EP2022025352W WO2023227186A1 WO 2023227186 A1 WO2023227186 A1 WO 2023227186A1 EP 2022025352 W EP2022025352 W EP 2022025352W WO 2023227186 A1 WO2023227186 A1 WO 2023227186A1
Authority
WO
WIPO (PCT)
Prior art keywords
pattern
data
items
item
contextual
Prior art date
Application number
PCT/EP2022/025352
Other languages
French (fr)
Inventor
Leo MUCKLEY
Adi Botea
Chahrazed Bouhini
Radhika Loomba
Original Assignee
Eaton Intelligent Power Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eaton Intelligent Power Limited filed Critical Eaton Intelligent Power Limited
Publication of WO2023227186A1 publication Critical patent/WO2023227186A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the disclosure relates to pattern recognition and classification.
  • the disclosure provides a computer implemented method of classifying data from a dataset of labelled data comprising: establishing a plurality of target classifications; obtaining one or more data patterns and one or more contextual data items from each labelled data item, and assigning a target classification to each labelled data item; establishing one or more pattern items for each labelled data item from the one or more data patterns, using a sliding window to identify pattern items smaller in size than its associated data pattern, and determining an association between the established pattern item, one or more contextual data items, and a target classification; and associating the target classification with one or more sets wherein each set comprises one or more pattern items and one or more contextual data items.
  • a sliding window approach discrete pattern items can be recognised effectively in pattern data and associated with contextual data items, resulting in contextual patterns that can be used in a practical classifier.
  • each data pattern may be a string, and each pattern item a substring.
  • using the sliding window to identify pattern items may comprise, for a plurality of windows, establishing a start position and a size for the window, and determining pattern items corresponding to that window.
  • This plurality of windows may consist of all available windows for a specific data pattern, but contextual data items may be used to select all available (or useful) windows from all possible windows for the specific data pattern.
  • the data pattern may be transformed by a predetermined transformation before using the sliding window to establish an alternative or an additional pattern item for the sliding window.
  • associating the target classification with one or more sets may comprise determining that the likelihood that the association is correct is above a predetermined confidence level.
  • the disclosure provides a computer implemented method of classifying an unlabelled data item, wherein the unlabelled data item comprises one or more data patterns as one or more contextual data items, comprising: establishing one or more pattern items from the one or more data patterns, using a sliding window to identify pattern items smaller in size than its associated data pattern; matching established pattern items against a pattern item database comprising stored pattern items and associated contextual data items, whereas matching established pattern items comprises determining a matching accuracy for the established pattern item wherein determining the matching accuracy comprising determining the presence or absence of contextual items associated with the stored pattern item; and if the matching accuracy exceeds a predetermined threshold value, classifying the unlabelled data item according to the classification of the relevant stored pattern item.
  • each data pattern may be a string, and each pattern item a substring.
  • using the sliding window to identify pattern items may comprise, for a plurality of windows, establishing a start position and a size for the window, and determining pattern items corresponding to that window.
  • This plurality of windows may consist of all available windows for a specific data pattern, but contextual data items may be used to select all available (or useful) windows from all possible windows for the specific data pattern.
  • the data pattern may be transformed by a predetermined transformation before using the sliding window to establish an alternative or an additional pattern item for the sliding window.
  • the disclosure provides a computer-implemented data classification system, comprising one or more processors and one or more memories, wherein the one or more processors are adapted to: provide a data classifier by performing the method of the first aspect above; and use the data classifier to classify data by performing the method of the second aspect above.
  • the disclosure provides a method of classifying an unclassified good in a warehouse, comprising: establishing a data classifier for warehouse goods as data items using the method of the first aspect; and using the data classifier to classify an unclassified good using the method of the second aspect.
  • Figure 1 illustrates a system suitable for implementing an embodiment of the disclosure in the context of warehouse automation
  • Figure 2 illustrates a process for the classification of data according to embodiments of the disclosure
  • Figure 3 illustrates in detail a method of using labelled data to establish a data classification according to an embodiment of the disclosure
  • Figure 4 shows use of a sliding window for pattern recognition in the method of Figure 3;
  • Figure 5 illustrates in detail a method of using the data classification established in Figure 3 to classify unlabelled data according to an embodiment of the disclosure.
  • Figure 6 shows use of a pattern matching process for use in the unlabelled data classification process of Figure 5.
  • Figure 1 shows an overall architecture for a system for implementing an embodiment of the disclosure in the context of warehouse automation.
  • a first cloud environment 1 informed by data input sources 2 establishes an inventory data repository 3 that is used for model training. Models are developed by a suitably programmed processor 4, with developed classification models and other related information (context models, training data) stored in a classification database 5 to provide a classification service.
  • the first cloud environment provides access to the classification models by a classification service API 6.
  • this may for example be used by a warehouse management system 10 in a warehouse 11.
  • this warehouse management system 10 is implemented as an edge server in communication with mobile robots 12 and warehouse worker devices 13.
  • the warehouse management system 10 here uses a cloud warehouse organisation service 14 in a second cloud environment 15 to perform necessary calculations, with the warehouse management system 10 communicating with the second cloud environment 15 through a further API 16.
  • the cloud warehouse organisation service 14 accesses the classification models through the classification service API 6.
  • first cloud environment and the second cloud environment could be the same environment, or the whole process could be located within the warehouse in which case there would be no need to access cloud systems.
  • this can of course be used for warehouse automation in any geography, or for purposes other than warehouse automation (as discussed further below).
  • Figure 2 illustrates a process for the classification of data according to embodiments of the disclosure. This has two main stages, and is in effect two processes, each reflecting a different aspect of the disclosure.
  • the first process 19 is the establishment of the classification. Firstly, labelled data is obtained 21 - each data item is assigned a target classification and it contains data patterns and contextual data. The data patterns are then mined 22 to establish one or more pattern items for the labelled data item - as will be described in detail below, this is achieved by using a sliding window to identify pattern items smaller in size than the data pattern. Associations between pattern items, contextual data and target classifications are established 23 in the form of contextual patterns associated with targets, and these are stored 24 in a contextual pattern database to form a classifying model.
  • the classifying model is used to classify unlabelled data.
  • the unlabelled data again comprises data patterns and contextual data, and each data item is input 25 to the classifying model.
  • the data patterns are mined 26 to establish pattern items and contextual data for a data item.
  • the pattern items and contextual data are matched 27 against data in the contextual pattern database to determine whether they can be matched with a classification to a required degree of accuracy. If so, then the unlabelled data item is classified 28 accordingly.
  • Key aspects of these processes are establishment of a classification through pattern recognition, and use of the classification on unlabelled data to classify it.
  • a method to establish a classification is illustrated in Figure 3, and a method of using such a classification on unlabelled data is illustrated in Figure 5.
  • Figure 3 illustrates the establishment of a classification, and in particular the use of a sliding window pattern recognition process, in the establishment of a classification.
  • pattern recognition the purpose here is to achieve accurate pattern recognition by finding a subset of a sequence (preferably a minimum subset) that can represent that sequence, the representation of the sequence by the subset being validated by training using labelled data.
  • the first stage is initialization 320, which can be divided into the following sub-stages.
  • the labelled data is read 322.
  • the labelled data comprises one or more pattern features, other relevant features (generally described here as “contextual features” and a target feature which is to be used as a basis for classification.
  • the pattern features could comprise one or more product codes or serial numbers.
  • Contextual features could be branding information, a vendor name, or product description language.
  • the target feature may be a product type, for example one identified in an existing product classification.
  • a feature set X is defined 324 to contain contextual features X_c 326 and pattern features X_p 328, with the target y being identified 330 for each labelled data item.
  • the labelled data could for example be historical invoice data for warehouse inventory where the classification of the spend has been labelled/verified by a domain expert/user.
  • the most pertinent features for classification could be identified by the user.
  • the pattern feature/s could be identified as a particular column in the invoice data which contains reoccurring and inherent patterns, such as a part number or serial number, which can aid in future automated classifications.
  • the contextual feature/s could be identified as the column/s which are needed to be utilised together with pattern feature/s to make a classification (e.g., patterns that can be recognised as such only when a certain vendor/ledger name is present). Therefore, a relationship can be learned/identified in a single transaction which contains a given pattern feature, a contextual feature and a pre-defined target label in a given invoice.
  • Contextual features may be used for more than complementing pattern items in identifying a target classification. They may also be used to assist in determination of where the pattern items may be found - in the sliding window method discussed below, contextual data may be used to limit the search space for the pattern items.
  • the function that controls the size and the positions of the sliding window may itself result from learned information.
  • such a function could recommend, in a given context, to limit the search for a pattern to the first half of a string - this may result from detection of vendor information as contextual data, along with observed or otherwise learned knowledge that the first half of the string provided classification data whereas the remainder was random or a counter.
  • the window selection step could be based on a heuristic from a domain expert (e.g., the first half of a part number contains the pattern; the last two digits of a serial number are redundant, etc.) and as such, this could motivate the choice of where to start/end the window or the size of the initial window.
  • a domain expert e.g., the first half of a part number contains the pattern; the last two digits of a serial number are redundant, etc.
  • this could motivate the choice of where to start/end the window or the size of the initial window.
  • Other possibilities are that the window selection could be based on historical classification statistics (e.g., which index of a pattern number has the greatest/lowest variability, etc.) or it could be learned using a learning algorithm (e.g., clustering algorithm).
  • the next stage is the pattern recognition and learning process 340.
  • this uses a dynamic sliding window to recognise pattern items within the pattern features, whereupon these pattern items can be associated with contextual data items to form a contextual pattern that can be used for classification.
  • the first step in the process is to initialize 342 a window with a specific (fixed) size and position. This is followed by pattern item identification and grouping with contextual data 344 which is described in more detail below with reference to Figure 4.
  • Grouped features that meet 346 a threshold for utility in classification for a target are added 348 to a contextual pattern storage database 400 (which can be considered as part of the classification database 5 shown in Figure 1).
  • the window can be incremented in size and/or moved in position 350 to work through the pattern item identification and grouping loop again for the different window arrangement.
  • the classifier is established 360. Specifically, this is a contextualized pattern database containing mappings from grouped pattern and contextual features to a target label with a corresponding measure of confidence. This contextualized pattern database is then used for pattern classification/inference on unlabelled data, as is described further below with reference to Figures 5 and 6.
  • the process begins 410 with the initialization of a hash table d, which acts as a contextual pattern dictionary. After this, each possible input function is considered for this window - additional inputs may also be added to ensure that all relevant forms of the pattern are considered, for example by use of a transformation function (which may, for example, reverse the pattern), to add extra scope to the search.
  • the window w is applied to the patterns of the pattern set X_c to extract 414 a pattern item p_i in each case - this will here be a substring based on the window’s size and starting position.
  • patterns p_i can be grouped 416 to form a reduced size dataset, and may also be associated with contextual features from the contextual feature set X_c.
  • the next step is to determine whether any of these resulting groups is a useful identifier for a target - this is done by scoring 418 grouped features against their association with a specific target feature y to determine whether the grouped features provide an indication 420 of a specific target to a particular confidence level if that grouping of features is found. Taking one grouping of pattern structures as an example, this may involve iteration through the reduced size dataset matching each pattern with the contextual features.
  • contextual features could be a selection of features which are highly correlated to the pattern feature or they could be a grouping of features based on their context (i.e. , grouped to protect inherent clustering between features to allow comparison between context groups). In any event, in this way, only the most pertinent information is retained for use.
  • a label need not be unique to a given pairing in all embodiments, and if so each pairing of pattern/contextual feature group may have various mappings to the target feature.
  • eA confidence measure may be assigned to each mapping. The type of confidence measure to be used should be selected depending on the data type and domain of the problem. With this confidence measure calculated, this is then assigned to each pattern/contextual feature mapping.
  • the hash table d is updated 422 with grouped features and aggregated scores, ending the loop.
  • the mappings of the pattern/contextual features to the classification result can be added 424 to the contextual pattern storage database 400.
  • the mappings can be added as a valid pattern, in context of the feature group, has been recognized in relation to the target feature. Following this, the window starting index position can be incremented. If the confidence measure falls below the threshold the mappings will not be considered a valid pattern, and the window position can increase in size and the bigger window can then consider the validity of the next pattern.
  • Adding the dynamic sliding window element to the pattern recognition aims at giving sufficient coverage of the input to detect the presence of inherent patterns within a feature without applying brute-force techniques.
  • the selection of the contextual features may be learned. For instance, the relevance of each of the input features to the specific problem could be learned by the dynamic sliding window. This could be achieved by analysing the output of the dynamic sliding window and the corresponding updates to the database, and then choosing the grouped features with the largest confidence scores. The subset of features which were selected via learning could then be utilized for future contextual pattern recognition and classification instead of utilizing the full feature set.
  • a pattern may be as simple as a substring corresponding to the window at hand.
  • a pattern can be generalized to allow so-called wildcards (i.e. , elements of regular expressions such as - for instance - “any character” (?), “any capital letter” [A-Z], and “any digit” [0-9]).
  • the method starts with learning (pattern, context, score) records where patterns are fully instantiated substrings. Then, in a post-processing step after the learning, a subset of records (substring, context, score) may be merged into a more compact representation, assuming that the substrings match a generalized pattern with wildcards, the contexts are similar, and the scores are similar.
  • the training data has been used to reduce the original data to a reduced data subset as indicated above, as each element of the reduced data subset is associated with a target classification, this can be used as the basis for a classifier tool.
  • this is provided as a cloud-based service.
  • the main elements of the classification method used by the classifier are shown in Figure 5, with Figure 6 providing detailed consideration of a pattern item identification loop complementary to that shown in Figure 4 for classifier development.
  • the first stage is initialization 520, which can be divided into the following sub-stages.
  • the unlabelled data for classification is read 522.
  • the labelled data this will typically comprises one or more pattern features and contextual features, but now the target feature needs to be inferred through the classification process.
  • known contextual patterns are read 524 from the contextual pattern storage database 400.
  • the next stage is pattern classification by inference 540.
  • a hash table s is initialized 542 to contain classification candidates and classification scores.
  • a window size and position is then defined 544 for use in a classification candidate determination loop 546 involving interrogation of the contextual pattern database.
  • this part of the process can be run in parallel as a plurality of jobs each assigned a different window size or position which will then query the respective pattern structure within the storage system - this shown in the alternative approach, involving determining 554 all window sizes and positions to be considered, and performing the candidate determination loop for each in parallel 556.
  • N pattern structure look-up tables
  • M ⁇ N
  • the classification candidate determination loop is described in more detail in Figure 6 below.
  • the output of this stage will be an assemblage of classification candidates each with an associated score.
  • the next stage is to return 560 a target classification.
  • a weighted methodology is applied 562.
  • the M classifications may be considered and the classification determined by considering a weighted combination of the classifications in relation to the corresponding measures of confidence.
  • a final classification for each input can be provided 564 as an output.
  • the final classification once determined, may then be used to update the pattern/contextual feature mapping in the database with a new measure of confidence reflecting this recognition step which will then be utilized for future classifications.
  • the weighted methodology could be based on an ensemble/aggregation method to combine the scores and provide a final classification.
  • An example aggregation method could be the calculation of a statistic (e.g., mean, median, etc.) based on the scores.
  • Such an ensemble-based method could involve use of a learning-based model (e.g. a machine learning process) where the input to the model would be the scores.
  • Figure 6 shows the classification candidate determination loop of Figure 5 in detail.
  • the loop begins 600 with the initialisation of the window with fixed size and position (this can be termed the “index”). This process consists of considering 610 every contextual pattern and the pattern items that they contain - these can be accessed in the form of a look-up table.
  • Each input i in the unlabelled data set X is evaluated 620, and the window function is applied 630 to the pattern data X_p in this data set to extract the associated pattern item p_i.
  • the next step is to check for the presence of the relevant contextual features which will enable a classification to be returned - if the contextual features X_c_i in the item are found 640 in the contextual pattern, this can be considered a candidate for a particular target mapping.
  • the next step is to determine 650 whether the confidence of the mapping is above a particular threshold value T. If the features can be queried and the confidence of the mapping is above a certain threshold, the hash table s is updated 660 with the index, target and score for use in the final scoring step and classification output, but after any updating the loop ends 670.
  • a transformation step e.g., reversal of the string; change the starting index etc.
  • transformation may be carried out specifically to search for pattern items that are indicated by particular contextual features, but are not found with them.
  • a contextual feature such as Vendor Name may map to one Normalised Vendor Name (.e., ABC UK, ABC USA and ABC LTD are normalized to ABC INC). Therefore one transformation could be to apply a normalization to a contextual feature.
  • manipulation of the input data (adjustment of the starting index, certain transformations) may be made to optimise the match.
  • classifier service API 6 enables classification to be used practically in the warehouse by individual users such as robots 12 or store workers using their computing devices 13. They can input details of an item, which then constitutes labelled data which can be provided through the edge server 10 through the warehouse service API 16 to the cloud warehouse service 14, which uses the classifier service API to obtain a classification result for the item, which is then fed back for use to the individual user in the warehouse for a warehousing task (such as storage in an appropriate warehouse location for the classified item).
  • a warehousing task such as storage in an appropriate warehouse location for the classified item.
  • Embodiments of the disclosure may also be used for other applications.
  • One example is in medical diagnosis - for example, where the goal is to determine the likelihood of a patient being diagnosed with a hereditary disease.
  • the user for example, a medical service
  • the feature with inherent patterns could be a DNA strand, which is string of characters (e.g., ACGTAAC), and the contextual features could be information related to the patient (such as age, sex etc. of the patient).
  • the grouping of the resulting patterns would allow for interaction analysis between the different patient features based on a specific pattern.
  • the pattern classification module would then search for inherent patterns within the new DNA strand, based on a set of pre-defined pattern structures (e.g., sub-string of length 4; the next 5 characters after T, etc.) in the contextualized database. Then for each pattern and set of patient related features matched in the database, the likelihood of a patient having a specific hereditary disease could be returned based on this information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A practical item classification method is described. Data may be classified from a dataset of labelled data in the following way. A plurality of target classifications can be established. Data patterns and contextual data items can be obtained from each labelled data item in the labelled data, with a target classification assigned to each labelled data item. One or more pattern items is then established (22) for each labelled data item from the one or more data patterns, using a sliding window to identify pattern items smaller in size than its associated data pattern, and an association between the established pattern item, one or more contextual data items, and a target classification is established (23). The target classification is associated with one or more sets, wherein each set comprises one or more pattern items and one or more contextual data items. Unlabelled items may then be classified. One or more pattern items are established (26) from the one or more data patterns, using a sliding window to identify pattern items smaller in size than its associated data pattern. Established pattern items are matched (27) against a pattern item database comprising stored pattern items and associated contextual data items - matching established pattern items comprises determining a matching accuracy for the established pattern item wherein determining the matching accuracy comprising determining the presence or absence of contextual items associated with the stored pattern item. If the matching accuracy exceeds a predetermined threshold value, the unlabelled data item is classified according to the classification of the relevant stored pattern item. Suitable systems for implementing such methods are also described.

Description

PATTERN RECOGNITION AND CLASSIFICATION
TECHNICAL FIELD
The disclosure relates to pattern recognition and classification.
BACKGROUND TO DISCLOSURE
In a number of contexts, it is desirable to classify objects in a collection on the basis of features - typically a set of features - characterizing the object. One such context is the organization of objects in a warehouse to enable effective automation of the warehouse (for example, by use of mobile robots). In this case, it may be particularly desirable to organise inventory in such a way that objects with a similar classification will be placed in the same area of the warehouse. There are, however, a wide variety of other contexts in which the same classification problem arises, extending from management of labelled objects to classification of disease.
While machine learning can be used for classification on the basis of a set of features, this is prone to inaccuracy. Insufficient training data, noisy data, and a suboptimal selection of the features and the representation of the contents inside a given feature can all lead to ineffective classification. Often a significant contribution to the complexity of the problem is the amount of data present that is not, in any given case, necessary for classification of that specific item.
It would be desirable to perform classification on the basis of a set of features in a more effective way, in particular by focussing more effectively on the data needed for classification in any given case.
SUMMARY OF DISCLOSURE
In a first aspect, the disclosure provides a computer implemented method of classifying data from a dataset of labelled data comprising: establishing a plurality of target classifications; obtaining one or more data patterns and one or more contextual data items from each labelled data item, and assigning a target classification to each labelled data item; establishing one or more pattern items for each labelled data item from the one or more data patterns, using a sliding window to identify pattern items smaller in size than its associated data pattern, and determining an association between the established pattern item, one or more contextual data items, and a target classification; and associating the target classification with one or more sets wherein each set comprises one or more pattern items and one or more contextual data items. Using such a sliding window approach, discrete pattern items can be recognised effectively in pattern data and associated with contextual data items, resulting in contextual patterns that can be used in a practical classifier.
In embodiments, each data pattern may be a string, and each pattern item a substring. In such a case, using the sliding window to identify pattern items may comprise, for a plurality of windows, establishing a start position and a size for the window, and determining pattern items corresponding to that window. This plurality of windows may consist of all available windows for a specific data pattern, but contextual data items may be used to select all available (or useful) windows from all possible windows for the specific data pattern.
In embodiments, the data pattern may be transformed by a predetermined transformation before using the sliding window to establish an alternative or an additional pattern item for the sliding window.
In embodiments, associating the target classification with one or more sets may comprise determining that the likelihood that the association is correct is above a predetermined confidence level.
In a second aspect, the disclosure provides a computer implemented method of classifying an unlabelled data item, wherein the unlabelled data item comprises one or more data patterns as one or more contextual data items, comprising: establishing one or more pattern items from the one or more data patterns, using a sliding window to identify pattern items smaller in size than its associated data pattern; matching established pattern items against a pattern item database comprising stored pattern items and associated contextual data items, whereas matching established pattern items comprises determining a matching accuracy for the established pattern item wherein determining the matching accuracy comprising determining the presence or absence of contextual items associated with the stored pattern item; and if the matching accuracy exceeds a predetermined threshold value, classifying the unlabelled data item according to the classification of the relevant stored pattern item.
In embodiments, each data pattern may be a string, and each pattern item a substring. In such a case, using the sliding window to identify pattern items may comprise, for a plurality of windows, establishing a start position and a size for the window, and determining pattern items corresponding to that window. This plurality of windows may consist of all available windows for a specific data pattern, but contextual data items may be used to select all available (or useful) windows from all possible windows for the specific data pattern. In embodiments, the data pattern may be transformed by a predetermined transformation before using the sliding window to establish an alternative or an additional pattern item for the sliding window.
In a third aspect, the disclosure provides a computer-implemented data classification system, comprising one or more processors and one or more memories, wherein the one or more processors are adapted to: provide a data classifier by performing the method of the first aspect above; and use the data classifier to classify data by performing the method of the second aspect above.
In a fourth aspect, the disclosure provides a method of classifying an unclassified good in a warehouse, comprising: establishing a data classifier for warehouse goods as data items using the method of the first aspect; and using the data classifier to classify an unclassified good using the method of the second aspect.
BRIEF DESCRIPTION OF FIGURES
Embodiments of the disclosure will now be described, by way of example, with reference to the following figures, of which:
Figure 1 illustrates a system suitable for implementing an embodiment of the disclosure in the context of warehouse automation;
Figure 2 illustrates a process for the classification of data according to embodiments of the disclosure;
Figure 3 illustrates in detail a method of using labelled data to establish a data classification according to an embodiment of the disclosure;
Figure 4 shows use of a sliding window for pattern recognition in the method of Figure 3;
Figure 5 illustrates in detail a method of using the data classification established in Figure 3 to classify unlabelled data according to an embodiment of the disclosure; and
Figure 6 shows use of a pattern matching process for use in the unlabelled data classification process of Figure 5.
Figure 1 shows an overall architecture for a system for implementing an embodiment of the disclosure in the context of warehouse automation. A first cloud environment 1 , informed by data input sources 2 establishes an inventory data repository 3 that is used for model training. Models are developed by a suitably programmed processor 4, with developed classification models and other related information (context models, training data) stored in a classification database 5 to provide a classification service. The first cloud environment provides access to the classification models by a classification service API 6. In the context of warehouse information, this may for example be used by a warehouse management system 10 in a warehouse 11. Here, this warehouse management system 10 is implemented as an edge server in communication with mobile robots 12 and warehouse worker devices 13. The warehouse management system 10 here uses a cloud warehouse organisation service 14 in a second cloud environment 15 to perform necessary calculations, with the warehouse management system 10 communicating with the second cloud environment 15 through a further API 16. The cloud warehouse organisation service 14 accesses the classification models through the classification service API 6.
It should be noted that this architecture is exemplary and that other arrangements are possible. For example, the first cloud environment and the second cloud environment could be the same environment, or the whole process could be located within the warehouse in which case there would be no need to access cloud systems. However, if the disclosure is implemented in this way to provide a cloud-based classification service, this can of course be used for warehouse automation in any geography, or for purposes other than warehouse automation (as discussed further below).
Figure 2 illustrates a process for the classification of data according to embodiments of the disclosure. This has two main stages, and is in effect two processes, each reflecting a different aspect of the disclosure. The first process 19 is the establishment of the classification. Firstly, labelled data is obtained 21 - each data item is assigned a target classification and it contains data patterns and contextual data. The data patterns are then mined 22 to establish one or more pattern items for the labelled data item - as will be described in detail below, this is achieved by using a sliding window to identify pattern items smaller in size than the data pattern. Associations between pattern items, contextual data and target classifications are established 23 in the form of contextual patterns associated with targets, and these are stored 24 in a contextual pattern database to form a classifying model.
In the second process 20, the classifying model is used to classify unlabelled data. The unlabelled data again comprises data patterns and contextual data, and each data item is input 25 to the classifying model. The data patterns are mined 26 to establish pattern items and contextual data for a data item. The pattern items and contextual data are matched 27 against data in the contextual pattern database to determine whether they can be matched with a classification to a required degree of accuracy. If so, then the unlabelled data item is classified 28 accordingly. Key aspects of these processes are establishment of a classification through pattern recognition, and use of the classification on unlabelled data to classify it. A method to establish a classification is illustrated in Figure 3, and a method of using such a classification on unlabelled data is illustrated in Figure 5.
Figure 3 illustrates the establishment of a classification, and in particular the use of a sliding window pattern recognition process, in the establishment of a classification. In pattern recognition, the purpose here is to achieve accurate pattern recognition by finding a subset of a sequence (preferably a minimum subset) that can represent that sequence, the representation of the sequence by the subset being validated by training using labelled data.
After the start 300 of the process, the first stage is initialization 320, which can be divided into the following sub-stages. First of all, the labelled data is read 322. Generally, the labelled data comprises one or more pattern features, other relevant features (generally described here as “contextual features” and a target feature which is to be used as a basis for classification. For example, in the case of warehouse inventory, the pattern features could comprise one or more product codes or serial numbers. Contextual features could be branding information, a vendor name, or product description language. The target feature may be a product type, for example one identified in an existing product classification. In initialisation, a feature set X is defined 324 to contain contextual features X_c 326 and pattern features X_p 328, with the target y being identified 330 for each labelled data item.
For practical training purposes, the labelled data could for example be historical invoice data for warehouse inventory where the classification of the spend has been labelled/verified by a domain expert/user. In this scenario, the most pertinent features for classification could be identified by the user. For instance, the pattern feature/s could be identified as a particular column in the invoice data which contains reoccurring and inherent patterns, such as a part number or serial number, which can aid in future automated classifications. Additionally, the contextual feature/s could be identified as the column/s which are needed to be utilised together with pattern feature/s to make a classification (e.g., patterns that can be recognised as such only when a certain vendor/ledger name is present). Therefore, a relationship can be learned/identified in a single transaction which contains a given pattern feature, a contextual feature and a pre-defined target label in a given invoice.
Contextual features may be used for more than complementing pattern items in identifying a target classification. They may also be used to assist in determination of where the pattern items may be found - in the sliding window method discussed below, contextual data may be used to limit the search space for the pattern items. For example, the function that controls the size and the positions of the sliding window may itself result from learned information. For example, such a function could recommend, in a given context, to limit the search for a pattern to the first half of a string - this may result from detection of vendor information as contextual data, along with observed or otherwise learned knowledge that the first half of the string provided classification data whereas the remainder was random or a counter. The window selection step could be based on a heuristic from a domain expert (e.g., the first half of a part number contains the pattern; the last two digits of a serial number are redundant, etc.) and as such, this could motivate the choice of where to start/end the window or the size of the initial window. Other possibilities are that the window selection could be based on historical classification statistics (e.g., which index of a pattern number has the greatest/lowest variability, etc.) or it could be learned using a learning algorithm (e.g., clustering algorithm).
The next stage is the pattern recognition and learning process 340. As noted above, this uses a dynamic sliding window to recognise pattern items within the pattern features, whereupon these pattern items can be associated with contextual data items to form a contextual pattern that can be used for classification. The first step in the process is to initialize 342 a window with a specific (fixed) size and position. This is followed by pattern item identification and grouping with contextual data 344 which is described in more detail below with reference to Figure 4. Grouped features that meet 346 a threshold for utility in classification for a target are added 348 to a contextual pattern storage database 400 (which can be considered as part of the classification database 5 shown in Figure 1). After these have been added, the window can be incremented in size and/or moved in position 350 to work through the pattern item identification and grouping loop again for the different window arrangement. This continues until all plausible windows have been evaluated - this may in embodiments comprise evaluation over all possible windows, but it may also be limited, for example by contextual data, so that windows which it is already known will not contain useful pattern items are not considered - for example, where it has been identified that for a particular context useful data will only be contained in the first half of a string, the window options may be established so that only windows “looking” at the first half of the string will be searched.. Once this process is complete, there is likely to be some redundancy in the contextual pattern storage database, and a selection process 352 may follow to reduce the contextual patterns so that only those with higher confidence are retained - preferably in such a way that lower confidence options are removed where there is a higher confidence option available for the same data, but not such that input data is left without a classification option. After this selection process, the classifier is established 360. Specifically, this is a contextualized pattern database containing mappings from grouped pattern and contextual features to a target label with a corresponding measure of confidence. This contextualized pattern database is then used for pattern classification/inference on unlabelled data, as is described further below with reference to Figures 5 and 6.
Before this, the pattern item identification and grouping loop is considered in more detail with reference to Figure 4. For a given window, the process begins 410 with the initialization of a hash table d, which acts as a contextual pattern dictionary. After this, each possible input function is considered for this window - additional inputs may also be added to ensure that all relevant forms of the pattern are considered, for example by use of a transformation function (which may, for example, reverse the pattern), to add extra scope to the search. For each input in the (potentially expanded) pattern set 412, the window w is applied to the patterns of the pattern set X_c to extract 414 a pattern item p_i in each case - this will here be a substring based on the window’s size and starting position.
Once all patterns p_i have been extracted, they can be grouped 416 to form a reduced size dataset, and may also be associated with contextual features from the contextual feature set X_c. The next step is to determine whether any of these resulting groups is a useful identifier for a target - this is done by scoring 418 grouped features against their association with a specific target feature y to determine whether the grouped features provide an indication 420 of a specific target to a particular confidence level if that grouping of features is found. Taking one grouping of pattern structures as an example, this may involve iteration through the reduced size dataset matching each pattern with the contextual features. These contextual features could be a selection of features which are highly correlated to the pattern feature or they could be a grouping of features based on their context (i.e. , grouped to protect inherent clustering between features to allow comparison between context groups). In any event, in this way, only the most pertinent information is retained for use.
It is significant for this process that in the presence of the relevant features, it is possible to map the pattern/contextual feature group pairings to the corresponding labels in the target feature. A label need not be unique to a given pairing in all embodiments, and if so each pairing of pattern/contextual feature group may have various mappings to the target feature. eA confidence measure may be assigned to each mapping. The type of confidence measure to be used should be selected depending on the data type and domain of the problem. With this confidence measure calculated, this is then assigned to each pattern/contextual feature mapping. The hash table d is updated 422 with grouped features and aggregated scores, ending the loop. If the confidence measure is above a specific certain threshold, the mappings of the pattern/contextual features to the classification result can be added 424 to the contextual pattern storage database 400. The mappings can be added as a valid pattern, in context of the feature group, has been recognized in relation to the target feature. Following this, the window starting index position can be incremented. If the confidence measure falls below the threshold the mappings will not be considered a valid pattern, and the window position can increase in size and the bigger window can then consider the validity of the next pattern. Adding the dynamic sliding window element to the pattern recognition aims at giving sufficient coverage of the input to detect the presence of inherent patterns within a feature without applying brute-force techniques.
In certain embodiments, the selection of the contextual features may be learned. For instance, the relevance of each of the input features to the specific problem could be learned by the dynamic sliding window. This could be achieved by analysing the output of the dynamic sliding window and the corresponding updates to the database, and then choosing the grouped features with the largest confidence scores. The subset of features which were selected via learning could then be utilized for future contextual pattern recognition and classification instead of utilizing the full feature set.
A pattern may be as simple as a substring corresponding to the window at hand. However, depending on the embodiment, a pattern can be generalized to allow so-called wildcards (i.e. , elements of regular expressions such as - for instance - “any character” (?), “any capital letter” [A-Z], and “any digit” [0-9]). In one sample embodiment, the method starts with learning (pattern, context, score) records where patterns are fully instantiated substrings. Then, in a post-processing step after the learning, a subset of records (substring, context, score) may be merged into a more compact representation, assuming that the substrings match a generalized pattern with wildcards, the contexts are similar, and the scores are similar.
Once the training data has been used to reduce the original data to a reduced data subset as indicated above, as each element of the reduced data subset is associated with a target classification, this can be used as the basis for a classifier tool. In the arrangement of Figure 1 , this is provided as a cloud-based service. The main elements of the classification method used by the classifier are shown in Figure 5, with Figure 6 providing detailed consideration of a pattern item identification loop complementary to that shown in Figure 4 for classifier development. After the start 500 of the process, the first stage is initialization 520, which can be divided into the following sub-stages. First of all, the unlabelled data for classification is read 522. As for the labelled data, this will typically comprises one or more pattern features and contextual features, but now the target feature needs to be inferred through the classification process. To assist in this, known contextual patterns are read 524 from the contextual pattern storage database 400.
The next stage is pattern classification by inference 540. As before, a hash table s is initialized 542 to contain classification candidates and classification scores. A window size and position is then defined 544 for use in a classification candidate determination loop 546 involving interrogation of the contextual pattern database. At the end of a loop, it is then determined whether 546 different positions or sizes of windows need to be evaluated, and if so, the window is adjusted accordingly 548. This continues until all legitimate candidate windows have been considered. It should be noted that as the windowing is dependent on the pattern structure, this part of the process can be run in parallel as a plurality of jobs each assigned a different window size or position which will then query the respective pattern structure within the storage system - this shown in the alternative approach, involving determining 554 all window sizes and positions to be considered, and performing the candidate determination loop for each in parallel 556. For example, if there are N pattern structure look-up tables in the storage system, then one choice is to consider M look-up tables, where M < N, resulting in a subset of the available pattern structures which would result in M pattern recognition jobs run in parallel. The classification candidate determination loop is described in more detail in Figure 6 below. In the parallelised job described here, the output of this stage will be an assemblage of classification candidates each with an associated score.
The next stage is to return 560 a target classification. To output the final classification based on the M pattern recognition activities, a weighted methodology is applied 562. Using such a weighted methodology, the M classifications may be considered and the classification determined by considering a weighted combination of the classifications in relation to the corresponding measures of confidence. Using this weighted combination, a final classification for each input can be provided 564 as an output. The final classification, once determined, may then be used to update the pattern/contextual feature mapping in the database with a new measure of confidence reflecting this recognition step which will then be utilized for future classifications.
In an alternative embodiment, the weighted methodology could be based on an ensemble/aggregation method to combine the scores and provide a final classification. An example aggregation method could be the calculation of a statistic (e.g., mean, median, etc.) based on the scores. Such an ensemble-based method could involve use of a learning-based model (e.g. a machine learning process) where the input to the model would be the scores.
Figure 6 shows the classification candidate determination loop of Figure 5 in detail. As previously noted in Figure 5, for each instance, the loop begins 600 with the initialisation of the window with fixed size and position (this can be termed the “index”). This process consists of considering 610 every contextual pattern and the pattern items that they contain - these can be accessed in the form of a look-up table. Each input i in the unlabelled data set X is evaluated 620, and the window function is applied 630 to the pattern data X_p in this data set to extract the associated pattern item p_i. If this pattern item exists in the look-up table, the next step is to check for the presence of the relevant contextual features which will enable a classification to be returned - if the contextual features X_c_i in the item are found 640 in the contextual pattern, this can be considered a candidate for a particular target mapping. The next step is to determine 650 whether the confidence of the mapping is above a particular threshold value T. If the features can be queried and the confidence of the mapping is above a certain threshold, the hash table s is updated 660 with the index, target and score for use in the final scoring step and classification output, but after any updating the loop ends 670. Once all M pattern classification jobs have been completed, for each input the algorithm will have returned at most M classifications (and possibly zero classifications). These will be used to provide one improved classification. As indicated above, some or all of these M pattern classification jobs may be carried out in parallel.
Various modifications to this approach are possible. For example, if contextual features are not present for the pattern indicated by the pattern item, a transformation step (e.g., reversal of the string; change the starting index etc.) can be applied to the input in the pattern feature, which will allow for the possibility of more patterns to be queried to the relevant database. Alternatively, transformation may be carried out specifically to search for pattern items that are indicated by particular contextual features, but are not found with them. For instance, a contextual feature such as Vendor Name may map to one Normalised Vendor Name (.e., ABC UK, ABC USA and ABC LTD are normalized to ABC INC). Therefore one transformation could be to apply a normalization to a contextual feature. Similarly, if on matching the confidence value is below the threshold, manipulation of the input data (adjustment of the starting index, certain transformations) may be made to optimise the match.
In the warehouse scenario depicted in Figure 1 , the development of a cloud-based classifier accessible by classifier service API 6 enables classification to be used practically in the warehouse by individual users such as robots 12 or store workers using their computing devices 13. They can input details of an item, which then constitutes labelled data which can be provided through the edge server 10 through the warehouse service API 16 to the cloud warehouse service 14, which uses the classifier service API to obtain a classification result for the item, which is then fed back for use to the individual user in the warehouse for a warehousing task (such as storage in an appropriate warehouse location for the classified item).
Embodiments of the disclosure may also be used for other applications. One example is in medical diagnosis - for example, where the goal is to determine the likelihood of a patient being diagnosed with a hereditary disease. Here, the user (for example, a medical service) has a labelled database of historical patient information which outlines if the patient had been diagnosed with a disease and a set of patient related features. In this problem, the feature with inherent patterns could be a DNA strand, which is string of characters (e.g., ACGTAAC), and the contextual features could be information related to the patient (such as age, sex etc. of the patient). In this situation, storing full DNA information for each patient would be infeasible as the storage requirements would be excessive - so identifying inherent patterns in the DNA strands, coupled with the contextual patient features, would allow the most pertinent information be stored and may enable use of the patient information for effective practical analysis.
The grouping of the resulting patterns would allow for interaction analysis between the different patient features based on a specific pattern. For a new patient, the pattern classification module would then search for inherent patterns within the new DNA strand, based on a set of pre-defined pattern structures (e.g., sub-string of length 4; the next 5 characters after T, etc.) in the contextualized database. Then for each pattern and set of patient related features matched in the database, the likelihood of a patient having a specific hereditary disease could be returned based on this information.
The skilled person will appreciate that the embodiments described above are exemplary, and that still further embodiments in different use contexts may be provided within the spirit and scope of the disclosure.

Claims

1. A computer implemented method of classifying data from a dataset of labelled data comprising: establishing a plurality of target classifications; obtaining one or more data patterns and one or more contextual data items from each labelled data item, and assigning a target classification to each labelled data item; establishing one or more pattern items for each labelled data item from the one or more data patterns, using a sliding window to identify pattern items smaller in size than its associated data pattern, and determining an association between the established pattern item, one or more contextual data items, and a target classification; and associating the target classification with one or more sets wherein each set comprises one or more pattern items and one or more contextual data items.
2. The method of claim 1 , wherein each data pattern is a string, and wherein each pattern item is a substring.
3. The method of claim 2, wherein using the sliding window to identify pattern items comprises, for a plurality of windows, establishing a start position and a size for the window, and determining pattern items corresponding to that window.
4. The method of claim 3, wherein the plurality of windows consists of all available windows for a specific data pattern.
5. The method of claim 4, wherein contextual data items are used to select all available windows from all possible windows for the specific data pattern.
6. The method of any of claims 2 to 5, further comprising transforming the data pattern by a predetermined transformation before using the sliding window to establish an alternative or an additional pattern item for the sliding window.
7. The method of any preceding claim, wherein associating the target classification with one or more sets comprises determining that the likelihood that the association is correct is above a predetermined confidence level.
8. A computer implemented method of classifying an unlabelled data item, wherein the unlabelled data item comprises one or more data patterns as one or more contextual data items, comprising: establishing one or more pattern items from the one or more data patterns, using a sliding window to identify pattern items smaller in size than its associated data pattern; matching established pattern items against a pattern item database comprising stored pattern items and associated contextual data items, whereas matching established pattern items comprises determining a matching accuracy for the established pattern item wherein determining the matching accuracy comprising determining the presence or absence of contextual items associated with the stored pattern item; and if the matching accuracy exceeds a predetermined threshold value, classifying the unlabelled data item according to the classification of the relevant stored pattern item.
9. The method of claim 8, wherein each data pattern is a string, and wherein each pattern item is a substring.
10. The method of claim 9, wherein using the sliding window to establish pattern items comprises, for a plurality of windows, establishing a start position and a size for the window, and determining pattern items corresponding to that window.
11. The method of claim 10, wherein the plurality of windows consists of all available windows for a specific data pattern.
12. The method of claim 11 , wherein contextual data items are used to select all available windows from all possible windows for the specific data pattern.
13. The method of any of claims 9 to 12, further comprising transforming the data pattern by a predetermined transformation before using the sliding window to establish an alternative or an additional pattern item for the sliding window.
14. A computer-implemented data classification system, comprising one or more processors and one or more memories, wherein the one or more processors are adapted to: provide a data classifier by performing the method of any of claims 1 to 7; and use the data classifier to classify data by performing the method of any of claims 8 to 13.
15. A method of classifying an unclassified good in a warehouse, comprising: establishing a data classifier for warehouse goods as data items using the method of any of claims 1 to 7; and using the data classifier to classify an unclassified good using the method of any of claims 8 to 13.
PCT/EP2022/025352 2022-05-24 2022-07-26 Pattern recognition and classification WO2023227186A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202211029836 2022-05-24
IN202211029836 2022-05-24

Publications (1)

Publication Number Publication Date
WO2023227186A1 true WO2023227186A1 (en) 2023-11-30

Family

ID=83005828

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/025352 WO2023227186A1 (en) 2022-05-24 2022-07-26 Pattern recognition and classification

Country Status (1)

Country Link
WO (1) WO2023227186A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161336A1 (en) * 2015-12-06 2017-06-08 Xeeva, Inc. Systems and/or methods for automatically classifying and enriching data records imported from big data and/or other sources to help ensure data integrity and consistency
US20180322327A1 (en) * 2017-05-02 2018-11-08 Techcyte, Inc. Machine learning classification and training for digital microscopy cytology images
US20200285912A1 (en) * 2017-06-05 2020-09-10 Umajin Inc. Hub-and-spoke classification system and methods

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161336A1 (en) * 2015-12-06 2017-06-08 Xeeva, Inc. Systems and/or methods for automatically classifying and enriching data records imported from big data and/or other sources to help ensure data integrity and consistency
US20180322327A1 (en) * 2017-05-02 2018-11-08 Techcyte, Inc. Machine learning classification and training for digital microscopy cytology images
US20200285912A1 (en) * 2017-06-05 2020-09-10 Umajin Inc. Hub-and-spoke classification system and methods

Similar Documents

Publication Publication Date Title
US11704494B2 (en) Discovering a semantic meaning of data fields from profile data of the data fields
US11537820B2 (en) Method and system for generating and correcting classification models
Kumar et al. An efficient k-means clustering filtering algorithm using density based initial cluster centers
US8949158B2 (en) Cost-sensitive alternating decision trees for record linkage
US20200110842A1 (en) Techniques to process search queries and perform contextual searches
US9183285B1 (en) Data clustering system and methods
CA3059414A1 (en) Hybrid approach to approximate string matching using machine learning
US20220398857A1 (en) Document analysis architecture
CN110019474B (en) Automatic synonymy data association method and device in heterogeneous database and electronic equipment
CN107844533A (en) A kind of intelligent Answer System and analysis method
US20200134537A1 (en) System and method for generating employment candidates
US11379665B1 (en) Document analysis architecture
US11507901B1 (en) Apparatus and methods for matching video records with postings using audiovisual data processing
JP5585472B2 (en) Information collation apparatus, information collation method, and information collation program
WO2021252419A1 (en) Document analysis architecture
US9442901B2 (en) Resembling character data search supporting method, resembling candidate extracting method, and resembling candidate extracting apparatus
CN112100202B (en) Product identification and product information completion method, storage medium and robot
CN114253990A (en) Database query method and device, computer equipment and storage medium
US20220050884A1 (en) Utilizing machine learning models to automatically generate a summary or visualization of data
US11922326B2 (en) Data management suggestions from knowledge graph actions
CN115099832B (en) Abnormal user detection method and device, equipment, medium and product thereof
WO2023227186A1 (en) Pattern recognition and classification
JP5890413B2 (en) Method and search engine for searching a large number of data records
US11893065B2 (en) Document analysis architecture
US11893505B1 (en) Document analysis architecture

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22757849

Country of ref document: EP

Kind code of ref document: A1