CN116964603A

CN116964603A - Systems, methods, and computer program products for multi-domain ensemble learning based on multivariate time series data

Info

Publication number: CN116964603A
Application number: CN202280011256.5A
Authority: CN
Inventors: 何林芸; S·阿格拉瓦尔; 林郁珊; 吴宇航; I·宾德利什; C·切蒂亚; 王飞
Original assignee: Visa International Service Association
Current assignee: Visa International Service Association
Priority date: 2021-10-20
Filing date: 2022-10-20
Publication date: 2023-10-27

Abstract

Systems, methods, and computer program products for multi-domain integrated learning based on multivariate time series data are provided. A method may include receiving multivariate sequence data. At least a portion of the multivariate sequence data can be input into a plurality of anomaly detection models to generate a plurality of scores. The multivariate sequence data may be combined with a plurality of scores to generate combined intermediate data. The combined intermediate data may be input into a combined integration model to generate an output score. In response to determining that the output score meets a threshold, at least one of: an alert may be communicated to the user device, the multivariate sequence data may be input into the feature domain integration model to generate a feature importance vector, or at least one of the model domain integration model, the time domain integration model, the feature domain integration model, or the combined integration model may be updated.

Description

Systems, methods, and computer program products for multi-domain ensemble learning based on multivariate time series data

Cross Reference to Related Applications

The present application claims priority from U.S. provisional patent application No. 63/257,737, filed on day 10, 20, 2021, and from U.S. provisional patent application No. 63/358,317, filed on day 7, 2022, the disclosures of which are incorporated herein by reference in their entirety.

Technical Field

The present disclosure relates generally to multi-domain integrated learning based on multivariate time series data, and in non-limiting embodiments or aspects, to systems, methods, and computer program products for multi-domain integrated learning based on multivariate time series data.

Background

There are many different types of models that can be used for anomaly detection. For example, certain models may detect anomalies based on a time series of one or more variables (e.g., features).

However, for a given input (e.g., a given input of multivariate time series data), the outputs of different models and/or different types of models may be different. For example, some (types of) models may detect anomalies based on such inputs, while others may not. Additionally, it may be difficult to consider (e.g., combine, integrate, etc.) scores for a plurality of different models. For example, it may be difficult to determine which feature(s) are important to each model (e.g., significantly contribute to the output of each model) and/or how to balance the scores from each of the models. Moreover, the tags may be sparse and/or available only to some of the input data (e.g., only confirmed anomalies may be marked while the remaining data is not), which may result in supervised learning suffering from category imbalance. Furthermore, when considering anomaly detection scores from multiple models, false positives may be high (e.g., data is marked as anomalous if one or a small number of models detect anomalies at a time), which may waste resources or lead to user frustration (e.g., users whose legal activities are marked as anomalous, investigators that examine detected anomalies to determine if they are true or false positive, etc.).

Disclosure of Invention

It is therefore an object of the present disclosure to provide a system, method and computer program product for multi-domain integrated learning based on multivariate time series data that overcomes some or all of the above-identified drawbacks.

According to a non-limiting embodiment or aspect, a computer-implemented method for multi-domain integrated learning is provided. The method may include receiving multivariate sequence data comprising a plurality of vectors. Each respective vector of the plurality of vectors may include elements based on a time series of respective variables of the plurality of variables. At least a portion of the multivariate sequence data may be input into each respective anomaly detection model of the plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model. The multivariate sequence data may be combined with a plurality of scores to generate combined intermediate data. The combined intermediate data may be input into a combined integration model to generate an output score. The combined integration model may be based on a model domain integration model, a time domain integration model, and a feature domain integration model. It may be determined whether the output score meets a threshold. In response to determining that the output score meets a threshold, at least one of: an alert may be communicated to the user device, multivariate sequence data may be input into the feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables, or parameters of at least one of the combined integration model, the model domain integration model, the time domain integration model, or the feature domain integration model may be updated.

In some non-limiting embodiments or aspects, each anomaly detection model of the plurality of anomaly detection models may include at least one of a classifier model or a score generation model.

In some non-limiting embodiments or aspects, each anomaly detection model of the plurality of anomaly detection models may include at least one of a bayesian model, a Kullback-Leibler importance estimation program (KLIEP) model, a ChangeFinder model, or a cumulative sum (CUSUM) model.

In some non-limiting embodiments or aspects, the method may further include determining whether to tag or communicate at least a portion of the transfer of the multivariate sequence data based on the output score. In response to determining whether to tag or communicate at least a portion of the multivariate sequence data, one of the at least a portion of the multivariate sequence data may be tagged based on the output score, or the at least a portion of the multivariate sequence data and the output score may be communicated to the user device.

In some non-limiting embodiments or aspects, updating parameters of at least one of the integrated models may include initializing a combined integrated model. Additionally or alternatively, for each time step less than the maximum time step, a first temporary variable may be determined based on the combined intermediate data and the current time step version of the combined integrated model; a second temporary variable may be determined based on a ratio of the tag to the first temporary variable; the feature domain integration model may be back-propagated based on the transpose of the combined intermediate data and the second temporary variable; and a next time-step version of the combined integrated model may be determined based on the back-propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

In some non-limiting embodiments or aspects, updating parameters of at least one of the integrated models may further comprise: for each time step less than the maximum time step, determining a third temporary variable according to a first Khatri-Rao product based on the current time step version of the time-domain integration model and the current time step version of the feature-domain integration model; determining a current time step version of the model domain integration model based on the third temporary variable, the model domain integration model, and the tensor development in mode 0 of the combined integration model; determining a fourth temporary variable according to a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model; determining an updated current time step version of the time domain integration model based on the fourth temporary variable, the time domain integration model, and the tensor expansion in mode 1 of the combined integration model; determining a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and the updated current time-step version of the time domain integration model; and determining an updated current time-step version of the feature domain integration model based on the fifth temporary variable, the feature domain integration model, and the tensor development in mode 2 of the combined integration model.

In some non-limiting embodiments or aspects, the loss function of the combined integration model may be based on a model domain integration model, a time domain integration model, and a feature domain integration model.

In some non-limiting embodiments or aspects, the loss function of the combined integrated model is based on the following equation:

where M is a model domain integration model, T is a time domain integration model, F is a feature domain integration model, H is a combined integration model, l is a feature importance vector, and Z is multivariate sequence data.

According to a non-limiting embodiment or aspect, a system for multi-domain integrated learning is provided. The system may include at least one processor and at least one non-transitory computer-readable medium comprising one or more instructions that, when executed by the at least one processor, direct the at least one processor to receive multivariate sequence data comprising a plurality of vectors. Each respective vector of the plurality of vectors may include elements based on a time series of respective variables of the plurality of variables. At least a portion of the multivariate sequence data may be input into each respective anomaly detection model of the plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model. The multivariate sequence data may be combined with a plurality of scores to generate combined intermediate data. The combined intermediate data may be input into a combined integration model to generate an output score. The combined integration model may be based on a model domain integration model, a time domain integration model, and a feature domain integration model. It may be determined whether the output score meets a threshold. In response to determining that the output score meets a threshold, at least one of: an alert may be communicated to the user device, multivariate sequence data may be input into the feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables, or parameters of at least one of the combined integration model, the model domain integration model, the time domain integration model, or the feature domain integration model may be updated.

In some non-limiting embodiments or aspects, each anomaly detection model of the plurality of anomaly detection models includes at least one of a classifier model or a score generation model.

In some non-limiting embodiments or aspects, each anomaly detection model of the plurality of anomaly detection models includes at least one of a bayesian model, a Kullback-Leibler importance estimation program (KLIEP) model, a changefilter model, or a cumulatively sum (CUSUM) model.

In some non-limiting embodiments or aspects, the one or more instructions may further direct the at least one processor to determine whether to tag or communicate at least a portion of the multivariate sequence data based on the output score. In response to determining whether to tag or communicate at least a portion of the multivariate sequence data, one of the at least a portion of the multivariate sequence data may be tagged based on the output score, or the at least a portion of the multivariate sequence data and the output score may be communicated to the user device.

In some non-limiting embodiments or aspects, the one or more instructions may further direct the at least one processor to initialize the combined integrated model when updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model. For each time step less than the maximum time step, a first temporary variable may be determined based on the combined intermediate data and a current time step version of the combined integrated model; a second temporary variable may be determined based on a ratio of the tag to the first temporary variable; the feature domain integration model may be back-propagated based on the transpose of the combined intermediate data and the second temporary variable; and a next time-step version of the combined integrated model may be determined based on the back-propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

In some non-limiting embodiments or aspects, the one or more instructions may further direct the at least one processor to, when updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model: for each time step less than the maximum time step, determining a third temporary variable according to a first Khatri-Rao product based on the current time step version of the time-domain integration model and the current time step version of the feature-domain integration model; determining a current time step version of the model domain integration model based on the third temporary variable, the model domain integration model, and the tensor development in mode 0 of the combined integration model; determining a fourth temporary variable according to a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model; determining an updated current time step version of the time domain integration model based on the fourth temporary variable, the time domain integration model, and the tensor expansion in mode 1 of the combined integration model; determining a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and the updated current time-step version of the time domain integration model; and determining an updated current time-step version of the feature domain integration model based on the fifth temporary variable, the feature domain integration model, and the tensor development in mode 2 of the combined integration model.

In some non-limiting embodiments or aspects, the loss function of the combined integrated model may be based on the following equation:

According to a non-limiting embodiment or aspect, a computer program product for multi-domain integrated learning is provided. The computer program product may include at least one non-transitory computer-readable medium comprising one or more instructions that, when executed by at least one processor, cause the at least one processor to receive multivariate sequence data comprising a plurality of vectors. Each respective vector of the plurality of vectors may include elements based on a time series of respective variables of the plurality of variables. At least a portion of the multivariate sequence data may be input into each respective anomaly detection model of the plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model. The multivariate sequence data may be combined with a plurality of scores to generate combined intermediate data. The combined intermediate data may be input into a combined integration model to generate an output score. The combined integration model may be based on a model domain integration model, a time domain integration model, and a feature domain integration model. It may be determined whether the output score meets a threshold. In response to determining that the output score meets a threshold, at least one of: an alert may be communicated to the user device, multivariate sequence data may be input into the feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables, or parameters of at least one of the combined integration model, the model domain integration model, the time domain integration model, or the feature domain integration model may be updated.

In some non-limiting embodiments or aspects, the one or more instructions may further cause the at least one processor to determine whether to tag or communicate at least a portion of the multivariate sequence data based on the output score. In response to determining whether to tag or communicate at least a portion of the multivariate sequence data, one of the at least a portion of the multivariate sequence data may be tagged based on the output score, or the at least a portion of the multivariate sequence data and the output score may be communicated to the user device.

In some non-limiting embodiments or aspects, the one or more instructions may further cause the at least one processor to initialize the combined integrated model when updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model. For each time step less than the maximum time step, a first temporary variable may be determined based on the combined intermediate data and a current time step version of the combined integrated model; a second temporary variable may be determined based on a ratio of the tag to the first temporary variable; the feature domain integration model may be back-propagated based on the transpose of the combined intermediate data and the second temporary variable; and a next time-step version of the combined integrated model may be determined based on the back-propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

In some non-limiting embodiments or aspects, the one or more instructions may further cause the at least one processor to, when updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model: for each time step less than the maximum time step, determining a third temporary variable according to a first Khatri-Rao product based on the current time step version of the time-domain integration model and the current time step version of the feature-domain integration model; determining a current time step version of the model domain integration model based on the third temporary variable, the model domain integration model, and the tensor development in mode 0 of the combined integration model; determining a fourth temporary variable according to a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model; determining an updated current time step version of the time domain integration model based on the fourth temporary variable, the time domain integration model, and the tensor expansion in mode 1 of the combined integration model; determining a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and the updated current time-step version of the time domain integration model; and determining an updated current time-step version of the feature domain integration model based on the fifth temporary variable, the feature domain integration model, and the tensor development in mode 2 of the combined integration model.

According to a non-limiting embodiment or aspect, a system for multi-domain integrated learning is provided. The system may include at least one processor; and at least one non-transitory computer-readable medium comprising one or more instructions that, when executed by the at least one processor, direct the at least one processor to perform any of the methods described herein.

According to a non-limiting embodiment or aspect, a computer program product for multi-domain integrated learning is provided. The computer program product may include at least one non-transitory computer-readable medium comprising one or more instructions that, when executed by at least one processor, cause the at least one processor to perform any of the methods described herein.

Other non-limiting embodiments or aspects will be set forth in the following numbered clauses:

clause 1: a computer-implemented method, comprising: receiving, with at least one processor, multivariate sequence data comprising a plurality of vectors, each respective vector of the plurality of vectors comprising elements based on a time series of respective variables of the plurality of variables; inputting, with at least one processor, at least a portion of the multivariate sequence data into each respective anomaly detection model of the plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model; combining, with at least one processor, the multivariate sequence data with the plurality of scores to generate combined intermediate data; inputting, with at least one processor, the combined intermediate data into a combined integrated model to generate an output score, the combined integrated model based on the model domain integrated model, the time domain integrated model, and the feature domain integrated model; determining, with the at least one processor, that the output score meets a threshold; and in response to determining that the output score meets a threshold, at least one of: communicating, with at least one processor, an alert to a user device; inputting, with at least one processor, the multivariate sequence data into a feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables; or updating, with the at least one processor, parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model.

Clause 2: the method of clause 1, wherein each anomaly detection model of the plurality of anomaly detection models comprises at least one of a classifier model or a score generation model.

Clause 3: the method of clause 1 or clause 2, wherein each anomaly detection model of the plurality of anomaly detection models comprises at least one of a bayesian model, a Kullback-Leibler importance estimation program (KLIEP) model, a changefilter model, or a cumulatively sum (CUSUM) model.

Clause 4: the method of any of clauses 1-3, further comprising: determining, with at least one processor, whether to tag or communicate at least a portion of the multivariate sequence data based on the output score; and in response to determining whether to tag or communicate at least a portion of the multivariate sequence data, one of: marking, with at least one processor, at least a portion of the multivariate sequence data based on the output score; or communicating, with at least one processor, at least a portion of the multivariate sequence data and the output score to the user equipment.

Clause 5: the method of any of clauses 1 to 4, wherein updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model comprises: initializing, with at least one processor, a combined integration model; for each time step less than the maximum time step: determining, with the at least one processor, a first temporary variable based on the combined intermediate data and a current time step version of the combined integrated model; determining, with the at least one processor, a second temporary variable based on a ratio of the tag to the first temporary variable; a back propagation feature domain integration model based on the transpose of the combined intermediate data and the second temporary variable, with the at least one processor; and determining, with the at least one processor, a next time-step version of the combined integrated model based on the back-propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

Clause 6: the method of any of clauses 1 to 5, wherein updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model further comprises: for each time step less than the maximum time step: determining, with the at least one processor, a third temporary variable from a first Khatri-Rao product based on the current time-step version of the time-domain integration model and the current time-step version of the feature-domain integration model; determining, with the at least one processor, a current time-step version of the model domain integration model based on the third temporary variable, the model domain integration model, and the tensor development in mode 0 of the combined integration model; determining, with the at least one processor, a fourth temporary variable from a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model; determining, with the at least one processor, an updated current time-step version of the time-domain integration model based on the fourth temporary variable, the time-domain integration model, and the tensor expansion in mode 1 of the combined integration model; determining, with the at least one processor, a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and the updated current time-step version of the time domain integration model; and determining, with the at least one processor, an updated current time-step version of the feature domain integration model based on the fifth temporary variable, the feature domain integration model, and the tensor expansion in mode 2 of the combined integration model.

Clause 7: the method of any of clauses 1-6, wherein the loss function of the combined integration model is based on a model domain integration model, a time domain integration model, and a feature domain integration model.

Clause 8: the method of any of clauses 1-7, wherein combining the loss functions of the ensemble model is based on the following equation:

Clause 9: a system, comprising: at least one processor; and at least one non-transitory computer-readable medium comprising one or more instructions that, when executed by the at least one processor, direct the at least one processor to: receiving multivariate sequence data comprising a plurality of vectors, each respective vector of the plurality of vectors comprising elements based on a time series of respective variables of the plurality of variables; inputting at least a portion of the multivariate sequence data into each respective anomaly detection model of the plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model; combining the multivariate sequence data with a plurality of scores to generate combined intermediate data; inputting the combined intermediate data into a combined integrated model to generate an output score, the combined integrated model being based on a model domain integrated model, a time domain integrated model and a feature domain integrated model; determining that the output score meets a threshold; and in response to determining that the output score meets a threshold, at least one of: communicating an alert to a user device; inputting the multivariate sequence data into a feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables; or updating parameters of at least one of the combined integration model, the model domain integration model, the time domain integration model, or the feature domain integration model.

Clause 10: the system of clause 9, wherein each anomaly detection model of the plurality of anomaly detection models comprises at least one of a classifier model or a score generation model.

Clause 11: the system of clause 9 or clause 10, wherein each anomaly detection model of the plurality of anomaly detection models comprises at least one of a bayesian model, a Kullback-Leibler importance estimation program (KLIEP) model, a changefilter model, or a cumulatively sum (CUSUM) model.

Clause 12: the system of any of clauses 9 to 11, wherein the one or more instructions further direct the at least one processor to: determining whether to tag or communicate at least a portion of the multivariate sequence data based on the output score; and in response to determining whether to tag or communicate at least a portion of the multivariate sequence data, one of: marking at least a portion of the multivariate sequence data based on the output score; or communicating at least a portion of the multivariate sequence data and the output score to the user device.

Clause 13: the system of any of clauses 9 to 12, wherein the one or more instructions further direct the at least one processor to, when updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model: initializing a combined integrated model; for each time step less than the maximum time step: determining a first temporary variable based on the combined intermediate data and a current time step version of the combined integrated model; determining a second temporary variable based on a ratio of the tag to the first temporary variable; based on the transpose of the combined intermediate data and the second temporary variable, back-propagating the feature domain integration model; and determining a next time-step version of the combined integrated model based on the back-propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

Clause 14: the system of any of clauses 9 to 13, wherein the one or more instructions may further direct the at least one processor to, when updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model: for each time step less than the maximum time step, determining a third temporary variable according to a first Khatri-Rao product based on the current time step version of the time-domain integration model and the current time step version of the feature-domain integration model; determining a current time step version of the model domain integration model based on the third temporary variable, the model domain integration model, and the tensor development in mode 0 of the combined integration model; determining a fourth temporary variable according to a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model; determining an updated current time step version of the time domain integration model based on the fourth temporary variable, the time domain integration model, and the tensor expansion in mode 1 of the combined integration model; determining a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and the updated current time-step version of the time domain integration model; and determining an updated current time-step version of the feature domain integration model based on the fifth temporary variable, the feature domain integration model, and the tensor development in mode 2 of the combined integration model.

Clause 15: the system of any of clauses 9 to 14, wherein the loss function of the combined integration model is based on a model domain integration model, a time domain integration model, and a feature domain integration model.

Clause 16: the system of any of clauses 9 to 15, wherein the loss function of the combined ensemble model is based on the following equation:

Clause 17: a computer program product comprising at least one non-transitory computer-readable medium containing one or more instructions that, when executed by at least one processor, cause the at least one processor to: receiving multivariate sequence data comprising a plurality of vectors, each respective vector of the plurality of vectors comprising elements based on a time series of respective variables of the plurality of variables; inputting at least a portion of the multivariate sequence data into each respective anomaly detection model of the plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model; combining the multivariate sequence data with a plurality of scores to generate combined intermediate data; inputting the combined intermediate data into a combined integrated model to generate an output score, the combined integrated model being based on a model domain integrated model, a time domain integrated model and a feature domain integrated model; determining that the output score meets a threshold; and in response to determining that the output score meets a threshold, at least one of: communicating an alert to a user device; inputting the multivariate sequence data into a feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables; or updating parameters of at least one of the combined integration model, the model domain integration model, the time domain integration model, or the feature domain integration model.

Clause 18: the computer program product of clause 17, wherein the one or more instructions further cause the at least one processor to: determining whether to tag or communicate at least a portion of the multivariate sequence data based on the output score; and in response to determining whether to tag or communicate at least a portion of the multivariate sequence data, one of: marking at least a portion of the multivariate sequence data based on the output score; or communicating at least a portion of the multivariate sequence data and the output score to the user device.

Clause 19: the computer program product of clause 17 or 18, wherein the one or more instructions further cause the at least one processor to, when updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model: initializing a combined integrated model; for each time step less than the maximum time step: determining a first temporary variable based on the combined intermediate data and a current time step version of the combined integrated model; determining a second temporary variable based on a ratio of the tag to the first temporary variable; based on the transpose of the combined intermediate data and the second temporary variable, back-propagating the feature domain integration model; and determining a next time-step version of the combined integrated model based on the back-propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

Clause 20: the computer program product of any of clauses 17 to 19, wherein the one or more instructions, when updating parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model, further cause the at least one processor to: for each time step less than the maximum time step, determining a third temporary variable according to a first Khatri-Rao product based on the current time step version of the time-domain integration model and the current time step version of the feature-domain integration model; determining a current time step version of the model domain integration model based on the third temporary variable, the model domain integration model, and the tensor development in mode 0 of the combined integration model; determining a fourth temporary variable according to a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model; determining an updated current time step version of the time domain integration model based on the fourth temporary variable, the time domain integration model, and the tensor expansion in mode 1 of the combined integration model; determining a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and the updated current time-step version of the time domain integration model; and determining an updated current time-step version of the feature domain integration model based on the fifth temporary variable, the feature domain integration model, and the tensor development in mode 2 of the combined integration model.

Clause 21: a system, comprising: at least one processor; and at least one non-transitory computer-readable medium containing one or more instructions that, when executed by the at least one processor, direct the at least one processor to perform the method of any of clauses 1-8.

Clause 22: a computer program product comprising at least one non-transitory computer-readable medium comprising one or more instructions that, when executed by at least one processor, cause the at least one processor to perform the method of any of clauses 1-8.

These and other features and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combinations of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification, wherein like reference numerals designate corresponding parts in the various figures. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the invention.

Drawings

Additional advantages and details are explained in more detail below with reference to the non-limiting exemplary embodiments shown in the schematic drawings in which:

FIG. 1 is a schematic diagram of a system for multi-domain integrated learning based on multivariate time series data according to some non-limiting embodiments or aspects;

FIG. 2 is a flow chart of a method for multi-domain ensemble learning based on multivariate time series data, according to some non-limiting embodiments or aspects;

FIG. 3 is a diagram of an exemplary environment in which the methods, systems, and/or computer program products described herein may be implemented, according to some non-limiting embodiments or aspects;

FIG. 4 is a schematic diagram of example components of one or more of the devices of FIG. 1 and/or FIG. 3, according to some non-limiting embodiments or aspects;

FIG. 5 is a schematic diagram of an example embodiment of a system for multi-domain integrated learning based on multivariate time series data according to some non-limiting embodiments or aspects;

FIG. 6 is a schematic diagram of an example embodiment of an integrated model for multi-domain integrated learning based on multivariate time series data according to some non-limiting embodiments or aspects; and

FIG. 7 is a schematic diagram of an example implementation of an integrated model for multi-domain integrated learning based on multivariate time series data according to some non-limiting embodiments or aspects.

Detailed Description

For purposes of the following description, the terms "end," "upper," "lower," "right," "left," "vertical," "horizontal," "top," "bottom," "cross," "longitudinal," and derivatives thereof shall relate to the embodiments as oriented in the figures. However, it is to be understood that the embodiments may assume various alternative variations and step sequences, except where expressly specified to the contrary. It is also to be understood that the specific devices and processes illustrated in the attached drawings, and described in the following specification are simply exemplary embodiments or aspects of the invention. Accordingly, specific dimensions and other physical characteristics relating to the embodiments or aspects disclosed herein are not to be considered as limiting.

No aspect, component, element, structure, act, step, function, instruction, or the like used herein should be construed as critical or essential unless explicitly described as such. Also, as used herein, the article "a" is intended to include one or more items, and is used interchangeably with "one or more" and "at least one". Furthermore, as used herein, the term "set" is intended to include one or more items (e.g., related items, unrelated items, a combination of related and unrelated items, etc.), and is used interchangeably with "one or more" or "at least one". Where only one item is desired, the terms "a" and "an" or similar language are used. Also, as used herein, the term "having" and the like are intended to be open-ended terms. In addition, unless explicitly stated otherwise, the phrase "based on" is intended to mean "based, at least in part, on".

As used herein, the term "acquirer mechanism" may refer to an entity licensed and/or approved by a transaction service provider to initiate a transaction (e.g., a payment transaction) using a payment device associated with the transaction service provider. The transaction that the acquirer mechanism may initiate may include a payment transaction (e.g., a purchase, an Original Credit Transaction (OCT), an Account Funds Transaction (AFT), etc.). In some non-limiting embodiments or aspects, the acquirer mechanism may be a financial institution, such as a bank. As used herein, the term "acquirer system" may refer to one or more computing devices operated by or on behalf of an acquirer mechanism, such as a server computer executing one or more software applications.

As used herein, the term "account identifier" may include one or more Primary Account Numbers (PANs), tokens, or other identifiers associated with customer accounts. The term "token" may refer to an identifier that serves as a substitute or replacement identifier for an original account identifier, such as a PAN. The account identifier may be an alphanumeric number or any combination of characters and/or symbols. The token may be associated with a PAN or other primary account identifier in one or more data structures (e.g., one or more databases, etc.) such that the token may be used to conduct transactions without directly using the primary account identifier. In some instances, a primary account identifier, such as a PAN, may be associated with multiple tokens for different individuals or purposes.

As used herein, the term "communication" may refer to the receipt, admission, transmission, transfer, provision, etc., of data (e.g., information, signals, messages, instructions, commands, etc.). Communication of one element (e.g., a device, system, component of a device or system, combination thereof, etc.) with another element means that the one element is capable of directly or indirectly receiving information from and/or transmitting information to the other element. This may refer to a direct or indirect connection (e.g., direct communication connection, indirect communication connection, etc.) that is wired and/or wireless in nature. In addition, although the transmitted information may be modified, processed, relayed, and/or routed between the first unit and the second unit, the two units may also be in communication with each other. For example, a first unit may communicate with a second unit even though the first unit passively receives information and does not actively send information to the second unit. As another example, if at least one intermediate unit processes information received from a first unit and transmits the processed information to a second unit, the first unit may communicate with the second unit.

As used herein, the term "computing device" may refer to one or more electronic devices configured to process data. In some examples, a computing device may include the necessary components to receive, process, and output data, such as processors, displays, memory, input devices, network interfaces, and the like. The computing device may be a mobile device. By way of example, mobile devices may include cellular telephones (e.g., smartphones or standard cellular telephones), portable computers, wearable devices (e.g., watches, glasses, lenses, clothing, etc.), personal Digital Assistants (PDAs), and/or other similar devices. The computing device may also be a desktop computer or other form of non-mobile computer.

As used herein, the terms "electronic wallet" and "electronic wallet application" refer to one or more electronic devices and/or software applications configured to initiate and/or conduct payment transactions. For example, the electronic wallet may include a mobile device executing an electronic wallet application, and may also include server-side software and/or databases for maintaining and providing transaction data to the mobile device. An "e-wallet provider" may include an entity that provides and/or maintains e-wallets for customers, e.g., googleAndroid />Apple />Samsung />And/or other similar electronic payment systems. In some non-limiting examples, the issuer bank may be an electronic wallet provider.

As used herein, the term "issuer" may refer to one or more entities, such as banks, that provide customers with an account for conducting transactions (e.g., payment transactions), such as initiating credit and/or debit payments. For example, the issuer may provide an account identifier, such as a PAN, to the customer that uniquely identifies one or more accounts associated with the customer. The account identifier may be implemented on a portable financial device, such as an entity financial instrument (e.g., a payment card), and/or may be electronic and used for electronic payment. The term "issuer system" refers to one or more computer devices operated by or on behalf of an issuer, such as a server computer executing one or more software applications. For example, the issuer system may include one or more authorization servers for authorizing transactions.

As used herein, the term "merchant" may refer to a person or entity that provides goods and/or services to a customer or access to goods and/or services based on a transaction, such as a payment transaction. The term "merchant" or "merchant system" may also refer to one or more computer systems operated by or on behalf of a merchant, such as a server computer executing one or more software applications.

As used herein, a "point-of-sale (POS) device" may refer to one or more devices that may be used by a merchant to conduct transactions (e.g., payment transactions) and/or to process transactions. For example, the POS device may include one or more client devices. Additionally, or alternatively, POS devices may include peripherals, card readers, scanning devices (e.g., code scanner),A communication receiver, a Near Field Communication (NFC) receiver, a Radio Frequency Identification (RFID) receiver and/or other contactless transceiver or receiver, a contact-based receiver, a payment terminal, etc. As used herein, a "point-of-sale (POS) system" may refer to one or more client devices and/or peripheral devices used by a merchant to conduct transactions. For example, the POS system may include one or more POS devices, and/or other similar devices that may be used to conduct payment transactions. In some non-limiting embodiments or aspects, a POS system (e.g., a merchant POS system) may include one or more server computers programmed or configured to communicate with one or more other devices via a network Web pages, mobile applications, etc. process online payment transactions.

As used herein, the terms "client" and "client device" may refer to one or more client-side devices or systems (e.g., at a remote location of a transaction service provider) for initiating or facilitating a transaction (e.g., a payment transaction). As an example, a "client device" may refer to one or more POS devices used by a merchant, one or more acquirer host computers used by an acquirer, one or more mobile devices used by a user, and so forth. In some non-limiting embodiments or aspects, the client device may be an electronic device configured to communicate with one or more networks and initiate or facilitate transactions. For example, the client devices may include one or more computers, portable computers, laptop computers, tablet computers, mobile devices, cellular telephones, wearable devices (e.g., watches, glasses, lenses, clothing, etc.), PDAs, and the like. Further, "client" may also refer to an entity (e.g., merchant, acquirer, etc.) that owns, utilizes, and/or operates a client device for initiating a transaction (e.g., for initiating a transaction with a transaction service provider).

As used herein, the term "payment device" may refer to a payment card (e.g., credit or debit card), gift card, smart media, payroll card, healthcare card, wristband, machine readable medium containing account information, key fob device or fob, RFID transponder, retailer discount or membership card, cellular telephone, electronic wallet mobile application, personal Digital Assistant (PDA), pager, security card, computing device, access card, wireless terminal, transponder, and the like. In some non-limiting embodiments or aspects, the payment device may include volatile or non-volatile memory to store information (e.g., account identifier, account holder name, etc.).

As used herein, the term "payment gateway" may refer to an entity (e.g., a merchant service provider, a payment service provider contracted with an acquirer, a payment aggregator (payment aggregator), etc.) that provides payment services (e.g., transaction service provider payment services, payment processing services, etc.) to one or more merchants and/or a payment processing system operated by or on behalf of such entity. The payment service may be associated with use of the portable financial device managed by the transaction service provider. As used herein, the term "payment gateway system" may refer to one or more computer systems, computer devices, servers, server groups, etc., operated by or on behalf of a payment gateway.

As used herein, the term "server" may refer to or include one or more computing devices operated by or facilitating communication and processing by multiple parties in a network environment, such as the internet, but it should be understood that communication may be facilitated through one or more public or private network environments, and that various other arrangements are possible. In addition, multiple computing devices (e.g., servers, POS devices, mobile devices, etc.) that communicate directly or indirectly in a network environment may constitute a "system". As used herein, reference to a "server" or "processor" may refer to the previously described servers and/or processors, different servers and/or processors, and/or combinations of servers and/or processors that were stated as performing the previous steps or functions. For example, as used in the specification and claims, a first server and/or a first processor stated as performing a first step or function may refer to the same or different server and/or processor stated as performing a second step or function.

As used herein, the term "transaction service provider" may refer to an entity that receives a transaction authorization request from a merchant or other entity and in some cases provides payment assurance through an agreement between the transaction service provider and an issuer. For example, the transaction service provider may include, for example Such as a payment network, or any other entity that handles transactions. The term "transaction processing system" may refer to one or more computer systems operated by or on behalf of a transaction service provider, such as executing one or more software applicationsIs provided. The transaction processing server may include one or more processors and, in some non-limiting embodiments or aspects, may be operated by or on behalf of a transaction service provider.

Non-limiting embodiments or aspects of the disclosed subject matter relate to systems, methods, and computer program products for ensemble learning, including, but not limited to, multi-domain ensemble learning based on multivariate time series data. For example, non-limiting embodiments or aspects of the disclosed subject matter provide: inputting at least a portion of the multivariate sequence data into each respective anomaly detection model of the plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model; combining the multivariate sequence data with the score to generate combined intermediate data; inputting the combined intermediate data into a combined integration model (based on the model domain integration model, the time domain integration model, and the feature domain integration model) to generate an output score; and in response to determining that the output score meets a threshold, at least one of: communication communicates an alert, generates a feature importance vector, or updates parameters of at least one of a combined integrated model, a model domain integrated model, a time domain integrated model, or a feature domain integrated model. These embodiments or aspects provide techniques and systems that improve anomaly detection by utilizing scores from multiple different models to create a single combined score that is more accurate and less likely to lead to false positives. Additionally, by identifying feature importance and/or updating the integrated model, the disclosed techniques and/or systems may actively learn and improve even if the tag (e.g., the actual expected output) is sparse or unavailable.

FIG. 1 depicts an example system 100 for multi-domain ensemble learning based on multivariate time series data in accordance with some non-limiting embodiments or aspects. The system 100 may include a time series data database 102, an anomaly detection model system 104, an integrated model system 106, and/or user devices 108.

The time series data database 102 may include one or more devices capable of receiving information from and/or communicating information to the anomaly detection model system 104, the integration model system 106, and/or the user devices 108. For example, the temporal database 102 may include computing devices, such as a server, a group of servers, and/or other similar devices. In some non-limiting embodiments or aspects, the temporal data database 102 may be in communication with a data storage device, which may be local or remote to the temporal data database 102. In some non-limiting embodiments or aspects, the temporal data database 102 may be capable of receiving information from, storing information in, communicating information to, or searching information stored in a data storage device. In some non-limiting embodiments or aspects, the time series data database 102 may store and/or receive multivariate sequence data comprising a plurality of vectors. For example, each respective vector of the plurality of vectors may include elements based on a time series of respective variables of the plurality of variables.

Anomaly detection model system 104 can include one or more devices capable of receiving information from and/or communicating information to time series data database 102, integrated model system 106, and/or user devices 108. For example, anomaly detection model system 104 may include a computing device, such as a computer, a set of computers, a server, a set of servers, and/or other similar devices. In some non-limiting embodiments or aspects, each anomaly detection model system 104 may include at least one anomaly detection model of a plurality of anomaly detection models. For example, each anomaly detection model system 104 can input at least a portion of the multivariate sequence data (from the time series data database 102) into at least one anomaly detection model to generate at least one score for each respective anomaly detection model.

The integrated model system 106 may include one or more devices capable of receiving information from and/or communicating information to the timing data database 102, the anomaly detection model system 104, and/or the user devices 108. For example, the integrated model system 106 may include a computing device, such as a computer, a set of computers, a server, a set of servers, and/or other similar devices. In some non-limiting embodiments or aspects, each integrated model system 106 may include at least one of a combined integrated model, a model domain integrated model, a time domain integrated model, and a feature domain integrated model, and/or a combined integrated model (which may be based on the model domain integrated model, the time domain integrated model, and/or the feature domain integrated model) for generating the output score. In some non-limiting embodiments or aspects, the integrated model system 106 may combine the multivariate sequence data (from the time series data database 102) with multiple scores (from the anomaly detection model system 104) to generate combined intermediate data, as described herein. In some non-limiting embodiments or aspects, the integrated model system 106 may input the combined intermediate data (or multivariate sequence data and/or multiple scores) into the combined integrated model (or into at least one of the model domain integrated model, the time domain integrated model, and/or the feature domain integrated model) to generate the output score, as described herein. In some non-limiting embodiments or aspects, the integrated model system 106 may determine whether the output score meets a threshold, as described herein. In response to the determination, the integrated modeling system 106 may communicate an alert to the user device 108, input the multivariate sequence data into the feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables, or update parameters of at least one of the combined integration model, the model domain integration model, the time domain integration model, or the feature domain integration model, as described herein.

User device 108 may include one or more devices capable of receiving information from and/or communicating information to time series data database 102, anomaly detection model system 104, and/or integrated model system 106. For example, user device 108 may include a computing device, such as a computer, mobile device, and/or other similar device. In some non-limiting embodiments or aspects, the user device 108 may communicate multi-variable sequence data (e.g., to the time series data database 102 and/or the integrated model system 106), as described herein. In some non-limiting embodiments or aspects, the user device 108 may receive output scores and/or alerts from the integrated model system 106, as described herein.

The number and arrangement of systems and/or devices shown in fig. 1 are provided as examples. Additional systems and/or devices, fewer systems and/or devices, different systems and/or devices, and/or systems and/or devices arranged in a different manner than those shown in fig. 1 may be present. Furthermore, two or more of the systems or apparatuses shown in fig. 1 may be implemented within a single system and/or apparatus, or a single system or apparatus shown in fig. 1 may be implemented as multiple distributed systems or apparatuses. Additionally or alternatively, a set of systems (e.g., one or more systems) and/or a set of devices (e.g., one or more devices) of system 100 may perform one or more functions described as being performed by another set of systems or another set of devices of system 100.

Referring now to fig. 2, an example process 200 for multi-domain ensemble learning based on multivariate time series data is shown in accordance with some non-limiting embodiments or aspects. The steps shown in fig. 2 are for illustration purposes only. It will be appreciated that additional, fewer, different, and/or different order steps may be used in non-limiting embodiments or aspects. In some non-limiting embodiments or aspects, one or more of the steps of process 200 may be performed (e.g., entirely, partially, etc.) by integrated model system 106 (e.g., one or more devices of integrated model system 106). In some non-limiting embodiments or aspects, one or more of the steps of process 200 may be performed (e.g., entirely, partially, etc.) by another system, another device, another set of systems, or another set of devices separate from or including integrated model system 106 (such as timing data database 102, anomaly detection model system 104, and/or user device 108).

As shown in fig. 2, at step 202, process 200 may include receiving multivariate time series data. For example, the temporal data database 102, the integrated model system 106, and/or the anomaly detection model system 104 may receive multivariate sequence data from the user devices 108. Additionally or alternatively, the timing data database 102 may store multivariate sequence data and/or the anomaly detection model system 104 and/or the integrated model system 106 may receive the multivariate sequence data from the timing data database 102.

In some non-limiting embodiments or aspects, the multivariate sequence data can comprise a plurality of vectors. For example, each respective vector of the plurality of vectors may include elements based on a time series of respective variables of the plurality of variables.

In some non-limiting embodiments or aspects, the multivariate time series data may be time domain downsampled (e.g., by the anomaly detection model system 104 and/or the integrated model system 106). For example, a mask may be applied to the multivariate time series data to downsample by following exponential probability decay over the time domain (e.g., selecting a subset of time steps, e.g., for use as input, for training, etc.). In some non-limiting implementations or aspects, the length of the mask may be less (e.g., much smaller) than the total number of time steps and/or current time steps. In some non-limiting embodiments or aspects, by using time domain downsampling, fewer records (e.g., time steps) of the multivariate sequence data may be considered (e.g., used for training), thereby saving computational resources (e.g., memory), reducing training time, and improving efficiency while maintaining or improving performance (e.g., in terms of accuracy, area under curve (area under the curve, AUC), recall, etc.). Table 1 shows the accuracy, precision, recall, and AUC for training the models described herein based on the entire multivariate time series dataset (raw), batch training (batch), random downsampling (downsampling), and time domain downsampling (time domain sampling) described herein, and table 2 shows the memory usage, training time, and AUC compared to the training of the entire multivariate time series dataset (raw) based on batch training, random downsampling, and time domain downsampling.

TABLE 1

Compared with the original	Memory (%)	Time (%)	AUC(％)
				Batch of	-83.00	23.00	5.61
Random downsampling	-79.83	-76.00	6.51
				Time domain downsampling	-97.97	-98.26	26.04

TABLE 2

As shown in FIG. 2, at step 204, process 200 may include inputting data into an anomaly detection model to generate a score. For example, the anomaly detection model system 104 may input at least a portion of the multivariate sequence data into each respective anomaly detection model of the plurality of anomaly detection models to generate a plurality of scores, which may include a respective score for each respective anomaly detection model.

In some non-limiting embodiments or aspects, each anomaly detection model of the plurality of anomaly detection models may include at least one of a bayesian model, a Kullback-Leibler importance estimation program (KLIEP) model, a ChangeFinder model, a cumulative sum (CUSUM) model, or any combination thereof.

As shown in fig. 2, at step 206, process 200 may include combining the data and the score to generate combined data. For example, the anomaly detection model system 104 and/or the integrated model system 106 can combine the multivariate sequence data with a plurality of scores to generate combined intermediate data.

In some non-limiting embodiments or aspects, the anomaly detection model system 104 can communicate the combined intermediate data to the integrated model system 106. In some non-limiting embodiments or aspects, the anomaly detection model system 104 can communicate a plurality of fractional communications to the integrated model system 106 and/or the time series data database 102 can communicate a multi-variable series data communication to the integrated model system 106. In response to receiving the multivariate sequence data and the plurality of scores, the integrated model system 106 may combine the multivariate sequence data with the plurality of scores to generate combined intermediate data.

As shown in fig. 2, at step 208, process 200 may include inputting the combined data into at least one integrated model to generate an output score. For example, the integrated model system 106 may input combined intermediate data into the combined integrated model to generate the output score. In some non-limiting embodiments, the combined integration model may be based on at least two (e.g., all three) of the model domain integration model, the time domain integration model, and/or the feature domain integration model.

As shown in fig. 2, at step 210, process 200 may include determining whether the output score meets a threshold. For example, the integrated model system 106 may determine that the output score meets a threshold.

As shown in fig. 2, at step 212, process 200 may include taking an action in response to determining whether the output score meets a threshold. For example, the integrated model system 106 may take action based on the determination. In some non-limiting embodiments or aspects, taking action may include at least one of the integrated model system 106 communicating an alert to the user device 108, generating a feature importance vector, or updating parameters of the integrated model. Additionally or alternatively, the integrated model system 106 may determine whether to tag or communicate (at least a portion of) the transmitted multivariate sequence.

In some non-limiting embodiments or aspects, in response to determining that the output score meets a threshold, the integrated model system 106 may communicate an alert to the user device 108.

In some non-limiting embodiments or aspects, in response to determining that the output score meets a threshold, the integrated model system 106 may input multivariate sequence data into the feature domain integrated model to generate a feature importance vector. For example, the feature importance vector may include a feature importance score for each of the plurality of variables.

In some non-limiting embodiments or aspects, in response to determining that the output score meets a threshold, the integrated model system 106 may update parameters of at least one of a combined integrated model, a model domain integrated model, a time domain integrated model, a feature domain integrated model, or any combination thereof.

In some non-limiting embodiments or aspects, updating the integrated model may include the integrated model system 106 initializing a combined integrated model. Additionally or alternatively, for each time step less than the maximum time step, the integrated model system 106 may determine a first temporary variable based on the combined intermediate data and the current time step version of the combined integrated model; determining a second temporary variable based on a ratio of the tag to the first temporary variable; back-propagating the feature domain integration model based on the transpose of the combined intermediate data and the second temporary variable; and/or determining a next time-step version of the combined integrated model based on the back-propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

In some non-limiting embodiments or aspects, updating the integration model may include: for each time step less than the maximum time step, the integration model 106 determines a third temporary variable from a first Khatri-Rao product based on the current time step version of the time-domain integration model and the current time step version of the feature-domain integration model; determining a current time step version of the model domain integration model based on the third temporary variable, the model domain integration model, and the tensor development in mode 0 of the combined integration model; determining a fourth temporary variable according to a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model; determining an updated current time step version of the time domain integration model based on the fourth temporary variable, the time domain integration model, and the tensor expansion in mode 1 of the combined integration model; determining a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and the updated current time-step version of the time domain integration model; and determining an updated current time-step version of the feature domain integration model based on the fifth temporary variable, the feature domain integration model, and the tensor development in mode 2 of the combined integration model.

In some non-limiting embodiments or aspects, the loss function of the combined integration model may be based on a model domain integration model, a time domain integration model, and a feature domain integration model. For example, the loss function of the combined integration model may be based on the following equation:

In some non-limiting embodiments or aspects, the integrated model system 106 may determine whether to tag or communicate at least a portion of the transfer of the multivariate sequence data based on the output scores. For example, in response to determining the labeling, the integrated model system 106 may label at least a portion of the multivariate sequence data based on the output scores. Additionally or alternatively, in response to determining the communication transmission, the integrated model system 106 may communicate at least a portion of the multivariate sequence data and the output score to the user device 108.

Referring now to fig. 3, fig. 3 is a diagram of an exemplary environment 300 in which systems, products, and/or methods as described herein may be implemented, according to some non-limiting embodiments or aspects. As shown in fig. 3, environment 300 includes a transaction service provider system 302, an issuer system 304, a client device 306, a merchant system 308, an acquirer system 310, and a communication network 312. In some non-limiting embodiments or aspects, each of the temporal data database 102, the anomaly detection model system 104, the integrated model system 106, and/or the user device 108 may be implemented by (e.g., a portion of) the transaction service provider system 302. In some non-limiting embodiments or aspects, at least one of the timing data database 102, the anomaly detection model system 104, the integrated model system 106, and/or the user device 108 may be implemented by another system, another device, another set of systems, or another set of devices (e.g., a portion thereof) separate from or including the transaction service provider system 302 (such as the issuer system 304, the merchant system 308, the acquirer system 310, etc.).

Transaction service provider system 302 may include one or more devices capable of receiving information from and/or transmitting information to issuer system 304, client device 306, merchant system 308, and/or acquirer system 310 via communication network 312. For example, the transaction service provider system 302 may include computing devices, such as servers (e.g., transaction processing servers, etc.), groups of servers, and/or other similar devices. In some non-limiting embodiments or aspects, the transaction service provider system 302 may be associated with a transaction service provider, as described herein. In some non-limiting embodiments or aspects, the transaction service provider system 302 may communicate with a data storage device, which may be local or remote to the transaction service provider system 302. In some non-limiting embodiments or aspects, the transaction service provider system 302 can receive information from, store information in, transmit information to, or search information stored in a data storage device.

Issuer system 304 may include one or more devices capable of receiving information from and/or transmitting information to 302, client device 306, merchant system 308, and/or acquirer system 310 via communication network 312. For example, the issuer system 304 may include computing devices, such as servers, groups of servers, and/or other similar devices. In some non-limiting embodiments or aspects, the issuer system 304 may be associated with an issuer, as described herein. For example, issuer system 304 may be associated with an issuer that issues credit accounts, debit accounts, credit cards, debit cards, and the like to users associated with client devices 306.

Client device 306 may include one or more devices capable of receiving information from and/or transmitting information to transaction service provider system 302, issuer system 304, merchant system 308, and/or acquirer system 310 via communication network 312. Additionally or alternatively, each client device 306 may include a device capable of receiving information from and/or transmitting information to other client devices 306 via communication network 312, another network (e.g., a temporary network, a local network, a private network, a virtual private network, etc.), and/or any other suitable communication technology. For example, client device 306 may include a client device or the like. In some non-limiting embodiments or aspects, a guest The user device 306 may or may not be capable of communicating via a short range wireless communication connection (e.g., NFC communication connection, RFID communication connection,Communication connection, < >>A communication connection, etc.), for example, from merchant system 308 or from another client device 306, and/or to communicate information via a short-range wireless communication connection, for example, to merchant system 308.

Merchant system 308 may include one or more devices capable of receiving information from and/or transmitting information to transaction service provider system 302, issuer system 304, client device 306, and/or acquirer system 310 via communication network 312. Merchant system 308 may also include a communication connection (e.g., NFC communication connection, RFID communication connection) capable of communicating with client device 306 via communication network 312,Communication connection, < >>Communication connection, etc.) and the like, receives information from client device 306, and/or transmits information to client device 306 via communication network 312, a communication connection, etc. In some non-limiting embodiments or aspects, merchant system 308 may include a computing device, e.g., a server group, a client device group, and/or other similar devices. In some non-limiting embodiments or aspects, merchant system 308 may be associated with a merchant, as described herein. In some non-limiting embodiments or aspects, merchant system 308 may include one or more client devices. For example, merchant system 308 may include a client device that allows a merchant to communicate information to transaction service provider system 302. In some non-limiting embodiments or aspects, merchant system 308 may include a capability to One or more devices, such as computers, computer systems, and/or peripheral devices, that are available to merchants to conduct transactions with users. For example, merchant system 308 may include a POS device and/or a POS system.

The acquirer system 310 may include one or more devices capable of receiving and/or transmitting information to/from the transaction service provider system 302, the issuer system 304, the client device 306, and/or the merchant system 308 via the communication network 312. For example, the acquirer system 310 may include computing devices, servers, server groups, and the like. In some non-limiting embodiments or aspects, the acquirer system 310 can be associated with an acquirer, as described herein.

The communication network 312 may include one or more wired and/or wireless networks. For example, the communication network 312 may include a cellular network (e.g., a Long Term Evolution (LTE) network, a third generation (3G) network, a fourth generation (4G) network, a fifth generation (5G) network, a Code Division Multiple Access (CDMA) network, etc.), a Public Land Mobile Network (PLMN), a Local Area Network (LAN), a Wide Area Network (WAN), a Metropolitan Area Network (MAN), a telephone network (e.g., a Public Switched Telephone Network (PSTN)), a private network (e.g., a private network associated with a transaction service provider), a temporary network, an intranet, the internet, a fiber-based network, a cloud computing network, etc., and/or a combination of these or other types of networks.

In some non-limiting embodiments or aspects, processing the transaction may include generating and/or transmitting at least one transaction message (e.g., an authorization request, an authorization response, any combination thereof, etc.). For example, a client device (e.g., client device 306, a POS device of merchant system 308, etc.) may initiate a transaction, such as by generating an authorization request. Additionally or alternatively, a client device (e.g., client device 306, at least one device of merchant system 308, etc.) may transmit an authorization request. For example, the client device 306 may communicate the authorization request to the merchant system 308 and/or a payment gateway (e.g., a payment gateway of the transaction service provider system 302, a third party payment gateway separate from the transaction service provider system 302, etc.). Additionally or alternatively, merchant system 308 (e.g., its POS device) may communicate the authorization request to acquirer system 310 and/or the payment gateway. In some non-limiting embodiments or aspects, the acquirer system 310 and/or the payment gateway may communicate the authorization request to the transaction service provider system 302 and/or the issuer system 304. Additionally or alternatively, the transaction service provider system 302 may communicate the authorization request to the issuer system 304. In some non-limiting embodiments or aspects, the issuer system 304 may determine an authorization decision (e.g., authorization, denial, etc.) based on the authorization request. For example, the authorization request may cause the issuer system 304 to determine an authorization decision based on the authorization request. In some non-limiting embodiments or aspects, the issuer system 304 may generate an authorization response based on the authorization decision. Additionally or alternatively, the issuer system 304 may transmit an authorization response. For example, the issuer system 304 may communicate the authorization response to the transaction service provider system 302 and/or the payment gateway. Additionally or alternatively, transaction service provider system 302 and/or payment gateway may communicate the authorization response to acquirer system 310, merchant system 308, and/or client device 306. Additionally or alternatively, acquirer system 310 may communicate the authorization response to merchant system 308 and/or the payment gateway. Additionally or alternatively, the payment gateway may transmit an authorization response to merchant system 308 and/or client device 306. Additionally or alternatively, merchant system 308 may transmit an authorization response to client device 306. In some non-limiting embodiments or aspects, merchant system 308 may receive an authorization response (e.g., from acquirer system 310 and/or a payment gateway). Additionally or alternatively, merchant system 308 may complete the transaction (e.g., provide, ship, and/or deliver goods and/or services associated with the transaction; fulfill orders associated with the transaction; any combination thereof, etc.) based on the authorization response.

For purposes of illustration, processing the transaction may include generating a transaction message (e.g., an authorization request, etc.) based on an account identifier of the customer (e.g., associated with the client device 306, etc.) and/or transaction data associated with the transaction. For example, the merchant system 308 (e.g., a client device of the merchant system 308, a POS device of the merchant system 308, etc.) may initiate the transaction, for example, by generating an authorization request (e.g., in response to receiving an account identifier from a portable financial device of a customer, etc.). Additionally or alternatively, merchant system 308 may transmit an authorization request to acquirer system 310. Additionally or alternatively, the acquirer system 310 may communicate the authorization request to the transaction service provider system 302. Additionally or alternatively, the transaction service provider system 302 may communicate the authorization request to the issuer system 304. The issuer system 304 may determine an authorization decision (e.g., authorization, denial, etc.) based on the authorization request, and/or the issuer system 304 may generate an authorization response based on the authorization decision and/or the authorization request. Additionally or alternatively, the issuer system 304 may communicate an authorization response to the transaction service provider system 302. Additionally or alternatively, the transaction service provider system 302 may transmit an authorization response to the acquirer system 310, which may transmit the authorization response to the merchant system 308.

For purposes of illustration, clearing and/or settlement of the transaction may include generating a message (e.g., a clearing message, a settlement message, etc.) based on an account identifier of the customer (e.g., associated with the client device 306, etc.) and/or transaction data associated with the transaction. For example, merchant system 308 may generate at least one clearing message (e.g., a plurality of clearing messages, a batch of clearing messages, etc.). Additionally or alternatively, merchant system 308 may transmit a clearing message to acquirer system 310. Additionally or alternatively, the acquirer system 310 may communicate the clearing message to the transaction service provider system 302. Additionally or alternatively, the transaction service provider system 302 may communicate the clearing message to the issuer system 304. Additionally or alternatively, the issuer system 304 may generate at least one settlement message based on the clearing message. Additionally or alternatively, issuer system 304 may communicate the settlement message and/or funds to transaction service provider system 302 (and/or a settlement banking system associated with transaction service provider system 302). Additionally or alternatively, the transaction service provider system 302 (and/or the settlement banking system) may communicate settlement messages and/or funds to the acquirer system 310, which may communicate the settlement messages and/or funds to the merchant system 308 (and/or an account associated with the merchant system 308).

The number and arrangement of systems, devices and/or networks shown in fig. 3 are provided as examples. Additional systems, devices, and/or networks may be present; fewer systems, devices, and/or networks; different systems, devices, and/or networks; and/or in a different arrangement of systems, devices and/or networks than those shown in fig. 3. Furthermore, two or more of the systems or devices shown in fig. 3 may be implemented within a single system or device, or a single system or device shown in fig. 3 may be implemented as multiple distributed systems or devices. Additionally or alternatively, a set of systems (e.g., one or more systems) and/or a set of devices (e.g., one or more devices) of environment 300 may perform one or more functions described as being performed by another set of systems or another set of devices of environment 300.

Referring now to fig. 4, a diagram of example components of an apparatus 400 according to a non-limiting embodiment or aspect is shown. As an example, the device 400 may correspond to at least one of the timing data database 102, the anomaly detection model system 104, the integrated model system 106, and/or the user device 108 of fig. 1, and/or at least one of the transaction service provider system 302, the issuer system 304, the client device 306, the merchant system 308, and/or the acquirer system 310 of fig. 3. In some non-limiting embodiments or aspects, such systems or devices may include at least one device 400 and/or at least one component of device 400. The number and arrangement of components shown are provided as examples. In some non-limiting embodiments or aspects, the apparatus 400 may include additional components, fewer components, different components, or components arranged in a different manner than those shown in fig. 4. Additionally or alternatively, a set of components (e.g., one or more components) of device 400 may perform one or more functions described as being performed by another set of components of device 400.

As shown in fig. 4, device 400 may include a bus 402, a processor 404, a memory 406, a storage component 408, an input component 410, an output component 412, and a communication interface 414. Bus 402 may include components that permit communication among the components of device 400. In some non-limiting embodiments or aspects, the processor 404 may be implemented in hardware, firmware, or a combination of hardware and software. For example, processor 404 may include a processor (e.g., a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), an Acceleration Processing Unit (APU), etc.), a microprocessor, a Digital Signal Processor (DSP), and/or any processing component that may be programmed to perform functions (e.g., a Field Programmable Gate Array (FPGA), an application-specific integrated circuit (ASIC), etc.). Memory 406 may include Random Access Memory (RAM), read Only Memory (ROM), and/or another type of dynamic or static storage device (e.g., flash memory, magnetic memory, optical memory, etc.) that stores information and/or instructions for use by processor 404.

With continued reference to fig. 4, the storage component 408 can store information and/or software related to operation and use of the device 400. For example, the storage component 408 may comprise a hard disk (e.g., magnetic disk, optical disk, magneto-optical disk, solid state disk, etc.) and/or another type of computer-readable medium. Input component 410 can include components that permit device 400 to receive information, such as through user input (e.g., a touch screen display, keyboard, keypad, mouse, buttons, switches, microphone, etc.). Additionally or alternatively, the input component 410 can include a sensor (e.g., a Global Positioning System (GPS) component, accelerometer, gyroscope, actuator, etc.) for sensing information. Output component 412 can include components (e.g., a display, a speaker, one or more Light Emitting Diodes (LEDs), etc.) that provide output information from device 400. The communication interface 414 may include transceiver-like components (e.g., transceivers, separate receivers and transmitters, etc.) that enable the device 400 to communicate with other devices, such as through a wired connection, a wireless connection, or a combination of wired and wireless connections. Communication interface 414 may permit device 400 to receive information from and/or provide information to another device. For example, the number of the cells to be processed, communication interface 414 may include an ethernet interface, an optical interface, a coaxial interface an infrared interface, a Radio Frequency (RF) interface, a Universal Serial Bus (USB) interface, Interfaces, cellular network interfaces, etc.

The apparatus 400 may perform one or more of the processes described herein. The apparatus 400 may perform these processes based on the processor 404 executing software instructions stored by a computer readable medium such as the memory 406 and/or the storage component 408. The computer readable medium may include any non-transitory memory device. A memory device includes a memory space that is located within a single physical storage device or a memory space that is spread across multiple physical storage devices. The software instructions may be read into memory 406 and/or storage component 408 from another computer-readable medium or from another device via communication interface 414. The software instructions stored in the memory 406 and/or the storage component 408, when executed, may cause the processor 404 to perform one or more processes described herein. Additionally or alternatively, hardwired circuitry may be used in place of or in combination with software instructions to perform one or more processes described herein. Thus, embodiments described herein are not limited to any specific combination of hardware circuitry and software. The term "programmed or configured" as used herein refers to an arrangement of software, hardware circuitry, or any combination thereof on one or more devices.

Referring now to fig. 5, a diagram of an example embodiment 500 of a system for multi-domain integrated learning based on multivariate time series data is shown, according to some non-limiting embodiments or aspects. As shown in FIG. 5, an embodiment 500 may include multivariate time series data 501, scores 503, anomaly detection models 504, time series integration models (TSEM) 506, user devices 508, model domain integration models 510, time domain integration models 512, feature domain integration models 514, detection models 516, alarm systems 518, feature contributor systems 520, model interpretation systems 522, policy systems 524, and/or TSEM update systems 526. In some non-limiting embodiments or aspects, the embodiment 500 may be the same as, similar to, and/or part of the system 100. In some non-limiting embodiments or aspects, the multivariate time series data 501 may be stored in a database (e.g., the time series data database 102, etc.). In some non-limiting embodiments or aspects, the anomaly detection model 504 may be identical, similar, and/or part of the anomaly detection model system 104. In some non-limiting embodiments or aspects, score 503 may be an output of anomaly detection model 504, as described herein. In some non-limiting embodiments or aspects, the TSEM 506 may be identical, similar, and/or part of the integrated model system 106. In some non-limiting embodiments or aspects, the user device 508 may be the same as or similar to the user device 108. In some non-limiting embodiments or aspects, the model domain integration model 510, the time domain integration model 512, the feature domain integration model 514, and/or the detection model 516 may be part of the TSEM 506 and/or may be the same as, similar to, and/or part of the integrated model system 106. In some non-limiting embodiments or aspects, the alarm system 518, the feature contribution system 520, the model interpretation system 522, the policy system 524, and/or the TSEM update system 526 may be the same as, similar to, and/or part of the integrated model system 106. In some non-limiting embodiments or aspects, at least one of the alarm system 518, the feature contribution system 520, the model interpretation system 522, the policy system 524, and/or the TSEM update system 526 may be another system separate from the integrated model system 106 and/or may be part of another system separate from or including the integrated model system 106. The number and arrangement of components (e.g., data, scores, models, devices, and/or systems) of the illustrated embodiment 500 are provided as examples. In some non-limiting embodiments or aspects, embodiment 500 may include additional components, fewer components, different components, or components arranged in a different manner than those shown in fig. 5. Additionally or alternatively, a set of components (e.g., one or more components) of embodiment 500 may perform one or more functions described as being performed by another set of components of embodiment 500.

In some non-limiting embodiments or aspects, the anomaly detection model 504 can receive multivariate sequence data 501, as described herein. Additionally or alternatively, TSEM 506 may receive multivariate sequence data 501, as described herein. In some non-limiting embodiments or aspects, the multivariate sequence data 501 can comprise a plurality of vectors each comprising elements based on a time series of corresponding variables of the plurality of variables, as described herein.

In some non-limiting embodiments or aspects, at least a portion of the multivariate sequence data 501 may be input into each respective anomaly detection model 504 to generate a plurality of scores 503, as described herein. For example, score 503 may include a respective score for each respective anomaly detection model 504. In some non-limiting embodiments or aspects, the anomaly detection model 504 may include at least one point of change model. Additionally or alternatively, the score 503 may include at least one change point score of the respective change point model.

In some non-limiting embodiments or aspects, the multivariate sequence data 501 may be combined with the score 503 (e.g., by TSEM 506, etc.) to generate combined intermediate data, as described herein.

In some non-limiting embodiments or aspects, the combined intermediate data may be input into the TSEM 506 (which may include a combined integration model) to generate an output score, as described herein. For example, TSEM 506 may include a combined integration model based on model domain integration model 510, time domain integration model 512, and feature domain integration model 514, as described herein.

In some non-limiting embodiments or aspects, the detection model 516 may determine that the output score meets a threshold, as described herein.

In some non-limiting embodiments or aspects, in response to determining that the output score meets a threshold, the alert system 518 can communicate an alert to at least one user device 508, as described herein. For example, alarm system 518 may communicate an alarm to policy system 524, which may communicate an alarm to user device 508. In some non-limiting embodiments or aspects, the alert system 518 and/or the policy system 524 may communicate to the user device 508 (e.g., with an alert) a record selected from the multivariate sequence data 501 (e.g., a record of the multivariate sequence data 501 that results in the detection model 516 detecting anomalies based on the output scores meeting a threshold). In some non-limiting embodiments or aspects, the user device 508 may receive input (e.g., from a user of the user device 508 viewing the alert and/or the selected recording) indicating the actual tag of the selected recording (e.g., a tag indicating whether the recording is actually an anomaly confirmed by a human viewer). The user device 508 may communicate the input and/or the real tag based on the input to the TSEM update system 526. In some non-limiting embodiments or aspects, TSEM 506 and/or policy system 524 may determine a predictive label for the selected record (e.g., a label indicating whether the record is predicted to be abnormal as predicted by TSEM 506). TSEM 506 and/or policy system 524 may communicate the predicted tag to TSEM update system 526. In some non-limiting embodiments or aspects, TSEM updating system 526 may update parameters (e.g., based on real tags and/or predictive tags) of at least one of TSEM 506 (e.g., a combined integration model), model domain integration model 510, time domain integration model 512, and/or feature domain integration model 514.

In some non-limiting embodiments or aspects, the feature contributor system 520 may generate feature importance vectors, as described herein. For example, the feature contributor system 520 may input the multivariate sequence data 501 into the feature domain integration model 514 to generate a feature importance vector, which may include a feature importance score for each of the plurality of variables. In some non-limiting embodiments or aspects, the feature contributor system 520 may communicate feature importance vectors to the model interpretation system 522. In some non-limiting embodiments or aspects, the feature contributor system 520 and/or the model interpretation system 522 may communicate at least one communication based on the feature importance vector (e.g., a communication indicating the importance of each variable, such as an importance score, an importance score-based ranking, etc.) to the user device 508 (e.g., via the policy system 524).

In some non-limiting embodiments or aspects, policy system 524 may receive (e.g., from user device 508, etc.) and/or store a threshold (e.g., for detecting anomalies by detection model 516). Additionally or alternatively, policy system 524 may receive (e.g., from user device 508, etc.) and/or store conditions under which communications described herein below are to be communicated to user device 508.

Referring now to fig. 6, a diagram of an example system 600 for a system for multi-domain integrated learning based on multivariate time series data is shown, according to some non-limiting embodiments or aspects. In some non-limiting embodiments or aspects, the embodiment 600 may be used for unsupervised learning. In some non-limiting embodiments or aspects, the embodiment 600 may be implemented (e.g., fully, partially, etc.) by the ensemble model system 106 (e.g., one or more devices of the ensemble model system 106) and/or the TSEM 506. In some non-limiting embodiments or aspects, the change point score (x) 603 may be the same as or similar to the score 503. In some non-limiting embodiments or aspects, the multivariate sequence (z) 601 may be the same as or similar to the multivariate time series data 501. The number and arrangement of components (e.g., tensors, matrices, vectors, values, scores, data, and/or operations) of the illustrated embodiment 600 are provided as examples. In some non-limiting embodiments or aspects, embodiment 600 may include additional components, fewer components, different components, or components arranged in a different manner than those shown in fig. 6. Additionally or alternatively, a set of components (e.g., one or more components) of embodiment 600 may perform one or more functions described as being performed by another set of components of embodiment 600.

In some non-limiting embodiments or aspects, the change point score (x) 603 may include a respective score from each respective model (m) (e.g., anomaly detection model, change point model, etc.) for each respective feature (f) at each respective time (t) (e.g., time step). Thus, the change point score (x) 603 may include tensors with three dimensions: time (t), feature (f) and model (m).

In some non-limiting embodiments or aspects, a time dimension reduction operation (p (t)) 612 may be applied to the change point score (x) 603 to reduce the time dimension to provide a matrix 603a having two dimensions: feature (f) and model (m). For example, the time dimension reduction operation (p (t)) 612 may include an exponential decay function in the time dimension (e.g., time axis) such that the closer the timestamp is to the current time (t), the greater the weight the time step has.

In some non-limiting embodiments or aspects, a model dimension reduction operation (p (m)) 610 may be applied to matrix 603a to reduce the model dimension, providing vector 603b with one dimension: feature (f). For example, the model dimension reduction operation (p (m)) 610 may include aggregation across each model (e.g., by averaging the scores of each model).

In some non-limiting embodiments or aspects, a feature dimension reduction operation (p (f)) 614 may be applied to vector 603b to reduce the feature dimension to provide feature weights 603c for each feature (f) (e.g., at the current time (t)). For example, the feature dimension reduction operation (p (f)) 614 may include a weighted sum of the change point score (x) 603 (e.g., represented by vector 603 b) for the feature and the original value of the feature (e.g., from the multivariate sequence (z) 601) in a selected number (k) of past time steps.

Referring now to fig. 7, a diagram of an example system 700 of a system for multi-domain integrated learning based on multivariate time series data is shown, according to some non-limiting embodiments or aspects. As shown in fig. 7, an embodiment 700 may include a multivariate sequence (z) 701, a change point score (x) 703, a change point algorithm 704, a joint input (P) 705, a TSEM 706, an anomaly score 707, a model domain integration model 710, a time domain integration model 712, a feature domain integration model 714, a combined integration model 716a, a detection model 716b, an alert communication 718, and/or a feature output 719. In some non-limiting embodiments or aspects, the embodiment 700 may be used for semi-supervised learning. In some non-limiting embodiments or aspects, in addition to embodiment 600 (e.g., semi-supervised learning after unsupervised learning) (e.g., after this embodiment), embodiment 700 may be utilized. In some non-limiting embodiments or aspects, embodiment 700 can be used independently (e.g., in lieu of) embodiment 600. In some non-limiting embodiments or aspects, the embodiment 700 may be implemented (e.g., fully, partially, etc.) by the ensemble model system 106 (e.g., one or more devices of the ensemble model system 106) and/or the TSEM 506. In some non-limiting embodiments or aspects, the multivariate sequence (z) 701 may be the same as or similar to the multivariate time series data 501 and/or the multivariate sequence (z) 601. In some non-limiting embodiments or aspects, the change point score (x) 703 may be the same as or similar to the score 503 and/or the change point score (x) 603. In some non-limiting embodiments or aspects, the change point algorithm 704 may be the same as or similar to the anomaly detection model 504. In some non-limiting embodiments or aspects, TSEM 706 may be the same as or similar to TSEM 506. In some non-limiting embodiments or aspects, model domain integration model 710 may be the same as or similar to model domain integration model 510. In some non-limiting embodiments or aspects, time domain integration model 712 may be the same as or similar to time domain integration model 512. In some non-limiting embodiments or aspects, the feature domain integration model 714 may be the same as or similar to the feature domain integration model 514. In some non-limiting embodiments or aspects, the combinatorial integrated model 716a and/or the detection model 716b may be the same or similar to the detection model 516. In some non-limiting embodiments or aspects, the alert communication 718 may be the same as, similar to, and/or implemented by (e.g., communicated by) the alert system 518. In some non-limiting embodiments or aspects, the feature output 719 may be the same as, similar to, and/or implemented by (e.g., determined by and/or communicated with) the feature contributor system 520 and/or the model interpretation system 522. The number and arrangement of components of the illustrated embodiment 700 are provided as examples. In some non-limiting embodiments or aspects, embodiment 700 may include additional components, fewer components, different components, or components arranged in a different manner than those shown in fig. 7. Additionally or alternatively, a set of components (e.g., one or more components) of embodiment 700 may perform one or more functions described as being performed by another set of components of embodiment 700.

In some non-limiting embodiments or aspects, a multivariate sequence (z) 701 may be received, as described herein. For example, the change point algorithm 704 and/or the TSEM 706 may receive the multivariate sequence (z) 701. In some non-limiting embodiments or aspects, the multivariate sequence (z) 701 can comprise a plurality of vectors, and/or each respective vector of the plurality of vectors can comprise an element (e.g., a value, etc.) based on a time sequence (e.g., a time step in a time (t) dimension) of a respective feature (f) (e.g., a variable) of the plurality of features, as described herein.

In some non-limiting embodiments or aspects, at least a portion of the multivariate sequence (z) 701 may be input into each respective one of the m change point algorithms 704 to generate a change point score (x) 703 (e.g., comprising a respective change point score for each respective anomaly detection model), as described herein. For example, the change point algorithm 704 may receive as input a multivariate sequence (z) 701. In some non-limiting embodiments or aspects, the change point score (x) 703 may include a tensor having three dimensions: time (t), feature (f) and model (m).

In some non-limiting embodiments or aspects, the multivariate sequence (z) 701 can be combined (e.g., by TSEM 706, etc.) with the change point score (x) 703 to generate a joint input (P) 705 (e.g., combined intermediate data), as described herein. For example, the joint input (P) 705 may be determined based on the following equation:

P＝XZ ^T

Where X is the set (e.g., tensor) of all change point scores (X) 703, Z is the set (e.g., matrix) of all multivariate sequences (Z) 701, and T is the transpose operation.

In some non-limiting embodiments or aspects, the joint input (P) 705 may be input into the combined integration model 716a to generate an anomaly score (y) 707 (e.g., an output score), as described herein. In some non-limiting embodiments or aspects, the combined integration model 716a may be based on the model domain integration model 710, the time domain integration model 712, and the feature domain integration model 714. For example, the combined integration model 716a may be generated based on the following equation:

H＝MTF,

where H is a combined integration model (e.g., a parameter set thereof), M is a model domain integration model (e.g., a model weight set for each of M models), T is a time domain integration model (e.g., a time weight set for each of K time steps), and F is a feature domain integration model (e.g., a feature weight set for each of a plurality of features).

In some non-limiting embodiments or aspects, the detection model 716b may determine whether the anomaly score (y) 707 meets a threshold λ. For example, if the anomaly score (y) 707 is greater than a threshold λ, the threshold λ may be met and, therefore, the detection model 716b may detect an anomaly.

In some non-limiting embodiments or aspects, at least one action (e.g., by TSEM 706, etc.) may be taken in response to determining that the output score meets a threshold. For example, alert communications 718 (e.g., from TSEM 706 to a user device, etc.) may be communicated as described herein. Additionally or alternatively, the multivariate sequence (z) 701 may be input into a feature domain integration model 718 to generate a feature output (L) 719 (e.g., a feature importance vector comprising a feature importance score for each of a plurality of features), as described herein. Additionally or alternatively, parameters (e.g., H, M, T and/or F) of at least one of combined integration model 716a, model domain integration model 710, time domain integration model 712, and/or feature domain integration model 714 may be updated (e.g., by TSEM 706, etc.). For example, the features of the foregoing model may be updated based on the following algorithm:

algorithm 1:

in some non-limiting embodiments or aspects, as shown in algorithm 1, at line 1, the name of the program may be TSEMParameterInformance. As shown in algorithm 1, at lines 2 and 11, annotations may be added, for example, to indicate a first portion of the algorithm (e.g., step 1, which may include a portion of the algorithm for iteratively updating parameters H of combined integration model 716 a) and a second portion of the algorithm (e.g., step 2, which may include updating model weights M of model domain integration model 710, time weights T of time domain integration model 712, and/or feature weights F of feature domain integration model 714).

In some non-limiting embodiments or aspects, as shown in algorithm 1, at line 3, the parameter H of the combined integrated model 716a may be initialized. As shown in algorithm 1, at line 4, the relevant portion of the joint input (P) 705 may be determined. As shown in algorithm 1, at lines 5-8, for each time step t that is less than the maximum time step maxT, a parameter H of the combined input (P) 705 and combined integration model 716a may be based ^(t) Determining a first temporary variable a by a current time step version ^(t) The method comprises the steps of carrying out a first treatment on the surface of the May be based on a tag (e.g., a real tag indicating whether the record of the multivariate sequence (z) 701 is abnormal) and the first temporary variable a ^(t) To determine the second temporary variable b ^(t) The method comprises the steps of carrying out a first treatment on the surface of the May be based on the transpose of the joint input (P) 705 and the second temporary variable b ^(t) To counter-propagate the feature weights f ^(t) The method comprises the steps of carrying out a first treatment on the surface of the And may be based on the feature weights f ^(t) Is to be combined with the parameters H of the integrated model 716a ^(t) Is used to determine the parameters H of the combined integrated model 716a using the current time step version and the change point score (X) 703 ^(t+1) Is a version of the next time step of (a).

In some non-limiting embodiments or aspects, as described in algorithm 1, at line 12, the time weights T of the time domain integration model 712 and the feature weights F of the feature domain integration model 714 may be initialized. As shown in algorithm 1, at lines 13-19, for each time step T that is less than the maximum time step maxT, a time weight T of the time-domain integration model 712 may be based ^(t) Feature weights F of the current time-step version and feature domain integration model 714 of (c) ^(t) Is used to determine the third temporary variable c ^(t) The method comprises the steps of carrying out a first treatment on the surface of the Can be based on a third temporary variable c ^(t) Model weight M of model domain integration model 710, tensor expansion in mode 0 of parameter H of combined integration model 716a, model weight M of model domain integration model 710 ^(t) Is a current time step version of (a); can be based on model domain integrationModel weight M of model 710 ^(t) Feature weights F of the current time-step version and feature domain integration model 714 of (c) ^(t) To determine temporary variables (e.g., fourth temporary variable, third temporary variable reused, etc.); time weight T of time domain integration model 712 ^(t) May be determined based on the (fourth) temporary variable, the time weight T of time domain integration model 712, and the tensor expansion in mode 1 of parameter H of combined integration model 716 a; model weight M, which may be based on model domain integration model 710 ^(t) And the (updated) time weight T of time-domain integration model 712 ^(t) To determine a fifth temporary variable (e.g., a fifth temporary variable, a reused third temporary variable, etc.); and the feature weights F of the feature domain integration model 714 may be determined based on the (fifth) temporary variable, the feature weights F of the feature domain integration model 714, and the tensor spread in mode 2 of the parameter H of the combined integration model 716a ^(t) Is a version of the updated current time step.

In some non-limiting embodiments, TSEM for unsupervised learning (TSEM-un) and/or TSEM for semi-supervised learning (TSEM-semi) as described herein may perform better than single change point algorithms (e.g., bayesian Change Point Detection (BCPD), chageFinder (CF), KLIEP, CUSUM) and/or integrated models that combine these change point algorithms without applying the techniques described herein. For example, table 3 shows the performance of each of these techniques for Server Machine Dataset (SMD) in terms of recall, accuracy, F-1 score, and AUC (Su et al, robust anomaly detection for multivariate time series by a random recurrent neural network KDD '19 (Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network, KDD' 19): proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery&Data Mining (2019), the disclosure of which is incorporated herein by reference in its entirety):

techniques for	Recall (%)	Accuracy (%)	F-1 fraction (%)	AUC(％)
					BCPD	35.32	0.21	0.42	62.53
CF	19.52	0.31	0.61	57.37
					KLIEP	11.93	0.22	0.43	54.25
CUSUM	21.78	0.14	0.28	56.02
					Integration of	35.55	0.25	0.49	62.43
TSEM-un	53.61	0.65	1.28	73.34
					TSEM-semi	84.06	0.79	1.56	87.22

TABLE 3 Table 3

Although embodiments have been described in detail for the purpose of illustration, it is to be understood that such detail is solely for that purpose and that the disclosure is not limited to the disclosed embodiments or aspects, but, on the contrary, is intended to cover modifications and equivalent arrangements that are within the spirit and scope of the appended claims. For example, it is to be understood that the present disclosure contemplates that, to the extent possible, one or more features of any embodiment or aspect can be combined with one or more features of any other embodiment or aspect.

Claims

1. A computer-implemented method, comprising:

receiving, with at least one processor, multivariate sequence data comprising a plurality of vectors, each respective vector of the plurality of vectors comprising elements based on a time series of respective variables of the plurality of variables;

inputting, with at least one processor, at least a portion of the multivariate sequence data into each respective anomaly detection model of a plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model;

combining, with at least one processor, the multivariate sequence data with the plurality of scores to generate combined intermediate data;

inputting, with at least one processor, the combined intermediate data into a combined integration model to generate an output score, the combined integration model based on a model domain integration model, a time domain integration model, and a feature domain integration model;

determining, with at least one processor, that the output score meets a threshold; and

in response to determining that the output score meets the threshold, at least one of:

communicating, with at least one processor, an alert to a user device;

Inputting, with at least one processor, the multivariate sequence data into the feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables; or alternatively

Parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model are updated with at least one processor.

2. The method of claim 1, wherein each anomaly detection model of the plurality of anomaly detection models comprises at least one of a classifier model or a score generation model.

3. The method of claim 1, wherein each anomaly detection model of the plurality of anomaly detection models comprises at least one of a bayesian model, a Kullback-Leibler importance estimation process (KLIEP) model, a changefilter model, or a cumulatively sum (CUSUM) model.

4. The method of claim 1, further comprising:

determining, with at least one processor, whether to tag or communicate the at least a portion of the multivariate sequence data based on the output score; and

in response to determining whether to tag or communicate the at least a portion of the multivariate sequence data, one of:

Marking, with at least one processor, the at least a portion of the multivariate sequence data based on the output score; or alternatively

Communicating, with at least one processor, the at least a portion of the multivariate sequence data and the output score to the user equipment.

5. The method of claim 1, wherein updating the parameters of at least one of the combined integration model, the model domain integration model, the time domain integration model, or the feature domain integration model comprises:

initializing the combined integrated model with at least one processor;

for each time step less than the maximum time step:

determining, with at least one processor, a first temporary variable based on the combined intermediate data and a current time step version of the combined integrated model;

determining, with at least one processor, a second temporary variable based on a ratio of tags to the first temporary variable;

back-propagating, with at least one processor, the feature domain integration model based on the transpose of the combined intermediate data and the second temporary variable; and

determining, with at least one processor, a next time-step version of the combined integrated model based on a back-propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

6. The method of claim 5, wherein updating the parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model further comprises:

for each time step less than the maximum time step:

determining, with the at least one processor, a third temporary variable from a first Khatri-Rao product based on a current time-step version of the time-domain integration model and a current time-step version of the feature-domain integration model;

determining, with at least one processor, a current time-step version of the model domain integration model based on tensor expansion in mode 0 of the third temporary variable, the model domain integration model, and the combined integration model;

determining, with at least one processor, a fourth temporary variable from a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model;

determining, with at least one processor, an updated current time-step version of the time-domain integration model based on tensor expansion in mode 1 of the fourth temporary variable, the time-domain integration model, and the combined integration model;

Determining, with at least one processor, a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and an updated current time-step version of the time domain integration model; and

an updated current time-step version of the feature domain integration model is determined, with at least one processor, based on the fifth temporary variable, the feature domain integration model, and a tensor expansion in mode 2 of the combined integration model.

7. The method of claim 1, wherein a loss function of the combined integration model is based on the model domain integration model, the time domain integration model, and the feature domain integration model.

8. The method of claim 7, wherein the loss function of the combined integration model is based on the following equation:

9. A system, comprising:

at least one processor; and

at least one non-transitory computer-readable medium comprising one or more instructions that, when executed by at least one processor, direct the at least one processor to:

Receiving multivariate sequence data comprising a plurality of vectors, each respective vector of the plurality of vectors comprising elements based on a time series of respective variables of the plurality of variables;

inputting at least a portion of the multivariate sequence data into each respective anomaly detection model of a plurality of anomaly detection models to generate a plurality of scores comprising a respective score for each respective anomaly detection model;

combining the multivariate sequence data with the plurality of scores to generate combined intermediate data;

inputting the combined intermediate data into a combined integration model to generate an output score, the combined integration model being based on a model domain integration model, a time domain integration model, and a feature domain integration model;

determining that the output score meets a threshold; and

communicating an alert to a user device;

inputting the multivariate sequence data into the feature domain integration model to generate a feature importance vector comprising a feature importance score for each of the plurality of variables; or alternatively

Updating parameters of at least one of the combined integration model, the model domain integration model, the time domain integration model, or the feature domain integration model.

10. The system of claim 9, wherein each anomaly detection model of the plurality of anomaly detection models comprises at least one of a classifier model or a score generation model.

11. The system of claim 9, wherein each anomaly detection model of the plurality of anomaly detection models comprises at least one of a bayesian model, a Kullback-Leibler importance estimation process (KLIEP) model, a changefilter model, or a cumulatively sum (CUSUM) model.

12. The system of claim 9, wherein the one or more instructions further direct the at least one processor to:

determining whether to tag or communicate the at least a portion of the multivariate sequence data based on the output score; and

marking the at least a portion of the multivariate sequence data based on the output score; or alternatively

Communicating the at least a portion of the multivariate sequence data and the output score to the user device.

13. The system of claim 9, wherein the one or more instructions further direct the at least one processor when updating the parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model:

Initializing the combined integrated model;

for each time step less than the maximum time step:

determining a first temporary variable based on the combined intermediate data and a current time step version of the combined integrated model;

determining a second temporary variable based on a ratio of tag to the first temporary variable;

back-propagating the feature domain integration model based on the transpose of the combined intermediate data and the second temporary variable; and

a next time-step version of the combined integrated model is determined based on a back propagation of the feature domain integrated model and the current time-step version of the combined integrated model.

14. The system of claim 13, wherein the one or more instructions further direct the at least one processor when updating the parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model:

for each time step less than the maximum time step:

determining a third temporary variable according to a first Khatri-Rao product based on the current time-step version of the time-domain integration model and the current time-step version of the feature-domain integration model;

Determining a current time-step version of the model domain integration model based on the third temporary variable, the model domain integration model, and a tensor expansion in mode 0 of the combined integration model;

determining a fourth temporary variable from a second Khatri-Rao product based on the current time-step version of the model domain integration model and the current time-step version of the feature domain integration model;

determining an updated current time step version of the time domain integration model based on tensor expansion in mode 1 of the fourth temporary variable, the time domain integration model, and the combined integration model;

determining a fifth temporary variable from a third Khatri-Rao product based on the current time-step version of the model domain integration model and an updated current time-step version of the time-domain integration model; and

an updated current time-step version of the feature domain integration model is determined based on the fifth temporary variable, the feature domain integration model, and a tensor expansion in mode 2 of the combined integration model.

15. The system of claim 9, wherein a loss function of the combined integration model is based on the model domain integration model, the time domain integration model, and the feature domain integration model.

16. The system of claim 15, wherein the loss function of the combined integration model is based on the following equation:

17. A computer program product comprising at least one non-transitory computer-readable medium containing one or more instructions that, when executed by at least one processor, cause the at least one processor to:

Determining that the output score meets a threshold; and

communicating an alert to a user device;

18. The computer program product of claim 17, wherein the one or more instructions further cause the at least one processor to:

19. The computer program product of claim 17, wherein the one or more instructions, when updating the parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model, further cause the at least one processor to:

initializing the combined integrated model;

for each time step less than the maximum time step:

20. The computer program product of claim 19, wherein the one or more instructions, when updating the parameters of at least one of the combined integrated model, the model domain integrated model, the time domain integrated model, or the feature domain integrated model, further cause the at least one processor to:

For each time step less than the maximum time step: