WO2018194707A1 - System and computer-implemented method for generating synthetic production data for use in testing and modeling - Google Patents

System and computer-implemented method for generating synthetic production data for use in testing and modeling Download PDF

Info

Publication number
WO2018194707A1
WO2018194707A1 PCT/US2017/055120 US2017055120W WO2018194707A1 WO 2018194707 A1 WO2018194707 A1 WO 2018194707A1 US 2017055120 W US2017055120 W US 2017055120W WO 2018194707 A1 WO2018194707 A1 WO 2018194707A1
Authority
WO
WIPO (PCT)
Prior art keywords
production data
system
transaction
data
behaviors
Prior art date
Application number
PCT/US2017/055120
Other languages
French (fr)
Inventor
Eric James Gieseke
Kenneth John CHENIS
Original Assignee
Aci Worldwide Corp.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US201762487741P priority Critical
Priority to US62/487,741 priority
Application filed by Aci Worldwide Corp. filed Critical Aci Worldwide Corp.
Publication of WO2018194707A1 publication Critical patent/WO2018194707A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/316User authentication by observing the pattern of computer usage, e.g. typical user behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems using knowledge-based models
    • G06N5/04Inference methods or devices
    • G06N5/045Explanation of inference steps
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2141Access rights, e.g. capability lists, access control lists, access tables, access matrices

Abstract

A system and computer-implemented method for modeling transactional behaviors in a financial or other transaction system to create and use realistic synthetic production data without the need to seek permissions or risk data loss. An artificial intelligence system is trained on real production data to create component models that represent relevant normal and abnormal behaviors, such as non-fraudulent and fraudulent activities, of a transaction stream. A consolidated model combines the component models and represents data states that include both the relevant normal and abnormal behaviors in a relative time base. A transaction generator is used in combination with the consolidated model to create the synthesized production data that realistically mimics the real production data, but which is not coupled to the real production data, and from which the real production data is not derivable. The synthetic production data may then be used as input for model systems.

Description

SYSTEM AND COMPUTER-IMPLEMENTED METHOD FOR GENERATING SYNTHETIC PRODUCTION DATA FOR USE IN TESTING AND MODELING

FIELD

[0001] The present invention relates to systems and methods for creating synthesized data, and more particularly, embodiments concern a system and computer-implemented method for creating and using realistic synthetic data for testing, modeling, and other applications without the need to seek permissions or risk data loss.

BACKGROUND

[0002] Financial institutions desire to protect their customers from fraud, protect themselves from losses due to financial crimes, and comply with national and international regulations and mandates. However, financial institutions face increasing challenges with detecting and preventing fraud and other undesirable activities. Financial crime management solutions have been developed to detect and manage unusual activity based upon known patterns in accounts or transaction activities within institutions. These solutions help to shield customers from fraud, minimize losses and improve efficiency, and comply with government regulations.

[0003] In order to develop and test these solutions, test data is needed to accurately model the behaviors of financial transaction systems. Two different approaches have been used to obtain this data. In the first approach, the data is generated in a quasi-random fashion, and in the second approach, the data is collected from "real" production data and then scrubbed to mask identifying information. The first approach yields data that is sufficient for performance testing systems, and even for some functional testing, but many specialized systems require "real" data to fully exercise the capabilities of the systems. In particular, there are patterns and nuances to real production data that are not readily simulated by standard tools for generating test data, which is why the second approach is often used to achieve more "life-like" test data.

[0004] The challenge with the second approach is that regulations on using real data are becoming increasingly restrictive. Currently, that data, even when scrubbed, is the property of the customer who originated it. Obtaining permission to use the data continues to be challenging, even when scrubbing or other protection mechanisms are applied. Further, the protection of the data itself is coming into question even with the best scrubbing methods. Recent research has shown that with very little outside data added, scrubbed data can reveal sensitive information simply because of the patterns exhibited in the data itself. Because these patterns are the very details for which the real data is needed, it is not feasible to remove or alter them in order to protect the real data.

[0005] This background discussion is intended to provide information related to the present invention which is not necessarily prior art.

SUMMARY

[0006] Embodiments of the present invention solve the above-described and other problems and limitations by providing a system and computer-implemented method for creating and using realistic synthetic data for testing, modeling, and other uses without the need to seek permissions or risk data loss.

[0007] In a first embodiment of the present invention, a system may be provided for modeling behaviors in a transaction system and creating and using realistic synthetic production data, and may broadly comprise an artificial intelligence system, one or more component models, a consolidated model, and a transaction generator. The artificial intelligence system may be trained on real production data from the transaction system to create the component models that represent relevant normal behaviors and relevant abnormal behaviors of a transaction stream. The consolidated model may combine the components models and represents data states that include both the relevant normal behaviors and the relevant abnormal behaviors in a relative time base. The transaction generator may be used in combination with the consolidated model to create the synthesized production data that realistically mimics the relevant normal behaviors and the relevant abnormal behaviors of the real production data, wherein the synthesized production data is not coupled to the real production data, and the real production data is not derivable from the synthesized production data.

[0008] Various implementations of the first embodiment may include any one or more of the following additional features. The artificial intelligence system may be a machine learning system. The transaction stream may be a financial transaction stream. The consolidated model may represent data states that include both non-fraudulent activity and fraudulent activities. The component models may be at least periodically modified to better represent patterns of activity in the transaction stream and the real production data. The component models may create new patterns of activity not exhibited in the transaction stream and the real production data. The transaction generator may adhere to certain patterns of activity while changing a volume of transactions. The system may further include one or more models of a financial transaction system which are tested with the synthesized production data.

[0009] In a second embodiment of the present invention, a computer-implemented method may be provided for improving the functionality of a computer for modeling behaviors in a transaction system to create and use realistic synthetic production data. The computer- implemented method may broadly comprise the following steps. An artificial intelligence system may be trained on real production data from the transaction system to create one or more component models that represent relevant normal behaviors and relevant abnormal behaviors of a transaction stream. The component models may be combined in a consolidated model that represents data states that include both the relevant normal behaviors and the relevant abnormal behaviors in a relative time base. The consolidated model may be combined with a transaction generator to create synthesized production data that realistically mimics the relevant normal behaviors and the relevant abnormal behaviors of the real production data, wherein the synthesized production data is not coupled to the real production data, and the real production data is not derivable from the synthesized production data. The synthetic production data may be used in a model of a system.

[0010] Various implementations of the second embodiment may include any one or more of the following additional features. The artificial intelligence system may be a machine learning system. The transaction stream may be a financial transaction stream. The consolidated model may represent data states that include both non-fraudulent activity and fraudulent activities in the relative time base. The component models may be at least periodically modified to better represent patterns of activity in the transaction stream and the real production data. The component models may create new patterns of activity not exhibited in the transaction stream and the real production data. The transaction generator may adhere to certain patterns of activity while changing a volume of transactions. The model of the system may be a fraud detection model.

[0011] This summary is not intended to identify essential features of the present invention, and is not intended to be used to limit the scope of the claims. These and other aspects of the present invention are described below in greater detail. DRAWINGS

[0012] Embodiments of the present invention are described in detail below with reference to the attached drawing figures, wherein:

[0013] FIG. 1 is a high-level block diagram of an embodiment of a system for creating and using realistic synthetic production data; and

[0014] FIG. 2 is a high-level flowchart of steps in a computer-implemented method for creating and using realistic synthetic production data.

[0015] The figures are not intended to limit the present invention to the specific embodiments they depict. The drawings are not necessarily to scale.

DETAILED DESCRIPTION

[0016] The following detailed description of embodiments of the invention references the accompanying figures. The embodiments are intended to describe aspects of the invention in sufficient detail to enable those with ordinary skill in the art to practice the invention. Other embodiments may be utilized and changes may be made without departing from the scope of the claims. The following description is, therefore, not limiting. The scope of the present invention is defined only by the appended claims, along with the full scope of equivalents to which such claims are entitled.

[0017] In this description, references to "one embodiment", "an embodiment", or

"embodiments" mean that the feature or features referred to are included in at least one embodiment of the invention. Separate references to "one embodiment", "an embodiment", or "embodiments" in this description do not necessarily refer to the same embodiment and are not mutually exclusive unless so stated. Specifically, a feature, structure, act, etc. described in one embodiment may also be included in other embodiments, but is not necessarily included. Thus, particular implementations of the present invention can include a variety of combinations and/or integrations of the embodiments described herein.

[0018] Broadly characterized, the present invention provides a system and computer- implemented method for creating and using realistic synthetic data for testing, modeling, and other uses without the need to seek permissions or risk data loss. More particularly, embodiments may allow for creating realistic, or "life-like," synthetic production data for internal or external use in the development and testing of models, such as financial transaction models and fraud detection models, without requiring permissions or risking data leakage. The created data may accurately model real production data and yet be owned by the creator, which makes the data eligible for sharing or exploratory offerings without violating tenant data agreements. Further, the created data can be shared with clients or other third-parties or they may be given access to it to show them "trend" data and allow them to explore it, without actually giving the clients or other third-parties access to real production data. The present technology may be used to create data sets of any size with any characteristics desired, including normal and abnormal behaviors and/or other nuances present in actual transactions.

[0019] As used herein, "artificial intelligence" may be broadly defined as non-human decision-making processes, and may include deep learning, representation learning, automated reasoning, and/or machine learning, or any combination thereof, and/or substantially any other suitable form or characterization of artificial intelligence. In one embodiment, artificial intelligence may examine real production data generated by actual transactions occurring within a transaction system, and generate a model and/or a series of models that accurately represent the subtle patterns and nuances present in the actual transactions. This may be accomplished by examining the behaviors of the originators of the transactions and modeling those behaviors. The behavior models may be combined with other data pattern models to accurately represent the normal behaviors and the abnormal behaviors that create the subtle patterns and nuances present in real production data, such as fraudulent activity. The machine learning system may then create and exercise a set of features and a combination of models until it can accurately represent the identified patterns and nuances.

[0020] The system may then use the input models in a combinatory fashion to feed a transaction generation subsystem. The subsystem may create individual transactions that follow a combination of the normal and abnormal behaviors identified by the machine learning system and represented in the models. Further, these representative models may be tuned or adjusted to create new patterns of behavior to explore how the corresponding transactions would be generated. The transaction generation subsystem may use a virtual time base defined by the models to create a set of synthetic transaction or other data that appears to have been created over the same time range as the modeled set of data. The resulting set of synthetic data may look and behave very similar to a set of "real" transaction data, exhibiting the same subtleties important to the system being tested, but may have no direct connection to the real data. [0021] Unlike simple generated data, the synthesized data may exhibit the same characteristics of real data, including characteristic buying patterns, geo distributions, merchant code amounts, and even fraudulent activity, which are relevant to the models or other systems being tested. Unlike scrubbed real data, the synthesized data has no direct connection to real data and should not be subject to any protection clauses and should not be considered sensitive data under any regulations. It should be very difficult or impossible to reverse engineer or decode the synthesized data to expose any real data, thereby minimizing or avoiding any possibility of data loss or liability risk.

[0022] For the purpose of illustration, the present technology may be herein described with reference to financial transactions, but it should be understood that the present technology has application in a large variety of different contexts involving sensitive data and/or privacy concerns, such as medical data or genetic data.

[0023] Referring to FIG. 1, an embodiment of a system 10 is shown for creating and using realistic synthetic data for testing, modeling, and other uses without the need to seek permissions or risk data loss. The system 10 and an exemplary operating environment may broadly comprise a transaction system 12 supporting a transaction stream 14 generating real production data; a production processing system 16; a real data database 18; an artificial intelligence system 20; one or more component models 22; a consolidated model 24; a transaction generator 26; a synthetic data database 28; and one or more model systems 30. These components may be stored on, implemented on, or otherwise facilitated by one or more electronic memory elements and one or more electronic processing elements, with data communications occurring over one or more communications networks. Access to the real production data, including components involved in the generation and storage of the real production data, may be relatively more restricted. Access to the synthetic production data, including components involved in the generation and storage of the synthetic production data, may be unrestricted or relatively less restricted.

[0024] The transaction system 12, transaction stream 14, and real production data may be substantially any suitable system, stream, and data, such as, for example, a financial transaction system, stream, and data. In one implementation, the transaction system 12 may facilitate a variety of financial transactions between buyers and sellers, and the transaction stream 14 may comprise these transactions. In order to facilitate the transactions, the transaction system 12 may send and/or receive information related to a buyer and/or a seller for each transaction. The transaction system 12 may facilitate each transaction by accepting payment by substantially any suitable payment method, such as credit cards, debit cards, and/or bank transfers. Some or all of the aforementioned transaction-related information may be included in the transaction stream 14. The production processing system 16 may be configured to process the real production data as desired or needed. The real data database 18 may be configured to store the real production data on one or more electronic memory element. In one implementation, the real data database 18 may contain raw and/or processed information relevant to the real financial transactions that were facilitated by the transaction system 12. For example, this information may include dates and times of transactions, amounts of money exchanged in the transactions, and details about the parties involved in the transactions.

[0025] The artificial intelligence system 20, which may be a machine learning system, may be configured to examine the real production data and generate the one or more component models 22 that accurately represent the subtle patterns and nuances present in the real production data. The consolidated model 24 may be configured to combine the one or more component models 22, and provide input to or otherwise guide operation of the transaction generator 26. The transaction generator 26 may be configured to create individual transactions that follow the behaviors identified by the artificial intelligence system 20 and represented in the component models 22 and therefore also in the consolidated model 24. The transaction generator 26 may use a virtual time base defined by the model 24 to create synthetic production data that appears to have been created over the same time range as the modeled real production data. The synthesized data database 28 may store the synthesized production data until needed. In one application, the synthesized data may be used as input for testing the one or more model systems 30.

[0026] Referring also to FIG. 2, the system 10 may broadly function substantially as follows, with additional and/or alternative functionality described below in the discussion of the computer-implemented method. The machine learning system 20 may be trained on the real production data stored in the real data database 18 to establish the one or more component models 22 that represent both relevant normal and relevant abnormal behaviors of the transaction stream 14, as shown in 112. The component models 22 may be combined in the consolidated model 24 that represents data states that include both the normal and abnormal behaviors in a relative time base, as shown in 114. The consolidated model 24 may be combined with the transaction generator 26 to create the synthesized production data that, as a whole, realistically mimics the normal and abnormal behaviors of the real production data, as shown in 116. The synthesized production data may not be coupled to the real production data, and the real production data may not be derivable from the synthesized production data. The synthesized production data may be stored in the synthesized database 28. The synthesized production data may then be used internally or externally for testing and/or modeling, including for testing models of financial, transactional, or other systems 30, as shown in 1 18. For example, the synthesized production data may be used to test financial transaction models and fraud detection models.

[0027] The system 10 may include more, fewer, or alternative components and/or perform more, fewer, or alternative actions, including those discussed elsewhere herein, and particularly those discussed below in describing the computer-implemented method.

[0028] Referring again to FIG. 2, an embodiment is shown of a computer-implemented method 110 for creating and using realistic synthetic data for testing, modeling, and other uses without the need to seek permissions or risk data loss. The computer-implemented method 110 may be a corollary to the functionality of the system 10 of FIG. 1, and may be similarly implemented using the various components of the system 10 within the above-described exemplary operating environment. The computer-implemented method 110 may broadly comprise the following steps.

[0029] An artificial intelligence system 20, such as a machine learning system, may be trained on real production data from the transaction system 12 to establish one or more component models 22, or a series of component models, that represent both relevant normal and relevant abnormal behaviors of a transaction stream 14, as shown in 112. For example, the transaction stream 14 may be a financial transaction stream. The component models 22 may be combined in a consolidated model 24 that represents data states that include both the normal and abnormal behaviors in a relative time base, as shown in 114. For example, the normal and abnormal behaviors may include non-fraudulent and fraudulent activities. The consolidated model 24 may be combined with a transaction generator 26 to create synthesized production data that, as a whole, realistically mimics the normal and abnormal behaviors of the real production data, as shown in 116. The synthesized production data may not be coupled to the real production data, and the real production data may not be derivable from the synthesized production data. The synthesized production data may be stored in a synthesized data database 28. The synthesized production data may then be used internally or externally for testing and/or modeling, including for testing models of financial, transactional, or other systems 30, as shown in 118. For example, the synthetic production data may be used to test financial transaction models and fraud detection models.

[0030] Additionally or alternatively, the component models 22 and/or the consolidated model 24 may be periodically or continuously tuned or otherwise modified to better represent trends or other patterns of activity in the transaction stream 14 and the real production data, as shown in 120. Additionally or alternatively, the component models 22 and/or the consolidated model 24 may be periodically or continuously tuned or otherwise modified to better represent new or otherwise changing or trends or other patterns of activity not yet being exhibited in the transaction stream 14 or the real production data, as shown in 122, thereby allowing for exploring or otherwise examining the transaction system 12 under the new patterns of activity. Additionally or alternatively, the transaction generator 26 may be tuned or otherwise modified to adhere to certain trends or other patterns of activity while changing (i.e., increasing or decreasing) the volume of transactions, as shown in 124, thereby allowing for characteristic load testing over projections of growth in the volume of transactions.

[0031] The computer-implemented method 110 may include more, fewer, or alternative actions, including those discussed elsewhere herein, and particularly those discussed above in describing the system 10.

[0032] In exemplary use, the system 10 and/or computer-implemented method 110 may be employed as follows. Given a very large data set of real financial transactions, dimensions of the data may be chosen to reveal characteristics of the consumers, such as locations, merchant category codes, and average transaction amount. The machine learning system 20 may create and train component models 22 that can accurately predict the distribution of this type of transaction from these consumers. This process may be repeated to create several component models 22 and dimension characteristics along with distribution information that accurately represents all the dimensional characteristics of the real production data in the transaction system 12 (even including fraudulent transactions). The component models 22 may then be combined in the consolidated model 24, and the consolidated model 24 may be combined with the transaction generator 26 to create a synthetic transaction stream and generate synthetic production data.

[0033] More specifically, the transaction generator 26 may access dimensional data repositories filled with synthesized information mimicking the attribute characteristics of the original production data (e.g., fictitious names with their addresses, phone numbers, etc.) that follow the same pattern of dimensional analysis and distribution into the synthesized production data. The transaction generator 26 may then choose from these data elements to create a synthetic transaction stream that mimics the original transaction stream 14 based on the behavioral model patterns (this may be for each dimension, or it may take a multi-dimensional approach for each fictitious consumer). When the same models are run against the newly created synthetic production data, they should yield the same result or similar result as the real production data.

[0034] Thus, embodiments advantageously allow for creating and using realistic synthetic production data for testing, modeling, and other uses without the need to seek permissions or risk data loss, and wherein the synthetic data more accurately represents normal and abnormal behaviors and/or other nuances of real production data.

[0035] Although the invention has been described with reference to the one or more embodiments illustrated in the figures, it is understood that equivalents may be employed and substitutions made herein without departing from the scope of the invention as recited in the claims.

[0036] Having thus described one or more embodiments of the invention, what is claimed as new and desired to be protected by Letters Patent includes the following:

Claims

CLAIMS:
1. A system for modeling behaviors in a transaction system and creating and using realistic synthetic production data, the system comprising:
an artificial intelligence system trained on real production data from the transaction system to create one or more component models that represent relevant normal behaviors and relevant abnormal behaviors of a transaction stream;
a consolidated model which combines the one or more components models and represents data states that include both the relevant normal behaviors and the relevant abnormal behaviors in a relative time base; and
a transaction generator used in combination with the consolidated model to create the synthesized production data that realistically mimics the relevant normal behaviors and the relevant abnormal behaviors of the real production data,
wherein the synthesized production data is not coupled to the real production data, and the real production data is not derivable from the synthesized production data.
2. The system as set forth in claim 1, wherein the artificial intelligence system is a machine learning system.
3. The system as set forth in claim 1, wherein the transaction stream is a financial transaction stream.
4. The system as set forth in claim 1, wherein the consolidated model represents data states that include both non-fraudulent activity and fraudulent activities.
5. The system as set forth in claim 1, wherein the one or more component models are at least periodically modified to better represent patterns of activity in the transaction stream and the real production data.
6. The system as set forth in claim 1, wherein the one or more component models create new patterns of activity not exhibited in the transaction stream and the real production data.
7. The system as set forth in claim 1, wherein the transaction generator adheres to certain patterns of activity while changing a volume of transactions.
8. The system as set forth in claim 1, wherein the system further comprises one or more models of a financial transaction system tested with the synthesized production data.
9. A system for modeling financial transactional behaviors in a financial transaction system and creating and using realistic synthetic production data, the system comprising:
a machine learning system trained on real production data from the financial transaction system to create one or more component models that represent relevant normal behaviors and relevant abnormal behaviors of a financial transaction stream;
a consolidated model which combines the one or more components models and represents data states that include both the relevant normal behaviors and the relevant abnormal behaviors, including both non-fraudulent activity and fraudulent activities, in a relative time base; and a transaction generator used in combination with the consolidated model to create synthesized production data that realistically mimics the relevant normal behaviors and the relevant abnormal behaviors of the real production data,
wherein the synthesized production data is not coupled to the real production data, and the real production data is not derivable from the synthesized production data, and
wherein the one or more component models are at least periodically modified to better represent patterns of activity in the financial transaction stream and the real production data.
10. The system as set forth in claim 9, wherein the one or more component models create new patterns of activity not exhibited in the transaction stream and the real production data.
11. The system as set forth in claim 9, wherein the transaction generator adheres to certain patterns of activity while changing a volume of transactions.
12. The system as set forth in claim 9, wherein the system further comprises one or more models of the financial transaction system tested with the synthesized production data.
13. A computer-implemented method for improving the functionality of a computer for modeling behaviors in a transaction system to create and use realistic synthetic production data, the computer-implemented method comprising:
training an artificial intelligence system on real production data from the transaction system to create one or more component models that represent relevant normal behaviors and relevant abnormal behaviors of a transaction stream;
combining the one or more component models in a consolidated model that represents data states that include both the relevant normal behaviors and the relevant abnormal behaviors in a relative time base;
combining the consolidated model with a transaction generator to create synthesized production data that realistically mimics the relevant normal behaviors and the relevant abnormal behaviors of the real production data,
wherein the synthesized production data is not coupled to the real production data, and the real production data is not derivable from the synthesized production data; and
using the synthetic production data in a model of a system.
14. The computer-implemented method as set forth in claim 13, wherein the artificial intelligence system is a machine learning system.
15. The computer-implemented method as set forth in claim 13, wherein the transaction stream is a financial transaction stream.
16. The computer-implemented method as set forth in claim 13, wherein the consolidated model represents data states that include both non-fraudulent activity and fraudulent activities in the relative time base.
17. The computer-implemented method as set forth in claim 13, further including at least periodically modifying the one or more component models to better represent patterns of activity in the transaction stream and the real production data.
18. The computer-implemented method as set forth in claim 13, wherein the one or more component models create new patterns of activity not exhibited in the transaction stream and the real production data.
19. The computer-implemented method as set forth in claim 13, wherein the transaction generator adheres to certain patterns of activity while changing a volume of transactions.
20. The computer-implemented method as set forth in claim 13, wherein the model of the system is a fraud detection model.
PCT/US2017/055120 2017-04-20 2017-10-04 System and computer-implemented method for generating synthetic production data for use in testing and modeling WO2018194707A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US201762487741P true 2017-04-20 2017-04-20
US62/487,741 2017-04-20

Publications (1)

Publication Number Publication Date
WO2018194707A1 true WO2018194707A1 (en) 2018-10-25

Family

ID=63856781

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/055120 WO2018194707A1 (en) 2017-04-20 2017-10-04 System and computer-implemented method for generating synthetic production data for use in testing and modeling

Country Status (1)

Country Link
WO (1) WO2018194707A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120109821A1 (en) * 2010-10-29 2012-05-03 Jesse Barbour System, method and computer program product for real-time online transaction risk and fraud analytics and management
US20130325608A1 (en) * 2009-01-21 2013-12-05 Truaxis, Inc. Systems and methods for offer scoring
WO2015099870A1 (en) * 2013-12-23 2015-07-02 Citibank, N.A. Quantitative assessment of behavior in financial entities and transactions
US20150309919A1 (en) * 2014-04-25 2015-10-29 Wal-Mart Stores, Inc. System and method for generating synthetic data for software testing purposes
US20160267483A1 (en) * 2014-03-12 2016-09-15 Facebook, Inc. Systems and methods for identifying illegitimate activities based on historical data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130325608A1 (en) * 2009-01-21 2013-12-05 Truaxis, Inc. Systems and methods for offer scoring
US20120109821A1 (en) * 2010-10-29 2012-05-03 Jesse Barbour System, method and computer program product for real-time online transaction risk and fraud analytics and management
WO2015099870A1 (en) * 2013-12-23 2015-07-02 Citibank, N.A. Quantitative assessment of behavior in financial entities and transactions
US20160267483A1 (en) * 2014-03-12 2016-09-15 Facebook, Inc. Systems and methods for identifying illegitimate activities based on historical data
US20150309919A1 (en) * 2014-04-25 2015-10-29 Wal-Mart Stores, Inc. System and method for generating synthetic data for software testing purposes

Similar Documents

Publication Publication Date Title
Corrado et al. Where is the economics in spatial econometrics?
Quah et al. Real-time credit card fraud detection using computational intelligence
Gomes et al. Levered returns
Ahmed et al. A survey of anomaly detection techniques in financial domain
Engel et al. Designing systems for adaptability by means of architecture options
Bańbura et al. Large Bayesian vector auto regressions
Recker et al. Do ontological deficiencies in modeling grammars matter?
US10115153B2 (en) Detection of compromise of merchants, ATMS, and networks
Brigo et al. Arbitrage‐free bilateral counterparty risk valuation under collateralization and application to credit default swaps
US20090018940A1 (en) Enhanced Fraud Detection With Terminal Transaction-Sequence Processing
Squartini et al. Randomizing world trade. I. A binary network analysis
Khwaja et al. Tracing the impact of bank liquidity shocks: Evidence from an emerging market
Véron Singularities of solutions of second-order quasilinear equations
Accorsi et al. On the exploitation of process mining for security audits: the conformance checking case
Porru et al. Blockchain-oriented software engineering: challenges and new directions
US20090182653A1 (en) System and method for case management
Wang Real options" in" projects and systems design: identification of options and solutions for path dependency
Wheeler et al. Properties of model‐averaged BMDLs: A study of model averaging in dichotomous response risk estimation
KR101983206B1 (en) Data records selection
US20110047056A1 (en) Continuous measurement and independent verification of the quality of data and processes used to value structured derivative information products
JP5889300B2 (en) Method and system for reliability assessment of online trading users
Min et al. Bayesian model selection for D‐vine pair‐copula constructions
Koh et al. A two-step method to construct credit scoring models with data mining techniques
Hafner Nonlinear time series analysis with applications to foreign exchange rate volatility
Panko et al. End user computing: The dark matter (and dark energy) of corporate IT

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17905979

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE