WO2016094472A1 - System and method for enabling tracking of data usage - Google Patents

System and method for enabling tracking of data usage Download PDF

Info

Publication number
WO2016094472A1
WO2016094472A1 PCT/US2015/064612 US2015064612W WO2016094472A1 WO 2016094472 A1 WO2016094472 A1 WO 2016094472A1 US 2015064612 W US2015064612 W US 2015064612W WO 2016094472 A1 WO2016094472 A1 WO 2016094472A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
policy
policies
usage
methods
Prior art date
Application number
PCT/US2015/064612
Other languages
French (fr)
Inventor
Adam Jonathan TOWVIM
Robert Anthony MCDONALD
Daniel Jacob WEITZNER
Original Assignee
Trustlayers, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Trustlayers, Inc. filed Critical Trustlayers, Inc.
Publication of WO2016094472A1 publication Critical patent/WO2016094472A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Definitions

  • the present application generally relates to a system and method for improved handling of data usage limitations.
  • the present application relates to a system and methods for improving the handling and tracking of data usage and enabling improved compliance with data usage policies of various kinds.
  • Such methods and systems include a rules language, a real-time reasoning engine that monitors data usage, and a dashboard that, among other capabilities, reports on monitored data usage as it occurs throughout an enterprise.
  • the language may be described as a policy language and/or a declarative rules language.
  • Such methods and systems further include a scaleable engine for processing data usage policies, monitoring data usage and storing information about data usage (along with reasoning about why usage was proper), including internal policies of organizations (so as to protect brand equity), regulatory policies, and the like.
  • Methods and systems disclosed herein include a policy language that captures policies of all types, such as organizational policies, external laws and regulations, rules, and the like.
  • the methods and systems include and enable users to take advantage of pre-built, customizable modules that allow the creation and use of policies while working in plain English.
  • the methods and systems allow enterprises to attest to (and prove) proper usage, such as by logging usages along with the reasoning that led to usage and to discover issues, such as usage that is risky or out of compliance.
  • the methods and systems disclose herein include a policy engine that analyzes transactions captured in logs and assesses the transactions for compliance with policies.
  • the policy engine implements the policies within the applications and data uses of an enterprise to track and record data usage.
  • the policy language and customizable modules constructed using the language allow a user to identify what data to watch in a convenient fashion, and the policy engine, when implemented in or applied to data usage of an enterprise, tracks and records data usage relative to those policies and performs reasoning, applying the policies to the data usage.
  • the methods and systems disclosed herein also include the ability to provide policy-related output into dashboards, alerts and the like, optionally in real time or near real time, such as within 10 seconds, three seconds, or one second of the time of a data usage or a proposed data usage.
  • a system comprises a device adapted to execute a data transmitting application, a server adapted to receive transmitted data from the application and a reasoning engine adapted to receive metadata describing the transmitted data and to generate one or more conclusions indicative of a state of transmission of the transmitted data.
  • a method comprises receiving metadata describing the transmission of data by a device to a server and applying one or more predetermined rules to generate one or more conclusions indicative of a state of transmission of the transmitted data.
  • Policy should be understood to encompass a wide range of internal policies of an enterprise that can relate to storage or use of data, as well as regulatory policies, laws, and regulations of all kinds that regulate storage or use of data, including, without limitation, policies that apply to PII, health care data, financial data, educational data, and many other types.
  • Data should be understood to refer to all kinds of data, such as data about which there is user sensitivity (e.g., personal information) and/or data about which a policy exists.
  • Real time and “near real time” means events occurring substantially simultaneously with each other, such as within one minute, ten seconds, three seconds, or under a second, as context may indicate.
  • the overall methods and systems described herein provide significant enhancement of trust within and outside organizations, including trust by employees, executives, regulators, and consumers that sensitive data can be used while still complying with applicable regulations and policies.
  • the methods and systems, components thereof, and the host entity or entities that enable them are referred to in some cases, according to context, as “TrustLayers,” or “trust layers,” and such references should be understood to refer to the same as such context indicates.
  • Fig. 1 depicts an embodiment of an architecture and enabling components for a system for monitoring how data is used in real time, which includes connections to enterprise user directories and incorporation of said user directory changes into a policy engine according to an exemplary and non-limiting embodiment.
  • FIG. 2 depicts an architecture and enabling components for a system for monitoring how data is used in real time according to an exemplary and non-limiting embodiment.
  • Fig. 3 depicts a dashboard for indicating in real time the extent to which data usage is in compliance with policies according to an exemplary and non-limiting embodiment.
  • Fig. 4 depicts the preservation of original sources, such as allowing the ability to reference original scanned documents used to create a policy or to which a policy is applied according to an exemplary and non-limiting embodiment.
  • Fig. 5 depicts a rule writing interface for creating rules or policies in plain language that can be applied to data usage in real time according to an exemplary and non-limiting embodiment.
  • Fig. 6 depicts a policy prepared using a rule writing interface according to an exemplary and non-limiting embodiment.
  • Fig. 7 depicts a transaction log captured by the systems disclosed herein according to an exemplary and non-limiting embodiment.
  • Fig. 8 depicts capturing data for an after-the-fact review of application of a policy to a usage of data according to an exemplary and non-limiting embodiment.
  • Fig. 9 depicts an interface for real-time usage control of data in response to determining a risk of non-compliance with a policy according to an exemplary and non-limiting embodiment.
  • Fig. 10 depicts an embodiment of an architecture and related components for real time monitoring of usage of data in a distributed platform.
  • Fig. 11 depicts an architecture and related components for a policy engine according to an exemplary and non-limiting embodiment.
  • Fig. 12 depicts an architecture for real time monitoring of data usage in a distributed platform, including communication of data usage limitations among distributed elements according to an exemplary and non-limiting embodiment.
  • Fig. 1 illustrates architecture for the methods and systems disclosed herein, in according to an exemplary and non-limiting embodiment.
  • the architecture may include a policy engine 102 that may connect via various connectors 104 to enterprise data resources 106.
  • the connectors 104 may for example include any type of programming module that may use or track usage of data such as brokers, connectors, bridges, and other tools for extraction, transformation and loading (ETL) of data or auditing or logging thereof.
  • the data resources of an enterprise may include applications that may use data, data logs, application programming interfaces that may exchange and use data, data exchanged throughout a social graph of an enterprise, and other such resources.
  • the policy engine 102 may be aware of users of the enterprise, such as by an enterprise user directory 108, so that its operation may be user-aware in addition to being aware of policies and data.
  • the policy engine 102 may ingest data from the enterprise and its users, as well as policies that may apply to usage of such data.
  • the architecture may further include policy modules 110 so that policies may be created by the policy modules 110.
  • the policy modules 110 may use a policy definition language and environment that may allow creation of plain-language, re-usable policies that may correspond to typical enterprise policies and external policies that apply to data usage.
  • the policy engine 102 may apply the policies to data in real time as the data is being used, such as immediately prior to, during or after usage by the enterprise, as reported to it through the connectors 104.
  • the policy engine 102 may analyze data usage versus the policies that may apply to the usage, tracks the reasoning that may lead to a conclusion that a particular data usage is, or is not, compliant with a policy, and may store information about the data usage event, the policy, and the reasoning.
  • the policy engine 102 may have one or more services layers that may interact with enterprise data sources 106, the enterprise user directory 108, the policy module(s) 110, and one or more systems for providing alerts (e.g., about non-compliance with policies), for providing reporting about data usage and compliance 114 (e.g., to a work flow system and/or to one or more Personal Information Balance Sheet (PIBS) dashboards 118 that may allow tracking, reporting, and analytics on policy usage).
  • PIBS Personal Information Balance Sheet
  • Fig. 2 shows an embodiment of technology architecture for executing the methods and systems depicted in Fig. 1.
  • the architecture may include a multithreaded, JVM-based (Scala) trust engine 202 that may optionally apply forward chaining reasoning to apply a policy to a proposed or actual data usage.
  • Various data sources 204 within the architecture may be accessed via the connectors 104, and various actions and reporting may occur through the connectors 104.
  • the trust engine 202 may take data usage requests and queries, such as via a RESTful API 208, apply policies, and send output, such as showing reasoning and justification for data usage, to a database 210, which may store justifications for later access.
  • the database 210 may include a justification database 212 and an application database 214.
  • the architecture may include a PIBS app server 218 that may allow tracking, reporting, and analytics on policy usage through the one or more PIBS dashboards 118.
  • the tracked output and reporting may be communicated through actions and alerts 220 via email, SMS, web service, new trust layers, or any other mode or request.
  • the methods and systems disclosed herein enable embodying policy as a first class object against which computation can be performed on data in flow for any type of rule set.
  • the rule sets are customized based on the application of the policy engine. Such sets are optionally defined as a separate layer so that all the
  • accountability and data usage tracking-related functions may be performed without affecting the data layer.
  • the systems disclosed herein may thus be seen as more the tool than representing and warranting policy.
  • the systems disclosed herein are capable of applying various privacy and compliance rules to data in use as compared to at rest.
  • such rules are applied proactively as the use is about to be activated.
  • the methods and systems disclosed herein provide a set of policies that define a signature for proper and/or improper data usage and that indicate how data should behave when in use. These policies are based on the application of the underlying data. For example, in case of healthcare patient data the policy sets may be defined based on Health Insurance Portability and Accountability Act (HIPAA). The various compliance rules required as per HIPAA may be incorporated in the policy sets including but not limited to the rules pertaining to access of data by patients and third parties, consent requirements for the use of data for marketing, rules pertaining to filing of formal complaints for privacy violation and so on.
  • HIPAA Health Insurance Portability and Accountability Act
  • the system disclosed herein thus abstracts the policy for data usage as a 'policy layer' separate from the application layer. This ability to abstract out the policy layer enables an enterprise to change their policy profile without digging into application code or data.
  • the methods and systems disclosed herein enable real-time compliance reporting thereby attesting that the data in motion has been validated against an agreed-upon rule set. Such report may be presented to the regulator through a dashboard depicting the various rule sets and compliance repost against each set.
  • transparent, enterprise integration architecture seamlessly layers on top of major enterprise application frameworks, accessing data through common log file formats without disrupting the data usage.
  • policy analytics may also be fed back into established application workflow as realtime user or supervisor alerts using standard APIs. Whether deployed for after-the- fact audit or real-time user alerting, there is no processing burden placed on existing data analytics tools.
  • a dashboard 302 may provide information about the extent to which an enterprise is complying with various data usage policies in real time, providing a "Good Housekeeping" -like seal of approval for proper usage.
  • the dashboard allows confirmation of proper usage to enable growth of data usage without running afoul of policies.
  • a Personal Information Balance Sheet (PIBS) feature may balance public accountability with protection for sensitive and classified data.
  • the balance sheet function produces provably-correct, consistent and complete summaries of data usage for review by regulators, overseers or the general public without needing to disclose classified or confidential details.
  • This feature may track changes in policies and data over time, thus the analysis can be traced back to both the original data and the policies in effect at the time the analysis was made. This is a significant advantage over traditional access log solutions that may be able to reveal who accessed data, but cannot track whether the data was used according to the rules.
  • the methods and systems disclosed herein enable the capacity for users to create custom "what if?" scenarios by adjusting time, policies and contextual use for data.
  • the methods and systems may allow scalability, for example by monitoring data use across billions of transactions, allowing real-time response at scale, during audit trails for non-compliant use of data. For example, auditing the usage of clinical trial data, based on patient consent, may be efficiently scaled by minimizing storage space required for output.
  • the methods and systems disclosed herein may track exceptions, as well as overrides to policy exceptions.
  • the methods and systems may prevent audit trail tampering.
  • the methods and systems may employ industry -grade encryption of output data.
  • the methods and systems may control access in on- premises and single or multi-tenant cloud installations.
  • Fig. 4 depicts the preservation of original sources, such as allowing the ability to reference original scanned documents used to create a policy or to which a policy is applied.
  • Fig. 5 depicts a rule writing interface for creating rules or policies in plain language that can be applied to data usage in real time.
  • a user interface 500 there is illustrated an exemplary and non-limiting embodiment of a user interface 500.
  • Interface 500 enables a rule or policy to be captured in plain language form, allowing the reasoning engine to ingest and apply the rule to a particular data usage instance.
  • a plain English rule name is provided descriptive of the operation of the rule.
  • the rule name is "Experiment should not use nonprofit sample for for-profit research.”
  • data constructs comprised of various variables, constants and literals are available for selection by a user as well as entity data structures.
  • entities include a sample entity and a researcher entity.
  • Rule filter 502 may be populated with data constructs and utilized to define rule parameters using Boolean logic as illustrated. Once defined, the rule is summarized in rule preview 504 and may be saved for later application and use.
  • the system may facilitate plain-language capabilities such as plain-English capabilities and may involve shareable policy modules.
  • the system may allow policy analysis over graph models.
  • the system may allow context-specific policies, ontology integration, and adaptable architecture.
  • the system may involve an adaptable architecture.
  • Fig. 6 depicts an example of a policy prepared using a rule writing interface as shown in Fig. 5.
  • rules may be combined from many parts of an organization or across enterprise boundaries, allowing real-time and retrospective assessment of policy compliance as data is shared around.
  • Unified policy analytics provides customized "what-if ' analysis of the impact of new rules before they are fully implemented.
  • Scalable reasoning is provided for analysis of data use with respect to even the most complex of policies, scaling up to many millions of queries across thousands of rules. Unlike other reasoning approaches which require hard- coding rules into specific software platforms, the system allows rules to change over time without costly software re-design.
  • the methods and systems disclosed herein provide a marketplace in shareable, pre-built policies. This enables reuse of policy sets defined for one given application for other applications after relevant modifications.
  • the marketplace model enables companies and policy writers to publish and share policy sets on a platform to other potential users. This buying, selling and exchange of policies through a marketplace model with supporting data structures, interfaces, and transaction environment will allow for policy reuse and support a range of business models around the marketplace.
  • Fig. 7 depicts an embodiment of a transaction log captured by the systems disclosed herein.
  • audit information is tightly coupled with a human readable explanation of the policy analysis. Even the most non-technical of users can understand automatically generated plain-English explanations of the system's conclusions. When the expert user needs to drill in more deeply, the explanations are tied to a provably correct execution trace, revealing the full decision making process.
  • Fig. 8 depicts an embodiment of capturing data for an after-the-fact review of application of a policy to a usage of data.
  • policies involved in state law enforcement such as relating to usage of sensitive data in connection with predictive policing.
  • the methods and systems disclosed herein may increase accountability of predictive policing efforts, such as in the wake of class action lawsuits that question law enforcement usage of data, such as alleging improper profiling.
  • the system may improve uncertain gang member classification and manage limited review of arrest records by applying policy -based reasoning to possible usage of arrest record data.
  • the system may provide automated third party review and classification of the usage of record data. This may allow state police to settle lawsuits, increase efficiency and objectivity, and to comply with state law.
  • exemplary real world policies may include a proposed reasoning approach, embodied in a policy, for documenting whether a person is a gang member.
  • a criminal street gang member may be a person who participates in or acts in concert with a criminal street gang as defined in a law, and the person's participation is more than nominal, passive, inactive, or purely technical.
  • a person may be documented as a gang member if, within the preceding five years: 1. The person admits his gang membership; 2. Whether in custody or not two of the following criteria are met: a. The person has been arrested in the commission of a crime where the criminal associates are documented gang members; b.
  • the person has been identified as a gang member through the use of a reliable confidential informant, parent or guardian of the suspect, or other documented gang members; c.
  • the person has known and identifiable gang tattoos; or d.
  • the person wears clothing that can be identified as gang specific, either in the clothing itself, or the manner in which the clothing is worn.
  • the reasoning chain and the data to which is has been applied may be preserved, such as to enable legal review of the determination whether a particular individual is (or is suspected to be) a gang member.
  • the methods and systems disclosed herein can embody real time accountability for terms of service for app developers in shareable/reusable policies and rules. For example, terms of service that are required for developers of software applications that are certified by certain vendors may be modeled. The system can then look at the information flow and help figure out whether one is complying with those terms or not.
  • Fig. 9 depicts an embodiment of an interface for real-time usage control of data in response to determining a risk of non-compliance with a policy.
  • Fig. 10 depicts an embodiment of an architecture and related components for real time monitoring of usage of data in a distributed platform with one or many endpoints (an endpoint optionally being a mobile device, a physical device with embedded data, or any other kind of device that may use data, the usage of which an enterprise or other use may wish to monitor).
  • an endpoint optionally being a mobile device, a physical device with embedded data, or any other kind of device that may use data, the usage of which an enterprise or other use may wish to monitor.
  • the methods and systems disclosed herein provide an architecture for monitoring real time data usage by end point (e.g. mobile) devices and applications, referred to herein in some cases as the "distributed" or “mobile” architecture.
  • the "mobile” architecture encompasses various types of distributed endpoints, including ones that may not typically be mobile devices.
  • data is often used on endpoint devices, such as in connection with devices that are associated with large numbers of apps that come from app stores.
  • endpoint devices such as in connection with devices that are associated with large numbers of apps that come from app stores.
  • a user may have a smart phone or similar device, such as an iPhone 6+®, which may have the capability to store health data, such as the user's blood pressure.
  • a third party apps could access that blood pressure reading, such as using a health care SDK for that device.
  • a computing platform company, mobile device company, mobile network operator or other app provider can permit or enable the third party app to operate on the device, but also use a data compliance tracking system, such as by using or having application developers use a set of tools, interfaces and capabilities referred to in the figures as the "TrustLayers SDK," so that there is reporting to Trust Layers servers whenever the health data stored by the device for health apps is used.
  • the app developer can arrange to have a pixel fired to the tracking party's servers, or the device itself can send a tracking signal when health care data is used, triggering application of the methods and systems disclosed throughout this disclosure.
  • the compliance tracking, or TrustLayers code can run on the device.
  • the company enabling the app e.g., the app platform company, mobile device provider or mobile operating company
  • the app has it running, such as from the TrustLayers SDK.
  • the SDK looks at data columns from the health care/health kit SDK. It captures the context of usage, such as based on the nature of the application that is using it. For example, some apps let you communicate with your doctor. Such an app might access blood pressure data only to improve medical services, so it would be within the terms of use of a health care policy.
  • the compliance tracking system, or TrustLayers system may take data from a field from the app that indicates the nature or context of the use (e.g., in a self-reporting mechanism) to confirm that the app is compliant.
  • a user operates or otherwise interacts with a device 1104 that runs an App 1102.
  • the app such as a health tracking or health kit app 1102
  • data is comprised of a user id, a date, blood pressure data and a confirmation of user consent for the use of the blood pressure data.
  • Metadata describing the data transmitted to the first party server 1108 is transmitted by the App 1102 to the TrustLayers SDK 1110.
  • TrustLayers SDK 1110 proceeds, in real or near real-time, to generate data conclusions 1114. Specifically, information indicative of the proper transmission of the heart data is transmitted to a personal data access log 1112, a data-use dashboard 1118 and may be used to generate one or more data-use alerts 1120.
  • a data conclusion 1114 is captured and reported that the data is being used properly (or is not being used properly in other cases).
  • Data in the figure is going to health tracker's own servers 1108 (the one taking the blood pressure reading).
  • TrustLayers 1110 is the third party that allows the health tracking, or "Health Tracker," app 1102 to confirm use of data in compliance with policies. TrustLayers 1110 says access was compliant with the health kit terms of use, recording compliance on servers and notifying if someone is using data in a non-compliant fashion.
  • the TrustLayers system preserves an audit trail there, so if the enterprise wants to do audits, it can request the audit trail and match it up. For example, if there is evidence that an enterprise made 1,000,000 requests and only logged about 50,000 requests in Trust Layers 1110 (a big discrepancy), there is a suggestion that some usage may be improper. At another level of granularity, particular usage can be tracked, and justifications logged, as in other embodiments described throughout this disclosure.
  • the reasoning is occurring on the endpoint (e.g., mobile) device itself.
  • endpoint e.g., mobile
  • TrustLayers 1110 monitors (on device) the usage, while still preserving all security, recording that usage on our servers.
  • the methods and systems disclosed herein provide a reasoner module of the policy engine that operates on data, which may be linked data.
  • data which may be linked data.
  • the ability to operate within a linked data architecture enables policy extensibility across any arbitrary set of applications and/or data sources. It further provides for modularity and composability of policies.
  • the abstraction of the connectors provides a way to layer the reasoning engine on top of any arbitrary set of data sources and is enabled, in part, because of the linked nature of the data. Using linked data enables the provision of all of the data in a common form into the trust engine.
  • the methods and systems disclosed herein may employ a reasoning technology inside a trust engine that makes use of a non-linked data model, such as a relational data model.
  • a non-linked data model such as a relational data model.
  • a model may be of particular use at very high volume transaction rates.
  • unstructured data such as identifiable information in a .PDF format document, could be monitored, such as by tagging identifiable sensitive information within the .PDF document for monitoring by the reasoning module.
  • the methods and systems disclosed herein may employ a reasoning technology inside a trust engine that makes use of a hybrid data model, such as including any combination of linked data, relational data, and unstructured data.
  • the methods and systems disclosed herein provide a forward chaining algorithm approach with regard to the rules engine when operating in accordance with a request to analyze.
  • the trust engine is not a pure rules engine that merely computes "yes” or "no.”
  • the system is enabled to handle conflicts as well.
  • forward chaining with goal direction operates to derive the
  • forward chaining there may be employed backwards chaining.
  • backward chaining depends, at least in part, on the need for performance optimization.
  • both forward chaining and backward chaining may be used, in a hybrid form of chaining algorithm.
  • the methods and systems disclosed herein operate to select a type of chaining for optimization of a rules engine that applies policy to data such as "in use” or "in flight” data. This optimization may happen on a per-rule or per-usage basis, such that the type of chaining is optimized to the type of rule, the type of data, the type of usage, or the like.
  • justification output refers not merely to the outputted “Yes” or “No,” or about compliance or non-compliance, but, rather, refers to how one arrives at a decision.
  • justification output (i) optionally enables an explanation of the results in human readable form and (ii) from an audit perspective, allows one to go back and ask why that whole decision was made in the first place (which includes showing conditional reasoning based on alternative beliefs). For example, "If you believe X, this is all true. If you believe Y, then another thing is all true.”
  • a policy might require that the use of an IP address for targeting ads is permissible if the IP address is not "personally identifiable,” using the standard that the IP address cannot be linked to a particular person.
  • a second policy might state that the use of an IP address for targeting ads is permissible if you believe an IP address is not personally identifiable, using the standard that the IP address cannot be linked to a group of people smaller than 10.
  • a conflict potentially arises, because a data usage that can be linked to any one of five people, but not a specific individual, would not be "personally identifiable” under the first policy, but it would be "personally identifiable” under the second policy. Normally, one has to resolve such a conflict in advance, or an irresolvable contradiction might arise.
  • one rule may preserve and perform reasoning in the alternative, i.e., represent both possibilities: that the IP address is or is not personally identifiable, depending on the standard used for determining what is personally identifiable.
  • one rule may indicate that after a period of twenty years, data collected from samples may be used for any purpose.
  • a second rule may indicate that data collected from samples may not be used if the patient can be identified by name.
  • a logical conflict potentially exists after twenty years, because one rule affirmatively permits use of the sample data in a blanket fashion, while the other rule prohibits use in certain cases.
  • the reasoner of the policy engine may identify the conflict (such as by recognizing the conflicting results "permitted” and “not permitted” when a particular usage is proposed), resolve the conflict (such as by recognizing precedence of one rule over the other or by allowing an individual to actively determine a resolution) and record the outcome, including preserving the justification tree that led to the outcome.
  • Resolution of a conflict might take various forms, such as having a certain outcome (e.g., "not permitted") always trump another type of outcome, by recognizing hierarchies of rules (e.g., federal rules may preempt state rules), by escalating conflicts to decision makers, or the like.
  • a justification tree may preserve forms of reasoning, such as inferences based on association. For example, if one thinks that John is a gang member and one thinks that Dan is a gang member, one may reason that Bob, who has an association with each of them, is a gang member by virtue of the association with at least two believed gang members. The reasoning may be preserved in the justification tree to allow one to revisit how the conclusion was reached.
  • the methods and systems disclosed herein preserve the ability to show reasoning in the alternative in a rules engine for applying policies to uses of data, particularly data that is in use or in flow.
  • the methods and systems disclosed herein provide a human readable presentation of the reasoning process in a data privacy compliance product.
  • a human readable presentation automatically and in real time as data is being used or being proposed to be used.
  • log data may be utilized. For example, one may take three months of logs and visualize the data usage. In the enterprise today, most everything is being logged. Analysis therefore doesn't have to be ad hoc. One may develop policies and understand how data is used relative to those policies. Data from a wide range of logs may be analyzed by the methods and systems disclosed herein, including firewall data logs, logs of patient data, logs of consumer data, and many others.
  • the methods and systems disclosed herein provide a user interface for a trust engine applicable to data in flow. Such an interface may be used to author policies through a graphical representation.
  • Rule writing requires making logical assertions over data structures.
  • the rule-writing interface is populated with data structures that are derived from connectors.
  • a general rule writing engine that can customize itself to an enterprise's unique data structure. For example, a rule says you can't use personally identifiable information to geo target people. Or you can't use location information to make employment decisions. For that rule to be expressed correctly, it needs to be written with reference to the schema of that organization's systems.
  • the rule interface draws the data structure in through the connectors. The system may learn about the data structure in the enterprise, and when someone sits down to write rules, they have a menu of schema elements (a data dictionary) that they can use to compose the rules with.
  • the methods and systems disclosed herein provide data compliance alerts based on the presence of justifications for data usage in a data compliance offering. Because the system preserves justification threads, one may create various alerts 1120. For example, one may alert based, at least on part, on the fact that one is seeing particular justifications. For example, an analysis of a large number of transactions may be based, at least in part, on knowledge one gets from the rules that say "watch out” because there is risk in how these particular rules are being implemented. Other embodiments may incorporate probabilistic (Bayesian or frequentist) presentation of justifications and alerts 1120.
  • probabilistic Bayesian or frequentist
  • the methods and systems disclosed herein provide grouping/tagging rules into categories of higher and lower risk in a data usage compliance offering. There may be higher and lower risk rules. For example, out of 20 rules, one might find that 3 are "bet the company” rules and the other 17 are less important. 5 of those have such impact on the business performance that you want to watch those closely.
  • the methods and systems disclosed herein provide user-written functions. Users may write a function and implement it into the engine. For example, a user may write a piece of Java code to send a piece of image data to an image recognition system to say "yes or no" as to whether something is a gang tattoo or not.
  • a policy engine 1200 within which resides and operates reasoner module 1202.
  • a chaining algorithm 1204 forming a part of the reasoner module 1202 operates, at least in part, on linked data structures 1206 as described more fully herein.
  • the policy engine 1200 can be run in a hosted environment, such as a web service or other cloud environment, or on premises of an enterprise or other entity.
  • a user operates the device 1104 on which operates the health tracking app 1102.
  • the app such as a health tracking or health kit app 1102 transmits health related data to first party server 1108, as illustrated by the path 1210.
  • data is comprised of a user identifier, a date, blood pressure data, a transaction identifier, and an indication of user consent for the use of the heart data.
  • Metadata describing the data transmitted to the first party server 1108 is transmitted by the app 1102 to the TrustLayers SDK 1110.
  • TrustLayers SDK 1110 proceeds, in real or near real-time, to relay data usage limitations via the path 1202 to the first party server 1108.
  • the TrustLayers SDK 1110 may also generate data conclusions 1114 and/or relay either intended data usage or data usage conclusions by the path 1212 to the TrustLayers Server 1114.
  • information indicative of the proper transmission of the blood pressure data is transmitted to the TrustLayers Server 1114, and in turn can be sent to a data-use dashboard 1118 and may be used to generate one or more data-use alerts 1120.
  • the methods and systems disclosed herein provide visual presentation of a proof tree.
  • a dashboard may summarize how the data is being used, how it is compliant or non- compliant. One may highlight individual events based on what are right now, hard coded notification filters that may be more customizable.
  • the methods and systems disclosed herein provide a Personal Information Balance Sheet (PIBS).
  • PIBS Personal Information Balance Sheet
  • people may share a balance sheet of some form (with regulators, business partners, or internally) with a "total performance" score, a trend, a number or frequency of transactions at risk, the time period over which there have been analyzed items in a log file, etc.
  • the PIBS may share logical analysis of various rules against data, such as transaction data.
  • the PIBS may preserve the provenance of usage decisions against the transaction data and summarize these decisions, optionally without needing to show the underlying transaction data.
  • the methods and systems disclosed herein provide a policy heat map can show where a policy has been impacted most frequently, such that one may drill down regarding a particular policy that has been highlighted (e.g., 15% of the transactions related to a particular policy were at risk, or "of 457 transactions, 200 are OK, 50 are unknown, some are unresolved, etc.") For a compliance officer, this is a mechanism to get at what policies are problem areas. There is a distribution over time.
  • methods and systems disclosed herein may include various data, communications and reporting features, such as to enable real time reporting of compliance with data usage policies.
  • Embodiments of the methods and systems disclosed herein may include various repositories for storage of data, such as a justification database for storing justification or reasoning with respect to a particular instance of data usage under a particular policy.
  • Alerts may be enabled by various technologies, such as by a JVM-based PIBS app server, which may involve a multi -threaded architecture. Alerts may be by JSON or similar mechanisms that feed a dashboard or similar system. Various data visualizations may be enabled, such as policy heat maps and the like.
  • Many data sources may be monitored for policy-sensitive data, such as logs, electronic medical records (EMR) and other electronic records, databases of many types (on premises and in the cloud), and sources such as Splunk and JDBC.
  • EMR electronic medical records
  • JDBC JDBC
  • Actions and alerts may take many forms, such as email, SMS, web services, and proprietary requests.
  • Methods and systems disclosed herein may be embodied in a wide range of products.
  • One embodiment may be a policy repository.
  • a host company of the methods and systems disclosed herein could hire people to enter this information in
  • Other embodiments may include a policy workbench for developing and modifying shareable and reusable policies, such as for use by a legal department of an enterprise.
  • the methods and systems disclosed herein by allowing data to be used properly, and allowing confirmation of proper use based on reasoning under various policies, enables benefits for a number of different types of users.
  • the system may create a common language for board members and executive teams, for privacy and big data teams, legal and marketing teams, and for compliance and product groups, and the like.
  • the Privacy or Legal or Compliance team of an enterprise may get real-time validation of data usage through real-time monitoring, as well as immediate alerts as to risky or improper usage, with auditable reasoning stored for after-the-fact review.
  • the system may respond to board priorities and external requests, such as from law enforcement, consumer protection regulators, auditors, media, consumer advocates, and the like.
  • the system may serve as a tool for checking responsibility if things go wrong.
  • the system may enable big data users to be more creative with work, such as by monitoring proposed usages of data sources in real-time, so that novel and creative big data uses can be tried, or modified, without undue concern about violation of policies.
  • the system and methods described herein may be utilized in a plurality of exemplary situations and contexts including, but not limited to the pharmaceutical industry.
  • Biotech and pharmaceutical corporations and entities typically mine genomics/trial data for new discoveries.
  • Various applicable laws require the change, removal, or obfuscation of personal data when consents are altered.
  • Other rules govern the use of clinical trial samples, the dissemination of prescription data, and many other uses of patient, medication, diagnosis, insurance, and other data types.
  • Current solutions for compliance with disparate rules are costly and require rewriting of software code, modifications of databases, rewriting or modification to processes, and the like, each time a rule changes or a new rule arises.
  • Company A was compliant in its purpose for using genome for cancer research"), allow for more efficient use of resources via automated review and to prevent incorrect data usage.
  • Drug Retailers collect consumer data on prescriptions from independent pharmacies. Laws limit consumer health data use in a marketing context. Safeguards to ensure appropriate use and prevent exploitation of data are currently lacking. In accordance with the system and methods described herein, safeguards required by law and company policy may be implemented. Monitoring for timely responses to breaches may be implemented and compliance with event history may be demonstrated.
  • Another exemplary context involves resorts which build profiles on patron activities and product preferences.
  • Personal profiles may only be used for certain purposes (e.g. marketing).
  • the potential for misuse and negative impact arises e.g. affecting patron credit, only releasing patron information to certain parties ["what happens in Vegas, should stay in Vegas”]).
  • analytic tools may be leveraged to ensure data accessed for legal uses. Timely data-misuse may be discovered followed by an appropriate response. Marketing campaigns' data access logs are reviewed by the system and validated against behavioral marketing/other relevant guidelines.
  • APR Automatic license plate readers
  • Sensitive Data is required and the need arises to ensure appropriate use in various technologies, including emerging technologies (e.g. Internet of Things)
  • the system platform is utilized to monitor use of Sensitive Data. Specifically, an app pulls data from an App Platform's API, a tracking code is added, and, as the app uses App Platform's Sensitive Data, the app sends data usage information (what data was accessed, what was context of use, who/what was recipient of Sensitive Data) to the system.
  • the system may then aggregate usage data, assess proper use of data with respect to Policies, then notify the App Platform in cases of misuse or notable proper use: Provide case-by-case reports to company to track downstream data use and preserve accountability conclusions (reasons why data usage was/wasn't proper).
  • the system may further constantly track transmission and access of data.
  • licensed data may be encapsulated into a system "wrapper", whereby a data licensee accesses licensed data through the system wrapper, and the system tracks data access against the licensee's signed terms of service.
  • a sensitive database provides a holistic picture of potential targets. Queries of potential persons of interest may result in illegal/inappropriate analysis (e.g. American citizens). The intelligence community needs the ability to attest to FISA courts/general public about proper classification of individuals into target categories. Rogue investigators may pursue searches on database of personal interest.
  • Another exemplary context involves financial companies.
  • an employer may hire a company, "FinCo,” to administer a 401k Plan wherein a contract defines ownership of Employee info in a 401k Plan Data as belonging to the employer.
  • Use limitations of employee data may, for example, prevent FinCo personal investing representatives from soliciting or determining an employee's 529 college savings plan eligibility (e.g., using data related to income, number of children, etc.).
  • the employee may then walk into a FinCo retail center for advice on 401k and 529 plans, at which point the company data containing personal information is transmitted to the investment agency.
  • the investment agency must adhere to company policy while pursuing marketing to clients.
  • compliance may be determined based, at least in part, on a context of use.
  • the system accomplishes this by comparing data access requests against the stated policies in the contract between FinCo and the employer and notifies the employer in cases of breach of contract.
  • IT policy defines the proper use between multiple systems. While security systems can detect anomalies (violations of projected behavior based on historical use), no solution compares security log data to defined policies.
  • policies can be easily adapted as proper use changes based on use of the system by authorized users, and/or different uses of data by same user, and/or different uses of data in different countries.
  • such systems and methods may be used by various healthcare users and providers like Hospitals, Doctors, nurses, Health care regulators, Research organizations, Insurance companies Federal and state Health Associations,
  • HIPAA Health Insurance Portability and Accountability Act
  • HITECH Health Information Technology for Economic and Clinical Health
  • HIPAA includes policies like: patients are provided with a right to access own records and the right to request corrections of errors; patients must have prior knowledge pertaining to how their data will be used; requirement of explicit consent from the involved individuals before Electronic Protected Health Information (ePHI) can be used for marketing; individuals have the right to ask and expect health organizations to take reasonable steps to ensure that communications between the individual and
  • ePHI Electronic Protected Health Information
  • policies related to privacy, consent, complaints etc may be encoded by the policy engine to provide a customized solution for the various users and providers in the healthcare industry.
  • PCI-DSS Payment Card Industry- Data Security Standards
  • FCRA Air Credit Reporting Act
  • GLB A Gram-Leach- Bliley Act
  • BASEL II BASEL II
  • PCI-DSS Payment Card Industry- Data Security Standards
  • FCRA Air Credit Reporting Act
  • GLB A Gram-Leach- Bliley Act
  • BASEL II BASEL II
  • PAN Primary Account Number
  • Sensitive authentication data Fel Magnetic Stripe Data, CVV2, PIN
  • Strong cryptography to render unreadable cardholder data
  • Servers and payment card storage systems to be located inside secure, access controlled rooms. It will be apparent that the methods and systems disclosed herein could be utilized to design a customized solution for various users of financial data depending on application of such data.
  • FERPA Full Educational Rights and Privacy Act
  • the standard includes rule sets like: parents or eligible students having right to inspect and review the student's education records maintained by the school; right to request that a school correct records which they believe to be inaccurate or
  • policies could be encoded and built into a specialized solution for the education industry using the methods and systems disclosed herein.
  • Methods and systems disclosed herein include various uses and applications, exemplary embodiments of which are provided below. Such uses serve various types of users, roles or "personas" within organizations. Among interesting applications are ones for personas where a part of the organization wants to hear a favorable answer about use of data, rather than having data locked down for fear of non-compliance with a regulation. Examples include chief marketing officers, big data analysts, data scientists, and more generally members of marketing groups. For example, a large body of pharmacy data may contain information that is quite sensitive at the individual level, such as prescription records, but that data, in the aggregate might be very useful, such as in organizing marketing campaigns targeted to people who walk into stores.
  • a campaign might suggest, for example, giving over the counter antihistamine coupons to people in Boston during a particular season, due to recognition from aggregate data that allergies are high during that time. Such a use could be cleared by the policy engine as it would not violate a policy against using individual records.
  • a compliance, legal or privacy professional such as working for a Chief Privacy Officer may be part of a big data team. More and more those people are forced to say "no," and there is no common language.
  • the methods and systems provided herein enable a common language and let users put questions into the engine in a way that can be understood by various professionals applying various policies to various kinds of data.
  • an entity may be looking to hire individuals that require training and investment by the enterprise.
  • personality tests may prove a better indicator of future performance.
  • Policy analytics can run alongside these data analytics, to confirm that use of the data, such as personality testing data, is permitted.
  • a big data analyst when one runs a predictive model, one is out on a ledge. The methods and systems disclosed herein provide confidence to get out onto that ledge.
  • Examples of areas that use predictive models include predictive policing; correlation of data sets for marketing (e.g., between mail order shopping data and use of ER services for patient care (such as relating to medical Insurance decisions about people); and many others. Methods and systems disclosed herein validate data usage and preserve the provenance of those decisions for future analysis.
  • Mutual fund companies have information about individual 401(k) plans, and their customers may want to know about saving for education.
  • the mutual fund company can't touch the 401(k) information (it belongs to the employer) until the customer asks about something like a 529 plan.
  • the methods and systems disclosed herein allow unlocking of the data when the appropriate circumstances permit, on a case-by-case basis, under automatic action of the policy engine 102.
  • Other examples include answering whether traders are making improper trades (such as accessing improper data sets).
  • API data usage can be processed by the policy engine 102 in real-time to confirm that a particular use of data presented by or passed through an API is in compliance with the terms and conditions of the API and applicable policies and regulations.
  • federal law enforcement requests information about a suspect, a criminal, or a witness from a state agency.
  • the state fusion centers were designed to help share that information.
  • a policy analyst sends a request from information from a federal agent. What is it regarding (pick the name); choose a policy (Massachusetts Fusion Center policy) and she gets an analysis from the policy engine 102 indicating whether there is compliance. Most rules engines would end there (up or down as to compliance).
  • the reasoner may expand out a usage justification, such as providing a traditional "IRAC" (issue, rule, analysis, conclusion) reasoning, such as: Issue: to determine whether a usage complies with a rule; Rule: a rule or policy is determined that applies to a usage; Analysis: RDF triplets are used to pull the relevant rules or policies (analysis optionally spells out in plain language, such as there was a request for dissemination of information to a federal law enforcement agency) and may designate the transaction with a name/number; and Conclusion (e.g., "the data usage is valid").
  • IRAC issue, rule, analysis, conclusion
  • the methods and systems described herein may be deployed in part or in whole through a machine that executes computer software, program codes, and/or instructions on a processor.
  • the processor may be part of a server, client, network infrastructure, mobile computing platform, stationary computing platform, or other computing platform.
  • a processor may be any kind of computational or processing device capable of executing program instructions, codes, binary instructions and the like.
  • the processor may be or include a signal processor, digital processor, embedded processor, microprocessor or any variant such as a co-processor (math co-processor, graphic co-processor, communication co-processor and the like) and the like that may directly or indirectly facilitate execution of program code or program instructions stored thereon.
  • the processor may enable execution of multiple programs, threads, and codes. The threads may be executed simultaneously to enhance the performance of the processor and to facilitate simultaneous operations of the application.
  • the processor may include memory that stores methods, codes, instructions and programs as described herein and elsewhere.
  • the processor may access a storage medium through an interface that may store methods, codes, and instructions as described herein and elsewhere.
  • the storage medium associated with the processor for storing methods, programs, codes, program instructions or other type of instructions capable of being executed by the computing or processing device may include but may not be limited to one or more of a CD- ROM, DVD, memory, hard disk, flash drive, RAM, ROM, cache and the like.
  • a processor may include one or more cores that may enhance speed and performance of a multiprocessor.
  • the process may be a dual core processor, quad core processors, other chip-level multiprocessor and the like that combine two or more independent cores (called a die).
  • the methods and systems described herein may be deployed in part or in whole through a machine that executes computer software on a server, client, firewall, gateway, hub, router, or other such computer and/or networking hardware.
  • the software program may be associated with a server that may include a file server, print server, domain server, Internet server, intranet server and other variants such as secondary server, host server, distributed server and the like.
  • the server may include one or more of memories, processors, computer readable media, storage media, ports (physical and virtual), communication devices, and interfaces capable of accessing other servers, clients, machines, and devices through a wired or a wireless medium, and the like.
  • the methods, programs or codes as described herein and elsewhere may be executed by the server.
  • other devices required for execution of methods as described in this application may be considered as a part of the
  • the server may provide an interface to other devices including, without limitation, clients, other servers, printers, database servers, print servers, file servers, communication servers, distributed servers and the like. Additionally, this coupling and/or connection may facilitate remote execution of program across the network. The networking of some or all of these devices may facilitate parallel processing of a program or method at one or more location without deviating from the scope.
  • any of the devices attached to the server through an interface may include at least one storage medium capable of storing methods, programs, code and/or instructions.
  • a central repository may provide program instructions to be executed on different devices.
  • the remote repository may act as a storage medium for program code, instructions, and programs.
  • the software program may be associated with a client that may include a file client, print client, domain client, Internet client, intranet client and other variants such as secondary client, host client, distributed client and the like.
  • the client may include one or more of memories, processors, computer readable media, storage media, ports (physical and virtual), communication devices, and interfaces capable of accessing other clients, servers, machines, and devices through a wired or a wireless medium, and the like.
  • the methods, programs or codes as described herein and elsewhere may be executed by the client.
  • other devices required for execution of methods as described in this application may be considered as a part of the infrastructure associated with the client.
  • the client may provide an interface to other devices including, without limitation, servers, other clients, printers, database servers, print servers, file servers, communication servers, distributed servers and the like. Additionally, this coupling and/or connection may facilitate remote execution of program across the network. The networking of some or all of these devices may facilitate parallel processing of a program or method at one or more location without deviating from the scope.
  • any of the devices attached to the client through an interface may include at least one storage medium capable of storing methods, programs, applications, code and/or instructions.
  • a central repository may provide program instructions to be executed on different devices.
  • the remote repository may act as a storage medium for program code, instructions, and programs.
  • the methods and systems described herein may be deployed in part or in whole through network infrastructures.
  • the network infrastructure may include elements such as computing devices, servers, routers, hubs, firewalls, clients, personal computers, communication devices, routing devices and other active and passive devices, modules and/or components as known in the art.
  • the computing and/or non- computing device(s) associated with the network infrastructure may include, apart from other components, a storage medium such as flash memory, buffer, stack, RAM, ROM and the like.
  • the processes, methods, program codes, instructions described herein and elsewhere may be executed by one or more of the network infrastructural elements.
  • the methods, program codes, and instructions described herein and elsewhere may be implemented on a cellular network having multiple cells.
  • the cellular network may either be frequency division multiple access (FDMA) network or code division multiple access (CDMA) network.
  • FDMA frequency division multiple access
  • CDMA code division multiple access
  • the cellular network may include mobile devices, cell sites, base stations, repeaters, antennas, towers, and the like.
  • the cell network may be a GSM, GPRS, 3G, EVDO, mesh, or other networks types.
  • the methods, programs codes, and instructions described herein and elsewhere may be implemented on or through mobile devices.
  • the mobile devices may include navigation devices, cell phones, mobile phones, mobile personal digital assistants, laptops, palmtops, netbooks, pagers, electronic books readers, music players and the like. These devices may include, apart from other components, a storage medium such as a flash memory, buffer, RAM, ROM and one or more computing devices.
  • the computing devices associated with mobile devices may be enabled to execute program codes, methods, and instructions stored thereon.
  • the mobile devices may be configured to execute instructions in collaboration with other devices.
  • the mobile devices may communicate with base stations interfaced with servers and configured to execute program codes.
  • the mobile devices may communicate on a peer-to-peer network, mesh network, or other communications network.
  • the program code may be stored on the storage medium associated with the server and executed by a computing device embedded within the server.
  • the base station may include a computing device and a storage medium.
  • the storage device may store program codes and instructions executed by the computing devices associated with the base station.
  • the computer software, program codes, and/or instructions may be stored and/or accessed on machine readable media that may include: computer components, devices, and recording media that retain digital data used for computing for some interval of time; semiconductor storage known as random access memory (RAM); mass storage typically for more permanent storage, such as optical discs, forms of magnetic storage like hard disks, tapes, drums, cards and other types; processor registers, cache memory, volatile memory, non-volatile memory; optical storage such as CD, DVD; removable media such as flash memory (e.g.
  • RAM random access memory
  • mass storage typically for more permanent storage, such as optical discs, forms of magnetic storage like hard disks, tapes, drums, cards and other types
  • processor registers cache memory, volatile memory, non-volatile memory
  • optical storage such as CD, DVD
  • removable media such as flash memory (e.g.
  • USB sticks or keys floppy disks, magnetic tape, paper tape, punch cards, standalone RAM disks, Zip drives, removable mass storage, off-line, and the like; other computer memory such as dynamic memory, static memory, read/write storage, mutable storage, read only, random access, sequential access, location addressable, file addressable, content addressable, network attached storage, storage area network, bar codes, magnetic ink, and the like.
  • the methods and systems described herein may transform physical and/or or intangible items from one state to another.
  • the methods and systems described herein may also transform data representing physical and/or intangible items from one state to another.
  • the methods and/or processes described above, and steps thereof, may be realized in hardware, software or any combination of hardware and software suitable for a particular application.
  • the hardware may include a general purpose computer and/or dedicated computing device or specific computing device or particular aspect or component of a specific computing device.
  • the processes may be realized in one or more microprocessors, microcontrollers, embedded microcontrollers,
  • programmable digital signal processors or other programmable device, along with internal and/or external memory.
  • the processes may also, or instead, be embodied in an application specific integrated circuit, a programmable gate array, programmable array logic, or any other device or combination of devices that may be configured to process electronic signals. It may further be appreciated that one or more of the processes may be realized as a computer executable code capable of being executed on a machine readable medium.
  • the computer executable code may be created using a structured
  • each method described above and combinations thereof may be embodied in computer executable code that, when executing on one or more computing devices, performs the steps thereof.
  • the methods may be embodied in systems that perform the steps thereof, and may be distributed across devices in a number of ways, or all of the functionality may be integrated into a dedicated, standalone device or other hardware.
  • the means for performing the steps associated with the processes described above may include any of the hardware and/or software described above. All such permutations and combinations are intended to fall within the scope of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Marketing (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Storage Device Security (AREA)

Abstract

A method includes embodying one or more policies as an object and performing computation on data in flow against the one or more policies.

Description

SYSTEM AND METHOD FOR ENABLING TRACKING OF DATA USAGE
[0001] CROSS-REFERENCE TO RELATED APPLICATIONS
[0002] This application claims the benefit of U.S. provisional patent application Ser. No. 62/089,634 filed December 9, 2014, which is hereby incorporated by reference in its entirety.
[0003] BACKGROUND
[0004] Field of the Invention
[0005] The present application generally relates to a system and method for improved handling of data usage limitations. In particular, the present application relates to a system and methods for improving the handling and tracking of data usage and enabling improved compliance with data usage policies of various kinds.
Description of the Related Art
[0006] There is significant pain for enterprises of various types associated with how they use various kinds of data, such as personally identifiable information (PII), health care data, educational records, financial data, and other data. Data misuse presents major brand and organizational risks for enterprises, as consumers are increasingly sensitive to such misuse. There are numerous policies that have emerged at state, national and international levels, imposing many complex constraints on use of data, but solutions to deal with such constraints are very limited and generally ineffective. Meanwhile, there is increasing demand for data usage, such as for improved customer relationship management, better targeting of advertising, leveraging of big data, and the like. A growing tension exists between the potential for large scale analytics, or "big data," solutions and the constraints on proper data usage. A need and opportunity exists for methods and systems that provide improved handling of data usage limitations.
SUMMARY
[0007] Provided herein are methods and systems for dramatically improving the handling and tracking of data usage and enabling improved compliance with data usage policies of various kinds. Such methods and systems include a rules language, a real-time reasoning engine that monitors data usage, and a dashboard that, among other capabilities, reports on monitored data usage as it occurs throughout an enterprise. As applicable, the language may be described as a policy language and/or a declarative rules language. Such methods and systems further include a scaleable engine for processing data usage policies, monitoring data usage and storing information about data usage (along with reasoning about why usage was proper), including internal policies of organizations (so as to protect brand equity), regulatory policies, and the like.
[0008] Methods and systems disclosed herein include a policy language that captures policies of all types, such as organizational policies, external laws and regulations, rules, and the like. The methods and systems include and enable users to take advantage of pre-built, customizable modules that allow the creation and use of policies while working in plain English. The methods and systems allow enterprises to attest to (and prove) proper usage, such as by logging usages along with the reasoning that led to usage and to discover issues, such as usage that is risky or out of compliance.
[0009] The methods and systems disclose herein include a policy engine that analyzes transactions captured in logs and assesses the transactions for compliance with policies. The policy engine implements the policies within the applications and data uses of an enterprise to track and record data usage.
[0010] The policy language and customizable modules constructed using the language allow a user to identify what data to watch in a convenient fashion, and the policy engine, when implemented in or applied to data usage of an enterprise, tracks and records data usage relative to those policies and performs reasoning, applying the policies to the data usage.
[0011] The methods and systems disclosed herein also include the ability to provide policy-related output into dashboards, alerts and the like, optionally in real time or near real time, such as within 10 seconds, three seconds, or one second of the time of a data usage or a proposed data usage.
[0012] In accordance with an exemplary and non-limiting embodiment, a system comprises a device adapted to execute a data transmitting application, a server adapted to receive transmitted data from the application and a reasoning engine adapted to receive metadata describing the transmitted data and to generate one or more conclusions indicative of a state of transmission of the transmitted data.
[0013] In accordance with an exemplary and non-limiting embodiment, a method comprises receiving metadata describing the transmission of data by a device to a server and applying one or more predetermined rules to generate one or more conclusions indicative of a state of transmission of the transmitted data.
[0014] "Policy," as referred to herein, should be understood to encompass a wide range of internal policies of an enterprise that can relate to storage or use of data, as well as regulatory policies, laws, and regulations of all kinds that regulate storage or use of data, including, without limitation, policies that apply to PII, health care data, financial data, educational data, and many other types.
[0015] "Data" should be understood to refer to all kinds of data, such as data about which there is user sensitivity (e.g., personal information) and/or data about which a policy exists.
[0016] "Real time" and "near real time" means events occurring substantially simultaneously with each other, such as within one minute, ten seconds, three seconds, or under a second, as context may indicate.
[0017] The overall methods and systems described herein provide significant enhancement of trust within and outside organizations, including trust by employees, executives, regulators, and consumers that sensitive data can be used while still complying with applicable regulations and policies. The methods and systems, components thereof, and the host entity or entities that enable them are referred to in some cases, according to context, as "TrustLayers," or "trust layers," and such references should be understood to refer to the same as such context indicates.
BRIEF DESCRIPTION OF THE FIGURES
[0018] Fig. 1 depicts an embodiment of an architecture and enabling components for a system for monitoring how data is used in real time, which includes connections to enterprise user directories and incorporation of said user directory changes into a policy engine according to an exemplary and non-limiting embodiment.
[0019] Fig. 2 depicts an architecture and enabling components for a system for monitoring how data is used in real time according to an exemplary and non-limiting embodiment.
[0020] Fig. 3 depicts a dashboard for indicating in real time the extent to which data usage is in compliance with policies according to an exemplary and non-limiting embodiment. [0021] Fig. 4 depicts the preservation of original sources, such as allowing the ability to reference original scanned documents used to create a policy or to which a policy is applied according to an exemplary and non-limiting embodiment.
[0022] Fig. 5 depicts a rule writing interface for creating rules or policies in plain language that can be applied to data usage in real time according to an exemplary and non-limiting embodiment.
[0023] Fig. 6 depicts a policy prepared using a rule writing interface according to an exemplary and non-limiting embodiment.
[0024] Fig. 7 depicts a transaction log captured by the systems disclosed herein according to an exemplary and non-limiting embodiment.
[0025] Fig. 8 depicts capturing data for an after-the-fact review of application of a policy to a usage of data according to an exemplary and non-limiting embodiment.
[0026] Fig. 9 depicts an interface for real-time usage control of data in response to determining a risk of non-compliance with a policy according to an exemplary and non-limiting embodiment.
[0027] Fig. 10 depicts an embodiment of an architecture and related components for real time monitoring of usage of data in a distributed platform.
[0028] Fig. 11 depicts an architecture and related components for a policy engine according to an exemplary and non-limiting embodiment.
[0029] Fig. 12 depicts an architecture for real time monitoring of data usage in a distributed platform, including communication of data usage limitations among distributed elements according to an exemplary and non-limiting embodiment.
DETAILED DESCRIPTION
[0030] Fig. 1 illustrates architecture for the methods and systems disclosed herein, in according to an exemplary and non-limiting embodiment. The architecture may include a policy engine 102 that may connect via various connectors 104 to enterprise data resources 106. The connectors 104 may for example include any type of programming module that may use or track usage of data such as brokers, connectors, bridges, and other tools for extraction, transformation and loading (ETL) of data or auditing or logging thereof. The data resources of an enterprise may include applications that may use data, data logs, application programming interfaces that may exchange and use data, data exchanged throughout a social graph of an enterprise, and other such resources. The policy engine 102 may be aware of users of the enterprise, such as by an enterprise user directory 108, so that its operation may be user-aware in addition to being aware of policies and data. The policy engine 102 may ingest data from the enterprise and its users, as well as policies that may apply to usage of such data. In an example, the architecture may further include policy modules 110 so that policies may be created by the policy modules 110. The policy modules 110 may use a policy definition language and environment that may allow creation of plain-language, re-usable policies that may correspond to typical enterprise policies and external policies that apply to data usage. The policy engine 102 may apply the policies to data in real time as the data is being used, such as immediately prior to, during or after usage by the enterprise, as reported to it through the connectors 104. In particular, the policy engine 102 may analyze data usage versus the policies that may apply to the usage, tracks the reasoning that may lead to a conclusion that a particular data usage is, or is not, compliant with a policy, and may store information about the data usage event, the policy, and the reasoning. The policy engine 102 may have one or more services layers that may interact with enterprise data sources 106, the enterprise user directory 108, the policy module(s) 110, and one or more systems for providing alerts (e.g., about non-compliance with policies), for providing reporting about data usage and compliance 114 (e.g., to a work flow system and/or to one or more Personal Information Balance Sheet (PIBS) dashboards 118 that may allow tracking, reporting, and analytics on policy usage).
[0031] Fig. 2 shows an embodiment of technology architecture for executing the methods and systems depicted in Fig. 1. The architecture may include a multithreaded, JVM-based (Scala) trust engine 202 that may optionally apply forward chaining reasoning to apply a policy to a proposed or actual data usage. Various data sources 204 within the architecture may be accessed via the connectors 104, and various actions and reporting may occur through the connectors 104. The trust engine 202 may take data usage requests and queries, such as via a RESTful API 208, apply policies, and send output, such as showing reasoning and justification for data usage, to a database 210, which may store justifications for later access. The database 210 may include a justification database 212 and an application database 214. The architecture may include a PIBS app server 218 that may allow tracking, reporting, and analytics on policy usage through the one or more PIBS dashboards 118. The tracked output and reporting may be communicated through actions and alerts 220 via email, SMS, web service, new trust layers, or any other mode or request.
[0032] The methods and systems disclosed herein enable embodying policy as a first class object against which computation can be performed on data in flow for any type of rule set. The rule sets are customized based on the application of the policy engine. Such sets are optionally defined as a separate layer so that all the
accountability and data usage tracking-related functions may be performed without affecting the data layer. The systems disclosed herein may thus be seen as more the tool than representing and warranting policy.
[0033] In accordance with exemplary and non-limiting embodiments, the systems disclosed herein are capable of applying various privacy and compliance rules to data in use as compared to at rest. In one embodiment such rules are applied proactively as the use is about to be activated.
[0034] The methods and systems disclosed herein provide a set of policies that define a signature for proper and/or improper data usage and that indicate how data should behave when in use. These policies are based on the application of the underlying data. For example, in case of healthcare patient data the policy sets may be defined based on Health Insurance Portability and Accountability Act (HIPAA). The various compliance rules required as per HIPAA may be incorporated in the policy sets including but not limited to the rules pertaining to access of data by patients and third parties, consent requirements for the use of data for marketing, rules pertaining to filing of formal complaints for privacy violation and so on.
[0035] The system disclosed herein thus abstracts the policy for data usage as a 'policy layer' separate from the application layer. This ability to abstract out the policy layer enables an enterprise to change their policy profile without digging into application code or data.
[0036] The methods and systems disclosed herein enable real-time compliance reporting thereby attesting that the data in motion has been validated against an agreed-upon rule set. Such report may be presented to the regulator through a dashboard depicting the various rule sets and compliance repost against each set.
[0037] According to some exemplary embodiments, transparent, enterprise integration architecture seamlessly layers on top of major enterprise application frameworks, accessing data through common log file formats without disrupting the data usage. One need not touch the sensitive data that is being queried, but only requires access to transaction metadata, data schema and/or ontology. As needed, policy analytics may also be fed back into established application workflow as realtime user or supervisor alerts using standard APIs. Whether deployed for after-the- fact audit or real-time user alerting, there is no processing burden placed on existing data analytics tools.
[0038] Referring to Fig. 3, a dashboard 302 may provide information about the extent to which an enterprise is complying with various data usage policies in real time, providing a "Good Housekeeping" -like seal of approval for proper usage. The dashboard allows confirmation of proper usage to enable growth of data usage without running afoul of policies.
[0039] In one exemplary embodiment, a Personal Information Balance Sheet (PIBS) feature may balance public accountability with protection for sensitive and classified data. The balance sheet function produces provably-correct, consistent and complete summaries of data usage for review by regulators, overseers or the general public without needing to disclose classified or confidential details. This feature may track changes in policies and data over time, thus the analysis can be traced back to both the original data and the policies in effect at the time the analysis was made. This is a significant advantage over traditional access log solutions that may be able to reveal who accessed data, but cannot track whether the data was used according to the rules.
[0040] Various features may be enabled by the methods and systems disclosed herein. For example, in embodiments, the methods and systems disclosed herein enable the capacity for users to create custom "what if?" scenarios by adjusting time, policies and contextual use for data. In embodiments, the methods and systems may allow scalability, for example by monitoring data use across billions of transactions, allowing real-time response at scale, during audit trails for non-compliant use of data. For example, auditing the usage of clinical trial data, based on patient consent, may be efficiently scaled by minimizing storage space required for output.
[0041] In embodiments, the methods and systems disclosed herein may track exceptions, as well as overrides to policy exceptions. The methods and systems may prevent audit trail tampering. The methods and systems may employ industry -grade encryption of output data. The methods and systems may control access in on- premises and single or multi-tenant cloud installations. [0042] Fig. 4 depicts the preservation of original sources, such as allowing the ability to reference original scanned documents used to create a policy or to which a policy is applied.
[0043] Fig. 5 depicts a rule writing interface for creating rules or policies in plain language that can be applied to data usage in real time. With reference to Fig. 5, there is illustrated an exemplary and non-limiting embodiment of a user interface 500. Interface 500 enables a rule or policy to be captured in plain language form, allowing the reasoning engine to ingest and apply the rule to a particular data usage instance.
[0044] As illustrated, a plain English rule name is provided descriptive of the operation of the rule. In the illustrated example, the rule name is "Experiment should not use nonprofit sample for for-profit research." As pictured, data constructs comprised of various variables, constants and literals are available for selection by a user as well as entity data structures. In the present examples, entities include a sample entity and a researcher entity.
[0045] Rule filter 502 may be populated with data constructs and utilized to define rule parameters using Boolean logic as illustrated. Once defined, the rule is summarized in rule preview 504 and may be saved for later application and use.
[0046] In embodiments, the system may facilitate plain-language capabilities such as plain-English capabilities and may involve shareable policy modules. The system may allow policy analysis over graph models. The system may allow context-specific policies, ontology integration, and adaptable architecture. In an embodiment, the system may involve an adaptable architecture.
[0047] Fig. 6 depicts an example of a policy prepared using a rule writing interface as shown in Fig. 5. In accordance with exemplary and non-limiting embodiments utilizing composable policy modules, rules may be combined from many parts of an organization or across enterprise boundaries, allowing real-time and retrospective assessment of policy compliance as data is shared around. Unified policy analytics provides customized "what-if ' analysis of the impact of new rules before they are fully implemented. Scalable reasoning is provided for analysis of data use with respect to even the most complex of policies, scaling up to many millions of queries across thousands of rules. Unlike other reasoning approaches which require hard- coding rules into specific software platforms, the system allows rules to change over time without costly software re-design. [0048] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide a marketplace in shareable, pre-built policies. This enables reuse of policy sets defined for one given application for other applications after relevant modifications. The marketplace model enables companies and policy writers to publish and share policy sets on a platform to other potential users. This buying, selling and exchange of policies through a marketplace model with supporting data structures, interfaces, and transaction environment will allow for policy reuse and support a range of business models around the marketplace.
[0049] Fig. 7 depicts an embodiment of a transaction log captured by the systems disclosed herein. In accordance with some exemplary embodiments, audit information is tightly coupled with a human readable explanation of the policy analysis. Even the most non-technical of users can understand automatically generated plain-English explanations of the system's conclusions. When the expert user needs to drill in more deeply, the explanations are tied to a provably correct execution trace, revealing the full decision making process.
[0050] Fig. 8 depicts an embodiment of capturing data for an after-the-fact review of application of a policy to a usage of data. Certain capabilities of the methods and systems disclosed herein may be understood by reference to an exemplary
embodiment involving policies involved in state law enforcement, such as relating to usage of sensitive data in connection with predictive policing. In embodiments, the methods and systems disclosed herein may increase accountability of predictive policing efforts, such as in the wake of class action lawsuits that question law enforcement usage of data, such as alleging improper profiling. For example, the system may improve uncertain gang member classification and manage limited review of arrest records by applying policy -based reasoning to possible usage of arrest record data. In embodiments, the system may provide automated third party review and classification of the usage of record data. This may allow state police to settle lawsuits, increase efficiency and objectivity, and to comply with state law.
[0051] As depicted in Fig. 8, exemplary real world policies may include a proposed reasoning approach, embodied in a policy, for documenting whether a person is a gang member. A criminal street gang member may be a person who participates in or acts in concert with a criminal street gang as defined in a law, and the person's participation is more than nominal, passive, inactive, or purely technical. In an example, a person may be documented as a gang member if, within the preceding five years: 1. The person admits his gang membership; 2. Whether in custody or not two of the following criteria are met: a. The person has been arrested in the commission of a crime where the criminal associates are documented gang members; b. The person has been identified as a gang member through the use of a reliable confidential informant, parent or guardian of the suspect, or other documented gang members; c. The person has known and identifiable gang tattoos; or d. The person wears clothing that can be identified as gang specific, either in the clothing itself, or the manner in which the clothing is worn. The reasoning chain and the data to which is has been applied may be preserved, such as to enable legal review of the determination whether a particular individual is (or is suspected to be) a gang member.
[0052] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein can embody real time accountability for terms of service for app developers in shareable/reusable policies and rules. For example, terms of service that are required for developers of software applications that are certified by certain vendors may be modeled. The system can then look at the information flow and help figure out whether one is complying with those terms or not.
[0053] Fig. 9 depicts an embodiment of an interface for real-time usage control of data in response to determining a risk of non-compliance with a policy.
[0054] Fig. 10 depicts an embodiment of an architecture and related components for real time monitoring of usage of data in a distributed platform with one or many endpoints (an endpoint optionally being a mobile device, a physical device with embedded data, or any other kind of device that may use data, the usage of which an enterprise or other use may wish to monitor). In accordance with exemplary and non- limiting embodiments, the methods and systems disclosed herein provide an architecture for monitoring real time data usage by end point (e.g. mobile) devices and applications, referred to herein in some cases as the "distributed" or "mobile" architecture. It should be understood that except where context indicates otherwise, the "mobile" architecture encompasses various types of distributed endpoints, including ones that may not typically be mobile devices. Today there are policies, data usage, etc. all happening in the four walls of the enterprise. In embodiments, data is often used on endpoint devices, such as in connection with devices that are associated with large numbers of apps that come from app stores. For example, it can be useful to be able to say, such as to a regulator, a user, or other party, that while data is on a user's device, it never leaves the device, at least as long as its usage is managed by the methods and systems disclosed herein.
[0055] In some cases it may be desirable to have methods and systems that provide control over personal health data usage, so that desirable applications, such as for promoting the user's health, are permitted to use the data, but other possible uses, such as for targeting advertisements to the user, can be blocked. By way of example, a user may have a smart phone or similar device, such as an iPhone 6+®, which may have the capability to store health data, such as the user's blood pressure. On the device, one or more third party apps could access that blood pressure reading, such as using a health care SDK for that device. In embodiments, a computing platform company, mobile device company, mobile network operator or other app provider can permit or enable the third party app to operate on the device, but also use a data compliance tracking system, such as by using or having application developers use a set of tools, interfaces and capabilities referred to in the figures as the "TrustLayers SDK," so that there is reporting to Trust Layers servers whenever the health data stored by the device for health apps is used. For example, the app developer can arrange to have a pixel fired to the tracking party's servers, or the device itself can send a tracking signal when health care data is used, triggering application of the methods and systems disclosed throughout this disclosure.
[0056] What may happen is that the compliance tracking, or TrustLayers, code can run on the device. Either the company enabling the app (e.g., the app platform company, mobile device provider or mobile operating company) has it there native to the device, or the app has it running, such as from the TrustLayers SDK. The SDK looks at data columns from the health care/health kit SDK. It captures the context of usage, such as based on the nature of the application that is using it. For example, some apps let you communicate with your doctor. Such an app might access blood pressure data only to improve medical services, so it would be within the terms of use of a health care policy. The compliance tracking system, or TrustLayers system, may take data from a field from the app that indicates the nature or context of the use (e.g., in a self-reporting mechanism) to confirm that the app is compliant.
[0057] Referring to Fig. 10, there is illustrated an architecture and related components for real time monitoring of usage of data in a distributed platform according to an exemplary and non-limiting embodiment. As illustrated, a user operates or otherwise interacts with a device 1104 that runs an App 1102. During normal operation, as illustrated, the app, such as a health tracking or health kit app 1102, transmits health related data to first party server 1108. In the example illustrated, such data is comprised of a user id, a date, blood pressure data and a confirmation of user consent for the use of the blood pressure data. Metadata describing the data transmitted to the first party server 1108 is transmitted by the App 1102 to the TrustLayers SDK 1110. TrustLayers SDK 1110 proceeds, in real or near real-time, to generate data conclusions 1114. Specifically, information indicative of the proper transmission of the heart data is transmitted to a personal data access log 1112, a data-use dashboard 1118 and may be used to generate one or more data-use alerts 1120.
[0058] With continued reference to Fig. 10 the only thing leaving the device 1104 is the data, but along with any sending of data, a data conclusion 1114 is captured and reported that the data is being used properly (or is not being used properly in other cases). Data in the figure is going to health tracker's own servers 1108 (the one taking the blood pressure reading). TrustLayers 1110 is the third party that allows the health tracking, or "Health Tracker," app 1102 to confirm use of data in compliance with policies. TrustLayers 1110 says access was compliant with the health kit terms of use, recording compliance on servers and notifying if someone is using data in a non-compliant fashion. The TrustLayers system preserves an audit trail there, so if the enterprise wants to do audits, it can request the audit trail and match it up. For example, if there is evidence that an enterprise made 1,000,000 requests and only logged about 50,000 requests in Trust Layers 1110 (a big discrepancy), there is a suggestion that some usage may be improper. At another level of granularity, particular usage can be tracked, and justifications logged, as in other embodiments described throughout this disclosure.
[0059] In embodiments, the reasoning is occurring on the endpoint (e.g., mobile) device itself. To generalize this, we have a first party operator of the app 1102 that is having its data being used, down stream, in other apps. TrustLayers 1110 monitors (on device) the usage, while still preserving all security, recording that usage on our servers.
[0060] First party health data are often used down stream. For example, if you are Pfizer and want to run an ad campaign on PC browsers, etc., you can trace it all back and say that sensitive data is properly used in these situations. [0061] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide a reasoner module of the policy engine that operates on data, which may be linked data. The ability to operate within a linked data architecture enables policy extensibility across any arbitrary set of applications and/or data sources. It further provides for modularity and composability of policies. The abstraction of the connectors provides a way to layer the reasoning engine on top of any arbitrary set of data sources and is enabled, in part, because of the linked nature of the data. Using linked data enables the provision of all of the data in a common form into the trust engine.
[0062] In accordance with other exemplary and non-limiting embodiments, the methods and systems disclosed herein may employ a reasoning technology inside a trust engine that makes use of a non-linked data model, such as a relational data model. Such a model may be of particular use at very high volume transaction rates. In other embodiments, unstructured data, such as identifiable information in a .PDF format document, could be monitored, such as by tagging identifiable sensitive information within the .PDF document for monitoring by the reasoning module.
[0063] In accordance with other exemplary and non-limiting embodiments, the methods and systems disclosed herein may employ a reasoning technology inside a trust engine that makes use of a hybrid data model, such as including any combination of linked data, relational data, and unstructured data.
[0064] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide a forward chaining algorithm approach with regard to the rules engine when operating in accordance with a request to analyze. In some embodiments, the trust engine is not a pure rules engine that merely computes "yes" or "no." The system is enabled to handle conflicts as well. In some
embodiments, forward chaining with goal direction operates to derive the
justifications. In other embodiments, there may be employed backwards chaining. Using forward chaining versus backward chaining depends, at least in part, on the need for performance optimization. In other embodiments, both forward chaining and backward chaining may be used, in a hybrid form of chaining algorithm.
[0065] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein operate to select a type of chaining for optimization of a rules engine that applies policy to data such as "in use" or "in flight" data. This optimization may happen on a per-rule or per-usage basis, such that the type of chaining is optimized to the type of rule, the type of data, the type of usage, or the like.
[0066] In accordance with various exemplary and non-limiting embodiments, generation of a justification tree is important. For example, when the reasoner gets a request to analyze transactions against some policies, the reasoner will give a true or false answer but may also preserve the whole proof tree that lead to the answer. As used herein, " justification output" refers not merely to the outputted "Yes" or "No," or about compliance or non-compliance, but, rather, refers to how one arrives at a decision. In real time, justification output (i) optionally enables an explanation of the results in human readable form and (ii) from an audit perspective, allows one to go back and ask why that whole decision was made in the first place (which includes showing conditional reasoning based on alternative beliefs). For example, "If you believe X, this is all true. If you believe Y, then another thing is all true."
[0067] By way of example, a policy might require that the use of an IP address for targeting ads is permissible if the IP address is not "personally identifiable," using the standard that the IP address cannot be linked to a particular person. A second policy might state that the use of an IP address for targeting ads is permissible if you believe an IP address is not personally identifiable, using the standard that the IP address cannot be linked to a group of people smaller than 10. Under these rules, a conflict potentially arises, because a data usage that can be linked to any one of five people, but not a specific individual, would not be "personally identifiable" under the first policy, but it would be "personally identifiable" under the second policy. Normally, one has to resolve such a conflict in advance, or an irresolvable contradiction might arise. With the present system, one may preserve and perform reasoning in the alternative, i.e., represent both possibilities: that the IP address is or is not personally identifiable, depending on the standard used for determining what is personally identifiable. In another example, relating to use of samples collected in clinical medical trials, one rule may indicate that after a period of twenty years, data collected from samples may be used for any purpose. A second rule may indicate that data collected from samples may not be used if the patient can be identified by name. A logical conflict potentially exists after twenty years, because one rule affirmatively permits use of the sample data in a blanket fashion, while the other rule prohibits use in certain cases. The reasoner of the policy engine may identify the conflict (such as by recognizing the conflicting results "permitted" and "not permitted" when a particular usage is proposed), resolve the conflict (such as by recognizing precedence of one rule over the other or by allowing an individual to actively determine a resolution) and record the outcome, including preserving the justification tree that led to the outcome. Resolution of a conflict might take various forms, such as having a certain outcome (e.g., "not permitted") always trump another type of outcome, by recognizing hierarchies of rules (e.g., federal rules may preempt state rules), by escalating conflicts to decision makers, or the like.
[0068] In yet another example, a justification tree may preserve forms of reasoning, such as inferences based on association. For example, if one thinks that John is a gang member and one thinks that Dan is a gang member, one may reason that Bob, who has an association with each of them, is a gang member by virtue of the association with at least two believed gang members. The reasoning may be preserved in the justification tree to allow one to revisit how the conclusion was reached.
[0069] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein preserve the ability to show reasoning in the alternative in a rules engine for applying policies to uses of data, particularly data that is in use or in flow.
[0070] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide a human readable presentation of the reasoning process in a data privacy compliance product. In other embodiments, there is presented a human readable presentation automatically and in real time as data is being used or being proposed to be used. In some embodiments, log data may be utilized. For example, one may take three months of logs and visualize the data usage. In the enterprise today, most everything is being logged. Analysis therefore doesn't have to be ad hoc. One may develop policies and understand how data is used relative to those policies. Data from a wide range of logs may be analyzed by the methods and systems disclosed herein, including firewall data logs, logs of patient data, logs of consumer data, and many others.
[0071] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide a user interface for a trust engine applicable to data in flow. Such an interface may be used to author policies through a graphical representation. Rule writing requires making logical assertions over data structures. In some embodiments, the rule-writing interface is populated with data structures that are derived from connectors. In some embodiments, there is provided a general rule writing engine that can customize itself to an enterprise's unique data structure. For example, a rule says you can't use personally identifiable information to geo target people. Or you can't use location information to make employment decisions. For that rule to be expressed correctly, it needs to be written with reference to the schema of that organization's systems. The rule interface draws the data structure in through the connectors. The system may learn about the data structure in the enterprise, and when someone sits down to write rules, they have a menu of schema elements (a data dictionary) that they can use to compose the rules with.
[0072] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide data compliance alerts based on the presence of justifications for data usage in a data compliance offering. Because the system preserves justification threads, one may create various alerts 1120. For example, one may alert based, at least on part, on the fact that one is seeing particular justifications. For example, an analysis of a large number of transactions may be based, at least in part, on knowledge one gets from the rules that say "watch out" because there is risk in how these particular rules are being implemented. Other embodiments may incorporate probabilistic (Bayesian or frequentist) presentation of justifications and alerts 1120.
[0073] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide grouping/tagging rules into categories of higher and lower risk in a data usage compliance offering. There may be higher and lower risk rules. For example, out of 20 rules, one might find that 3 are "bet the company" rules and the other 17 are less important. 5 of those have such impact on the business performance that you want to watch those closely.
[0074] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide user-written functions. Users may write a function and implement it into the engine. For example, a user may write a piece of Java code to send a piece of image data to an image recognition system to say "yes or no" as to whether something is a gang tattoo or not.
[0075] With reference to Fig. 11, there is illustrated an exemplary and non-limiting embodiment of a policy engine 1200 within which resides and operates reasoner module 1202. As illustrated, a chaining algorithm 1204, forming a part of the reasoner module 1202 operates, at least in part, on linked data structures 1206 as described more fully herein. In embodiments, the policy engine 1200 can be run in a hosted environment, such as a web service or other cloud environment, or on premises of an enterprise or other entity.
[0076] Referring to Fig. 12, there is illustrated an architecture and related components for real time monitoring of usage of data in a distributed environment, according to an exemplary and non-limiting embodiment. As illustrated, a user operates the device 1104 on which operates the health tracking app 1102. During normal operation, as illustrated in connection with Fig. 10 above and in Fig. 12, the app, such as a health tracking or health kit app 1102, transmits health related data to first party server 1108, as illustrated by the path 1210. In the example illustrated, such data is comprised of a user identifier, a date, blood pressure data, a transaction identifier, and an indication of user consent for the use of the heart data. Metadata describing the data transmitted to the first party server 1108 is transmitted by the app 1102 to the TrustLayers SDK 1110. TrustLayers SDK 1110 proceeds, in real or near real-time, to relay data usage limitations via the path 1202 to the first party server 1108. The TrustLayers SDK 1110 may also generate data conclusions 1114 and/or relay either intended data usage or data usage conclusions by the path 1212 to the TrustLayers Server 1114. Specifically, information indicative of the proper transmission of the blood pressure data is transmitted to the TrustLayers Server 1114, and in turn can be sent to a data-use dashboard 1118 and may be used to generate one or more data-use alerts 1120.
[0077] With continued reference to Fig. 10 and Fig. 12, the only thing leaving the device 1104 is the data, but along with any sending of data, a data conclusion 1114 is captured and reported, such as that the data is being used properly (or is not being used properly in other cases).
[0078] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide visual presentation of a proof tree.
[0079] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide a Real Time Dashboard. In some embodiments a dashboard may summarize how the data is being used, how it is compliant or non- compliant. One may highlight individual events based on what are right now, hard coded notification filters that may be more customizable.
[0080] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide a Personal Information Balance Sheet (PIBS). In much the same way as a company releases a balance sheet to Wall Street, people may share a balance sheet of some form (with regulators, business partners, or internally) with a "total performance" score, a trend, a number or frequency of transactions at risk, the time period over which there have been analyzed items in a log file, etc. As an example, one may show the number of transactions or customer records at risk. Further, the PIBS may share logical analysis of various rules against data, such as transaction data. Among other important elements, the PIBS may preserve the provenance of usage decisions against the transaction data and summarize these decisions, optionally without needing to show the underlying transaction data.
[0081] In accordance with exemplary and non-limiting embodiments, the methods and systems disclosed herein provide a policy heat map can show where a policy has been impacted most frequently, such that one may drill down regarding a particular policy that has been highlighted (e.g., 15% of the transactions related to a particular policy were at risk, or "of 457 transactions, 200 are OK, 50 are unknown, some are unresolved, etc.") For a compliance officer, this is a mechanism to get at what policies are problem areas. There is a distribution over time.
[0082] In various embodiments, methods and systems disclosed herein may include various data, communications and reporting features, such as to enable real time reporting of compliance with data usage policies.
[0083] Embodiments of the methods and systems disclosed herein may include various repositories for storage of data, such as a justification database for storing justification or reasoning with respect to a particular instance of data usage under a particular policy.
[0084] Alerts may be enabled by various technologies, such as by a JVM-based PIBS app server, which may involve a multi -threaded architecture. Alerts may be by JSON or similar mechanisms that feed a dashboard or similar system. Various data visualizations may be enabled, such as policy heat maps and the like.
[0085] Many data sources may be monitored for policy-sensitive data, such as logs, electronic medical records (EMR) and other electronic records, databases of many types (on premises and in the cloud), and sources such as Splunk and JDBC.
[0086] Actions and alerts may take many forms, such as email, SMS, web services, and proprietary requests. [0087] Methods and systems disclosed herein may be embodied in a wide range of products. One embodiment may be a policy repository. A host company of the methods and systems disclosed herein could hire people to enter this information in [0088] Other embodiments may include a policy workbench for developing and modifying shareable and reusable policies, such as for use by a legal department of an enterprise.
[0089] The methods and systems disclosed herein, by allowing data to be used properly, and allowing confirmation of proper use based on reasoning under various policies, enables benefits for a number of different types of users. The system may create a common language for board members and executive teams, for privacy and big data teams, legal and marketing teams, and for compliance and product groups, and the like. For example, the Privacy or Legal or Compliance team of an enterprise may get real-time validation of data usage through real-time monitoring, as well as immediate alerts as to risky or improper usage, with auditable reasoning stored for after-the-fact review. The system may respond to board priorities and external requests, such as from law enforcement, consumer protection regulators, auditors, media, consumer advocates, and the like. The system may serve as a tool for checking responsibility if things go wrong. The system may enable big data users to be more creative with work, such as by monitoring proposed usages of data sources in real-time, so that novel and creative big data uses can be tried, or modified, without undue concern about violation of policies.
[0090] The system and methods described herein may be utilized in a plurality of exemplary situations and contexts including, but not limited to the pharmaceutical industry. Biotech and pharmaceutical corporations and entities typically mine genomics/trial data for new discoveries. Various applicable laws require the change, removal, or obfuscation of personal data when consents are altered. Other rules govern the use of clinical trial samples, the dissemination of prescription data, and many other uses of patient, medication, diagnosis, insurance, and other data types. Current solutions for compliance with disparate rules are costly and require rewriting of software code, modifications of databases, rewriting or modification to processes, and the like, each time a rule changes or a new rule arises.
[0091] In accordance with the system and methods described herein, there is provided the means to show compliant personal data use in a dashboard, to remove costly code/database/process changes (30% of trial costs) and to accelerate the speed and flexibility of trials and peer collaboration with confidence.
[0092] In another exemplary context involving genomic data aggregation/analysis platform for 100+ research partners, researchers' queries must respect consent form restrictions such as, "Use my genes only for cancer research" and "Can send my genes to non-profit and for-profit researchers". Internal Review Boards (IRBs) require review of database queries for compliance against consent form restrictions and current approaches lack objectivity and scalability.
[0093] In accordance with the system and methods described herein, there is provided the means to validate queries with objective automated approach, allow for preservation of provenance of the compliance assessment (e.g., "Big Pharma
Company A was compliant in its purpose for using genome for cancer research"), allow for more efficient use of resources via automated review and to prevent incorrect data usage.
[0094] In another exemplary context involving State Law Enforcement, states classify citizens into gangs to prevent gang crime. Laws and regulations present clear guidelines on gang identification. The system validates analyst assessment of gang classification while the analyst can override the system's determination of gang classification. In accordance with the system and methods described herein, there may be ensured an objective approach to gang classification. Algorithms for categorization may be directly created from law with the process fully automated and documented.
[0095] In another exemplary context involving Nationwide Drugstore Retailers, Drug Retailers collect consumer data on prescriptions from independent pharmacies. Laws limit consumer health data use in a marketing context. Safeguards to ensure appropriate use and prevent exploitation of data are currently lacking. In accordance with the system and methods described herein, safeguards required by law and company policy may be implemented. Monitoring for timely responses to breaches may be implemented and compliance with event history may be demonstrated.
[0096] Another exemplary context involves resorts which build profiles on patron activities and product preferences. Personal profiles may only be used for certain purposes (e.g. marketing). The potential for misuse and negative impact arises (e.g. affecting patron credit, only releasing patron information to certain parties ["what happens in Vegas, should stay in Vegas"]). In accordance with the system and methods described herein, analytic tools may be leveraged to ensure data accessed for legal uses. Timely data-misuse may be discovered followed by an appropriate response. Marketing campaigns' data access logs are reviewed by the system and validated against behavioral marketing/other relevant guidelines.
[0097] In another exemplary context involving Automatic license plate readers (ALPR), sensitive personal data of location, date, and time are collected and stored. They are used by law enforcement and allowed for certain purposes. Such data may have an expiration date yet still be stored in database. It is necessary to ensure that data being accessed hasn't expired.
[0098] In accordance with the system and methods described herein, one may automate legality determination to ensure law compliance and reliability.
[0099] In another exemplary context involving healthcare,
Patient/Wellness/Medical data is gathered by an App Platform. Downstream transmission of sensitive data to third party companies becoming more prevalent. Monitoring of third party developers' use of data, including health, medical, and wellness data ("Sensitive Data") is required and the need arises to ensure appropriate use in various technologies, including emerging technologies (e.g. Internet of Things) [00100] In accordance with the system and methods described herein, the system platform is utilized to monitor use of Sensitive Data. Specifically, an app pulls data from an App Platform's API, a tracking code is added, and, as the app uses App Platform's Sensitive Data, the app sends data usage information (what data was accessed, what was context of use, who/what was recipient of Sensitive Data) to the system. The system may then aggregate usage data, assess proper use of data with respect to Policies, then notify the App Platform in cases of misuse or notable proper use: Provide case-by-case reports to company to track downstream data use and preserve accountability conclusions (reasons why data usage was/wasn't proper). The system may further constantly track transmission and access of data.
[00101] In another exemplary context involving data licensing, sensitive personal data is stored on massive databases. In such instances, ensuring compliance in each scenario is not viable via a manual audit. Monitoring downstream data use is not feasible with current technology.
[00102] In accordance with the system and methods described herein, licensed data may be encapsulated into a system "wrapper", whereby a data licensee accesses licensed data through the system wrapper, and the system tracks data access against the licensee's signed terms of service. [00103] In another exemplary context involving the intelligence community, a sensitive database provides a holistic picture of potential targets. Queries of potential persons of interest may result in illegal/inappropriate analysis (e.g. American citizens). The intelligence community needs the ability to attest to FISA courts/general public about proper classification of individuals into target categories. Rogue investigators may pursue searches on database of personal interest.
[00104] In accordance with the system and methods described herein, there is provided a cross-check database with laws to create a shortlist of targets to investigate manually. Employee monitoring may be automated to allow a workforce to focus on matters of national interest and prevent inappropriate use of regulated databases.
[00105] Another exemplary context involves financial companies. For example, an employer may hire a company, "FinCo," to administer a 401k Plan wherein a contract defines ownership of Employee info in a 401k Plan Data as belonging to the employer. Use limitations of employee data may, for example, prevent FinCo personal investing representatives from soliciting or determining an employee's 529 college savings plan eligibility (e.g., using data related to income, number of children, etc.). The employee may then walk into a FinCo retail center for advice on 401k and 529 plans, at which point the company data containing personal information is transmitted to the investment agency. The investment agency must adhere to company policy while pursuing marketing to clients. In accordance with the system and methods described herein, compliance may be determined based, at least in part, on a context of use. The system accomplishes this by comparing data access requests against the stated policies in the contract between FinCo and the employer and notifies the employer in cases of breach of contract.
[00106] In another exemplary context involving network security, IT policy defines the proper use between multiple systems. While security systems can detect anomalies (violations of projected behavior based on historical use), no solution compares security log data to defined policies.
[00107] In accordance with the system and methods described herein, policies can be easily adapted as proper use changes based on use of the system by authorized users, and/or different uses of data by same user, and/or different uses of data in different countries.
[00108] It will be apparent that the methods and systems disclosed herein may have applications in a wide range of industries including but not limited to Healthcare, Banking, Financial Services and Insurance (BFSI), Government, Education and Academia, Biotech and Pharmaceuticals, Construction and Engineering, Telecom and IT, Mining and Natural Resources, Energy and Utility, Transportation and logistics, Manufacturing, Retail and Consumer Goods.
[00109] For example, such systems and methods may be used by various healthcare users and providers like Hospitals, Doctors, nurses, Health care regulators, Research organizations, Insurance companies Federal and state Health Associations,
Governmental organizations like the CDC, NIH, Non-governmental organizations like UNESCO and the Red Cross for managing data for complying with different healthcare regulation standards such as HIPAA (Health Insurance Portability and Accountability Act) and HITECH (Health Information Technology for Economic and Clinical Health). Such standards include a wide variety of data usage policies that must be complied with by these users of healthcare data. For example HIPAA includes policies like: patients are provided with a right to access own records and the right to request corrections of errors; patients must have prior knowledge pertaining to how their data will be used; requirement of explicit consent from the involved individuals before Electronic Protected Health Information (ePHI) can be used for marketing; individuals have the right to ask and expect health organizations to take reasonable steps to ensure that communications between the individual and
organization are kept private and the right to file formal privacy related complaints in case of data breach. It will be understood that such policies related to privacy, consent, complaints etc may be encoded by the policy engine to provide a customized solution for the various users and providers in the healthcare industry.
[00110] Similarly, in the Financial Services industry various users including banks, securities firms, credit card companies, insurance companies, credit score
organizations, e-commerce corporations, consumer loan and mortgage corporations, asset management companies, Governmental organizations like IRS that routinely deal with consumer personal financial data related to Credit card transactions,
Mortgages and loans, Taxation Investment etc. Such organizations and users need to comply with various standards such as PCI-DSS (Payment Card Industry- Data Security Standards), FCRA (Fair Credit Reporting Act), GLB A (Gramm-Leach- Bliley Act), BASEL II, etc. These standards and norms require compliance with various policies like: Primary Account Number (PAN) to be rendered unreadable anywhere it is stored; Sensitive authentication data (Full Magnetic Stripe Data, CVV2, PIN) not stored; Strong cryptography to render unreadable cardholder data; Servers and payment card storage systems to be located inside secure, access controlled rooms. It will be apparent that the methods and systems disclosed herein could be utilized to design a customized solution for various users of financial data depending on application of such data.
[00111] As another example, various users of student record data like public and private schools, Universities, Test administration organizations, Athletic
organizations, Recruitment firms that deal with data for applications related to college admissions, student loans, scholarships grants, recruiting grading and test
administration need to comply with FERPA (Family Educational Rights and Privacy Act). The standard includes rule sets like: parents or eligible students having right to inspect and review the student's education records maintained by the school; right to request that a school correct records which they believe to be inaccurate or
misleading; requirement of written permission from the parent or eligible student in order to release any information from a student's education record. As before, such policies could be encoded and built into a specialized solution for the education industry using the methods and systems disclosed herein.
[00112] It will be apparent that in addition to the above representative examples, the methods and systems disclosed herein could be applicable to a wide range of applications including export control compliance; Sarbanes Oxley (SOX) compliance; GxP (GMP, GCP, GLP) compliance and so on.
[00113] Methods and systems disclosed herein include various uses and applications, exemplary embodiments of which are provided below. Such uses serve various types of users, roles or "personas" within organizations. Among interesting applications are ones for personas where a part of the organization wants to hear a favorable answer about use of data, rather than having data locked down for fear of non-compliance with a regulation. Examples include chief marketing officers, big data analysts, data scientists, and more generally members of marketing groups. For example, a large body of pharmacy data may contain information that is quite sensitive at the individual level, such as prescription records, but that data, in the aggregate might be very useful, such as in organizing marketing campaigns targeted to people who walk into stores. A campaign might suggest, for example, giving over the counter antihistamine coupons to people in Boston during a particular season, due to recognition from aggregate data that allergies are high during that time. Such a use could be cleared by the policy engine as it would not violate a policy against using individual records. In another example, a compliance, legal or privacy professional, such as working for a Chief Privacy Officer may be part of a big data team. More and more those people are forced to say "no," and there is no common language. The methods and systems provided herein enable a common language and let users put questions into the engine in a way that can be understood by various professionals applying various policies to various kinds of data. Many kinds of questions can be answered for these professionals, such as "can I use PHI in marketing?," or "can I give this person a bank account and reach into the database to find out that person's affiliation with party X or Y?" A user may have some correlation drawn from a massive data set, and there may be some catalyst for putting that to use. When that predictive model is going to be used, the usage can be addressed by the policy engine to validate it against the policies of the enterprise.
[00114] In another example, an entity may be looking to hire individuals that require training and investment by the enterprise. In some cases, there is little clear correlation with experience of individuals on a resume, but personality tests may prove a better indicator of future performance. For example, for call center representatives, personality tests tend to work (creative types tend to stick around, while inquisitive types don't). Policy analytics can run alongside these data analytics, to confirm that use of the data, such as personality testing data, is permitted. In general, as a big data analyst, when one runs a predictive model, one is out on a ledge. The methods and systems disclosed herein provide confidence to get out onto that ledge.
[00115] Examples of areas that use predictive models include predictive policing; correlation of data sets for marketing (e.g., between mail order shopping data and use of ER services for patient care (such as relating to medical Insurance decisions about people); and many others. Methods and systems disclosed herein validate data usage and preserve the provenance of those decisions for future analysis.
[00116] Financial services companies may also benefit from the methods and systems disclosed herein. Mutual fund companies have information about individual 401(k) plans, and their customers may want to know about saving for education. The mutual fund company can't touch the 401(k) information (it belongs to the employer) until the customer asks about something like a 529 plan. The methods and systems disclosed herein allow unlocking of the data when the appropriate circumstances permit, on a case-by-case basis, under automatic action of the policy engine 102. Other examples include answering whether traders are making improper trades (such as accessing improper data sets).
[00117] In the medical field, many predictions (such as relating to outcomes of sports) depend on sensitive medical data. Usage can be compared to policy to verify which elements can be used in prediction models and which are prohibited, on a use- by-use basis in real time.
[00118] In the intelligence and law enforcement world, analysts may wish to look at data obtained by other analysts. The permission to use data for particular purposes can be evaluated against policies on a case-by-case basis in real time.
[00119] Application programming interfaces use many kinds of data in an automated fashion and are typically associated with their own data usage policies as well as being subject to enterprise polices and regulatory constraints. API data usage can be processed by the policy engine 102 in real-time to confirm that a particular use of data presented by or passed through an API is in compliance with the terms and conditions of the API and applicable policies and regulations.
[00120] Almost any highly regulated industry can benefit from the methods and systems disclosed herein. In medical research, for example, a researcher may have samples and wish to know whether they can be used. For example, if an individual is deceased for more than 70 years, the individual's genome may be used for research under certain circumstances that can be embodied in the policy engine. Such circumstances may vary from country-to-country and may be embodied in reasoning engine that walks through the applicable policies and usage on a case-by-case basis, capturing the reasoning and underlying data for future analysis.
[00121] Government sectors have policy analysts frequently looking to share data (e.g., between state and federal). The personal information balance sheet described above allows validation of certain usage while confirming that other uses are not occurring. This may change the dialog with the public from who has access to how the data is being used.
[00122] In an embodiment, federal law enforcement requests information about a suspect, a criminal, or a witness from a state agency. The state fusion centers were designed to help share that information. A policy analyst (sender) receives a request from information from a federal agent. What is it regarding (pick the name); choose a policy (Massachusetts Fusion Center policy) and she gets an analysis from the policy engine 102 indicating whether there is compliance. Most rules engines would end there (up or down as to compliance). In one embodiment, related to legal reasoning, the reasoner may expand out a usage justification, such as providing a traditional "IRAC" (issue, rule, analysis, conclusion) reasoning, such as: Issue: to determine whether a usage complies with a rule; Rule: a rule or policy is determined that applies to a usage; Analysis: RDF triplets are used to pull the relevant rules or policies (analysis optionally spells out in plain language, such as there was a request for dissemination of information to a federal law enforcement agency) and may designate the transaction with a name/number; and Conclusion (e.g., "the data usage is valid").
[00123] The methods and systems described herein may be deployed in part or in whole through a machine that executes computer software, program codes, and/or instructions on a processor. The processor may be part of a server, client, network infrastructure, mobile computing platform, stationary computing platform, or other computing platform. A processor may be any kind of computational or processing device capable of executing program instructions, codes, binary instructions and the like. The processor may be or include a signal processor, digital processor, embedded processor, microprocessor or any variant such as a co-processor (math co-processor, graphic co-processor, communication co-processor and the like) and the like that may directly or indirectly facilitate execution of program code or program instructions stored thereon. In addition, the processor may enable execution of multiple programs, threads, and codes. The threads may be executed simultaneously to enhance the performance of the processor and to facilitate simultaneous operations of the application. By way of implementation, methods, program codes, program
instructions and the like described herein may be implemented in one or more thread. The thread may spawn other threads that may have assigned priorities associated with them; the processor may execute these threads based on priority or any other order based on instructions provided in the program code. The processor may include memory that stores methods, codes, instructions and programs as described herein and elsewhere. The processor may access a storage medium through an interface that may store methods, codes, and instructions as described herein and elsewhere. The storage medium associated with the processor for storing methods, programs, codes, program instructions or other type of instructions capable of being executed by the computing or processing device may include but may not be limited to one or more of a CD- ROM, DVD, memory, hard disk, flash drive, RAM, ROM, cache and the like. [00124] A processor may include one or more cores that may enhance speed and performance of a multiprocessor. In embodiments, the process may be a dual core processor, quad core processors, other chip-level multiprocessor and the like that combine two or more independent cores (called a die).
[00125] The methods and systems described herein may be deployed in part or in whole through a machine that executes computer software on a server, client, firewall, gateway, hub, router, or other such computer and/or networking hardware. The software program may be associated with a server that may include a file server, print server, domain server, Internet server, intranet server and other variants such as secondary server, host server, distributed server and the like. The server may include one or more of memories, processors, computer readable media, storage media, ports (physical and virtual), communication devices, and interfaces capable of accessing other servers, clients, machines, and devices through a wired or a wireless medium, and the like. The methods, programs or codes as described herein and elsewhere may be executed by the server. In addition, other devices required for execution of methods as described in this application may be considered as a part of the
infrastructure associated with the server.
[00126] The server may provide an interface to other devices including, without limitation, clients, other servers, printers, database servers, print servers, file servers, communication servers, distributed servers and the like. Additionally, this coupling and/or connection may facilitate remote execution of program across the network. The networking of some or all of these devices may facilitate parallel processing of a program or method at one or more location without deviating from the scope. In addition, any of the devices attached to the server through an interface may include at least one storage medium capable of storing methods, programs, code and/or instructions. A central repository may provide program instructions to be executed on different devices. In this implementation, the remote repository may act as a storage medium for program code, instructions, and programs.
[00127] The software program may be associated with a client that may include a file client, print client, domain client, Internet client, intranet client and other variants such as secondary client, host client, distributed client and the like. The client may include one or more of memories, processors, computer readable media, storage media, ports (physical and virtual), communication devices, and interfaces capable of accessing other clients, servers, machines, and devices through a wired or a wireless medium, and the like. The methods, programs or codes as described herein and elsewhere may be executed by the client. In addition, other devices required for execution of methods as described in this application may be considered as a part of the infrastructure associated with the client.
[00128] The client may provide an interface to other devices including, without limitation, servers, other clients, printers, database servers, print servers, file servers, communication servers, distributed servers and the like. Additionally, this coupling and/or connection may facilitate remote execution of program across the network. The networking of some or all of these devices may facilitate parallel processing of a program or method at one or more location without deviating from the scope. In addition, any of the devices attached to the client through an interface may include at least one storage medium capable of storing methods, programs, applications, code and/or instructions. A central repository may provide program instructions to be executed on different devices. In this implementation, the remote repository may act as a storage medium for program code, instructions, and programs.
[00129] The methods and systems described herein may be deployed in part or in whole through network infrastructures. The network infrastructure may include elements such as computing devices, servers, routers, hubs, firewalls, clients, personal computers, communication devices, routing devices and other active and passive devices, modules and/or components as known in the art. The computing and/or non- computing device(s) associated with the network infrastructure may include, apart from other components, a storage medium such as flash memory, buffer, stack, RAM, ROM and the like. The processes, methods, program codes, instructions described herein and elsewhere may be executed by one or more of the network infrastructural elements.
[00130] The methods, program codes, and instructions described herein and elsewhere may be implemented on a cellular network having multiple cells. The cellular network may either be frequency division multiple access (FDMA) network or code division multiple access (CDMA) network. The cellular network may include mobile devices, cell sites, base stations, repeaters, antennas, towers, and the like. The cell network may be a GSM, GPRS, 3G, EVDO, mesh, or other networks types.
[00131] The methods, programs codes, and instructions described herein and elsewhere may be implemented on or through mobile devices. The mobile devices may include navigation devices, cell phones, mobile phones, mobile personal digital assistants, laptops, palmtops, netbooks, pagers, electronic books readers, music players and the like. These devices may include, apart from other components, a storage medium such as a flash memory, buffer, RAM, ROM and one or more computing devices. The computing devices associated with mobile devices may be enabled to execute program codes, methods, and instructions stored thereon.
Alternatively, the mobile devices may be configured to execute instructions in collaboration with other devices. The mobile devices may communicate with base stations interfaced with servers and configured to execute program codes. The mobile devices may communicate on a peer-to-peer network, mesh network, or other communications network. The program code may be stored on the storage medium associated with the server and executed by a computing device embedded within the server. The base station may include a computing device and a storage medium. The storage device may store program codes and instructions executed by the computing devices associated with the base station.
[00132] The computer software, program codes, and/or instructions may be stored and/or accessed on machine readable media that may include: computer components, devices, and recording media that retain digital data used for computing for some interval of time; semiconductor storage known as random access memory (RAM); mass storage typically for more permanent storage, such as optical discs, forms of magnetic storage like hard disks, tapes, drums, cards and other types; processor registers, cache memory, volatile memory, non-volatile memory; optical storage such as CD, DVD; removable media such as flash memory (e.g. USB sticks or keys), floppy disks, magnetic tape, paper tape, punch cards, standalone RAM disks, Zip drives, removable mass storage, off-line, and the like; other computer memory such as dynamic memory, static memory, read/write storage, mutable storage, read only, random access, sequential access, location addressable, file addressable, content addressable, network attached storage, storage area network, bar codes, magnetic ink, and the like.
[00133] The methods and systems described herein may transform physical and/or or intangible items from one state to another. The methods and systems described herein may also transform data representing physical and/or intangible items from one state to another.
[00134] The elements described and depicted herein, including in flow charts and block diagrams throughout the figures, imply logical boundaries between the elements. However, according to software or hardware engineering practices, the depicted elements and the functions thereof may be implemented on machines through computer executable media having a processor capable of executing program instructions stored thereon as a monolithic software structure, as standalone software modules, or as modules that employ external routines, code, services, and so forth, or any combination of these, and all such implementations may be within the scope of the present disclosure. Examples of such machines may include, but may not be limited to, personal digital assistants, laptops, personal computers, mobile phones, other handheld computing devices, medical equipment, wired or wireless
communication devices, transducers, chips, calculators, satellites, tablet PCs, electronic books, gadgets, electronic devices, devices having artificial intelligence, computing devices, networking equipment, servers, routers and the like. Furthermore, the elements depicted in the flow chart and block diagrams or any other logical component may be implemented on a machine capable of executing program instructions. Thus, while the foregoing drawings and descriptions set forth functional aspects of the disclosed systems, no particular arrangement of software for implementing these functional aspects should be inferred from these descriptions unless explicitly stated or otherwise clear from the context. Similarly, it may be appreciated that the various steps identified and described above may be varied, and that the order of steps may be adapted to particular applications of the techniques disclosed herein. All such variations and modifications are intended to fall within the scope of this disclosure. As such, the depiction and/or description of an order for various steps should not be understood to require a particular order of execution for those steps, unless required by a particular application, or explicitly stated or otherwise clear from the context.
[00135] The methods and/or processes described above, and steps thereof, may be realized in hardware, software or any combination of hardware and software suitable for a particular application. The hardware may include a general purpose computer and/or dedicated computing device or specific computing device or particular aspect or component of a specific computing device. The processes may be realized in one or more microprocessors, microcontrollers, embedded microcontrollers,
programmable digital signal processors or other programmable device, along with internal and/or external memory. The processes may also, or instead, be embodied in an application specific integrated circuit, a programmable gate array, programmable array logic, or any other device or combination of devices that may be configured to process electronic signals. It may further be appreciated that one or more of the processes may be realized as a computer executable code capable of being executed on a machine readable medium.
[00136] The computer executable code may be created using a structured
programming language such as C, an object oriented programming language such as C++, or any other high-level or low-level programming language (including assembly languages, hardware description languages, and database programming languages and technologies) that may be stored, compiled or interpreted to run on one of the above devices, as well as heterogeneous combinations of processors, processor architectures, or combinations of different hardware and software, or any other machine capable of executing program instructions.
[00137] Thus, in one aspect, each method described above and combinations thereof may be embodied in computer executable code that, when executing on one or more computing devices, performs the steps thereof. In another aspect, the methods may be embodied in systems that perform the steps thereof, and may be distributed across devices in a number of ways, or all of the functionality may be integrated into a dedicated, standalone device or other hardware. In another aspect, the means for performing the steps associated with the processes described above may include any of the hardware and/or software described above. All such permutations and combinations are intended to fall within the scope of the present disclosure.
[00138] While the methods and systems described herein have been disclosed in connection with certain preferred embodiments shown and described in detail, various modifications and improvements thereon may become readily apparent to those skilled in the art. Accordingly, the spirit and scope of the methods and systems described herein is not to be limited by the foregoing examples, but is to be understood in the broadest sense allowable by law.
[00139] All documents referenced herein are hereby incorporated by reference.

Claims

CLAIMS: What is claimed is:
1. A method comprising:
embodying one or more policies as an object; and
performing computation on data in flow against the one or more policies.
2. The method of claim 1 wherein the object is a first class object.
3. The method of claim 1 wherein the performing computation is performed for at least one of a plurality of rule sets.
4. The method of claim 3 wherein the computation is performed by a policy engine that compares intended data uses with policies that govern said uses in real time.
5. The method of claim 4 wherein at least one of the plurality of rule sets is customized based, in part, on an application of the policy engine.
6. The method of claim 3 wherein at least one of the plurality of rule sets is defined as a layer separate from a data layer.
7. A system comprising:
a device adapted to execute a data transmitting application;
a server adapted to receive transmitted data from the application; and
a reasoning engine adapted to receive metadata describing the transmitted data and to generate one or more conclusions indicative of a state of transmission of the transmitted data.
8. The system of claim 7 wherein the reasoning engine is further adapted to transmit the one or more conclusions to at least one of a personal data log and a data- use dashboard.
9. The system of claim 7 wherein the reasoning engine is further adapted to transmit an alert based, at least in part, upon the one or more conclusions.
10. The system of claim 7 wherein the reasoning engine is adapted to apply one or more predetermined rules to generate the one or more conclusions.
11. The system of claim 10 further comprising a rule writing interface adapted to facilitate the generation of the one or more predetermined rules.
12. The system of claim 11 wherein the one or more predetermined rules comprise a policy.
13. The system of claim 12 further comprising a marketplace server on which is stored at least one policy.
14. The system of claim 13 wherein the marketplace server is adapted to provide the at least one policy to the reasoning engine.
15. The system of claim 7 wherein the reasoning engine is further adapted to analyze the metadata in accordance with one or more policies to produce at least one justification output.
16. The system of claim 15 wherein the justification output comprises a human readable form.
17. The system of claim 15 wherein the justification output preserves each of a plurality of results forming one or more sub-steps of the analysis.
18. A method comprising:
receiving metadata describing the transmission of data by a device to a server; and applying one or more predetermined rules to generate one or more conclusions indicative of a state of transmission of the transmitted data.
19. The method of claim 18 further comprising transmitting the one or more conclusions to at least one of a personal data log and a data-use dashboard.
20. The method of claim 18 further comprising transmitting an alert based, at least in part, upon the one or more conclusions.
21. The method of claim 18 further comprising utilizing a rule writing interface to facilitate the generation of the one or more predetermined rules.
22. The method of claim 21 wherein the one or more predetermined rules comprise a policy.
23. The method of claim 22 further comprising accessing a marketplace server for receiving at least one policy.
24. The method of claim 18 further comprising analyzing the metadata in accordance with one or more policies to produce at least one justification output.
25. The method of claim 24 wherein the justification output comprises a human readable form.
26. The method of claim 25 wherein the justification output preserves each of a plurality of results forming one or more sub-steps of the analysis.
PCT/US2015/064612 2014-12-09 2015-12-09 System and method for enabling tracking of data usage WO2016094472A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201462089634P 2014-12-09 2014-12-09
US62/089,634 2014-12-09

Publications (1)

Publication Number Publication Date
WO2016094472A1 true WO2016094472A1 (en) 2016-06-16

Family

ID=56108088

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/064612 WO2016094472A1 (en) 2014-12-09 2015-12-09 System and method for enabling tracking of data usage

Country Status (1)

Country Link
WO (1) WO2016094472A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10747871B2 (en) 2018-06-21 2020-08-18 International Business Machines Corporation System and method for producing secure data management software
US20220101336A1 (en) * 2020-09-30 2022-03-31 EMC IP Holding Company LLC Compliant and auditable data handling in a data confidence fabric
US11838304B2 (en) 2020-08-28 2023-12-05 International Business Machines Corporation Tracking of sensitive data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040193912A1 (en) * 2003-03-31 2004-09-30 Intel Corporation Methods and systems for managing security policies
US20050203881A1 (en) * 2004-03-09 2005-09-15 Akio Sakamoto Database user behavior monitor system and method
US20100036779A1 (en) * 2008-05-16 2010-02-11 Norman Sadeh-Koniecpol User-controllable learning of policies
US20130219463A1 (en) * 2011-04-11 2013-08-22 Namakkal S. Sambamurthy Methods and Systems for Enterprise Data Use Monitoring and Auditing User-Data Interactions
US20130290200A1 (en) * 2012-04-29 2013-10-31 Monaeo, Llc. Systems and methods of compliance tracking

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040193912A1 (en) * 2003-03-31 2004-09-30 Intel Corporation Methods and systems for managing security policies
US20050203881A1 (en) * 2004-03-09 2005-09-15 Akio Sakamoto Database user behavior monitor system and method
US20100036779A1 (en) * 2008-05-16 2010-02-11 Norman Sadeh-Koniecpol User-controllable learning of policies
US20130219463A1 (en) * 2011-04-11 2013-08-22 Namakkal S. Sambamurthy Methods and Systems for Enterprise Data Use Monitoring and Auditing User-Data Interactions
US20130290200A1 (en) * 2012-04-29 2013-10-31 Monaeo, Llc. Systems and methods of compliance tracking

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10747871B2 (en) 2018-06-21 2020-08-18 International Business Machines Corporation System and method for producing secure data management software
US11838304B2 (en) 2020-08-28 2023-12-05 International Business Machines Corporation Tracking of sensitive data
US20220101336A1 (en) * 2020-09-30 2022-03-31 EMC IP Holding Company LLC Compliant and auditable data handling in a data confidence fabric

Similar Documents

Publication Publication Date Title
Amir Latif et al. A remix IDE: smart contract-based framework for the healthcare sector by using Blockchain technology
Prinsloo et al. An evaluation of policy frameworks for addressing ethical considerations in learning analytics
Kaplan et al. Beyond cybersecurity: protecting your digital business
Pasquale et al. Protecting health privacy in an era of big data processing and cloud computing
Kalyvas et al. Big Data: A business and legal guide
Gonzalez-Granadillo et al. Automated cyber and privacy risk management toolkit
Mishra Exploring the impact of ai-based cyber security financial sector management
Casanovas et al. A linked democracy approach for regulating public health data
Salih et al. Prioritising organisational factors impacting cloud ERP adoption and the critical issues related to security, usability, and vendors: A systematic literature review
Min Global business analytics models: Concepts and applications in predictive, healthcare, supply chain, and finance analytics
Madavarapu Electronic Data Interchange Analysts Strategies to Improve Information Security While Using EDI in Healthcare Organizations
Kummar et al. Blockchain based big data solutions for internet of things (IoT) and smart cities
Parker Managing threats to health data and information: toward security
WO2016094472A1 (en) System and method for enabling tracking of data usage
Kamal et al. The impact of blockchain on business models: A study on how the attributes of blockchain affect the elements of business model
Manns The Adoption of Cybersecurity in Small-to Medium-Sized Businesses: A Correlation Study
Hyson Factors influencing the adoption of cloud computing by medical facility managers
Chen et al. Identification of SMEs in the Critical Factors of an IS Backup System Using a Three-Stage Advanced Hybrid MDM–AHP Model
Kamarinou et al. Protection of Personal Data in Clouds and Rights of Individuals
Mawel Exploring the Strategic Cybersecurity Defense Information Technology Managers Can Implement to Reduce Healthcare Data Breaches
Arulmozhi et al. A Review of Blockchain Technology Based Techniques to Preserve Privacy and to Secure for Electronic Health Records
Hechler et al. AI and Governance
Shahriar et al. Approaches and challenges to secure health data
Luna A Framework for Evaluation of Risk Management Models for HIPAA Compliance for Electronic Personal Health Information used by Small and Medium Businesses using Cloud Technologies
Hemann Mitigating It Security Risk in United States Healthcare: a Qualitative Examination of Best Practices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15867200

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22.09.2017)

122 Ep: pct application non-entry in european phase

Ref document number: 15867200

Country of ref document: EP

Kind code of ref document: A1