WO2016028442A1 - Systèmes et procédés de détection de données sensibles d'utilisateur sur l'internet - Google Patents

Systèmes et procédés de détection de données sensibles d'utilisateur sur l'internet Download PDF

Info

Publication number
WO2016028442A1
WO2016028442A1 PCT/US2015/042216 US2015042216W WO2016028442A1 WO 2016028442 A1 WO2016028442 A1 WO 2016028442A1 US 2015042216 W US2015042216 W US 2015042216W WO 2016028442 A1 WO2016028442 A1 WO 2016028442A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
candidates
verified
online
data format
Prior art date
Application number
PCT/US2015/042216
Other languages
English (en)
Inventor
Garth Shoemaker
Ivan Medvedev
Russell Owen
Nima MOUSAVI
Emily LUTHRA
Heather GUTNIK
Original Assignee
Google Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google Inc. filed Critical Google Inc.
Publication of WO2016028442A1 publication Critical patent/WO2016028442A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/556Detecting local intrusion or implementing counter-measures involving covert channels, i.e. data leakage between processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/30Payment architectures, schemes or protocols characterised by the use of specific devices or networks
    • G06Q20/34Payment architectures, schemes or protocols characterised by the use of specific devices or networks using cards, e.g. integrated circuit [IC] cards or magnetic cards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2141Access rights, e.g. capability lists, access control lists, access tables, access matrices

Definitions

  • the subject matter discussed herein relates generally to detecting sensitive user data on the Internet and, more particularly, to systems and methods for identifying leakage of sensitive user data on the Internet.
  • the Internet is used for online transactions that may involve sensitive user data. For example, a user may make an online purchase, manage information associated with their accounts, and communicate information to other users. These online transactions may require the user to provide sensitive information. For example, a user may need to provide credit card information to make an online purchase, a government issued identification number to renew a license, or a bank account number to transfer funds electronically.
  • sensitive user data is placed on the Internet in a manner that is inconsistent with a user' s intent. This may occur due to an unintentional act by the owner of the data, such as inadvertently sharing a document or incorrectly setting a permission level on a website, such that the data accidentally becomes public on the Internet.
  • a third party such as a business, and government, or another individual, may also unintentionally post this information on the Internet. Further, the information may be posted to the Internet by a malicious actor.
  • the leaked sensitive user data may remain publicly available on the Internet.
  • a malicious actor may locate the leaked sensitive user data on the Internet and attempt to exploit it.
  • a malicious actor may engage in credit card fraud, identity theft, or other activity. After the malicious activity has occurred and the user becomes aware of the problem, an anti-fraud mechanism may close the leak, but the damage may be difficult to remedy.
  • a malicious actor may purchase merchandise with leaked credit card information, or may use a Social Security number to obtain a driver's license, open a bank account, or conduct other malicious activity.
  • aspects of the example implementations are directed to a method for identifying leakage of sensitive user data when the leakage occurs, and before the leakage may be identified and/or exploited by malicious actors.
  • the subject matter includes computer-implemented methods for detecting information, comprising: performing an online search of an index of online sites for information having a prescribed data format, and generating a list of one or more candidates having the prescribed data format; applying a filter to the one or more candidates having the prescribed data format based on at least one of a plurality of rules, and determining whether each of the filtered candidates meets one or more criteria indicative of the information being compromised; verifying the one or more filtered candidates that meet the one or more criteria based on the determining, by comparison to verified information associated with the filtered candidates, to generate a list of one or more verified candidates; validating the one or more verified candidates based on identification data related to the verified information, to generate a list of one or more validated candidates; and taking an action with respect to the one or more validated candidates.
  • the methods are implemented using one or more computing devices and/or systems. The methods may be stored in computer-readable media.
  • FIG. 1 shows an example environment suitable for some example implementations .
  • FIG. 2 shows a system according to an example implementation.
  • FIG. 3 shows a first user interface according to an example implementation.
  • FIG. 4 shows a second user interface according to an example
  • FIG. 5 shows a third user interface according to an example
  • FIG. 6 shows a machine learning classifier, according to an example implementation.
  • FIG. 7 shows an example of a process implementation.
  • FIG. 8 shows an example computing environment with an example computing device suitable for use in some example implementations.
  • the examples shown below are directed to structures and functions for implementing systems and methods for detecting sensitive user data on the internet.
  • the example implementations are directed to detection of the leakage of sensitive user data on the Internet, with the permission of the owner of the sensitive use data.
  • the sensitive user data may include, but is not limited to, credit card numbers, bank account numbers, government issued numbers such as Social Security, passport, license or other information, as well as other user data that would be understood to be sensitive, and subject to potential leakage and damage due to malicious activity, as would be understood by those skilled in the art.
  • action may be taken based on the results, and the information may be incorporated into measures to prevent future occurrences of leakage, if the owner of the sensitive user data consents.
  • an index such as a web index
  • data that appears to have one or more characteristics of the sensitive user data, if the owner of the sensitive user has consented to the search.
  • a search of a web index may be conducted for a set of numbers that appear to be credit card numbers.
  • the result of the search is a list of candidates that match the one or more characteristics of the sensitive user data.
  • known patterns associated with credit card numbers may be searched. For example, but not by way of limitation, a number of digits, formatting of digits (e.g., 1234- 5678-9012-3456), the specific starting digit or starting sequence in a string of digits, or other rules associated with the conventions for assigning credit card numbers may be applied to the search.
  • the known pattern may include information associated with an international bank account number (IB AN) under ISO 13616, which included a prescribed country code, followed by check digits, and a basic bank account number (BBAN).
  • an algorithm to validate an identification number such as a Luhn check may be performed.
  • the example implementation is not limited to a Luhn check, and other algorithms may be substituted therefor without departing from the scope of the inventive concept.
  • a filter is applied to the candidates on the list.
  • Application of the filter may involve the execution one or more deterministic rules, as well as the possible application of a machine learning classifier.
  • the deterministic rules may include, but are not limited to, the following:
  • the filter may include a machine learning classifier.
  • the machine learning classifier may be trained to recognize sites or webpages that may include credit card numbers.
  • the classifier may use various features, such as words that appear on the page, properties of credit card numbers that are found on the page, data associated with the prescribed data format (e.g., credit card numbers may occur with more frequency on webpages that include the phrase "credit card"), and other metadata associated with the page. Further details of the machine learning classifier are discussed below with respect to FIG. 6.
  • the properties may include the number of credit card numbers found on the site, the number of unique credit card numbers, patterns associated with the arrangement of the credit card number is on site, or the like.
  • the metadata associated with the page may include information such as page rank, anchors, or other relevant metadata.
  • the filtered candidates may be scored based on the results of the filtering, and optionally based on the results of the initial search. If the score associated with a filtered candidate exceeds a threshold score, there is a high degree of likelihood that the filtered candidate is sensitive user data.
  • a threshold score may be set based on a degree of confidence or other calculation, or manually set. The threshold score may be fixed, dynamic, or adjustable based on a configurable parameter.
  • the scoring may be based on a presence and/or distribution of strings of characters in a webpage, such as keywords or phrases (e.g., the score may be incremented for each occurrence of a string of characters that includes a word such as "credit card”, "bank”, “social security”, or the like).
  • the score may also be incremented if information present on the webpage has been previously identified as being potentially associated with compromised account information (e.g., credit card numbers or other financial information that has been reported as potentially leaked or stolen). More specifically, for an account that has been reported as potentially comprised, the score may increment if associated information (e.g., account holder name, phone number, email address, physical address, expiration date, or the like) is present on the webpage.
  • the score may also be based on one or more machine learning models. For example, the score may be based on a machine learning model that is based on manually selected training sets that can be constructed to predict information associated with a webpage that may have a high probability of indicating a true positive (e.g., actual instance of leakage).
  • the information associated with the webpage may include, but is not limited to, hypertext markup language (HTML) constructs (e.g., tables), prescribed fonts, page structure, element repetitions or other information that a machine learning algorithm may determine to be predictive of an actual instance of leakage, as would be understood by those skilled in the art. Accordingly a score may be incremented when the predictive information is determined to be associated with the webpage.
  • HTML hypertext markup language
  • validation may be performed, by searching for other information associated with the sensitive user data on the site that contains the leaked sensitive user data.
  • other credit card information such as a name of owner or an expiration date of the credit card may be the subject of the search in the validation process. Accordingly, further removal of false positives may occur as a result of validation.
  • the validation may be performed without revealing the verified data to the entity that does the comparison (e.g., separation of a data verifier and a data custodian).
  • the sensitive user data is characterized as having been leaked, and the owner of the leaked sensitive user data is considered to have been accurately identified. Accordingly, action is taken to notify the user, based on the preference of the owner of the sensitive user data. Further, assistance may be provided to the user, such as helping the user to obtain a new credit card or account, or to cancel the existing account, or both.
  • the issuer of the credit card may also be notified, with the permission of the owner of the account.
  • the information on the leakage may be provided to an anti-fraud engine, with the permission of the user.
  • the database of credit card numbers may be used as a basis to search for each of the actual credit card numbers on the web index, and where a match exists, perform the foregoing remaining processes to identify the owner of the credit card.
  • the foregoing example implementation may be executed at varying frequencies. For example, but not by limitation, the foregoing example implementation may be executed daily, hourly, or at any other periodic time interval. Alternatively, it may be executed based on specific triggering events, such as an update performed on the web index, which may include changes in the information of the web index. The timing or frequency of the execution of the foregoing example implementation may be prioritized or varied.
  • the example implementation is not limited to a text search of websites. For example, image recognition or recognition of images in videos may also be performed. Further, the information that may be considered sensitive user data is not limited to the foregoing, and the example of a credit card number is merely for illustrative purposes and not intended to be limiting.
  • the foregoing example implementation may be adapted to search on other types of sensitive user data with the permission of the owner of the sensitive data, such as location information, health information, or any other information that a user would considered sensitive, and that may appear on the Internet.
  • FIG. 1 shows an example environment suitable for some example implementations.
  • Environment 100 includes devices 105-145, and each of the devices is communicatively connected to at least one other device via, for example, network 160 (e.g., by wired and/or wireless connections). Some devices may be communicatively connected to one or more storage devices 130 and 145.
  • An example of one or more devices 105-145 may be computing device 805 described below in FIG. 8.
  • Devices 105-145 may include, but are not limited to, a computer 105 (e.g., a laptop computing device), a mobile device 110 (e.g., smartphone or tablet), a television 115, a device associated with a vehicle 120, a server computer 125, computing devices 135-140, storage devices 130 and 145.
  • devices 105-120 may be considered user devices
  • Devices 125-145 may be devices associated with service providers including websites that may contain sensitive user data (e.g., used by service providers to provide services and/or store data, such as webpages, text, text portions, images, image portions, audios, audio segments, videos, video segments, and/or information thereabout).
  • sensitive user data e.g., used by service providers to provide services and/or store data, such as webpages, text, text portions, images, image portions, audios, audio segments, videos, video segments, and/or information thereabout.
  • FIG. 2 illustrates a system 200 for detecting sensitive user information that may have been leaked according to an example implementation.
  • a user such as the owner of sensitive data, may interact with the system via a user device, such as a mobile terminal 201, a fixed or desktop terminal 203, a laptop or tablet 205, a user device associated with a vehicle such as an automobile 207, or any other user device that may engage in online communication, as would be understood by those skilled in the art.
  • a user device such as a mobile terminal 201, a fixed or desktop terminal 203, a laptop or tablet 205, a user device associated with a vehicle such as an automobile 207, or any other user device that may engage in online communication, as would be understood by those skilled in the art.
  • a server 209 is provided that is communicatively coupled to the user device.
  • the server 209 may include a search engine 211 that is configured to perform the search of an index 227, such as a web index, for the sensitive user data, and generate a list of candidates that may be the leaked sensitive user data.
  • an index 227 such as a web index
  • the owner of the sensitive data may choose to whether to consent to the search (e.g., in account settings).
  • a filter 213 is provided to apply a filter, which may include one or more deterministic rules or a machine learning classifier, in a rule base 225, as explained above.
  • a verification engine 215 and a validation engine 217 are also provided, to perform the verification and validation of the example implementations based on actual account data 223, as explained above. As also explained above, the validation may be performed by the validation engine 217 without revealing the verified data that is generated by the verification engine 215, to an entity that performs the comparison during the validation.
  • an action determining unit 219 is provided, to execute an action, as described above. The action determining unit 219 may be associated with an action engine 221, such as an anti-fraud engine or the like, as discussed above.
  • the server 209 is connected to a network 229 such as the Internet.
  • Sensitive user data may be posted on the Internet at various sites 231, 233, 235, 237.
  • the sensitive user information may be posted unintentionally by the user themself, or may be posted by another party, such as an online vendor or a malicious actor 1, 2, 3.
  • malicious actor 1 may post or attempt to abuse sensitive user data A, B at a site 231.
  • malicious actor 2 may post or attempt to abuse sensitive data C at site 233 as well as sensitive data D at site 235.
  • an online vendor 3 may unintentionally release sensitive data E at its site 237.
  • the above described activities at the network 229 may be detected prior to a successful attempt by a malicious actor to abuse the sensitive user data.
  • FIG. 3 illustrates an example user interface 300 according to the example implementation. While the example user interface 300 illustrates an online payment system, the user interface is not limited thereto, and may also include other sites, portals or applications that permit a user to interact with a system associated with the management of the sensitive user information.
  • account options are provided, such as sending payment, performing transactions, selecting a method of payment, considering offers, making transactions, a security center or others as would be understood by those skilled in the art.
  • the user is provided with a notification that a credit card has been discovered on a public website. For example, a notice may be provided to the user, along with an identification of the credit card associated with the sensitive user data. The user may be informed that the system checks for credit card information on the Internet, and provides a notification when such information is found.
  • the user is presented with options. By selecting an object such as a button, the user may invoke a procedure to go to the security center and take further action (e.g., cancel credit card and obtain new credit card). Alternatively, a user may select to use or not use this feature.
  • FIG. 4 illustrates an example user interface 400 according to the example implementation.
  • various account options may be provided.
  • the user is provided with a notification of a security alert. More specifically, the user is notified that the system is checking for the public presence of the user's credit card number, and that the credit card number has been discovered on a public website. The notification explains the risk to the user, and provides the suggestion to contact the issuer of the credit card, and request new credit card.
  • At 405 information associated with the sensitive user data that has been found on the public website is displayed. More specifically, identifying information of the credit card number, expiry date, and other information that was found on the site is displayed. Further, the number of websites on which the data was found is also displayed. The user is also presented with options to contact issuer of the credit card, such as ABC and XYZ. Although the example implementation is not limited thereto, options are provided for the user to call a phone number associated with the credit card issuer, or complete an online form, which may include canceling the existing credit card number, requesting a new credit card number, or both.
  • FIG. 5 illustrates an example user interface 500 according to the example implementation.
  • account options 501 are provided.
  • the example user interface 500 provides the user with an opportunity to adjust settings associated with the online system. For example, the user may choose the language, whether information such as offers, invitations and news are sent to the user, whether the identity of the user has been verified, name and address information, and setting options for the security center.
  • the setting options for the security center may include, for example but not by way of limitation, a setting that permits the online payment system to check the public web for the credit card date of the user, as well as options for the user to receive security center alerts.
  • Security center alerts may be received, for example, via intelligent personal assistant, e-mail address, mobile device or any other manner of alert delivery, as would be understood by those skilled in the art. Accordingly, the owner of the sensitive user data maintains control over participation in the checking of the public websites on behalf of the owner to protect the sensitive user data, and the manner of being informed.
  • FIG. 6 illustrates a machine learning classifier, according to an example implementation.
  • a data input 601 provides input data, such as the URLs and the data on the sites, to a data processing tool 603.
  • the data processing tool 603 receives the input data and generates as its output information associated with sites that can potentially have one or more credit card numbers.
  • the output of the data processing tool 603 is unlabeled data that is stored in a database 605.
  • a labeling user interface 611 is provided that permits the operator to manage the process of the reading of the unlabeled data from the database 605, and the writing of the labels back into the database 605, indicative of the credit card information in the site that is associated with the URL. Particularly, the labeling user interface 611 retrieves only the unlabeled data from the database 605 that is determined to serve as a training classifier, and sends that data to the operator for labeling.
  • FIG. 7 shows an example of a process implementation according to the example implementation.
  • a search of an index is performed for information having a prescribed data format.
  • the search may be an online search of a web index of online sites for credit card numbers, and a list may be based on those candidates having the prescribed data format.
  • the prescribed data format may refer to a number of digits, a sequence of digits, a specific format of the digits, or other identification information as would be known by those skilled in the art.
  • a Luhn check or other similar algorithm may be performed.
  • the application of the filter is based on at least one of a plurality of rules, such as deterministic rules or based on machine learning as described above. Further, a score is generated that is indicative of a degree of confidence that the candidate is sensitive user data that has been leaked on the Internet. The score may be generated based on the result of the application of the rules associated with the filter, the results of the search in 701, or both.
  • the determination may be based on the score exceeding a threshold, and the threshold may be determined based on a level or degree of confidence, above which the sensitive user data is deemed to be associated with an actual leakage.
  • the determination at 705 is not limited to a determination based on a score exceeding a threshold, and other determination processes may be substituted therefor, or used in combination therewith.
  • a Boolean test may be applied, in which the existence or non-existence of the suspected sensitive user data on the website may constitute the criteria that is used in the determination at 705.
  • 707 and 709 are optional steps as indicated by the broken lines in FIG. 7. Accordingly, in alternate example implementations, the process may proceed to 711 without performing 707 and/or 709.
  • the flow of the process may be 705->707->711, 705->709->711, or 705- >711, instead of 705->707->709->711 as shown in FIG. 7.
  • the filtered candidates having a score that exceeds the threshold are verified against verified information, such as actual sensitive user data.
  • verified information such as actual sensitive user data.
  • this verification may be performed against a database of actual credit card numbers, such as those provided by the credit card issuer, or stored in some other secure location.
  • validation is performed for the candidate, based on information that is related to the verified information. For example, as explained above with respect to the credit card example implementation, online site content of a site on which the credit card number has been found is searched for other identifying information associated with the credit card account. For example, if a name or expiry date associated with the credit card account also appears on the site, further validation is provided. As also explained above, the validation at 709 may be performed without revealing the verified data generated during the verification 707, to an entity that performs the comparison on the website during the validation. For example but not by way of limitation, this may be performed via one-way transformation (e.g., hash), or other technique as would be understood by those skilled in the art.
  • one-way transformation e.g., hash
  • action may be taken, such as notifying the owner of the sensitive user information, the issuer of the sensitive user information, denying a transaction associated with the one or more validated candidates, or some other action as consented to by the user.
  • process 700 may be implemented with different, fewer, or more blocks.
  • Process 700 may be implemented as computer executable instructions, which can be stored on a medium, loaded onto one or more processors of one or more computing devices, and executed as a computer- implemented method.
  • FIG. 8 shows an example computing environment with an example computing device suitable for use in some example implementations.
  • Computing device 805 in computing environment 800 can include one or more processing units, cores, or processors 810, memory 815 (e.g., RAM, ROM, and/or the like), internal storage 820 (e.g., magnetic, optical, solid state storage, and/or organic), and/or I/O interface 825, any of which can be coupled on a communication mechanism or bus 830 for communicating information or embedded in the computing device 805.
  • memory 815 e.g., RAM, ROM, and/or the like
  • internal storage 820 e.g., magnetic, optical, solid state storage, and/or organic
  • I/O interface 825 any of which can be coupled on a communication mechanism or bus 830 for communicating information or embedded in the computing device 805.
  • Computing device 805 can be communicatively coupled to input/user interface 835 and output device/interface 840. Either one or both of input/user interface 835 and output device/interface 840 can be a wired or wireless interface and can be detachable.
  • Input/user interface 835 may include any device, component, sensor, or interface, physical or virtual, that can be used to provide input (e.g., buttons, touch-screen interface, keyboard, a pointing/cursor control, microphone, camera, braille, motion sensor, optical reader, and/or the like).
  • Output device/interface 840 may include a display, television, monitor, printer, speaker, braille, or the like. In some example
  • input/user interface 835 and output device/interface 840 can be embedded with or physically coupled to the computing device 805.
  • other computing devices may function as or provide the functions of input/user interface 835 and output device/interface 840 for a computing device 805.
  • Examples of computing device 805 may include, but are not limited to, highly mobile devices (e.g., smartphones, devices in vehicles and other machines, devices carried by humans and animals, and the like), mobile devices (e.g., tablets, notebooks, laptops, personal computers, portable televisions, radios, and the like), and devices not designed for mobility (e.g., desktop computers, other computers, information kiosks, televisions with one or more processors embedded therein and/or coupled thereto, radios, and the like).
  • Computing device 805 can be communicatively coupled (e.g., via I/O interface 825) to external storage 845 and network 850 for communicating with any number of networked components, devices, and systems, including one or more computing devices of the same or different configuration.
  • Computing device 805 or any connected computing device can be functioning as, providing services of, or referred to as a server, client, thin server, general machine, special-purpose machine, or another label.
  • the I/O interface 825 may include wireless communication components
  • the wireless communication components may include an antenna system with one or more antennae, a radio system, a baseband system, or any combination thereof.
  • Radio frequency (RF) signals may be transmitted and received over the air by the antenna system under the management of the radio system.
  • I/O interface 825 can include, but is not limited to, wired and/or wireless interfaces using any communication or I/O protocols or standards (e.g., Ethernet, 802. l lx, Universal System Bus, WiMax, modem, a cellular network protocol, and the like) for communicating information to and/or from at least all the connected components, devices, and network in computing environment 800.
  • Network 850 can be any network or combination of networks (e.g., the Internet, local area network, wide area network, a telephonic network, a cellular network, satellite network, and the like).
  • Computing device 805 can use and/or communicate using computer-usable or computer-readable media, including transitory media and non-transitory media.
  • Transitory media include transmission media (e.g., metal cables, fiber optics), signals, carrier waves, and the like.
  • Non-transitory media include magnetic media (e.g., disks and tapes), optical media (e.g., CD ROM, digital video disks, Blu-ray disks), solid state media (e.g., RAM, ROM, flash memory, solid-state storage), and other non-volatile storage or memory.
  • Computing device 805 can be used to implement techniques, methods, applications, processes, or computer-executable instructions in some example computing environments.
  • Computer-executable instructions can be retrieved from transitory media, and stored on and retrieved from non-transitory media.
  • the executable instructions can originate from one or more of any programming, scripting, and machine languages (e.g., C, C++, C#, Java, Visual Basic, Python, Perl, JavaScript, and others).
  • Processor(s) 810 can execute under any operating system (OS) (not shown), in a native or virtual environment.
  • OS operating system
  • One or more applications can be deployed that include logic unit 860, application programming interface (API) unit 865, input unit 870, output unit 875, search unit 880, filter unit 885, verification and validation unit 890, action unit 897, and inter-unit communication mechanism 895 for the different units to communicate with each other, with the OS, and with other applications (not shown).
  • search unit 880, filter unit 885, verification and validation unit 890, and action unit 897 may implement one or more processes shown in FIGs. X and Y.
  • the described units and elements can be varied in design, function, configuration, or implementation and are not limited to the descriptions provided.
  • API unit 865 when information or an execution instruction is received by API unit 865, it may be communicated to one or more other units (e.g., logic unit 860, input unit 870, output unit 875, search unit 880, filter unit 885, and verification and validation unit 890).
  • input unit 870 may use API unit 865 to provide this information to the search unit 880.
  • Search unit 880 may, via API unit 865, interact with the filter unit 885 to apply the filter based on the rules, and to score the candidates.
  • filter unit 885 may interact with verification and validation unit the 890 to verify and validate that the sensitive user data is associated with the user.
  • action unit 897 may interact with verification and validation unit the 890 to take or recommend an action.
  • logic unit 860 may be configured to control the information flow among the units and direct the services provided by API unit 865, input unit 870, output unit 875, search unit 880, filter unit 885, verification and validation unit 890, and action unit 897 in some example implementations described above.
  • the flow of one or more processes or implementations may be controlled by logic unit 860 alone or in conjunction with API unit 865.
  • a component may be a stand-alone software package, or it may be a software package incorporated as a "tool" in a larger software product. It may be downloadable from a network, for example, a website, as a stand-alone product or as an add-in package for installation in an existing software application. It may also be available as a client-server software application, as a web-enabled software application, and/or as a mobile application.
  • the users may be provided with an opportunity to control whether programs or features collect user information (e.g., information about a user's social network, social actions or activities, profession, a user's preferences, or a user's current location), or to control whether and/or how to receive content from the content server that may be more relevant to the user.
  • user information e.g., information about a user's social network, social actions or activities, profession, a user's preferences, or a user's current location
  • certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed.
  • a user's identity may be treated so that no personally identifiable information can be determined for the user, or a user's geographic location may be generalized where location information is obtained (such as to a city, ZIP code, or state level), so that a particular location of a user cannot be determined.
  • location information such as to a city, ZIP code, or state level
  • the user may have control over how information is collected about the user and used by a content server or the like.

Abstract

La présente invention concerne des systèmes et des procédés de détection de données sensibles d'utilisateur sur l'internet consistant à exécuter une recherche en ligne d'un index de sites en ligne concernant des informations ayant un format de données prescrit, et à générer une liste d'un ou de plusieurs candidats ayant le format de données prescrit ; à appliquer un filtre audit ou auxdits candidats ayant le format de données prescrit sur la base d'au moins l'une d'une pluralité de règles, et à déterminer si chacun des candidats filtrés satisfait des critères indicatifs des informations compromises ; à vérifier ledit ou lesdits candidats filtrés qui satisfont les critères sur la base de la détermination, par comparaison avec les informations vérifiées associées aux candidats filtrés, pour générer une liste d'un ou de plusieurs candidats vérifiés ; à valider ledit ou lesdits candidats vérifiés sur la base de données d'identification concernant les informations vérifiées, pour générer une liste d'un ou de plusieurs candidats validés ; et à entreprendre une action.
PCT/US2015/042216 2014-08-18 2015-07-27 Systèmes et procédés de détection de données sensibles d'utilisateur sur l'internet WO2016028442A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201414462354A 2014-08-18 2014-08-18
US14/462,354 2014-08-18

Publications (1)

Publication Number Publication Date
WO2016028442A1 true WO2016028442A1 (fr) 2016-02-25

Family

ID=53777039

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/042216 WO2016028442A1 (fr) 2014-08-18 2015-07-27 Systèmes et procédés de détection de données sensibles d'utilisateur sur l'internet

Country Status (1)

Country Link
WO (1) WO2016028442A1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017196967A1 (fr) 2016-05-10 2017-11-16 Allstate Insurance Company Surveillance et évaluation de présence pour la cybersécurité
WO2018122049A1 (fr) * 2016-12-30 2018-07-05 British Telecommunications Public Limited Company Détection d'infraction aux données personnelles
US11019080B2 (en) 2016-05-10 2021-05-25 Allstate Insurance Company Digital safety and account discovery
WO2022165510A1 (fr) * 2021-01-29 2022-08-04 Capital One Services, Llc Systèmes et procédés permettant une détection de violation de données à l'aide de numéros de carte virtuelle
US11539723B2 (en) 2016-05-10 2022-12-27 Allstate Insurance Company Digital safety and account discovery
US11611570B2 (en) 2016-12-30 2023-03-21 British Telecommunications Public Limited Company Attack signature generation
US11658996B2 (en) 2016-12-30 2023-05-23 British Telecommunications Public Limited Company Historic data breach detection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060100990A1 (en) * 2004-10-21 2006-05-11 Aaron Jeffrey A Methods, systems, and computer program products for discreetly monitoring a communications network for sensitive information
US8108370B1 (en) * 2008-03-28 2012-01-31 Symantec Corporation High-accuracy confidential data detection
US8407766B1 (en) * 2008-03-24 2013-03-26 Symantec Corporation Method and apparatus for monitoring sensitive data on a computer network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060100990A1 (en) * 2004-10-21 2006-05-11 Aaron Jeffrey A Methods, systems, and computer program products for discreetly monitoring a communications network for sensitive information
US8407766B1 (en) * 2008-03-24 2013-03-26 Symantec Corporation Method and apparatus for monitoring sensitive data on a computer network
US8108370B1 (en) * 2008-03-28 2012-01-31 Symantec Corporation High-accuracy confidential data detection

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017196967A1 (fr) 2016-05-10 2017-11-16 Allstate Insurance Company Surveillance et évaluation de présence pour la cybersécurité
EP3455777A4 (fr) * 2016-05-10 2019-10-23 Allstate Insurance Company Surveillance et évaluation de présence pour la cybersécurité
US11019080B2 (en) 2016-05-10 2021-05-25 Allstate Insurance Company Digital safety and account discovery
US11539723B2 (en) 2016-05-10 2022-12-27 Allstate Insurance Company Digital safety and account discovery
US11606371B2 (en) 2016-05-10 2023-03-14 Allstate Insurance Company Digital safety and account discovery
US11895131B2 (en) 2016-05-10 2024-02-06 Allstate Insurance Company Digital safety and account discovery
WO2018122049A1 (fr) * 2016-12-30 2018-07-05 British Telecommunications Public Limited Company Détection d'infraction aux données personnelles
US11582248B2 (en) 2016-12-30 2023-02-14 British Telecommunications Public Limited Company Data breach protection
US11611570B2 (en) 2016-12-30 2023-03-21 British Telecommunications Public Limited Company Attack signature generation
US11658996B2 (en) 2016-12-30 2023-05-23 British Telecommunications Public Limited Company Historic data breach detection
WO2022165510A1 (fr) * 2021-01-29 2022-08-04 Capital One Services, Llc Systèmes et procédés permettant une détection de violation de données à l'aide de numéros de carte virtuelle
US11694205B2 (en) 2021-01-29 2023-07-04 Capital One Services, Llc Systems and methods for data breach detection using virtual card numbers

Similar Documents

Publication Publication Date Title
WO2016028442A1 (fr) Systèmes et procédés de détection de données sensibles d'utilisateur sur l'internet
US20230269243A1 (en) Browser extension for limited-use secure token payment
US11822694B2 (en) Identity breach notification and remediation
JP6609047B2 (ja) アプリケーション情報リスクマネジメントのための方法及びデバイス
US11671448B2 (en) Phishing detection using uniform resource locators
CN106789939B (zh) 一种钓鱼网站检测方法和装置
US11381598B2 (en) Phishing detection using certificates associated with uniform resource locators
CN109257321B (zh) 安全登录方法和装置
CN112313653A (zh) 隐藏文本中的敏感信息
CN112651841B (zh) 线上业务办理方法、装置、服务器及计算机可读存储介质
US11593517B1 (en) Systems and methods for a virtual fraud sandbox
CN110765451B (zh) 风险识别方法和装置、电子设备
US9124623B1 (en) Systems and methods for detecting scam campaigns
US20200120113A1 (en) Automated detection of phishing campaigns via social media
CN110351672B (zh) 信息推送方法、装置及电子设备
US20210192523A1 (en) Techniques to improve fraud detection at financial terminals
US20140351902A1 (en) Apparatus for verifying web site and method therefor
Krupp et al. An analysis of web tracking domains in mobile applications
CN112836612B (zh) 一种用户实名认证的方法、装置及系统
Rai et al. Security and Auditing of Smart Devices: Managing Proliferation of Confidential Data on Corporate and BYOD Devices
US20210203691A1 (en) Malware and phishing detection and mediation platform
KR101192803B1 (ko) 사용자 단말의 기기 정보 인증을 통한 개인정보 제공 서비스 방법, 장치 및 시스템
CN111767544A (zh) 多频重放攻击漏洞确定方法、装置、设备及可读存储介质
CN114245889A (zh) 用于基于行为生物测定数据认证交易的系统、方法和计算机程序产品
US9485242B2 (en) Endpoint security screening

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15745740

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15745740

Country of ref document: EP

Kind code of ref document: A1