GB2542369A - Apparatus and method for connection-based anomaly detection - Google Patents

Apparatus and method for connection-based anomaly detection Download PDF

Info

Publication number
GB2542369A
GB2542369A GB1516426.2A GB201516426A GB2542369A GB 2542369 A GB2542369 A GB 2542369A GB 201516426 A GB201516426 A GB 201516426A GB 2542369 A GB2542369 A GB 2542369A
Authority
GB
United Kingdom
Prior art keywords
network
entities
sub
domain
behaviour pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1516426.2A
Other versions
GB201516426D0 (en
Inventor
Hu Bo
Mendes Rodrigues Eduarda
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to GB1516426.2A priority Critical patent/GB2542369A/en
Publication of GB201516426D0 publication Critical patent/GB201516426D0/en
Publication of GB2542369A publication Critical patent/GB2542369A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Technology Law (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

An anomaly detection method, for detecting anomalous behavior associated with a particular element of a domain of interest with which domain human beings interact, comprises: constructing in respect of a set of entities, comprising human beings and/or human institutions which are associated or potentially associated with the domain, a network representing inter-entity and intra-entity connections; performing statistical analysis of a population, including the human beings or human institutions in the set of entities, on the basis of known behavioral characteristic data, to determine a typical behaviour pattern for the population in respect of the particular element on which the anomaly detection method is focused; extracting from the constructed connection network an ego-centric sub-network (social network) of entities associated with the focal element; correlating an actual behavior pattern of the entities in the sub-network with the determined typical behaviour pattern to find behavioral abnormalities in the actual behaviour pattern. The method further comprises providing an output identifying one or more abnormalities in the actual behavior pattern of the entities in the sub-network found from such correlation. In particular the method is applied to detecting anomalous behavior in stocks and trading networks.

Description

Title of the Invention APPARATUS AND METHOD FOR CONNECTION-BASED ANOMALY DETECTION Background of the Invention
The present invention relates to an apparatus and a method for connection-based anomaly detection in a domain with which human beings interact.
In many domains in which human beings, either as individuals or as part of an organisation, play a part, the behaviour of some human beings may not conform to the norm, i.e. may be anomalous, and in some cases this can be a cause of concern, for example because it is indicative of illegal activity. Detection of such anomalies has come to rely increasingly on data from multiple sources, portraying different aspects of a subject in a domain of interest. Such data are aggregated to reveal abnormal data patterns that seem worthy of further investigation.
However, it is desirable to improve anomaly detection, for example by making it easier to detect a greater number of abnormal data patterns and/or to detect abnormal data patterns more quickly.
Brief Summary of the Invention
According to an embodiment of a first aspect of the present invention there is provided an anomaly detection method for detecting anomalous behaviour associated with a particular element of a domain of interest with which domain human beings interact, the method comprising: constructing in respect of a set of entities, comprising human beings and/or human institutions which are associated or potentially associated with the domain, a network representing inter-entity and intra-entity connections; performing statistical analysis of a population, including the human beings or human institutions in the set of entities, on the basis of known behavioural characteristic data, to determine a typical behaviour pattern for the population in respect of the particular element on which the anomaly detection method is focussed; extracting from the constructed connection network an ego-centric sub-network of entities associated with the focal element; correlating an actual behaviour pattern of the entities in the sub-network with the determined typical behaviour pattern to find behavioural abnormalities in the actual behaviour pattern; and providing an output identifying one or more abnormalities in the actual behaviour pattern of the entities in the sub-network found from such correlation.
According to an embodiment of a second aspect of the present invention there is provided anomaly detection apparatus configured to detect anomalous behaviour associated with a particular element of a domain of interest with which domain human beings interact, which apparatus comprises: network construction means configured to construct, in respect of a set of entities comprising human beings and/or human institutions which are associated or potentially associated with the domain, a network representing inter-entity and intra-entity connections; analysis means configured to perform statistical analysis of a population, including the human beings or human institutions in the set of entities, on the basis of known behavioural characteristic data, to determine a typical behaviour pattern for the population in respect of the particular element on which the anomaly detection method is focussed; sub-network extraction means configured to extract from the constructed connection network an ego-centric sub-network of entities associated with the focal element; correlation means configured to correlate an actual behaviour pattern of the entities in the sub-network with the determined typical behaviour pattern to find behavioural abnormalities in the actual behaviour pattern; and abnormality outputting means configured to provide an output identifying one or more abnormalities in the actual behaviour pattern of the entities in the sub-network found from such correlation.
According to an embodiment of a third aspect of the present invention there is provided a computer program which, when run on a computer, causes that computer to carry out a method embodying the first aspect of the present invention.
The element may for example be one of: a person or group of people sharing one or more common characteristics; an institution or group of related institutions; an item or system with which a person or institution is potentially associated. Connections may for example comprise one or more of: personal relationships; social acquaintanceships; business and/or legal interactions; legal associations. More than one data source may be used in order to construct the connection network.
Brief Description of the Drawings
Figure 1A is a flowchart illustrating an anomaly detection method embodying the present invention;
Figure IB is a diagram illustrating system architecture suitable for carrying out the method of Figure 1A;
Figure 2 is a flowchart illustrating a method embodying the present invention;
Figure 3 is a flowchart illustrating a first sub-process of the method of Figure 2;
Figure 4 is a flowchart illustrating a second sub-process of the method of Figure 2; Figure 5 is a flowchart illustrating third sub-process of the method of Figure 2; and Figure 6 is a block diagram of a computing device suitable for carrying out a method embodying the present invention.
Detailed Description of the Invention
The following patterns can be useful when one tries to differentiate “normal” and “abnormal” patterns in a network of intertwined entities: 1. Discrepancies between individual behaviours and population norms 2. Discrepancies between behaviours in a specific time window and cross time average (i.e. normal behaviour observed over a relatively long period of time) Such discrepancy patterns can be exploited to detect abnormal behaviour, e.g. fraudulent activities, in various domains.
An embodiment of the present invention aims to enrich information used conventionally to detect abnormal data patterns with social network data for better understanding of the interconnections among different anomalous patterns based on the above two categories of discrepancies.
In this regard, a graphical representation of personal, social and business interactions among companies, legal entities, and people who may or may not be explicitly involved in trading can provide a good insight into the relationships among the involved entities. When combined with other relevant data, such relationships can reveal anomalies that are not easily detected with the other data alone. In this document, a method is proposed in which statistics-based patterns are projected over a social network graph model (e.g. an ego-centric network of traders and/or a public company) to correlate abnormal behaviours.
An embodiment of an anomaly detection method for detecting anomalous behaviour associated with a particular element of a domain of interest with which domain human beings interact is shown in the flowchart of Figure 1 A, The element may, for example be one of: a person or group of people sharing one or more common characteristics; an institution or group of related institutions; an item or system with which a person or institution is potentially associated. In Step 1 a network representing inter-entity and intra-entity connections is constructed in respect of a set of entities, comprising human beings and/or human institutions which are associated or potentially associated with the domain. The connections may for example comprise one or more of: personal relationships; social acquaintanceships; business and/or legal interactions; legal associations. It is desirable to use more than one data source in order to construct the connection network. In Step 2 statistical analysis of a population, including the human beings or human institutions in the set of entities, is performed on the basis of known behavioural characteristic data, to determine a typical behaviour pattern for the population in respect of the particular element on which the anomaly detection method is focussed. In Step 3 an ego-centric sub-network of entities associated with the focal element is extracted from the constructed social connection network. In Step 4 an actual behaviour pattern of the entities in the sub-network is correlated with the determined typical behaviour pattern to find behavioural abnormalities in the actual behaviour pattern. In Step 5 an output identifying one or more abnormalities in the actual behaviour pattern of the entities in the sub-network found from such correlation is provided.
Architecture of a system suitable for carrying out a method embodying the present invention is shown in Figure 1B. Key components of the system are:
Connection network construction module 1 (network constructions means) - this module is operable to take as input external data sources and build a social connection network of entities identified as having an association or potential association with a particular element of a domain of interest. The entities in such a social connection network may, for example, comprise individual persons (e.g. key persons involved in stock trading activities, where the domain is stock trading), institutions such as companies or other organisations, etc;
Population pattern detection module 2 (analysis means)- this module is operable to perform statistical analysis to determine the typical behaviour patterns of a population, including the persons and/or institutions in the social connection network, either as a whole or in a stratified manner;
Sub-net extraction module 3 (sub-network extraction means) - this module is operable to produce an ego-centric network of a focal entity, the focal entity being, for example, a particular individual or group of individuals sharing the same characteristics, a particular company (or a subsidiary), or a particular item or system with which an individual or institution is or may be associated (such as, when detecting insider trading, a particular stock in which some “abnormalities” are suspected);
Matching module 4 (correlation means) - this module is operable to correlate (compare) the behaviour pattern of the population and the behaviour pattern of the focal network to discover any discrepancies, i.e. abnormal behaviour, in the focal network which is worthy of further investigation; and
Abnormality output module 5 (abnormality output means) - this module provides an output identifying one or more abnormalities identified by the matching module.
An example of how a connection network might be constructed by the network construction module will now be described.
Firstly, data relating to the domain of interest is gathered and examined to identify one or more individuals and/or institutions associated with the domain in a general or specific way. In the domain of stock market trading, for example, stock market trading data are, by law, publically available for major markets or through commercial data aggregation services. For instance, NYSE trading data can be downloaded from market regulators directly, e.g. the US Securities and Exchange Commission or API for Shanghai Stock Exchange, as well as finance portal sites, e.g. Yahoo! Finance and Sina Finance. Such databases are normally structured or semi-structured to facilitate information extraction. Details in such trading data include, for example: trading volume, trading type (share or option), trading price, etc. Insider trading is mandated in SEC Form 4 for all the companies listed in NYSE.
Using the identified individual(s) and/or institution(s) as a basis, one or more connection networks may then be constructed. One suitable type of connection network is a person-person social network, which as discussed further below may be constructed using open data from social media sites based on public APIs or using existing data-crawling and data-extraction tools. Another suitable type of connection network is a person/resource-organisation network, which as also discussed further below may be constructed through information extraction from data in the public domain, official/government registries, and regulatory bodies. For unstructured data sources, relationships may be extracted using conventional natural language processing methods.
Nowadays, it is hard to prevent social relationships from appearing on the Internet entirely. Traces can be discovered from social media sites such as Linkedln.com, Xing.com, Sina’s weibo.com, Facebook.com and Twitter.com. How to construct a social network graph based on “who knows whom” or “who interacts with whom” has been studied extensively. More specifically, inter-person relationships can be extracted from explicit networks or user-generated content in various social network sites (e.g. by detecting the “following/®” activities in twitter.com, weibo.com; by following the comment thread of a particular discussion group; or by analysing images of posted photos). Corporate hierarchies can be composed from open data sources, such as company registration, DBpedia, etc. Person-organisation connections can be extracted from news stories, corporate announcements/press releases, corporate financial reports, etc.
In this proposal, it is assumed that such a network can be constructed from one or more public data sources using existing techniques and tools. Overlapping different data sources together in a single network may provide the following benefits: 1. Revealing connections that may not be present in a single source 2. Accurately quantifying the strength of connections among individuals 3. Validating the connections through redundancy identified in the information extracted from the multiple data sources.
Standard social network analysis (SNA) techniques may be applied to: 1. Define the scope of an entity’s ego-centric network, for example discovering with whom a focal individual is connected, and explicating the connections among those individuals (e.g. triangle closures) 2. Define the tie strength among individuals to signify the relevance of the connections among different people. 3. Define the strength of the connections along the shortest path that connects two seemingly distant entities.
The weights of links can be propagated and refined using graph metrics such as “edge betweenness” (Girvan-Newman algorithm).
The social connection network so constructed may then be stratified in order to arrive at a general sub-network, or an ego-centric network focusing on one or multiple subjects (elements) of the domain of interest. Stratification is carried out to categorise entities in the social connection network into different sub-groups based on criteria such as gender, age, ethnic background, education, economic background, jobs, etc. (for individuals) or industrial sector, revenue, etc. (for companies). Such stratification helps to reveal patterns by increasing data homogeneity. The assumption is that people belonging to the same stratum tend to behaviour similarly (or at least more similarly in comparison to those from other strata) owing to similar education and social backgrounds; a similar assumption is made for institutions in the same stratum.
In the proposed method, statistical behaviour patterns are laid over social networks. In general, this technique can be applied to many areas where two such types of data can work in complement with one another. For example, fraudulent benefit claims detection can be enhanced when social patterns are used. This approach can also be used in domains other than forensic accounting. For instance, the combination of these two types of data can be used to provide targeted advertisement.
An embodiment of the present invention will be described in the context of investigating possible insider trading on the stock market. Although the present invention is certainly not limited to this application, it is particularly suited to this field owing to the data availability and the distinctive characteristics of this domain. Some unique characteristics of the domain include: (1) illegal insider trading is normally associated with well-defined points in time, e.g. acquisition or merger announcements, new product launch, press release, release of financial reports, etc. that offer investigators a well-defined starting point; and (2) the population norm has been well studied by economists and finance analysts, which helps to identify abnormal behaviours in this domain more easily.
Detecting fraudulent insider trading is a challenging task. The current practice is mostly retrospective and time consuming. Regulatory bodies normally observe stock and options trading patterns and correlate such patterns with major events to detect potential fraudulent insider trading activities. For instance, SONAR system in FINAR monitors news and trading behaviour to highlight anomalies in the stock market. However, in many cases stock trading data alone are not sufficient to pin down fraudulent behaviours.
Insider trading is not easy to predict, but the proposed method can better facilitate the investigation of potential fraud cases. If such discrepancies can be satisfactorily identified ahead of the announcement of major events, investigative resources can then be directed to such highlighted activities to avoid wasting time and effort.
Stock market trading behaviour is influenced by different factors, the majority of which are perfectly legitimate. Common trading behaviours can be established as distribution models, for example: 1. For a particular stock, establish the distribution model of buy/sell across different strata of the network population 2. For a particular concerned individual, establish the distribution model of buy/sell across different shares
Possible anomalous behaviour can be detected and quantified on the assumption that a stratum of the population will demonstrate similar trading behaviours. In this embodiment a power law distribution is assumed to be a reasonably accurate model of population-based behaviour. The reasoning for this assumption is that few traders exhibit a high volume of trading activity, while the majority of traders exhibit a very low volume of trading activity. In fact, power law is a widely observed phenomenon in economics. Simply speaking, it presents a relation of Y = kXa, where F and X are variables of interest, a is called the power law exponent, and fc is a constant for fine tuning. Many economic activities take the form of power laws, in particular the distribution of income, wealth, size of cities and firms, and the distribution of financial variables such as returns and trading.
In the insider trading use case, with sufficient data one should be able to compute fc and a and s for the deviation factors in Y = kXa + s. Such parameters can be computed over a fixed period of time (w,) defined by the end users of the method.
Computing trading behaviour against each stratum is sometimes not feasible, owing to the sheer size of the data to be processed in real-time. In this case some heuristic knowledge is helpful to improve the system performance, such as: 1 Grouping by profession: individuals involved in the trading of stocks can be grouped by their professions. In general, one can differentiate professional stock brokers and the general public. 2. Grouping by industrial sector: trading activities can be grouped against industries. The assumption is that trading behaviours of stocks may demonstrate less differences of k, a, and ε values within an industrial sector than those across industrial sectors.
For the identified strata, distribution trading activities can be modelled.
Based on existing research (e.g. Xavier Gabaix, Power Laws in Economics and Finance, 2009, Annual Review of Economics), the power law of the stock market is determined to be characterised by a = -1.5 for trading volume and a = -3.4 for number of trades, a for a specific stock market may be plus or minus a subtle variant over the above values. For an industry-wide distribution, two different trading pattern models will be constructed, namely MP for purchasing and Ms for selling. This is further defined as follows:
Mp°for purchasing volume and for selling volume following the established power law distributions. The distribution can be normalised against the median volume of the stocks. The advantage of splitting purchasing and selling activities is that fraudulent traders may only acquire or dispose shares abnormally and defer the corresponding liquidation activities beyond the observed time windows.
Apart from the above established parameters, models for stratified population can be learnt using existing fitting algorithms and tools/libraries, providing sufficient data can be acquired.
Based on above established normal population behaviour patterns, outlier behaviour patterns can be discovered by comparing the normal behaviour pattern with the behaviour pattern of the focal network. An example will now be described with reference to identifying an abnormal distribution of a stock’s trading volume. A widely-used measure to compute the discrepancy of two distributions is the Kullback-Leibler divergence measure. This measure can be applied to determine the overall deviation from the normal power-law distribution with respect to a pair of variables. Variables following the power-law relation can be normalised as follows:
The normalised distributions can then be compared in terms of the Kullback-Leibler divergence for discovering the disagreement between two distributions. The assumption is that when a behaviour (for example, characterised by a stock trading activity distribution) deviates too much away from the “norm”, it is likely that this deviation is worthy of further investigation. In the present example let
be the trading pattern of stock st within a given period of time At,
where X is either P for purchasing or 5 for selling.
When the divergence is sufficiently large (e.g. greater than a threshold), the focal stock can be flagged as “worth further investigation”. The identified stocks are high risk ones, denoted as Ηχο<:Μ.
From the above measure, the search can be narrowed down to a selective set of stocks. Graph algorithms can be applied to the previously-constructed social connection network to explicate the connections among the people involved in the trading of the identified stock. Note that the individuals involved in the trading of a focal stock over a given period of time can be considered as a trading network, a. For each pair of individuals involved in trading a particular stock, compute the disagreement/agreement of their trading activities against stock st and entire market, denoted as Dst and DM respectively. Such discrepancies can be computed as follows:
where either x(t) φ 0 or y(t) ψ 0 are the trading activities in terms of volume at a particular time point for different individuals (i.e. x and y).
Alternatively, when the trading volume is not significant, and rather the fact whether or not different individuals are trading at the same time and in the same stock is deemed to present more significance, Kappa’s coefficient can be used. Kappa measures the (dis)agreement of two judgements as follows:
In both cases, the disagreement of the trading behaviours of two individuals with respect to a specific stock can be computed.
Basically, for a given time M, this distance gives an indication of the degree to which the individuals are acting in accordance.
is the average of the discrepancies of two individuals across multiple stocks within the given period of time. DM provides a good reference for correlated behaviours of the focal individuals. b. Using the results of the step (a), a trading affinity graph/network can be constructed with an edge between each pair of individuals whose trading behaviour highly agrees with each other. - In Dx can be the weight of the corresponding edge. c. The trading affinity graph and the social connection network can be correlated to reveal sub-network(s) that are correlated in both graphs. d. The assumption is that such connections can help to reveal whether the focal individuals are also socially connected. The following patterns may highlight cases that are worthy of further investigation: 1. Highly correlated trading patterns from socially disconnected individuals 2. Highly correlated trading patterns on the focal stock among tightly socially connected individuals
The highlighted entities/individuals can then be subject to further investigation to confirm whether there has been illegal insider trading.
The basic concept of the above-described fraud detection approach can be summarised as follows: 1. Identify candidate stocks that demonstrate abnormal patterns suggesting it/them being involved in “abnormal” trading activities 2. For each stock, find individuals who tend to trade in agreement with each other.
This helps to create a trading agreement network 3. Correlate the trading agreement network with a constructed social network to identify one or more groups of people whose behaviour should be investigated further.
This entire process can be broken down into several sub-processes, as shown in Figure 2. In the first sub-process of Step 21, abnormally-traded stock is identified using a statistical model of the trading data. In the second sub-process of Step 22, a subnetwork (a “trading network”) of companies or individuals involved in trading the abnormally-traded stock is identified, using a previously-constructed social network. In the third sub-process Step 23, the trading network is correlated with the social network, from which abnormal trading patterns are identified in Step 24.
The first sub-process of Step 21 is described in more detail with reference to Figure 3.
In Steps 31 to 33, the market average trading patterns are compared against those of each individual stock. In many cases, the market average can be refined against individual sectors. In Step 34 any discrepancies between the trading patterns is computed. If the discrepancy exceeds a predetermined threshold (Step 35), the stock is identified as having been abnormally traded (Step 36).
As shown in Figure 4, the stock which is identified as having been abnormally traded is then used, with the constructed social network, in the second sub-process (Step 22 of Figure 2) to produce a trading correlation network by identifying people with similar trading activities of the identified “abnormal” stock(s) (Steps 41 and 42).
Finally, as shown in Figure 5, in the third sub-process (Step 23 of Figure 2) the trading correlation network and the social network are compared (“overlapped”) and used in conjunction with data on the abnormally-traded stock to detect and highlight a group of people that are worthy of further investigation (Step 51).
As mentioned previously, an embodiment of the present invention can usefully be applied to many other domains, not just stock market trading. Furthermore, it should be noted that an embodiment of the present invention does not provide a definite conclusion of problematic or illegal behaviour, but instead brings to light “abnormal” patterns of behaviour that might warrant further investigation. The proposed approach, which is based on long-term distribution and correlation between social and activity graphs, can provide the following benefits: 1. Early detection of abnormal behaviour - for example, potential fraudulent trading activities can be highlighted by detecting abnormal trends in stock volume fluctuation while the trading happens, instead of during retrospective investigation 2, Enrichment and overlapping social network and behaviour affinity graph allows better detection
Figure 6 is a block diagram of a computing device, such as a data storage server, which embodies the present invention, and which may be used to implement a method of an embodiment and perform the tasks of apparatus of an embodiment. The computing device comprises a computer processing unit (CPU) 993, memory, such as Random Access Memory (RAM) 995, and storage, such as a hard disk, 996. Optionally, the computing device also includes a network interface 999 for communication with other such computing devices of embodiments. For example, an embodiment may be composed of a network of such computing devices. Optionally, the computing device also includes Read Only Memory 994, one or more input mechanisms such as keyboard and mouse 998, and a display unit such as one or more monitors 997. The components are connectable to one another via a bus 992.
The CPU 993 is configured to control the computing device and execute processing operations. The RAM 995 stores data being read and written by the CPU 993. The storage unit 996 may be, for example, a non-volatile storage unit, and is configured to store data.
The display unit 997 displays a representation of data stored by the computing device and displays a cursor and dialog boxes and screens enabling interaction between a user and the programs and data stored on the computing device. The input mechanisms 998 enable a user to input data and instructions to the computing device.
The network interface (network l/F) 999 is connected to a network, such as the Internet, and is connectable to other such computing devices via the network. The network l/F 999 controls data input/output from/to other apparatus via the network. Other peripheral devices such as microphone, speakers, printer, power supply unit, fan, case, scanner, trackerball etc may be included in the computing device.
Methods embodying the present invention may be carried out on a computing device such as that illustrated in Figure 6. Such a computing device need not have every component illustrated in Figure 6, and may be composed of a subset of those components. A method embodying the present invention may be carried out by a single computing device in communication with one or more data storage servers via a network. The computing device may be a data storage itself storing at least a portion of the data. A method embodying the present invention may be carried out by a plurality of computing devices operating in cooperation with one another. One or more of the plurality of computing devices may be a data storage server storing at least a portion of the data.
Embodiments of the present invention may be implemented in hardware, or as software modules running on one or more processors, or on a combination thereof. That is, those skilled in the art will appreciate that a microprocessor or digital signal processor (DSP) may be used in practice to implement some or all of the functionality described above.
The invention may also be embodied as one or more device or apparatus programs (e.g. computer programs and computer program products) for carrying out part or all of the methods described herein. Such programs embodying the present invention may be stored on computer-readable media, or could, for example, be in the form of one or more signals. Such signals may be data signals downloadable from an Internet website, or provided on a carrier signal, or in any other form.

Claims (9)

Claims
1. An anomaly detection method for detecting anomalous behaviour associated with a particular element of a domain of interest with which domain human beings interact, the method comprising: constructing in respect of a set of entities, comprising human beings and/or human institutions which are associated or potentially associated with the domain, a network representing inter-entity and intra-entity connections; performing statistical analysis of a population, including the human beings or human institutions in the set of entities, on the basis of known behavioural characteristic data, to determine a typical behaviour pattern for the population in respect of the particular element on which the anomaly detection method is focussed; extracting from the constructed connection network an ego-centric sub-network of entities associated with the focal element; correlating an actual behaviour pattern of the entities in the sub-network with the determined typical behaviour pattern to find behavioural abnormalities in the actual behaviour pattern; and providing an output identifying one or more abnormalities in the actual behaviour pattern of the entities in the sub-network found from such correlation.
2. A method as claimed in claim 1, wherein the element is one of: a person or group of people sharing one or more common characteristics; an institution or group of related institutions; an item or system with which a person or institution is potentially associated.
3. A method as claimed in claim 1 or 2, wherein connections comprise one or more of: personal relationships; social acquaintanceships; business and/or legal interactions; legal associations.
4. A method as claimed in any preceding claim, wherein more than one data source is used in order to construct the connection network.
5. Anomaly detection apparatus configured to detect anomalous behaviour associated with a particular element of a domain of interest with which domain human beings interact, which apparatus comprises: network construction means configured to construct, in respect of a set of entities comprising human beings and/or human institutions which are associated or potentially associated with the domain, a network representing inter-entity and intra-entity connections; analysis means configured to perform statistical analysis of a population, including the human beings or human institutions in the set of entities, on the basis of known behavioural characteristic data, to determine a typical behaviour pattern for the population in respect of the particular element on which the anomaly detection method is focussed; sub-network extraction means configured to extract from the constructed connection network an ego-centric sub-network of entities associated with the focal element; correlation means configured to correlate an actual behaviour pattern of the entities in the sub-network with the determined typical behaviour pattern to find behavioural abnormalities in the actual behaviour pattern; and abnormality outputting means configured to provide an output identifying one or more abnormalities in the actual behaviour pattern of the entities in the sub-network found from such correlation.
6. Apparatus as claimed in claim 5, wherein the element is one of: a person or group of people sharing one or more common characteristics; an institution or group of related institutions; an item or system with which a person or institution is potentially associated.
7. Apparatus as claimed in claim 5 or 6, wherein connections comprise one or more of: personal relationships; social acquaintanceships; business and/or legal interactions; legal associations.
8. Apparatus as claimed in any one of claims 5 to 7, wherein the network construction means are operable to use more than one data source in order to construct the connection network.
9. A computer program which, when run on a computer, causes that computer to carry out the method of any of claims 1 to 4.
GB1516426.2A 2015-09-16 2015-09-16 Apparatus and method for connection-based anomaly detection Withdrawn GB2542369A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1516426.2A GB2542369A (en) 2015-09-16 2015-09-16 Apparatus and method for connection-based anomaly detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1516426.2A GB2542369A (en) 2015-09-16 2015-09-16 Apparatus and method for connection-based anomaly detection

Publications (2)

Publication Number Publication Date
GB201516426D0 GB201516426D0 (en) 2015-10-28
GB2542369A true GB2542369A (en) 2017-03-22

Family

ID=54363270

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1516426.2A Withdrawn GB2542369A (en) 2015-09-16 2015-09-16 Apparatus and method for connection-based anomaly detection

Country Status (1)

Country Link
GB (1) GB2542369A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409948B (en) * 2018-10-12 2022-09-16 深圳前海微众银行股份有限公司 Transaction abnormity detection method, device, equipment and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None *

Also Published As

Publication number Publication date
GB201516426D0 (en) 2015-10-28

Similar Documents

Publication Publication Date Title
Osmani et al. Blockchain for next generation services in banking and finance: cost, benefit, risk and opportunity analysis
US20230259857A1 (en) Systems and methods for vulnerability assessment and remedy identification
Brauer et al. Industry divestiture waves: How a firm's position influences investor returns
US20210112101A1 (en) Data set and algorithm validation, bias characterization, and valuation
See-To et al. Market sentiment dispersion and its effects on stock return and volatility
US11570214B2 (en) Crowdsourced innovation laboratory and process implementation system
Maroofi et al. An investigation of initial trust in mobile banking
Ali et al. Stock market reactions to favorable and unfavorable information security events: A systematic literature review
US20150006708A1 (en) Determining the health of a network community
Eigelshoven et al. Cryptocurrency Market Manipulation-A Systematic Literature Review.
US20160042451A1 (en) System and method for online evaluation and underwriting of loan products
Ojeniyi et al. Security risk analysis in online banking transactions: Using diamond bank as a case study
US20190066248A1 (en) Method and system for identifying potential fraud activity in a tax return preparation system to trigger an identity verification challenge through the tax return preparation system
US11372526B2 (en) Method for anomaly detection in clustered data structures
Debreceny et al. Corporate network centrality score: Methodologies and informativeness
Rasull et al. Benefit and sacrifice factors determining internet banking adoption in Iraqi Kurdistan region
Abdillah et al. Effect of corporate social responsibility disclosure (CSRD) on financial performance and role of media as moderation variables
Park et al. Twitter sentiment analysis-based adjustment of cryptocurrency action recommendation model for profit maximization
US20210295436A1 (en) Method and platform for analyzing and processing investment data
Onu et al. Detection of Ponzi scheme on Ethereum using machine learning algorithms
Kulkarni et al. Network-based anomaly detection for insider trading
Souza et al. Brazilian stock market performance and investor sentiment on Twitter
WO2013067117A1 (en) System and method for selecting an insurance carrier
US20090083169A1 (en) Financial opportunity information obtainment and evaluation
US20220067625A1 (en) Systems and methods for optimizing due diligence

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)