CN107527144A - A kind of detection method of financial field connected transaction - Google Patents

A kind of detection method of financial field connected transaction Download PDF

Info

Publication number
CN107527144A
CN107527144A CN201710715883.8A CN201710715883A CN107527144A CN 107527144 A CN107527144 A CN 107527144A CN 201710715883 A CN201710715883 A CN 201710715883A CN 107527144 A CN107527144 A CN 107527144A
Authority
CN
China
Prior art keywords
investor
sequence
transaction
tape symbol
commission
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710715883.8A
Other languages
Chinese (zh)
Inventor
周水庚
王俊杰
关佶红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201710715883.8A priority Critical patent/CN107527144A/en
Publication of CN107527144A publication Critical patent/CN107527144A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Technology Law (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention belongs to financial big data digging technology field, specially a kind of detection method of financial field connected transaction.The inventive method includes:Using characteristic variable of the commission amount of tape symbol as investor's transaction, tape symbol commission amount sequence is established;Establish the tape symbol commission amount sequence of the unified aggregation of investor's transaction;Two investor's trading activity similitudes are calculated, establish multiple investor's correlation matrixs;According to the correlation matrix of a day of trade, odd-numbered day weight map is built, multiple odd-numbered day weight maps merge into a synthetic weights multigraph, and investor's set corresponding to a connected subgraph in synthetic weights multigraph is exactly a potential connected transaction group.The method of the present invention can take preventive measures before market risk formation, avoid developing into great risk case, thus the line regulator of transaction business one can develop a set of quick intuitively connected transaction group monitoring and discovering tool, for market surpervision and risk management.

Description

A kind of detection method of financial field connected transaction
Technical field
The invention belongs to financial big data digging technology field, and in particular to the detection of various connected transaction on financial market Method.
Background technology
On financial market, trading activity refer to investor bought and sold on exchange's platform financial investment product operation and Behavior.Although the transaction of specification is the main flow in market, abnormal trading activity also occurs often, such as price operation and inside story Transaction, especially on emerging financial market, abnormal trading activity occurs now and then.The abnormal behaviour in financial market not only influences market Operating mechanism, distort investment product transaction value, also threaten market system safety and transaction fairness, injure The trading interest of honest investor.In recent years, a kind of new abnormal trading activity just increasingly occurs.Some veteran transaction Person maximizes to chase individual interest, forms a connected transaction Zhe little clique, works in coordination and manipulates some financial products Price is run, and is misled other investors, is therefrom sought exorbitant profit.Connected transaction activity just turns into a kind of dangerous, hidden market Manipulate type.For market surpervision person and risk managers, from the transaction data of boundless and indistinct participant in the market and flood tide It is a difficult task to excavate out hiding connected transaction.This significant challenge attracts more and more market surpervisions in recent years Person and the close attention of researcher.
By buying and selling the empiric observation and actual analysis of operation to participant in the market, it can be found that some detection connected transaction Clue.It is extremely similar in transaction between member in one connected transaction group, but it is big with the dealer outside group It is mutually very unlike.Similar trading activity means that group member carries out some investment product on almost identical or similar time point Business activities (i.e. same to buy same sell), and it is related that they, which entrust quantity, the change that how much accompanies (i.e. with more with less).On the contrary, Trading activity between general transaction person is almost altogether irrelevant, and the collaboration that evident regularity is not present in dealing T/A becomes Change.It can't deny, some " clever " dealers may take different trading strategies to come in offsets activity what is associated Variation tendency, its trading activity is showed not only common but also normal, create smoke screens to do, escape monitoring.But successfully camouflage is not only Need the technical ability of scalping that dealer is superb, it is also possible in order to reduce the association of trading activity, unfavorable operation is performed, so as to pay Extra cost.Emphasis of the present invention detects the first connected transaction behavior, i.e., similar trading activity mould is shown between dealer Formula.
At present, the related work that connected transaction detects on forward market is not also studied according to the similitude of trading activity Make.But there is a few thing to study this theme of abnormal transaction on financial market, such as price behaviour with different view Vertical, it is an important transaction swindling behavior, has had distinct methods to study it, including the side based on pattern-recognition Method (Palshikarin COMAD 2000), behavioral statisticses model (Khwaja in J.Finan.Econ 2005, Hansen in Appl.Econ.Lett.2004, Aggarwal in J.Business 2006), rationality expectancy theory (the Allen in of short squeeze corner ) and domain-driven data mining (Ou in PRICAI 2008) Rev.Finan.2006.
With the increasingly rise of abnormal transaction on financial market, the connected transaction activity between investor, which causes, grinds The concern for the person of studying carefully, and explored from different directions, its first purpose is to explain market manipulation.In order to from specification Transactional operation in identify that nonstandard trade mode, Franke et al. (GfKl 2007) propose the inspection based on spectral clustering Survey method.They construct dealer's network, describe the behavior of dealer, and portray market.If the actual row in market To deviate from the trading activity of rule permission, then it is considered as abnormal movement, will be reported.But this research uses one Individual experimental stock market data.Palshikar et al. (Data Min.Knowl.Discov 2008) proposes that a figure is poly- Class algorithm, detection connected transaction person colony.Trading volume is very big between they think connected transaction colony internal members, and and group Other external dealer's trading volumes are with regard to very little.In order to detect such connected transaction colony, they use the transaction data of simulation, Stock flow graph is constructed, describes the transaction relationship between dealer, and connected transaction colony is found using figure clustering method.Cao Et al. (KDD 2010) think transaction of the market dominance from a group wire-puller:Wire-pullers work in concert, Elaborately planned price, quantity and the time that they merchandise, so as to upper three transaction sequences of rigging the market:Buy orders sequence Arrange, sell commission sequence and conclusion of the business sequence.They propose coupling HMM method, describe group's composition of pulling strings behind the scenes Interbehavior between member, abnormal manipulation trading activity is further detected from the thin data of commission of stock.
In fact, association colludes with behavioral value and exploration, including online auction system was also carried out in other fields (Trevathan in ITNG 2007, Trevathan in J.Comput.2007), online commending system (Lam in WWW 2004, Su in WWW 2005, Chirita in WIDM 2005, Zhang in KDD 2006), online rating system (Zhang in WAW 2004,Wang in Expert Syst.Appl.2008,Liu in Proceeding of Asilomar Conference on Signals, Systems and Computers 2008) and end-to-end file-sharing net Network (Feldman in Proceedings of EC 2004, Lian in ICDCS, 2007).The solution of these systems exists Each have under specific environment it is very effective, it can be found that behavior is colluded with association, but for operation different on financial market And interaction mechanism, these method neither ones are applied to the detection of the connected transaction behavior on financial market, specifically there is three originals Cause:
(1) transaction on financial market is extremely complex;
(2) within a common day of trade, in the market has millions of electricity that exchange is sent into tens million of order tickets Sub- transaction system, so large-scale data amount in online auction and evaluation system for having never heard of;
(3) the behavior interactive mode of the behavior evaluation system in these fields and two collaborators behind the scenes are not suitable for finance The description and detection of in the market high frequency commission stream behind connected transaction behavior.
The content of the invention
The problem of for being difficult to connected transaction behavior on financial market, the present invention propose that a kind of new financial field is closed Join the detection method of transaction, effectively can excavate out hiding pass from the transaction data of numerous participants in the market and flood tide Join trading activity.
The detection method of financial field connected transaction proposed by the present invention, it is the connected transaction based on trading activity similitude Detection method, comprise the following steps that:
(1) using the commission amount of tape symbol, as the characteristic variable of investor's transaction, (this variable truly can may be used Transaction by ground reflection investor is intended to), establish tape symbol commission amount sequence;
(2) the tape symbol commission amount sequence of the unified aggregation of investor's transaction is established, during eliminating single under investor Between noise jamming caused by difference;
(3) for any two investor, both trading activity similitudes are calculated, i.e., calculating any two meets length and wanted The coefficient correlation for the aggregation tape symbol commission amount sequence asked;Then for multiple investors, correlation matrix is established;
(4) discovery of connected transaction group:According to the correlation matrix of a day of trade, odd-numbered day weight map is built, works as phase When relation number is more than a predetermined value, corresponding side is present;Multiple odd-numbered day weight maps merge into a synthetic weights multigraph, synthetic weights The weight on each side is the number that it occurs in multiple odd-numbered day weight maps in multigraph, will if side right is less than predetermined value again It is abandoned;Investor's set is exactly a potential connected transaction group corresponding to a connected subgraph in synthetic weights multigraph.
Test result indicates that method of the invention can be searched effectively in the transaction data of substantial amounts of dealer and flood tide Seek groupuscule illegal in violation of rules and regulations in financial transaction.
The method of the present invention can also be applied to study the similitude of the other behaviors of investor, for example, investor holds position daily The similitude of change, this can help us to study connected transaction from bigger time granularity and find connected transaction group.
The characteristic variable selected in the method for the present invention is complete for commission pricing information without reference to any pricing information Transparent, it is applicable not only to limit order list, is also applied for Market order list, and the order ticket of other price attributes.
Below, it is further described the particular technique details of each step of the inventive method:
Step (1) the commission amount by the use of tape symbol establishes tape symbol committee as the characteristic variable of investor's transaction Support amount sequence;Its specific practice is as follows:
The particular content of limit order list in process of exchange is analyzed first.
Limit order list refers to investor with designated price rather than the order ticket of sale at daily market price financial product.One price limit committee Support singly includes the essential informations such as dealing direction, commission price and commission quantity.In these essential information items, which can For describing investor's transactional intentIn order to answer this problem, we do to item of information therein and analyzed one by one.
Direction is bought and sold, as a crucial information in order ticket, it is that price limit buys in product, or limit to indicate investor Valency sells product, and this specifies the wish that investor possesses or abandoned an assets.
Price is entrusted, is a price specified, investor it is expected order ticket with this price order matching.Normal conditions Under, commission close prices is in the newest knock-down price in market at that time, and the commission price that investors report and submit in a short time is several All it is identical.Therefore, this price depends on Market Situation at that time, can not show the wish of investor.
Quantity is entrusted, an investor is reflected and intends to buy in or sell the quantity of assets.
The present invention will buy and sell direction and commission according to the analysis to different items of information in limit order list in limit order list Quantity merges into the commission quantity of tape symbol, the reasonable performance being intended to as investor's transaction, i.e., as transaction Characteristic variable.Positive sign represents the commission amount of buy orders list, and negative sign represents to sell the commission amount of order ticket;During one transaction of record The commission amount of tape symbol in section, that is, form the sequence of a tape symbol commission amount.In the sequence of a tape symbol commission amount, Buy orders amount is on the occasion of and it is negative value to sell commission amount.Tape symbol commission amount can describe a dealer and submit its committee The transaction movement event at single moment is held in the palm, and transaction event sequence discrete in a trading session can naturally portray one Individual dealer's trading activity.
To an investor (such as futures investor), v (ti) him is expressed as in moment ti(1,2 ..., N) submits commission The tape symbol commission amount of event.N is sequence { tiLength, this value is different for different investors, difference throw Time point in money person's sequence may also be not quite similar.Therefore, the sequence { v (t of tape symbol commission amounti) it is one uneven Every sequence of events.
The tape symbol commission amount sequence of step (2) the unified aggregation for establishing investor's transaction, idiographic flow is such as Under:
Although having been set up tape symbol commission amount sequence of events, it is not appropriate for calculating the row between two investors For similitude.Here there are two reasons:
(1) two investor reports and submits strategy in a connected transaction group, and it is desirable that doing identical and entrusting, they Action can not be accurately synchronous in practical operation because in whole process of exchange, it will usually because of some factors Small operating lag when causing commission to be reported and submitted, such as network speed, or the queue discipline of exchange.
(2) speculator is enlivened, such as short-term scalp hand, large batch of order ticket is always generated daily and is traded, this formation Very long sequence of events, calculating the behavioral similarity between dealer becomes more complicated.
In order to solve this two problem, the time series of aggregation is introduced, replaces original tape symbol commission amount time series, is come Express the behavior pattern of investor.
Define a time window dimension deltat.For a tape symbol commission amount sequence, this sequences segmentation into a system Row length is δtCut into slices for continuous window, each window time index marker, time index is the nonnegative value since 0.The One window indicia is 0, and second mark be, by that analogy.For i-th of window, its time index is expressed as si, it Cover time interval [siδt, (si+1)δt].The tape symbol commission amount of order ticket in each window is carried out cumulative calculation, obtained The end value single to one.Specifically it is calculated as follows:For i-th of window, cluster set V (si) it is time interval [siδt, (si+ 1)δt] in all tape symbol commission amounts sum.Calculation formula is as follows:
In formula, v (ti) it is timestamp tiWhen order ticket tape symbol commission amount.Thus, a tape symbol commission amount sequence An assemble index sequence can be converted into, is expressed as { (si, V (si))}(si=0,1,2 ...).In addition, it is poly- to abandon those Set value is equal to zero V (siThe accumulation point of)=0, finally, the tape symbol commission amount sequence assembled, it is an assemble index sequence Row.
If in fact, there is no any commission event in a time window, then this event window does not just have Assemble data, therefore, assemble index sequence is unevenly spaced.Time window size δtDetermine the granularity of aggregation and gather Collect the length of sequence.Expand time window, the dealing amount in window may be offset, and the cluster set of window diminishes, and this will make calculating As a result accuracy reduces.On the contrary, reducing time window, the noise jamming that the time difference brings becomes big, and computation complexity also will increase. Therefore, for the testing result and efficiency of connected transaction group, suitable time window size is vital.Pin can be passed through Analysis to market historical data selects suitable time window size, sees specific embodiment part.
In order to guide or influence market operation, connected transaction person needs continually to report and submit order ticket that this inclines in the market Easily just dealer is enlivened to them are caused as in the market.Those are almost without the investor of order ticket, and enliven There is no high correlation between dealer, thus will most possibly be excluded outside the detection range of potential connected transaction group. In order to reduce computation complexity, computational efficiency is improved, before coefficient correlation calculating, filters out the throwing of only a small amount of commission request Money person.Specifically, by the length of each assemble index sequence and an empirical value δLCompare, those length are not less than threshold value Sequence be retained, it is further to be handled.These time serieses are referred to as qualified aggregation tape symbol commission amount time sequence Row.Therefore, only these qualified sequences are just used for the calculating of coefficient correlation and the detection of potential connected transaction group.
Step (3) is described for any two investor, calculates both trading activity similitudes, that is, calculates any two symbol Close the coefficient correlation of the aggregation tape symbol commission amount sequence of length requirement;Then for multiple investors, coefficient correlation square is established Battle array;Idiographic flow is as follows:
The trading activity similitude of two investors can be evaluated according to the strength of association of corresponding assemble index sequence, Measured usually using relative coefficient.Statistically, coefficient correlation has expressed an event (or phenomenon) and another The associated degree of event, or the degree of another event can be predicted from an event.It also illustrates that two linear variable displacements The intensity of relation.Coefficient correlation is widely used for the research of financial field.Here coefficient correlation is chosen to hand over as two investors The measurement means of easy behavioral similarity.
Assuming that having two investors A and B, their the assemble index sequences of tape symbol commission amount are respectively VAAnd VB, the two Assemble index sequence be it is uneven, it is discrete.VAAnd VBTime index be respectivelyWithOn identical subscript i, WithBut it is not necessarily identical.Therefore, the assemble index sequence of tape symbol commission amount is not suitable for it towards the method for non-uniform time sequence Row, it is necessary to unitized processing is done to assemble index sequence.
By time index collection sAAnd sBIt is merged into unified time index a collection s, i.e. s=sA∪sB.Based on unified time Indexed set, unified assemble index sequence is defined, for investor A, the unified assemble index sequence of his tape symbol commission amount UAIt is defined as follows:
Similarly, the unified assemble index sequence U of investor B tape symbol commission amount can be definedB
S in formula (2) and (3) abovekRefer to a time index values.
The coefficient correlation of two unified assemble index sequences is calculated now.For two unified assemble index sequence UAAnd UB, Their correlation coefficient rABIt is defined as follows:
In formula, angle brackets (...) represent that all aggregation event (accumulation points) is averaged in time series.Coefficient correlation Span between -1 to 1.On the occasion of expression UAAnd UBBetween positive correlation be present, negative value represent them between be negatively correlated Property.The no correlation of null value expression, two time serieses are independent mutually.In the detection of connected transaction group, negative correlation meaning The trading activity contrast of two investors, this is nonsensical for the target for detecting connected transaction group, can ignore.Only There is positive correlation to detect connected transaction group meaningful.
Assuming that there is N number of investor, for any two investor i and j, they unify assemble index sequence UiAnd UjPhase Relation number is rij, then, the coefficient correlation of N number of investor between any two may be constructed a correlation matrix R.Because UiAnd Uj Correlation be equal to UjAnd UiCorrelation, i.e. rij=rji, therefore, R is a symmetrical matrix.Element on diagonal of a matrix is Each investor unifies the auto-correlation coefficient of assemble index sequence, is worth for 1.
The discovery of connected transaction group, idiographic flow are as follows described in step (4):
After correlation matrix is obtained, weight map is built with this matrix.Wherein, node represents the investor of in the market, If the correlation coefficient value of the unified assemble index sequence of two investors is more than the threshold value δ of user's settingw, then the two There is a connection side between two nodes corresponding to investor, the weight on side is the correlation coefficient value.This weight map is free of and followed Ring side, it is connected in the absence of multiple summits between any two node.The weight map of structure needs not be a connected graph, and it is very likely to It is to be made up of some isolated nodes and some connected components (subgraph).Isolated point and any other node do not connect, for It is nonsensical to detect connected transaction group, can delete.Connected component is likely to a complete graph, and this shows in connected component The trading activity that all nodes correspond to investor is highly similar, but with the node outside connected component corresponding to investor it is dissimilar. It is obvious that connected component meets the standard of potential connected transaction person in weight map.
Obviously, user-defined correlation coefficient threshold δwSize will in weighing factor figure connected component quantity.When When this threshold value becomes big, the quantity of connected component will be reduced, and testing result becomes reliable, but some suspicious dealers may Be ignored filtering.On the contrary, when threshold value becomes small, the quantity of connected component will increase, and the connected transaction group (noise) of mistake will Increase, this will reduce the precision of testing result.Therefore, a suitable threshold value δwThe right and wrong in the efficiency and effect for ensureing detection It is often important.Threshold value δwIt can be selected by analyzing market historical data, see specific embodiment part.
In reality, people are not aware that how many connected transaction group of in the market, and which dealer, which belongs to which association, is handed over Easy group.Fortunately, some smell of powder and practical observation can help us to find solution.Specifically, association is handed over The members easily organized only will not mutually gang up and once be associated transaction, and they can repeatedly cooperate, and repeatedly collude with together It is traded manipulation.Therefore, if a suspicious connected transaction group repeatedly occurs in multiple days of trade, can from but It is a potential associated group so to think it.Consider the situation of multiple chain transaction days:To build a power each day of trade Multigraph, the connected component being then combined with these daily weight maps, obtain a synthetic weights multigraph.In figure the weight on side be it The number occurred in multiple daily weight maps.Finally, delete weight and be less than predetermined value deltafSide, remaining connected subgraph is latent Connected transaction group.Predetermined value deltafIt can be selected by analyzing market historical data, see specific embodiment part.
Compared with the prior art, method of the invention has three distinguishing features:
(1) present invention is applied to the market of transaction in assets, such as stock market and forward market.And existing research is based on The abnormal activity of stock market;
(2) present invention creates weight map, portrays the interaction row between investor according to the trading activity of market investment person For this point is different from the research method that figure is based only in existing research;
(3) thinking of the invention is the result from the empiric observation to true sale data and analysis, and true Commission data record set on verified, but it is existing research (except Cao et al. work) all be using simulation Data evaluate detection method.
The method of the present invention can take preventive measures before market risk formation, avoid developing into great risk thing Thus part, the line regulator of transaction business one can develop a set of quick intuitively connected transaction group monitoring and discovering tool, be used for Market surpervision and risk management.The present invention is a connected transaction detection method efficient, applied widely.
The inventive method effectively can excavate out hiding from the transaction data of numerous participants in the market and flood tide Connected transaction behavior, it is ensured that marketing health is normally carried out.
Brief description of the drawings
Fig. 1 is the synthetic weights multigraph of futures copper contract.
Fig. 2 is the synthetic weights multigraph of futures fuel oil contract.
Fig. 3 is the synthetic weights multigraph of futures natural rubber contract.Wherein, maximum connected subgraph eliminates the weight on side, The average side right weight on its all side is 3.28.
Embodiment
The present invention is further elaborated below in conjunction with the drawings and specific embodiments.Here, it have selected forward market Upper commission data set is as test sample data.
(1) algorithm is realized
Realize that the connected transaction detection method based on trading activity similitude is divided into two main stages:
The unified assemble index sequence of the tape symbol commission amount of each investor is calculated, according to all qualified unifications Assemble index sequence, calculate correlation matrix.
Multiple daily authorized graphs are built, their connected component is merged into comprehensive authorized graph, therefrom identification is potential closes Join transaction group.
In order to realize the task in above-mentioned two stage, two algorithms are developed.
Algorithm 1 calculates correlation matrix
Tape symbol commission amount of one investor within a day of trade in each time window is gathered into one by algorithm 1 Single value, and the too small assemble index sequence of length is filtered out, then calculate the phase that any two unifies assemble index sequence Relation number.The input of this algorithm is commission record set of certain futures contract by pretreatment in a day of trade.Each commission Record include investor's virtual ID, the commission amount of tape symbol and from the timestamp of colon form change based on the absolute second Timestamp.
Algorithm 2 detects potential connected transaction group
Using correlation matrix, algorithm 2 can detect potential connected transaction group.Algorithm builds multiple daily first Authorized graph, the connected component being then combined with these authorized graphs, form a comprehensive authorized graph.Delete weight in comprehensive authorized graph Less than threshold value δfSide after, each connected subgraph in comprehensive authorized graph, all correspond to a potential connected transaction group.
(2) time window dimension delta is determinedt
When assembling the time series of tape symbol commission amount, time window dimension deltatIt is an important parameter, it is direct Affect the calculating of coefficient correlation.In order to determine window size δt, two are selected in fuel oil transaction data within 25th from September in 2008 Individual investor, calculate their tape symbol commission amount time series.With two time serieses of different time window size agglomeration.When Window size δtAt=60 seconds, although the time series after aggregation reduces some data points, but remain former time series Shape contour.When time window size from when increasing to 200 seconds within 1 second, calculate under each time window size two aggregations when Between sequence coefficient correlation.It is observed that when time window becomes big, coefficient correlation will increase, and it is progressively steady close to one Fixed value, 60 seconds.Thus, it is possible to think time window size for 60 seconds be a relatively reasonable selection.
(3) the length threshold δ of assemble index sequence is determinedL
In whole test data set, assemble index sequence length L cumulative distribution function F (L)=P (L ' < L) is calculated. It was found that the length of about 90% time series is less than 15, these time serieses are excluded in coefficient correlation because length is too short Outside calculating.Only 10% investor is retained in the detection range of connected transaction group, so substantially reduces answering for calculating Miscellaneous degree.Therefore, selective aggregation length of time series filtering empirical value δLFor 15, the time series less than the value will be lost by filtering Abandon.That is, an investor is at least having the order ticket reported and submitted in 15 time windows, connected transaction group can be just included into Detection.Such selection meets the actual conditions of the long-term supervision experiments of line regulator of futures exchange one and operation processing.
(4) correlation coefficient threshold δ is determinedw
With the commission record data of in September, the 2008 copper futures contract of 18 days as an example.Complete to entrust tape symbol After the aggregation and filtering of measuring time series, there are 819 assemble index sequences to remain, for further calculating coefficient correlation square Battle array Mc.Choose four correlation coefficient thresholds, respectively 0.80,0.85,0.90 and 0.95.According to matrix Mc, four phases can be built The authorized graph answered.In four authorized graphs, connected component number is 10,8,6 and 4 respectively.It can be seen that connected component number is with threshold value Increase and gradually decrease.Notice the connected component of 6 nodes in first three authorized graph (δw=0.80,0.85,0.90) In be a complete graph, in last authorized graph (δw=0.95) complete graph (only having lacked a line) is almost in, this It is to have very big similitude because of between any two node in this component.In actual applications, the prison of exchange Pipe personnel can choose different threshold values according to onsite supervision demand, to observe the dubious investments person on different stage.
(5) predetermined value delta is determinedf
This parameter is bigger, and the result of output is fewer;Conversely, output result is more.In practical application, first give one compared with Few value, if output result is more than the acceptable quantity of user, increase δf;Conversely, then reduce.Increase or reduce with 1 is step-length.
(6) Detection results
Following parameters value is selected to be used for the calculating of connected transaction group detection method:δt=60 seconds, δL=15, δw=0.90, δf =2.Build the daily weight map of 9 chain transaction days, three futures contracts, and occurring in daily weight map at least 2 times While merge into synthetic weights multigraph.Fig. 1-3 shows the synthetic weights after three futures contracts (copper, fuel oil and natural rubber) merging Multigraph.
In fig. 1-3,18 connected subgraphs are shared in the synthesis authorized graph of three futures contracts.Except in Fig. 2 22069,12633,1680,33473,3956] and Fig. 3 in { 24139,21244,29020 } two subgraphs outside, remaining is all Subgraph be complete graph.Most of subgraph only occurs even 2 times within 9 day of trade, and in four subgraphs, including Fig. 1 In { 12509,21255,11668 }, Fig. 3 in { 24686,28000 }, Fig. 2 1680,3203,4324,10032,12633, 17891,22069 } and maximum component occurs at least 3 times.These connected subgraphs in comprehensive authorized graph are considered as latent In connected transaction group, this four subgraphs for occurring at least 3 times are more believed to be potential connected transaction group.
Further carefully verify the synthesis authorized graph of three futures contracts, it is noted that investor's set 1680,12633, 22069,4324,3203,17891 } it is connected subgraph in Fig. 1 and Fig. 3, its partial set { 1680,12633,22069 } Occur in fig. 2;Two investors gather { 3956,33473 } and { 4162,4937,4987 } while occurred in figs. 2 and 3; Two investors gather the collection that { 3956,33473 } and { 1680,12633,22069 } are two separation in natural rubber futures Close, but they are associated in fuel NO, Fig. 2 shows that they merge to form a single subgraph. It is therefore believed that it is very much potential connected transaction group in maximum probability that these investors, which are integrated into, they can be according to relevant supervision Background information and the experience of business expert further confirm.
The experimental result of three futures contracts is summarised in table 1.All Activity in a few days eligible assemble index sequence ParFar smaller than corresponding investor's number, this is due to a large amount of shorter assemble index sequences by threshold value δLAfter filtering Abandon, only active investor, which stays, to be further processed.From table 1, it can be seen that have many connected components at 9 Only occurs once (i.e. N in the day of tradec> > Ns), although they are not included into potential association by the detection method of the present invention Transaction group, but supervision department of exchange still can energetically pay close attention within the day of trade in future and track them.Certainly, after testing The potential connected transaction group gone out should be done further investigation and be confirmed by the handling process of exchange's supervisory system.Actually should In, these potential connected transaction groups, even if not being identified, the black name of exchange market monitoring system can be also added into Dan Zhong.
Finally, studying and judging with reference to the senior business expert of exchange, it was demonstrated that have 17 in the potential connected transaction group detected It is individual that real connected transaction group is regarded as by expert.Associated group quantity (the N in table 1 being identified in three futures contractstRow) respectively It is 4,4 and 9.In addition, being recorded by the commission for tracking these associations group membership, and detailed comparison is done to these commission records Analysis, reconfirm that these associated groups assert are real connected transaction groups.Only one belongs to fuel oil contract and has examined The potential connected transaction group measured is not proved that reason is not find ample evidence.Due to secret protection reason, it is impossible to Further provide for these detection more detailed information of associated group.
Table 1:Three futures contract experimental results collect.
The par of eligible assemble index sequence in 9 day of trade;Nc:Connected component in all weight maps Quantity;Ns:The quantity of connected subgraph in synthetic weights multigraph, that is, detect the quantity of potential connected transaction group;Nt:According to supervision shelves Case data and the experience of senior business expert, the quantity of the potential connected transaction group of identification.

Claims (5)

1. a kind of detection method of financial field connected transaction, it is characterised in that comprise the following steps that:
(1)Using characteristic variable of the commission amount of tape symbol as investor's transaction, tape symbol commission amount sequence is established;
(2)The tape symbol commission amount sequence of the unified aggregation of investor's transaction is established, to eliminate single time difference under investor Caused noise jamming;
(3)For any two investor, both trading activity similitudes are calculated, that is, calculates any two and meets length requirement Assemble the coefficient correlation of tape symbol commission amount sequence;Then for multiple investors, correlation matrix is established;
(4)The discovery of connected transaction group:According to the correlation matrix of a day of trade, odd-numbered day weight map is built, works as phase relation When number is more than a predetermined value, corresponding side is present;Multiple odd-numbered day weight maps merge into a synthetic weights multigraph, synthetic weights multigraph In the weight on each side be its number for occurring in multiple odd-numbered day weight maps, if side right is less than predetermined value again, will be thrown Abandon;Investor's set is exactly a potential connected transaction group corresponding to a connected subgraph in synthetic weights multigraph.
2. the detection method of financial field connected transaction according to claim 1, it is characterised in that step(1)Described in By the use of the commission amount of tape symbol as the characteristic variable of investor's transaction, tape symbol commission amount sequence is established;Specific practice is such as Under:
The particular content of limit order list in process of exchange is analyzed first, and direction will be bought and sold in limit order list and commission quantity is closed And for the commission quantity of tape symbol, the characteristic variable as transaction;Positive sign represents the commission amount of buy orders list, negative sign table Show the commission amount for selling order ticket;The commission amount of the tape symbol in a trading session is recorded, that is, forms a tape symbol commission The sequence of amount;
To an investor, useHim is expressed as at the momentThe tape symbol commission amount of commission event is submitted,It is sequenceLength, the sequence of tape symbol commission amountIt is a unevenly spaced sequence of events.
3. the detection method of financial field connected transaction according to claim 2, it is characterised in that step(2)Described in The tape symbol commission amount sequence of the unified aggregation of investor's transaction is established, idiographic flow is as follows:
Set a time window size, for a tape symbol commission amount sequence, this sequences segmentation into a series of length Spend and beCut into slices for continuous window, each window time index marker, time index is the nonnegative value since 0;First Window indicia is 0, and second mark be, by that analogy;ForIndividual window, its time index are expressed as, it is covered Time interval;The tape symbol commission amount of order ticket in each window is carried out cumulative calculation, obtains one Individual single end value, is specifically calculated as follows:ForIndividual window, cluster setIt is time interval The sum of interior all tape symbol commission amounts, its calculation formula are:
(1)
In formula,It is timestampWhen order ticket tape symbol commission amount;Thus, a tape symbol commission amount sequence is turned An assemble index sequence is changed to, is expressed as;It is equal in addition, abandoning those cluster sets ZeroAccumulation point, finally, the tape symbol commission amount sequence assembled, it is an assemble index sequence;
Then, the investor of only a small amount of commission request is filtered out, will each length of assemble index sequence and an experience Threshold valueCompare, sequence of those length not less than threshold value is remained, further to be handled, these time serieses are referred to as Qualified aggregation tape symbol commission amount time series.
4. the detection method of financial field connected transaction according to claim 3, it is characterised in that step(3)Described in For any two investor, both trading activity similitudes are calculated, that is, calculate the accumulation zone that any two meets length requirement The coefficient correlation of symbol commission amount sequence;Then for multiple investors, correlation matrix is established;Idiographic flow is as follows:
Assuming that two investorsWith, they are respectively at the assemble index sequence of tape symbol commission amountWith, the two aggregations Time series is uneven, discrete;WithTime index be respectivelyWith, in identical subscriptOn,WithBut it is not necessarily identical;
Unitized processing is done to assemble index sequence:
By time index collectionWithIt is merged into a unified time index collection, i.e.,;Based on unified time Indexed set, unified assemble index sequence is defined, for investor A, the unified assemble index sequence of his tape symbol commission amountIt is defined as follows:
(2)
Similarly, the unified assemble index sequence of investor B tape symbol commission amount is defined
(3)
Formula above(2)With(3)Ins kRefer to a time index values;
Calculate the coefficient correlation of two unified assemble index sequences:For two unified assemble index sequencesWith, they Coefficient correlationIt is defined as follows:
(4)
In formula, angle bracketsRepresent that all aggregation events are averaged in time series;The span of coefficient correlation- Between 1 to 1, on the occasion of expressionWithBetween positive correlation be present, negative value represent them between be negative correlation;Null value represents There is no a correlation, two time serieses are independent mutually;In the detection of connected transaction group, negative correlation means two investors Trading activity contrast, this for detect connected transaction group target it is nonsensical, ignore;Only positive correlation is to association The detection of transaction group is meaningful;
Assuming that haveIndividual investor, for any two investorWith, they unify assemble index sequenceWithCorrelation Coefficient is, then,The coefficient correlation of individual investor between any two forms a correlation matrix;Due toWithPhase Closing property is equal toWithCorrelation, i.e.,, i.e.,It is a symmetrical matrix;Element on diagonal of a matrix is each Investor unifies the auto-correlation coefficient of assemble index sequence, is worth for 1.
5. the detection method of financial field connected transaction according to claim 4, it is characterised in that step(4)Described in The detection of connected transaction group, idiographic flow are as follows:
After correlation matrix is obtained, weight map is built with this matrix;Wherein, the investor of in the market is represented with node, such as The correlation coefficient value of the unified assemble index sequence of two investors of fruit is more than the threshold value of user's setting, then the two There is a connection side between two nodes corresponding to investor, the weight on side is the correlation coefficient value;
In the case of multiple chain transaction days:A weight map is built for each day of trade, is then combined with these daily weights Connected component in figure, obtain a synthetic weights multigraph;The weight on side is time that it occurs in multiple daily weight maps in figure Number;Finally, delete weight and be less than predetermined valueSide, remaining connected subgraph is potential connected transaction group.
CN201710715883.8A 2017-08-21 2017-08-21 A kind of detection method of financial field connected transaction Pending CN107527144A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710715883.8A CN107527144A (en) 2017-08-21 2017-08-21 A kind of detection method of financial field connected transaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710715883.8A CN107527144A (en) 2017-08-21 2017-08-21 A kind of detection method of financial field connected transaction

Publications (1)

Publication Number Publication Date
CN107527144A true CN107527144A (en) 2017-12-29

Family

ID=60681639

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710715883.8A Pending CN107527144A (en) 2017-08-21 2017-08-21 A kind of detection method of financial field connected transaction

Country Status (1)

Country Link
CN (1) CN107527144A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255517A (en) * 2018-07-27 2019-01-22 阿里巴巴集团控股有限公司 Generation method, device, server and the readable storage medium storing program for executing of air control strategy
CN109636608A (en) * 2018-12-20 2019-04-16 上海金融期货信息技术有限公司 A kind of financial derivatives investor correlation behavior automatic recognition system
CN110362609A (en) * 2019-07-01 2019-10-22 西安交通大学 A kind of stock collaboration transaction doubtful point crowd surveillance method based on bigraph (bipartite graph)
CN111340578A (en) * 2018-12-18 2020-06-26 北京京东尚科信息技术有限公司 Method, device, medium and electronic equipment for generating commodity association relationship
CN111402053A (en) * 2020-03-16 2020-07-10 杭州时戳信息科技有限公司 Transaction matching method, system, computer readable storage medium and computing device
CN111443165A (en) * 2020-03-27 2020-07-24 华中科技大学 Odor identification method based on gas sensor and deep learning
CN112579661A (en) * 2019-09-29 2021-03-30 杭州海康威视数字技术股份有限公司 Method and device for determining specific target pair, computer equipment and storage medium
CN115587893A (en) * 2022-12-12 2023-01-10 深圳市泰铼科技有限公司 Futures transaction supervisory systems based on internet finance
CN117435676A (en) * 2023-07-13 2024-01-23 南京电力设计研究院有限公司 Building energy management method based on subsequence mining and directed weighted graph clustering

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103412915A (en) * 2013-08-06 2013-11-27 复旦大学 Method and system for measuring scene awareness for financial high-frequency transaction data
US20140149174A1 (en) * 2012-11-26 2014-05-29 International Business Machines Corporation Financial Risk Analytics for Service Contracts
CN104008503A (en) * 2013-02-26 2014-08-27 诺布里斯股份有限公司 Systems and methods for detecting market irregularities

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149174A1 (en) * 2012-11-26 2014-05-29 International Business Machines Corporation Financial Risk Analytics for Service Contracts
CN104008503A (en) * 2013-02-26 2014-08-27 诺布里斯股份有限公司 Systems and methods for detecting market irregularities
CN103412915A (en) * 2013-08-06 2013-11-27 复旦大学 Method and system for measuring scene awareness for financial high-frequency transaction data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王俊杰: "基于网络的金融数据分析与挖掘", 《中国博士学位论文全文数据库 经济与管理科学辑》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255517B (en) * 2018-07-27 2022-04-26 创新先进技术有限公司 Method and device for generating wind control strategy, server and readable storage medium
CN109255517A (en) * 2018-07-27 2019-01-22 阿里巴巴集团控股有限公司 Generation method, device, server and the readable storage medium storing program for executing of air control strategy
CN111340578A (en) * 2018-12-18 2020-06-26 北京京东尚科信息技术有限公司 Method, device, medium and electronic equipment for generating commodity association relationship
CN111340578B (en) * 2018-12-18 2024-05-21 北京京东尚科信息技术有限公司 Commodity association relation generation method, device, medium and electronic equipment
CN109636608A (en) * 2018-12-20 2019-04-16 上海金融期货信息技术有限公司 A kind of financial derivatives investor correlation behavior automatic recognition system
CN110362609A (en) * 2019-07-01 2019-10-22 西安交通大学 A kind of stock collaboration transaction doubtful point crowd surveillance method based on bigraph (bipartite graph)
WO2021000475A1 (en) * 2019-07-01 2021-01-07 西安交通大学 Bipartite graph-based method for detecting collaborative stock transaction suspicious groups
CN110362609B (en) * 2019-07-01 2021-09-07 西安交通大学 Stock cooperative trading doubtful point group detection method based on bipartite graph
CN112579661A (en) * 2019-09-29 2021-03-30 杭州海康威视数字技术股份有限公司 Method and device for determining specific target pair, computer equipment and storage medium
CN112579661B (en) * 2019-09-29 2023-04-14 杭州海康威视数字技术股份有限公司 Method and device for determining specific target pair, computer equipment and storage medium
CN111402053A (en) * 2020-03-16 2020-07-10 杭州时戳信息科技有限公司 Transaction matching method, system, computer readable storage medium and computing device
CN111443165B (en) * 2020-03-27 2021-06-11 华中科技大学 Odor identification method based on gas sensor and deep learning
CN111443165A (en) * 2020-03-27 2020-07-24 华中科技大学 Odor identification method based on gas sensor and deep learning
CN115587893A (en) * 2022-12-12 2023-01-10 深圳市泰铼科技有限公司 Futures transaction supervisory systems based on internet finance
CN115587893B (en) * 2022-12-12 2023-05-16 深圳市泰铼科技有限公司 Futures transaction supervision system based on internet finance
CN117435676A (en) * 2023-07-13 2024-01-23 南京电力设计研究院有限公司 Building energy management method based on subsequence mining and directed weighted graph clustering

Similar Documents

Publication Publication Date Title
CN107527144A (en) A kind of detection method of financial field connected transaction
US20210342836A1 (en) Systems and methods for controlling rights related to digital knowledge
Farrokhi et al. Using artificial intelligence to detect crisis related to events: Decision making in B2B by artificial intelligence
Allahbakhsh et al. Collusion detection in online rating systems
Kaiser et al. Warning system for online market research–identifying critical situations in online opinion formation
US20180268015A1 (en) Method and apparatus for locating errors in documents via database queries, similarity-based information retrieval and modeling the errors for error resolution
Wang et al. Detecting potential collusive cliques in futures markets based on trading behaviors from real data
WO2022016102A1 (en) Systems and methods for controlling rights related to digital knowledge
Kim et al. Detecting the change of customer behavior based on decision tree analysis
Rishehchi Fayyaz et al. A data-driven and network-aware approach for credit risk prediction in supply chain finance
Lau et al. A business process decision model for fresh-food supplier evaluation
Kwapień et al. Minimum spanning tree filtering of correlations for varying time scales and size of fluctuations
CN112668859A (en) Big data based customer risk rating method, device, equipment and storage medium
Kaur et al. Analyzing negative ties in social networks: A survey
Clemente et al. A novel measure of edge and vertex centrality for assessing robustness in complex networks
Manlangit et al. Novel machine learning approach for analyzing anonymous credit card fraud patterns
Khotilin et al. Visualization and cluster analysis of social networks
Kaur et al. A pythagorean fuzzy approach for sustainable supplier selection using TODIM
Chaurasia et al. A survey on terrorist network mining: current trends and opportunities
Rahmatillah et al. An Improved Decision Tree Model for Forecasting Consumer Decision in a Medium Groceries Store
Alghobiri et al. Using data mining algorithm for sentiment analysis of users’ opinions about bitcoin cryptocurrency
Li et al. A commonsense knowledge-enabled textual analysis approach for financial market surveillance
Frazier An Analysis of the factors affecting attitudes toward drone delivery and the moderating effect of COVID-19
Neysiani et al. A framework for improving find best marketing targets using a hybrid genetic algorithm and neural networks
Singh et al. User Reputation Analysis for effective Trading on Bitcoin Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171229