CN108268624A - User data method for visualizing and system - Google Patents
User data method for visualizing and system Download PDFInfo
- Publication number
- CN108268624A CN108268624A CN201810022133.7A CN201810022133A CN108268624A CN 108268624 A CN108268624 A CN 108268624A CN 201810022133 A CN201810022133 A CN 201810022133A CN 108268624 A CN108268624 A CN 108268624A
- Authority
- CN
- China
- Prior art keywords
- group
- user
- data
- decision
- decision tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0609—Buyer or seller confidence or verification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The application provides a kind of user data method for visualizing and system.Wherein, the method includes:The data set of a group is obtained, the data characteristics of data set includes user information, IP address, event type, event and initiates source, event response side and Time To Event;Wherein, the data characteristics of data set is confirmed as different decision priority;A decision tree figure is shown to characterize the attribute test process of all users in group, wherein:Show the data characteristics of the first priority of decision tree figure root node and its decision codomain;Show the final attribute of at least one user of each leaf node characterization of decision tree figure;Show the current attribute of multiple users, the data characteristics of current priority and its decision codomain of each nonleaf node characterization of decision tree figure;And display corresponds to the decision path of the root node or each nonleaf node in decision tree figure, the lines of those decision paths different colours, shape or thickness are characterized.
Description
Technical field
This application involves computer processing technology field, more particularly to a kind of user data method for visualizing and system.
Background technology
Online fraud has been current internet dark aspect known to everybody, it all can worldwide be caused every year
Immeasurable loss.2015, million ranks that net crime complaint center has been connected in worldwide about taking advantage of
The complaint of swindleness problem, and cheat on the net it is annual also can worldwide cause tens economic loss, fraudulent user is usual
For can from help promote some specific commodity or spread junk information in receive remuneration.In internet finance, fraud is used
The credit card that family is applied for loan, stolen with them using false identity buys commodity, even carries out the unlawful activities such as money laundering.Cause
This, in internet business scenario, finding suitable anti-fraud algorithm becomes more crucial, this demand is also growing day by day.
Although nowadays there are many methods to identify the fraud on internet, by constructed fraud detecting system
Limitation, the credible of the data of corresponding fraud suspect filtered out needs follow-up a large amount of manpower verification, for example, platform
Supervisor need to investigate verification one by one.This so that the revision of such as algorithm parameter, data characteristics are excellent in fraud detecting system
Design, algorithm model selection of first grade etc. not only need the Software for Design of algorithm expert, with greater need for the participation of domain expert.Cause
This, fraud Detection accuracy can be efficiently modified by improving the transparency of fraud recognizer, how to realize the visual of data
Turn to this field urgent problem to be solved.
Invention content
In view of the foregoing deficiencies of prior art, the application is designed to provide a kind of user data method for visualizing
And system, for solving the problems, such as that fraud recognizer is visual in the prior art.
In order to achieve the above objects and other related objects, the first aspect of the application provides a kind of user data visualization side
Method, applied in a fraud detecting system, the method for visualizing includes the following steps:Obtain the data of a group
Collection, the data characteristics of the data set include user information, IP address, event type, event initiate source, event response side and
Time To Event;Wherein, the data characteristics of the data set is confirmed as different decision priority;Show a decision tree
Figure to characterize the attribute test process of all users in the group, wherein:Show the of the decision tree figure root node
The data characteristics of one priority and its decision codomain;Show at least one user of each leaf node characterization of the decision tree figure
Final attribute;Show current attribute, the current priority of multiple users of each nonleaf node characterization of the decision tree figure
Data characteristics and its decision codomain;And show determining for the root node or each nonleaf node that correspond in the decision tree figure
Plan path, the lines of those decision paths different colours, shape or thickness are characterized.
The application second aspect provides a kind of computer equipment, including:Processor;The presentation performed on the processor
Engine, the presentation engine are used to perform as above any one of them user data method for visualizing.
The application third aspect provides a kind of user data visualization system, including:Acquisition module, for obtaining a group
The data set of group, the data characteristics of the data set include user information, IP address, event type, event and initiate source, event sound
Ying Fang and Time To Event;Wherein, the data characteristics of the data set is confirmed as different decision priority;It is and aobvious
Show module, for showing a decision tree figure to characterize the attribute test process of all users in the group, wherein, display
The data characteristics of first priority of the decision tree figure root node and its decision codomain;Show that the decision tree figure is each
The final attribute of at least one user of leaf node characterization;Show multiple use of each nonleaf node characterization of the decision tree figure
The current attribute at family, the data characteristics of current priority and its decision codomain;And in the corresponding decision tree figure of display
The decision path of root node or each nonleaf node, the lines of those decision paths different colours, shape or thickness carry out table
Sign.
The application provides a kind of client in fourth aspect, passes through one server-side of network connection, which is characterized in that the visitor
Family end group asks to log in the step of server-side performs user data method for visualizing described in any one of the above embodiments in transmission.
The application provides a kind of server at the 5th aspect, passes through one client of network connection, which is characterized in that the clothes
It is visual to send user data described in any one of the above embodiments to the client for the operation that business device is asked based on the client executing
The process of change method simultaneously shows implementing result by the client.
The application provides a kind of browser at the 6th aspect, passes through one server-side of network connection, which is characterized in that described clear
Device of looking at is based on sending request to log in the step of server-side performs any one of them user data method for visualizing.
The application provides a kind of computer readable storage medium at the 7th aspect, is stored with data visualization computer journey
Sequence, which is characterized in that the data visualization computer program, which is performed, realizes that user data described in any of the above-described is visual
The step of change method.
As described above, the user data method for visualizing and system of the application, have the advantages that:By that will cheat
The modes such as the determining group user grouping process of institute, data characteristics distribution, tabulation are presented in event detection procedure, are realized
Suo Fen groups are shown with a variety of relationship interfaces during fraud is detected, and are conducive to domain expert and algorithm expert pair
The detection algorithm of fraud detecting system is assessed and is revised.
Description of the drawings
Fig. 1 is shown as the user data method for visualizing flow chart of the application in one embodiment.
Fig. 2 is shown as a kind of embodiment provided herein to obtain the flow chart of a group data collection.
Fig. 3 is shown as the interface for including multiple groups that the application is shown in one embodiment.
Fig. 4 is shown as the schematic diagram of a group user decision tree figure that the application is shown in one embodiment.
Fig. 5 is shown as the application and shows the number of users for being further included in decision tree figure and being classified to each node in one embodiment
The display interface of amount.
Fig. 6 be shown as the application show in one embodiment left side for target user's operation log on a timeline,
Right side shows the interface schematic diagram of Group Decision tree graph shape.
Fig. 7 is shown as the list interface schematic diagram for the data set of a group that the application is shown in one embodiment.
Fig. 8 is shown as the comentropy of registion time dimension in the group that the application is shown in one embodiment in network
The interface schematic diagram of feature distribution in cluster.
Fig. 9 be shown as the application show in one embodiment the group data set feature distribution interface flow
Figure.
Figure 10 is shown as the application and shows the step flow chart that multiple groups are distributed in the cluster in one embodiment.
Figure 11 is shown as the application and shows multiple groups distribution interface in the cluster in one embodiment.
Figure 12 is shown as the configuration diagram of the application computer equipment in one embodiment.
Figure 13 is shown as the modular structure schematic diagram of user data visualization system provided herein.
Specific embodiment
Presently filed embodiment is illustrated by particular specific embodiment below, those skilled in the art can be by this explanation
Content disclosed by book understands other advantages and effect of the application easily.
In described below, refer to the attached drawing, attached drawing describes several embodiments of the application.It should be appreciated that it also can be used
Other embodiment, and can be carried out in the case of without departing substantially from spirit and scope mechanical composition, structure, electrically with
And operational change.Following detailed description should not be considered limiting, and the range of embodiments herein
Only limited by the claims for the patent announced.Term used herein is merely to describe specific embodiment, and be not
It is intended to limitation the application.
Furthermore as used in herein, singulative " one ", "one" and "the" are intended to also include plural number shape
Formula, unless there is opposite instruction in context.It will be further understood that term "comprising", " comprising " show that there are the spies
Sign, step, operation, element, component, project, type, and/or group, but it is not excluded for other one or more features, step, behaviour
Presence, appearance or the addition of work, element, component, project, type, and/or group.Term "or" used herein and "and/or" quilt
It is construed to inclusive or means any one or any combinations.Therefore, " A, B or C " or " A, B and/or C " mean " with
Descend any one:A;B;C;A and B;A and C;B and C;A, B and C ".Only when element, function, step or the combination of operation are in certain sides
When inherently mutually exclusive under formula, it just will appear the exception of this definition.
In fraud detection technique, domain expert provides the warp of data classification for the core technology that fraud identifies
The demand with classification results accuracy is tested, but parameter of the algorithm framework and in algorithm in itself is not known to them.Field
Expert is examined due to the mode for having no way of classifying to data during being detected when obtaining fraud using fraud detecting system
When surveying result, domain expert is other than verifying testing result, the accuracy for obtained testing result of having no way of judging.
In order to improve the accuracy of fraud detecting system, the application provides a kind of number of users applied to fraud detecting system
According to method for visualizing, obtained group categorized in fraud detecting system and its data set are shown in a manner of visual
To algorithm expert and domain expert so that different domain experts or algorithm expert is various to explore by a variety of interactive means
Fraud, and fraud detection algorithm can flexibly be changed according to fraud feature.
The user data method for visualizing is mainly performed by computer equipment.The computer equipment can be following
Suitable computer equipment, such as handheld computer device, tablet computer equipment, notebook computer, desktop PC,
Server etc..Computer equipment includes display, input unit, input/output (I/O) port, one or more processors, deposits
Reservoir, non-volatile memory device, network interface and power supply etc..The various parts may include hardware element (such as core
Piece and circuit), software element (such as tangible non-transitory computer-readable medium of store instruction) or hardware element and software
The combination of element.In addition, it may be noted that various parts can be combined into less component or be separated into additional component.For example,
Memory and non-volatile memory device can be included in single component.The computer equipment can be individually performed described visual
Change method coordinates execution with other computer equipments.In some embodiments, computer equipment performs method for visualizing simultaneously
Corresponding visualization interface is shown.For example, computer equipment includes processor, display, wherein, in the processor
Engine (or display engine) is presented in upper perform, and the engine that presents is used to perform the user data method for visualizing and pass through
Display is shown, here, the presentation engine includes but not limited to parse is used for boundary based on what program language was developed
The software and hardware that face is shown, such as XML, HTML script, C language etc..In yet other embodiments, a computer
Equipment performs method for visualizing and another computer equipment is supplied to be shown corresponding visualization interface.It is for example, objective
Family end group operates in the request of user and initiates to ask to server-side and log in the server-side, server-side perform method for visualizing with
Corresponding interface data is formed, and the interface data is fed back into client, by the browser of client or the application of customization
Program shows corresponding diagram according to respective interface data.
The method for visualizing can be applied to fraud detecting system.The fraud detecting system may include one
Or the software and hardware in multiple computer equipments.It is assorted in order to done to domain expert's one group of offer as a fraud group
And " whether same group of user has identical behavioural habits " that algorithm expert is proposed.The application is used from group
A kind of method for visualizing is provided in terms of the grouping process at family.Referring to Fig. 1, it is shown as the number of users of the application in one embodiment
According to method for visualizing flow chart.As shown in the figure, the user data method for visualizing includes the following steps:
In step s 11, the data set of a group is obtained.The data characteristics of the data set is believed including at least user
Breath, IP address, event type, event initiate source, event response side and Time To Event.The user information refers to table
The information of user identity is levied, for example, User ID, unique user's pet name, certificate number etc..The user information further includes:Mobile phone
Number, mailbox, ID number, gender, user equipment used by a user number, registion time etc..The IP address represents same use
Family information generates the IP address of computer equipment corresponding during event in a network or IP address is segmented or IP address grouping.Institute
It states event type and is recorded on the type that user behavior event is represented in network operation daily record, include but not limited to:Network is used
The concern that is carried out between family, the Social behaviors such as thumb up, comment on, presenting or the network user logged in, published, more new state,
At least one of operation behaviors such as registration, modification information.Same user information can correspond at least one event type, each thing
Part type corresponds to event and initiates source, event response side and Time To Event.For example, same user information can correspond to multiple thumb up
Event type each thumbs up event type and corresponds to respective event initiation source, event response side and Time To Event.The event
Initiation source refers to initiate user information of an event type etc..The event response side includes the mesh of initiated event type
Mark user information etc..
Here, group's grouping of collected user is determining according to data characteristics based on collected constituent clusters.
The detection algorithm (such as unsupervised detection algorithm) for the group member for participating in the fraud time is preset in fraud detecting system.
The detection algorithm is for group member of accurately classifying, the decision priority pair of all data characteristicses based on collected member
All members carry out hierarchical classification.Different frauds corresponds to the detection algorithm of different decision priority.
In some embodiments, the detection algorithm carries out decision according to the similarity of the data characteristics of all members
Classification.Specifically, referring to Fig. 2, being shown as a kind of embodiment provided herein to obtain group data collection
Flow chart, as shown in the figure, the step S11 further comprises:
Step S111 obtains the operation log that cluster is made of multiple network users;In various embodiments, the collection
Group is the cluster of all-network user composition that can be got, the network user in the cluster from same website or
The different website of person also or from different Internet channels, for example can be internet, one or more intranets, local
Net (LAN), wide area network (WLAN), storage area network (SAN) etc. or its appropriately combined or mobile phone mobile communication
Network etc..
Step S112 determines at least one data characteristics from the operation log of the multiple network user, and analyzes institute
The similarity of at least one set of data characteristics in operation log is stated to determine the group;In the particular embodiment, for network
Fraud will necessarily leave the characteristics of user is using data in a network, be collected in fraud detecting system from least one
The operation log of multiple network users of a website, by analyzing the similar of at least one data characteristics in the operation log
Degree is grouped the user for generating corresponding operating daily record, obtains the data set of group and group in operation log.
In certain embodiments, the data set positioned at a group includes but unlimited user information, IP address, event class
Type, event initiate at least the two data characteristics in source, event response side and Time To Event.Wherein, the user information
Such as phone number, mailbox, ID number, identification card number, gender, user equipment used by a user number, registion time.Wherein,
Same user information can correspond at least one event type, and each event type corresponds to event and initiates source, event response side and thing
Part time of origin.The affair character includes but not limited to:The concern that is carried out between the network user, thumb up, comment on, presenting (or
Person is referred to as to give a present) etc. Social behaviors or the network user logged in, published, more new state, registration, the behaviour such as modification information
Make at least one of behavior.For example, same user information can correspond to it is multiple thumb up event type, each thumb up event type pair
Respective event is answered to initiate source, event response side and Time To Event.
Step S113 obtains the data set of the group.In some embodiments, the data set can be obtained from a storage
You Ge groups and its database of data set, the database are for example configured in the storage server of a distal end or are configured
In storage device in local computer equipment, then the data set of an acquired group can be grasped based on the input of user
Work is extracted from database and is obtained.For example, the fraud detecting system obtains multiple groups using unsupervised detection algorithm
Group, user select one of group by selection interface, then obtain the data set of relevant groups.
Specifically, the fraud detecting system first to data all in operation log same class data characteristics phase
It is calculated like degree, wherein, the similarity available information entropy is weighed, for example, the fraud detecting system point
Not Li Yong user information calculate the comentropy of IP usage amounts or maximum IP usage amount dimensions, utilize event type calculating operation type
The comentropy of dimension calculates the comentropy of bad operation dimension using the comentropy of registion time dimension or operating time;By
By above-mentioned calculating, unsupervised detection mode is recycled to be detected obtained each comentropy and divides to obtain multiple groups
Group.Wherein, the unsupervised detection mode citing is included using the algorithm based on dense subgraph or the calculation based on vector space
Method etc..Each group that method for visualizing provided herein is presented for reflect shared resource used in fraud,
Customer relationship etc., the user using the fraud detecting system to be allowed more clearly to determine in the unsupervised detection algorithm
Classification policy it is whether reasonable.Wherein, the shared resource includes but not limited to shared IP, mailbox etc., and customer relationship includes
But it is not limited to:User's concern, interactive relation etc..
In one embodiment, the method for visualizing further includes the step of showing at least one group interface, the group
Group size in class boundary face is characterized with the geometric figure size shown.Implement referring to Fig. 3, being shown as the application one
The interface for including multiple groups shown in example, as shown in the figure, 11 groups are shown in interface, for characterizing those groups
Geometric figure is circle, and 11 groups are all located in a maximum circle of dotted line, in the circle of dotted line, such as the void
Line circle is used for characterizing cluster be made of N number of network user, such as the group marked as 0 is normal group, one compared with
There is of different sizes 10 group marked as 1-10 in small circle of dotted line, circular size is directly proportional to the number of members of group,
That is, the big group's person's of being expressed as quantity is more, the small group's person's of being expressed as negligible amounts, for another example the group marked as 1-10 is different
Normal group.In various embodiments, the geometric figure of the group can be arbitrary shape.The color of geometric figure can
It is randomly provided or related to the number of members of the quantity of group or group.For example, N kind colors are preset with, the fraud inspection
Examining system randomly corresponds to different colours on the geometric figure for characterizing each group.For another example, the fraud detecting system
According to preset color sequences, it is corresponding in turn on the geometric figure for characterizing each group according to the ascending sequence of number of members.
When display interface described in user's operation chooses a geometric figure, described one group of fraud detecting system acquisition
Data set.
In a preferred embodiment, display group information can also be included at least one group interface of display
Information bar, when user selects a group in the group interface, in the side at interface with the side of form or text box
Formula shows the essential information of the group, and the essential information is, for example,:Group's coding, number of members, for determining the group
The information such as the most preferred data characteristics of group, group attribute (such as normal group or abnormal group).
In order to show the decision process of Suo Fen groups, fraud detecting system performs step S12 after the grouping, with tree
The form of shape structure is by the group user that of classifying in corresponding detection algorithm according to the data characteristics of decision priority
Grouping process is shown that thus domain expert and/or algorithm expert are solved by visualization interface in corresponding detection algorithm
Deficiency and defect.
In step s 12, a decision tree figure is shown to characterize the attribute test process of all users in the group.
Wherein, the attribute of the user may include normal users (Normal) and abnormal user (Abnormal) or comprising just common
Family (Normal), fraud role A (Abnormal A), fraud role B (Abnormal B) etc..In display interface, this step
Fraud inspection is characterized up to each leaf node via decision path or each nonleaf node from the root node set with tree
Examining system is using detection algorithm from highest priority until same each user of group belongs to obtained from lowest priority hierarchical classification
The process of property.Wherein, it is shown in display interface illustrated below:The data of first priority of the decision tree figure root node
Feature and its decision codomain;The final attribute of at least one user of each leaf node characterization of decision tree figure;It is described to determine
The current attribute of multiple users, the data characteristics of current priority and its decision value of each nonleaf node characterization of plan tree graph shape
Domain;And root node or the decision path of each nonleaf node in the corresponding decision tree figure, those decision paths are not with
The lines of same color, shape or thickness are characterized.By the decision tree figure as it can be seen that being divided into user's quilt of each leaf node
It determines to be detected as normal users or the final attribute of abnormal user, classification need to be continued until quilt by being divided into the user of each nonleaf node
Determining leaf node is assigned to determine the final attribute (i.e. normal users and abnormal user) of relative users.
Wherein, the result of decision of decision tree is the original value and corresponding decision according to each user data feature of step-by-step analysis
The relationship of value threshold and classify.For example, certain selected user in the fraud detecting system, by decision hedge clipper
After branch, it is utilized respectively the out-degree (out_ of maximum IP usage amounts, the user in social networks from high to low according to priority
Degree) with user information IP usage amounts are calculated, decision is grouped to all users.Wherein, the unsupervised detection algorithm side
Formula citing is included using the algorithm based on dense subgraph or algorithm based on vector space etc..Provided herein is visual
Each group that change method is presented is for reflecting shared resource used in fraud, customer relationship etc., to allow described in use
The user of fraud detecting system more clearly determines whether the classification policy in the unsupervised detection algorithm is reasonable.Its
In, the shared resource includes but not limited to shared IP, mailbox etc., and customer relationship includes but not limited to:User's concern, interaction
Relationship etc..
Referring to Fig. 4, it is shown as the schematic diagram of a group user decision tree figure.As shown in the figure, the decision tree diagram
The root node of shape shows that the data characteristics of highest priority is maximum IP usage amounts, and weigh group with maximum IP usage amounts
The attributive classification of user, i.e., when the maximum IP usage amounts (max_IP_used_be_used_amount) corresponding to a user≤
When 80.5, relative users are classified to first nonleaf node along " indigo plant " color decision path, conversely, then by relative users edge " Huang "
Color decision path is classified to first leaf node.First nonleaf node initiates data spy of the source for the second priority according to event
Sign continues to carry out acquired each user classification judgement, i.e., initiates source using event to weigh current acquired each user's
Attributive classification, when the out-degree (out_degree)≤711.0 with event initiation source in social networks corresponding to a user
When, relative users are classified to second leaf node along " indigo plant " color decision path, conversely, then by relative users edge " Huang " color decision
Route classification is to second nonleaf node.Second nonleaf node continues according to data characteristics of the IP usage amounts for third priority
Classification judgement is carried out to acquired each user, i.e., the attribute point of current acquired each user is weighed using IP usage amounts
When corresponding to a user with IP usage amounts (IP_used_amount)≤870.0, relative users are determined along " indigo plant " color for class
Plan route classification is to third leaf node, conversely, relative users then are classified to the 4th leaf node along " Huang " color decision path.
Wherein, show decision path that attribute that the nonleaf node in currently point priority classified is " normal users " with " indigo plant " color table,
" Huang " color table shows the decision path that the attribute classified under current priority is " abnormal user ".Wherein, each nonleaf node
Middle user weighs the codomain of corresponding information, and as illustrated in the drawing 80.5,711.0 and 870.0 etc., it is corresponding current preference series
According to the decision codomain of feature.
In various embodiments, the difference of decision path can also be characterized with lines of different shapes in display,
For example the user property gone out with solid line characterization decision is normal users, the user property gone out with dotted line characterization decision is abnormal use
Family, then alternatively, the user property gone out with straight line characterization decision is normal users, the user property gone out with curve characterization decision is different
Common family, more alternatively, the lines of thickness characterize the difference of decision path, such as the user property gone out with hachure characterization decision
For normal users, the user property gone out with thick lines characterization decision is abnormal user etc..
For the number of users for being more clearly seen each nonleaf node and acquired in leaf node, a decision tree is being shown
To characterize in the group during the attribute test of all users, the root node in the decision tree figure is also shown figure
It group user quantity (sample size that i.e. root node gives) and is also shown in each nonleaf node of the decision tree figure
The number of users (sample size that i.e. current nonleaf node obtains) of current attribute.Referring to Fig. 5, its decision tree figure for showing
In further include the display interface of the number of users for being classified to each node, wherein, the sample_size shown in root node is root section
The given sample size of point, i.e. group member sum, the sample_size that other nonleaf nodes are shown are obtained for current nonleaf node
Sample size, the sample_size expressions shown in leaf node are classified to the number of users of own node by upper level.
It should be noted that it is different according to the type of fraud, the design of unsupervised detection algorithm, operate day in detection
In will in each Group Decision assorting process, each data characteristics priority, the decision codomain of each priority, the superior and the subordinate are adjacent excellent
First grade relationship, decision path at different levels etc. all may be different.Or even in order to get more quickly to the group of each user in operation log
The result of decision, used unsupervised detection algorithm can cut selected data characteristics according to convergent in training
Choosing, i.e., when trained detection algorithm has reached the condition of convergence, remaining data characteristics will be handled by beta pruning, institute's beta pruning processing
Data characteristics will be not displayed on the display interface of decision tree figure.Alternatively, all users are examining in acquired group
It is had been identified as in first several grades of classification in method of determining and calculating, then remaining data characteristics can be handled by beta pruning, and display module is only
The decision tree figure of each node that display is connected comprising all decision paths and each decision path.It is described herein when utilizing
When method for visualizing is shown the categorised decision process of a group, domain expert and algorithm expert are easier to evaluate the inspection
The accuracy of method of determining and calculating.
It is another aobvious in the display interface of the decision tree figure or what is redirected based on acquired operational order
Show in interface, the method for visualizing further includes:Determine a user in the group as target user;And described
The side of decision tree figure shows a time shaft, the step of the operation log of the target user on the time axis is presented
Suddenly.
Here, when domain expert or algorithm expert click a leaf node and by the pop-out of leaf node in choose one with
When user links, operation log of the relative users in time shaft is shown on the side of decision tree figure.Referring to Fig. 6,
It is shown as left side as the operation log of target user on a timeline, the interface signal in right side display Group Decision tree graph shape
Figure.According to the sequential node of time sequencing in time shaft marking operation daily record from top to bottom as depicted, by each sequential node
Show event type (such as event_type) in the operation log corresponding to corresponding time point, event generation time (such as
Timestamp), user information (such as user_id), IP address (such as complete IP address or IP segmentation), event response side be (such as
Target_user), event content (such as comment_id, comment_lenth, amount, object_id, target_
Video etc.), event type (such as event_type).By showing, each operation of user on a timeline is gone through in group
History, can allow domain expert and algorithm expert checks the accuracy of the detected user property positioned at same group in detail, with
And adhere to the general character relationship of normal users and abnormal user in same group separately, and then confirm the deficiency and defect of detection algorithm.
In other embodiments, domain expert and algorithm expert are not only concerned about the member property assorting process of group,
It is also concerned about whether distributed group is reasonable, this needs them that can check the detailed data feature in each group, and from another
A kind of dimension opens the preferred order of each data characteristics checked and built for classifying group.The method for visualizing may include showing
The step of showing the interface of the data set of a group.Shown data set is shown with list mode, is thus shown for user
Show the details of data characteristics in same group.To improve the group data collection classification accuracy, shown in the interface
Priority of classifying based on when the list shown can classify according to fraud detecting system is by the data characteristics in a group
List is shown by column.For example, the referring to Fig. 7, row of the data set of a group that display the application is shown in one embodiment
Surface and interface schematic diagram.In the list interface schematic diagram, the data set of shown group is according to data characteristics
Obtained by similitude sorts for the sequence of priority from high to low.When the data characteristics similitude in the first priority is identical,
Data characteristics according to the second priority is ranked up, in the embodiment shown in fig. 7, the sequence of the priority from high to low
For:IP address, event initiate source (source), event response side (target), event type (event_type) and event hair
Raw time (timestamp).In the present embodiment, the new line of table (gauge outfit) is encoded with the importance of different lines, such as
The value of one feature of fruit is more concentrated, then this feature is more important.In the embodiment provided in the application, the fraud
Event detection system is to represent this characteristic by calculating the comentropy of each feature.If comentropy is lower, then meaning
It is higher consistency.Then feature is ranked up by the fraud detecting system according to the incremental sequence of comentropy, most
At last the list head front of low comentropy come prompting family note that certain, can also be according to will under different performances
List head in the table of display carries out color rendering, for example most the color rendering of the list head of low comentropy is most deep next at last
Prompt the data characteristics that the attentions row at family are characterized mostly important, and so on the progress color rendering row characterized other
Data characteristics, and then obtain data set list interface shown in figure.The list interface can be undertaken on the multiple group interfaces of display
The step of after or step S12 before or after, then the selection operation of the list interface is selected based on user and is shown.
In certain embodiments, whether the data set of the group acquired for further characterization can reflect fraud
Characteristic, it is also necessary to be shown from other dimensions.For example, network operation data and group data by comparing normal users
Collect the accuracy to further confirm that detected fraud.For this purpose, the method for visualizing further includes:Show the group
Data set feature distribution interface the step of.Wherein, the feature distribution interface can be shown with each data type in entirety
Distribution in network, the overall network are opposite, for example form a cluster by multiple network users, then can pass through
The distribution of some data characteristics in the interface display cluster in some group, referring to Fig. 3, maximum empty in such as Fig. 3
Line circle represents one and forms cluster by multiple network users, and cluster Zhong You11Ge groups are the group that number is 0-10 respectively,
Therefrom a group is selected to be shown into row information.
In some embodiments, the data type that feature distribution interface can be shown is, for example,:It ties up at average operating time interval
The comentropy (average operation interval entropy) of degree, the comentropy (IP of IP address usage amount dimension
Used amount entropy), the comentropy (sex entropy) of gender dimension, the comentropy (email of Email dimension
Entropy), the comentropy (reg time entropy) of registion time dimension, the comentropy of number of operations dimension
(operation times entropy), the comentropy (device amount entropy) of number of devices dimension operate class
The comentropy (operation type entropy) of type dimension, the maximum amount of comentropy (max that is used by others using IP
IP used be used amount entropy) etc...In the embodiment shown in fig. 8, with the information of registion time dimension
For entropy to be shown for data characteristics, i.e. Fig. 8 is shown as the comentropy of registion time in a group (registration period) dimension
Feature distribution in network cluster.In order to which effective ratio is to the network operation data of acquired group data collection and normal users
Feature distribution difference, referring to Fig. 9, its flow for being shown as showing the interface of the feature distribution of the data set of the group
Figure, as shown in the figure, including the following steps:
In step S211, a group is selected, and at least one data are determined from the data set of the group
Feature.In one embodiment, group marked as 2 such as in selection Fig. 3, and from the data in the group marked as 2
The data characteristics for determining that one is user information is concentrated, for example the user information is registion time.
In step S212, determining at least one data characteristics feature in the group and cluster point is counted
Cloth.In the present embodiment, the statistics feature distribution and statistics institute for the data characteristics of registion time in the group
State feature distribution of the data characteristics for registion time in the entire cluster.
In step S213, show the histogram of the feature distribution and correspond to the histogram in entire cluster histogram
In profiles versus figure.In the present embodiment, based on the coding to the data characteristics, the display data for registion time
Feature in the group histogram of feature distribution and the display data characteristics for registion time in the entire collection
The histogram of feature distribution in group.As shown in figure 8, in the interface D, figure (a) is shown as noting in the selected group marked as 2
The feature distribution thumbnail of volume time corresponds to the amplification of the thumbnail, then the enlarged drawing (d) for lower side in the D of interface, by institute
It states enlarged drawing to can be seen that in the group, from August 1 day to August one middle of the month of 31 days, which carries out registration behaviour
The time of work concentrates on August 5, August 6 days, August 11 days, August 12 days and August this 5 days on the 16th, and in the interface D
Figure (c) is characterized as the histogram that registered user in the cluster carries out the Annual distribution of registration operation in August part, from the figure
(c) as can be seen that registered user has certain rule in the in one's duty registration distribution of August in the cluster, scheme in the D of interface
(b) data characteristics for being characterized as overlaping in figure (d) and figure (c) being shown as registion time is in the entire cluster
With the difference in the group of selection.In order to allow users to know the difference and contact between different characteristic, the application
This block diagram is presented in the form of three layers in the embodiment of offer, after user is by clicking one of thumbnail, page
Face will be scrolled into schemes by normalized profiles versus.Certainly, in specific application, the thumbnail of the data characteristics may be used also
Can have multiple, each represent different data characteristicses.
In some embodiments, it can also distinguish or emphasize some data characteristics by carrying out color rendering to histogram
Feature distribution or Dynamic Announce (such as the mode flickered) are to distinguish or emphasize certain number in the group and entire cluster
According to feature in the group and entire cluster feature distribution.
In some embodiments, it is described in order to further analyze the difference between multiple groups in a network cluster
User data method for visualizing further includes the step of interface of the feature distribution for the data set for showing multiple groups, please refers to Fig.1 0
And Figure 11, Figure 10 are shown as the application and show the step flow chart that multiple groups are distributed in the cluster in one embodiment, figure
11 are shown as the application shows multiple groups distribution interface E in the cluster in one embodiment, as shown in the figure, the step packet
It includes:
In step S311, multiple groups are determined in the cluster be made of multiple network users, respectively with different shape, figure
Mark, label and/or the difference of the multiple group of characterization;In one embodiment, for example, selection Fig. 3 in label 0,1 and 2
3 groups, wherein, the group marked as 0 is shown with " " color table, and the group marked as 1 is shown with " red " color table, the group marked as 2
Group is shown with " indigo plant " color table.
In step S312, at least one data characteristics is determined from the data set of the multiple group;In the present embodiment
In, a data characteristics, such as IP address are determined from the data set of this 3 groups.
In step S313, based between each two network user at least one data characteristics analysis respectively group
Relative Entropy as the similarity degree between measuring each two network user;In the present embodiment, based on the IP
Relative Entropy (the letter of IP usage amount dimensions in 3 groups of adress analysis label 0,1 and 2 between each two network user
Cease entropy, IP used amount entropy) as the similarity degree measured between each two network user.For example, it adopts
It is used as with the method t-SNE of Data Dimensionality Reduction (t- distribution neighborhoods embedded mobile GIS) and with the relative entropy between two users and measures this
The index of a little network user's distances.
In step S314, display interface is exported, in the interface, with shape, icon, and/or tag characterization network
User characterizes the difference of the multiple group with different colours, and two network users in each group are characterized with the distance of display
Between similarity degree.In the present embodiment, in interface E as shown in figure 11, the network user is characterized with dot, " green " color table shows
Group marked as 0, shows the group marked as 1 with " red " color table, shows the group marked as 2 with " indigo plant " color table, wherein, with " indigo plant "
Color table shows that the user distance in the group marked as 2 is shorter, which forms tufted distribution, is shown with " red " color table marked as 1
User distance in group is also shorter, which forms tufted distribution, and point of the normal users of random sampling is shown with " green " color table
Farther out, distribution more disperses the distance between cloth, normal users.Thereby it is believed that a group is if dense cluster,
It is considered as a fraud group possibility it is bigger.Than the group that in embodiment as shown in figure 11, which shows
In the distribution more disperseed, then it represents that for should " green " colo(u)r group group be normal group, it is therein it is " green " point represent user also be just
Common family.Opposite, the group (group i.e. marked as 1) shown with " red " color table and the group shown with " indigo plant " color table (mark
Number group for being 2) be distributed in into tufted, then it represents that for should " red " and " indigo plant " colo(u)r group group be exception group, wherein, put with " red " and
The user that " indigo plant " point represents is abnormal user.In one embodiment, led to using user's interactive of the visualization system
Mouse is crossed to suspend to check the specifying information of user and feature value in each group.
In other examples, in the interface of output, for example, shape, icon, and/or tag characterization can also be used
The network user, such as shape are the geometric figures such as triangle, rectangle, for example icon is smiling face or face of crying, human skeleton head portrait, Qiang Daotou
As etc. icons, such as label word or with symbol for clearly distinguishing etc..
The user data method for visualizing of the application is by the way that the determining group user of institute in fraud detection process is grouped
The modes such as process, data characteristics distribution, tabulation are presented, and realize during fraud is detected Suo Fen groups with more
Kind relationship interface is shown, and is conducive to domain expert and algorithm expert and the detection algorithm of fraud detecting system is commented
Estimate and revise.
The application also provides a kind of computer equipment, and the computer equipment can be following suitable computer equipment,
Such as handheld computer device, tablet computer equipment, notebook computer, desktop PC, server etc..Computer is set
It is standby to include display, input unit, input/output (I/O) port, one or more processors, memory, non-volatile memories
Equipment, network interface and power supply etc..The various parts may include hardware element (such as chip and circuit), software member
The combination of part (such as tangible non-transitory computer-readable medium of store instruction) or hardware element and software element.In addition,
It may be noted that various parts can be combined into less component or be separated into additional component.For example, memory and non-volatile
Storage device can be included in single component.The computer equipment can be individually performed the method for visualizing or and other
Computer equipment cooperation performs.
2 are please referred to Fig.1, is shown as the configuration diagram of the application computer equipment in one embodiment, as shown in the figure,
In present embodiment, the computer equipment 1 include one or more processors 11 and what is performed on the processor 1 be in
Existing engine 12, to perform above-mentioned method for visualizing and be shown corresponding visualization interface.For example, computer equipment packet
Containing processor 11, display and the presentation engine 12 performed on the processor 11, wherein, it is held on the processor 11
Capable presentation engine (or display engine), the user data that engine 12 is presented for performing described in above-described embodiment are visual
Change method simultaneously passes through display and is shown, performs the description of implementation process of the user data method for visualizing refering to being directed to
The description of Fig. 1 to Figure 11.Under specific implementation state, the presentation engine is, for example, to be stored in local computer device
On memory or in remote storage server, the presentation engine includes but not limited to parse to be developed based on program language
The software and hardware for interface display, such as XML, HTML script, C language etc..In yet other embodiments, one
Platform computer equipment performs method for visualizing and another computer equipment is supplied to be shown corresponding visualization interface.
It initiates to ask and log in the server-side to server-side for example, request of the client based on user is operated, server-side performs visual
The interface data is fed back to client by change method to form corresponding interface data, by the browser of client or fixed
The application program of system shows corresponding diagram according to respective interface data.
The application also provides a kind of client, and the client passes through one server-side of network connection, in the present embodiment, institute
It is, for example, web client to state client, and the client is, for example, web services end, and the web client is based on sending web industry
Business request performs the user data method for visualizing described in above-described embodiment and passes through display to log in the web services end
It is shown, the description for performing the implementation process of the user data method for visualizing refers to the description for being directed to Fig. 1 to Figure 11.
The application also provides a kind of server, passes through one client of network connection, in the present embodiment, the client example
It is such as web client, the client is, for example, web services end, and the web server performs request based on web client
Operation sends the user data method for visualizing performed described in above-described embodiment to the client and passes through display and give
It has been shown that, the description for performing the implementation process of the user data method for visualizing refer to the description for being directed to Fig. 1 to Figure 11.
The application also provides a kind of browser, by one server-side of network connection, the browser be based on sending request with
It logs in the server-side and performs the user data method for visualizing described in above-described embodiment and pass through display and shown, held
The description of the implementation process of the row user data method for visualizing refers to the description for Fig. 1 to Figure 11.In the present embodiment,
The browser is, for example, web browser, including but not limited to QQ browsers, Internet Explorer browsers,
Firefox browser, Safari browsers, Opera browsers, Google Chrome browsers, baidu browser, search dog are clear
Look at device, cheetah browser, 360 browsers, UC browsers, proud trip browser, Window on the World browser etc..
The application also provides a kind of user data visualization system, the user data visualization system may include one or
Software and hardware in multiple computer equipments, and the data set for the group that fraud detecting system is detected carries out visually
Change.Do what what and algorithm expert were proposed as a fraud group to provide group one by one to domain expert
" whether same group of user has identical behavioural habits ".The application provides a kind of user data from group member relationship
Visualization system.3 are please referred to Fig.1, is shown as the modular structure signal of user data visualization system provided herein
Figure.As shown in the figure, the user data visualization system 3 includes acquisition module 31 and display module 32.
The acquisition module 31 is used to obtain the data set of a group.The data characteristics of the data set, which includes at least, to be used
Family information, IP address, event type, event initiate source, event response side and Time To Event.The user information refers to energy
The information of user identity is enough characterized, for example, User ID, unique user's pet name, certificate number etc..The user information further includes:
Phone number, mailbox, ID number, gender, user equipment used by a user number, registion time etc..The IP address represents same
One user information generates the IP address of computer equipment corresponding during event in a network or IP address is segmented or IP address point
Group.The event type is recorded on the type that user behavior event is represented in network operation daily record, includes but not limited to:Net
The concern that is carried out between network user, thumb up, comment on, presenting and (being either referred to as to give a present) etc. Social behaviors or the network user into
At least one of operation behaviors such as row is logged in, published, more new state, registration, modification information.Same user information can correspond to
A few event type, each event type correspond to event and initiate source, event response side and Time To Event.It is for example, same
User information can correspond to it is multiple thumb up event type, each thumb up event type and correspond to respective event and initiate source, event response side
And Time To Event.The event initiates source and refers to initiate user information of an event type etc..The event response side
Including target user's information of event type for being initiated etc..
Here, group's grouping of collected user is determining according to data characteristics based on collected constituent clusters.
The detection algorithm (such as unsupervised detection algorithm) for the group member for participating in the fraud time is preset in fraud detecting system.
The detection algorithm is for group member of accurately classifying, the decision priority pair of all data characteristicses based on collected member
All members carry out hierarchical classification.Different frauds corresponds to the unsupervised detection algorithm of different decision priority.
In some embodiments, the detection algorithm carries out decision according to the similarity of the data characteristics of all members
Classification.Specifically, referring to Fig. 2, being shown as a kind of embodiment provided herein to obtain group data collection
Flow chart, as shown in the figure, the acquisition module can obtain one from being concentrated based on the obtained multiple group datas of following steps
The data set of group:
Step S111 obtains the operation log that cluster is made of multiple network users;In various embodiments, the collection
Group is the cluster of all-network user composition that can be got, the network user in the cluster from same website or
The different website of person also or from different Internet channels, for example can be internet, one or more intranets, local
Net (LAN), wide area network (WLAN), storage area network (SAN) etc. or its appropriately combined or mobile phone mobile communication
Network etc..
Step S112 determines at least one data characteristics from the operation log of the multiple network user, and analyzes institute
The similarity of at least one set of data characteristics in operation log is stated to determine the group;In the particular embodiment, for network
Fraud will necessarily leave the characteristics of user is using data in a network, be collected in fraud detecting system from least one
The operation log of multiple network users of a website, by analyzing the similar of at least one data characteristics in the operation log
Degree is grouped the user for generating corresponding operating daily record, obtains the data set of group and group in operation log.
In certain embodiments, the data set positioned at a group includes but unlimited user information, IP address, event class
Type, event initiate at least the two data characteristics in source, event response side and Time To Event.Wherein, the user information
Such as phone number, mailbox, ID number, identification card number, gender, user equipment used by a user number, registion time.Wherein,
Same user information can correspond at least one event type, and each event type corresponds to event and initiates source, event response side and thing
Part time of origin.The affair character includes but not limited to:The concern that is carried out between the network user such as thumbs up, comments on, presenting at the societies
At least one in the operation behaviors such as Bank of Communications is or the network user is logged in, published, more new state, registration, modification information
Person.For example, same user information can correspond to it is multiple thumb up event type, each thumb up event type and correspond to respective event and initiate
Source, event response side and Time To Event.
Step S113 obtains the data set of the group.In some embodiments, the data set can be obtained from a storage
You Ge groups and its database of data set, the database are for example configured in the storage server of a distal end or are configured
In storage device in local computer equipment, then the data set of an acquired group can be grasped based on the input of user
Work is extracted from database and is obtained.For example, the fraud detecting system obtains multiple groups using unsupervised detection algorithm
Group, user select one of group by selection interface, then obtain the data set of relevant groups.
Specifically, the fraud detecting system first to data all in operation log same class data characteristics phase
It is calculated like degree, wherein, the similarity available information entropy is weighed, for example, the fraud detecting system point
Not Li Yong user information calculate the comentropy of IP usage amounts or maximum IP usage amount dimensions, utilize event type calculating operation type
The comentropy of dimension calculates the comentropy of bad operation dimension using the comentropy of registion time dimension or operating time;By
By above-mentioned calculating, unsupervised detection mode is recycled to be detected obtained each comentropy and divides to obtain multiple groups
Group.Wherein, the unsupervised detection mode citing is included using the algorithm based on dense subgraph or the calculation based on vector space
Method etc..Each group that method for visualizing provided herein is presented for reflect shared resource used in fraud,
Customer relationship etc., the user using the fraud detecting system to be allowed more clearly to determine in the unsupervised detection algorithm
Classification policy it is whether reasonable.Wherein, the shared resource includes but not limited to shared IP, mailbox etc., and customer relationship includes
But it is not limited to:User's concern, interactive relation etc..
In one embodiment, the display module 32 in the user data visualization system can show at least one group
Class boundary face, the group size in the group interface are characterized with the geometric figure size shown.Referring to Fig. 3, it is shown as
The interface for including multiple groups that the application is shown in one embodiment as shown in the figure, showing 11 groups in interface, is used for
The geometric figure for characterizing those groups is circle, and 11 groups are all located in a maximum circle of dotted line, in the dotted line
In circle, for example the circle of dotted line is used for characterizing cluster be made of N number of network user, such as the group marked as 0 is just
, there is of different sizes 10 group marked as 1-10 in normal group in a smaller circle of dotted line, circular size and group
Number of members is directly proportional, that is, the big group's person's of being expressed as quantity is more, the small group's person's of being expressed as negligible amounts, for another example label
Group for 1-10 is abnormal group.In various embodiments, the geometric figure of the group can be arbitrary shape.
The color of geometric figure can be randomly provided or related to the number of members of the quantity of group or group.For example, it is preset with N kind face
Color, the fraud detecting system randomly correspond to different colours on the geometric figure for characterizing each group.For another example, it is described
Fraud detecting system is corresponding in turn to each group of characterization according to preset color sequences according to the ascending sequence of number of members
On the geometric figure of group.When display interface described in user's operation chooses a geometric figure, the fraud detection system
System obtains the data set of a group.
In a preferred embodiment, display group information can also be included at least one group interface of display
Information bar, when user selects a group in the group interface, in the side at interface with the side of form or text box
Formula shows the essential information of the group, and the essential information is, for example,:Group's coding, number of members, for determining the group
The information such as the most preferred data characteristics of group, group attribute (such as normal group or abnormal group).
In order to show the decision process of Suo Fen groups, after the grouping, display module 32 is with tree for fraud detecting system
The form of shape structure is by the group user that of classifying in corresponding detection algorithm according to the data characteristics of decision priority
Grouping process is shown that thus domain expert and/or algorithm expert are solved by visualization interface in corresponding detection algorithm
Deficiency and defect.
The display module 32 is surveyed for showing a decision tree figure to characterize the attribute of all users in the group
Examination process.Wherein, the attribute of the user may include normal users (Normal) and abnormal user (Abnormal) or comprising
Normal users (Normal), fraud role A (Abnormal A), fraud role B (Abnormal B) etc..In display interface,
The display module 32 is with the root node that tree is set certainly via decision path or each nonleaf node until each leaf node comes
Fraud detecting system is characterized using detection algorithm from highest priority until same obtained from lowest priority hierarchical classification
The process of each user property of one group.Wherein, it is shown in display interface illustrated below:The of the decision tree figure root node
The data characteristics of one priority and its decision codomain;At least one user of each leaf node characterization of decision tree figure is most
Whole attribute;The current attribute of multiple users of each nonleaf node characterization of decision tree figure, the data of current priority are special
Sign and its decision codomain;And root node or the decision path of each nonleaf node in the corresponding decision tree figure, those
The lines of decision path different colours, shape or thickness are characterized.By the decision tree figure as it can be seen that being divided into each leaf
The user of node is determined being detected as normal users or the final attribute of abnormal user, and the user for being divided into each nonleaf node need to be after
Continuous classification is until be assigned to determining leaf node to determine the final attribute (i.e. normal users and abnormal user) of relative users.
Wherein, the result of decision of decision tree is the original value and corresponding decision according to each user data feature of step-by-step analysis
The relationship of value threshold and classify.For example, certain selected user in the fraud detecting system, by decision hedge clipper
After branch, it is utilized respectively the out-degree (out_ of maximum IP usage amounts, the user in social networks from high to low according to priority
Degree) with user information IP usage amounts are calculated, decision is grouped to all users.Wherein, the unsupervised detection algorithm side
Formula citing is included using the algorithm based on dense subgraph or algorithm based on vector space etc..Provided herein is visual
Each group that change method is presented is for reflecting shared resource used in fraud, customer relationship etc., to allow described in use
The user of fraud detecting system more clearly determines whether the classification policy in the unsupervised detection algorithm is reasonable.Its
In, the shared resource includes but not limited to shared IP, mailbox etc., and customer relationship includes but not limited to:User's concern, interaction
Relationship etc..
Referring to Fig. 4, it is shown as the schematic diagram of a group user decision tree figure.As shown in the figure, the decision tree diagram
The root node of shape shows that the data characteristics of highest priority is maximum IP usage amounts, and weigh group with maximum IP usage amounts
The attributive classification of user, i.e., when the maximum IP usage amounts (max_IP_used_be_used_amount) corresponding to a user≤
When 80.5, relative users are classified to first nonleaf node along " indigo plant " color decision path, conversely, then by relative users edge " Huang "
Color decision path is classified to first leaf node.First nonleaf node initiates data spy of the source for the second priority according to event
Sign continues to carry out acquired each user classification judgement, i.e., initiates source using event to weigh current acquired each user's
Attributive classification, when the out-degree (out_degree)≤711.0 with event initiation source in social networks corresponding to a user
When, relative users are classified to second leaf node along " indigo plant " color decision path, conversely, then by relative users edge " Huang " color decision
Route classification is to second nonleaf node.Second nonleaf node continues according to data characteristics of the IP usage amounts for third priority
Classification judgement is carried out to acquired each user, i.e., the attribute point of current acquired each user is weighed using IP usage amounts
When corresponding to a user with IP usage amounts (IP_used_amount)≤870.0, relative users are determined along " indigo plant " color for class
Plan route classification is to third leaf node, conversely, relative users then are classified to the 4th leaf node along " Huang " color decision path.
Wherein, show decision path that attribute that the nonleaf node in currently point priority classified is " normal users " with " indigo plant " color table,
" Huang " color table shows the decision path that the attribute classified under current priority is " abnormal user ".Wherein, each nonleaf node
Middle user weighs the codomain of corresponding information, and as illustrated in the drawing 80.5,711.0 and 870.0 etc., it is corresponding current preference series
According to the decision codomain of feature.
In various embodiments, the difference of decision path can also be characterized with lines of different shapes in display,
For example the user property gone out with solid line characterization decision is normal users, the user property gone out with dotted line characterization decision is abnormal use
Family, then alternatively, the user property gone out with straight line characterization decision is normal users, the user property gone out with curve characterization decision is different
Common family, more alternatively, the lines of thickness characterize the difference of decision path, such as the user property gone out with hachure characterization decision
For normal users, the user property gone out with thick lines characterization decision is abnormal user etc..
For the number of users for being more clearly seen each nonleaf node and acquired in leaf node, a decision tree is being shown
To characterize in the group during the attribute test of all users, the root node in the decision tree figure is also shown figure
It group user quantity (sample size that i.e. root node gives) and is also shown in each nonleaf node of the decision tree figure
The number of users (sample size that i.e. current nonleaf node obtains) of current attribute.Referring to Fig. 5, its decision tree figure for showing
In further include the display interface of the number of users for being classified to each node, wherein, the sample_size shown in root node is root section
The given sample size of point, i.e. group member sum, the sample_size that other nonleaf nodes are shown are obtained for current nonleaf node
Sample size, the sample_size expressions shown in leaf node are classified to the number of users of own node by upper level.
It should be noted that it is different according to the type of fraud, the design of unsupervised detection algorithm, operate day in detection
In will in each Group Decision assorting process, each data characteristics priority, the decision codomain of each priority, the superior and the subordinate are adjacent excellent
First grade relationship, decision path at different levels etc. all may be different.Or even in order to get more quickly to the group of each user in operation log
The result of decision, used unsupervised detection algorithm can cut selected data characteristics according to convergent in training
Choosing, i.e., when trained detection algorithm has reached the condition of convergence, remaining data characteristics will be handled by beta pruning, institute's beta pruning processing
Data characteristics will be not displayed on the display interface of decision tree figure.Alternatively, all users are examining in acquired group
It is had been identified as in first several grades of classification in method of determining and calculating, then remaining data characteristics can be handled by beta pruning, display module 32
The decision tree figure of each node that only display is connected comprising all decision paths and each decision path.It is herein described when utilizing
When user data visualization system is shown the categorised decision process of a group, domain expert and algorithm expert are easier to
Evaluate the accuracy of the detection algorithm.
It is another aobvious in the display interface of the decision tree figure or what is redirected based on acquired operational order
Show in interface, the display module 32 be additionally operable to determine the group in a user as target user;And described
The side of decision tree figure shows a time shaft, the operation log of the target user on the time axis is presented.
Here, when domain expert or algorithm expert click a leaf node and by the pop-out of leaf node in choose one with
When user links, operation log of the relative users in time shaft is shown on the side of decision tree figure.Referring to Fig. 6,
It is shown as left side as the operation log of target user on a timeline, the interface signal in right side display Group Decision tree graph shape
Figure.According to the sequential node of time sequencing in time shaft marking operation daily record from top to bottom as depicted, by each sequential node
Show event type (such as event_type) in the operation log corresponding to corresponding time point, event generation time (such as
Timestamp), user information (such as user_id), IP address (such as complete IP address or IP segmentation), event response side be (such as
Target_user), event content (such as comment_id, comment_lenth, amount, object_id, target_
Video etc.), event type (such as event_type).By showing, each operation of user on a timeline is gone through in group
History, can allow domain expert and algorithm expert checks the accuracy of the detected user property positioned at same group in detail, with
And adhere to the general character relationship of normal users and abnormal user in same group separately, and then confirm the deficiency and defect of detection algorithm.
In other embodiments, domain expert and algorithm expert are not only concerned about the member property assorting process of group,
It is also concerned about whether distributed group is reasonable, this needs them that can check the detailed data feature in each group, and from another
A kind of dimension opens the preferred order of each data characteristics checked and built for classifying group.The method for visualizing is additionally operable to show
Show the interface of the data set of a group.Shown data set is shown with list mode, is thus displayed for a user same
The details of data characteristics in group.To improve the group data collection classification accuracy, shown row in the interface
Based on when table can classify according to fraud detecting system classify priority by the data characteristics list in a group by
Row displaying.For example, the referring to Fig. 7, list interface of the data set of a group that display the application is shown in one embodiment
Schematic diagram.In the list interface schematic diagram, the data set of shown group is the similitude according to data characteristics
For obtained by the sequence sequence of priority from high to low.When the data characteristics similitude in the first priority is identical, according to
The data characteristics of two priority is ranked up, and in the embodiment shown in fig. 7, the sequence of the priority from high to low is:IP
Address (segmentation or grouping of IP address), event initiate source (source), event response side (target), event type
(event_type) and Time To Event (timestamp).In the present embodiment, the new line of table (gauge outfit) is used into different lines
Importance encoded, if the value of a feature is more concentrated, then this feature is more important.It is provided in the application
In one embodiment, the fraud detecting system is to represent this characteristic by calculating the comentropy of each feature.If
Comentropy is lower, then means that consistency is higher.Then the fraud detecting system passs feature according to comentropy
The sequence of increasing is ranked up, most at last the list head front of low comentropy come prompting family note that certain, different implementation
In the case of, color rendering, such as the most list of low comentropy at last can also be carried out according to by the list head in the table shown
Head color rendering prompt the attention at the family data characteristics that the row are characterized mostly important for most deep, and so on progress color
Other data characteristicses that the row are characterized are rendered, and then obtain data set list interface shown in figure.The list interface can be held
It is connected on after the step of showing multiple group interfaces or before or after step S12, then list circle is selected based on user
The selection operation in face and show.
In certain embodiments, whether the data set of the group acquired for further characterization can reflect fraud
Characteristic, it is also necessary to be shown from other dimensions.For example, network operation data and group data by comparing normal users
Collect the accuracy to further confirm that detected fraud.For this purpose, the display module 32 is additionally operable to show the group
Data set feature distribution interface.Wherein, the feature distribution interface can be shown with each data type in overall network
Distribution, the overall network is opposite, for example forms a cluster by multiple network users, then can be shown by interface
Show the distribution of some data characteristics in the cluster in some group, referring to Fig. 3, maximum circle of dotted line table in such as Fig. 3
Show that one forms cluster by multiple network users, cluster Zhong You11Ge groups are to number the group for being 0-10, Cong Zhongxuan respectively
A group is selected to show into row information.
In some embodiments, the data type that feature distribution interface can be shown is, for example,:It ties up at average operating time interval
The comentropy (average operation interval entropy) of degree, the comentropy (IP of IP address usage amount dimension
Used amount entropy), the comentropy (sex entropy) of gender dimension, the comentropy (email of Email dimension
Entropy), the comentropy (reg time entropy) of registion time dimension, the comentropy of number of operations dimension
(operation times entropy), the comentropy (device amount entropy) of number of devices dimension operate class
The comentropy (operation type entropy) of type dimension, the maximum amount of comentropy (max that is used by others using IP
IP used be used amount entropy) etc...In the embodiment shown in fig. 8, with the information of registion time dimension
For entropy to be shown for data characteristics, i.e. Fig. 8 is shown as the comentropy of registion time in a group (registration period) in net
Feature distribution in network cluster.In order to which effective ratio is to the spy of the network operation data of acquired group data collection and normal users
Distributional difference is levied, referring to Fig. 9, its flow chart for being shown as showing the interface of the feature distribution of the data set of the group, such as
Shown in figure, user's visualization system performs following steps so that display module 32 includes generated each diagram in respective interface
On:
In step S211, a group is selected, and at least one data are determined from the data set of the group
Feature.In one embodiment, group marked as 2 such as in selection Fig. 3, and from the data in the group marked as 2
The data characteristics for determining that one is user information is concentrated, for example the user information is registion time.
In step S212, determining at least one data characteristics feature in the group and cluster point is counted
Cloth.In the present embodiment, the statistics feature distribution and statistics institute for the data characteristics of registion time in the group
State feature distribution of the data characteristics for registion time in the entire cluster.
In step S213, show the histogram of the feature distribution and correspond to the histogram in entire cluster histogram
In profiles versus figure.In the present embodiment, based on the coding to the data characteristics, the display data for registion time
Feature in the group histogram of feature distribution and the display data characteristics for registion time in the entire collection
The histogram of feature distribution in group.As shown in figure 8, in the interface D, figure (a) is shown as noting in the selected group marked as 2
The feature distribution thumbnail of volume time corresponds to the amplification of the thumbnail, then the enlarged drawing (d) for lower side in the D of interface, by institute
It states enlarged drawing to can be seen that in the group, from August 1 day to August one middle of the month of 31 days, which carries out registration behaviour
The time of work concentrates on August 5, August 6 days, August 11 days, August 12 days and August this 5 days on the 16th, and in the interface D
Figure (c) is characterized as the histogram that registered user in the cluster carries out the Annual distribution of registration operation in August part, from the figure
(c) as can be seen that registered user has certain rule in the in one's duty registration distribution of August in the cluster, scheme in the D of interface
(b) data characteristics for being characterized as overlaping in figure (d) and figure (c) being shown as registion time is in the entire cluster
With the difference in the group of selection.In order to allow users to know the difference and contact between different characteristic, the application
This block diagram is presented in the form of three layers in the embodiment of offer, after user is by clicking one of thumbnail, page
Face will be scrolled into schemes by normalized profiles versus.Certainly, in specific application, the thumbnail of the data characteristics may be used also
Can have multiple, each represent different data characteristicses.
In some embodiments, it can also distinguish or emphasize some data characteristics by carrying out color rendering to histogram
Feature distribution or Dynamic Announce (such as the mode flickered) are to distinguish or emphasize certain number in the group and entire cluster
According to feature in the group and entire cluster feature distribution.
In some embodiments, it is described in order to further analyze the difference between multiple groups in a network cluster
Display module 32 is additionally operable to show the interface of the feature distribution of the data set of multiple groups, please refers to Fig.1 0 and Figure 11, Tu10Xian
It is shown as the display module 32 and shows the step flow chart that multiple groups are distributed in the cluster in one embodiment, Figure 11 is shown
Multiple groups distribution interface E in the cluster is shown in one embodiment for the application, as shown in the figure, the step includes:
In step S311, multiple groups are determined in the cluster be made of multiple network users, respectively with different shape, figure
Mark, label and/or the difference of the multiple group of characterization;In one embodiment, for example, selection Fig. 3 in label 0,1 and 2
3 groups, wherein, the group marked as 0 shows that the group marked as 1 is shown with " red " color table, marked as 2 with " green " color table
Group is shown with " indigo plant " color table.
In step S312, at least one data characteristics is determined from the data set of the multiple group;In the present embodiment
In, a data characteristics, such as IP address are determined from the data set of this 3 groups.
In step S313, based between each two network user at least one data characteristics analysis respectively group
Relative Entropy as the similarity degree between measuring each two network user;In the present embodiment, based on the IP
Relative Entropy (the letter of IP usage amount dimensions in 3 groups of adress analysis label 0,1 and 2 between each two network user
Cease entropy, IP used amount entropy) as the similarity degree measured between each two network user.For example, it adopts
It is used as with the method t-SNE of Data Dimensionality Reduction (t- distribution neighborhoods embedded mobile GIS) and with the relative entropy between two users and measures this
The index of a little network user's distances.
In step S314, display interface is exported, in the interface, with shape, icon, and/or tag characterization network
User characterizes the difference of the multiple group with different colours, and two network users in each group are characterized with the distance of display
Between similarity degree.In the present embodiment, in interface E as shown in figure 11, the network user is characterized with dot, " green " color table shows
Group marked as 0, shows the group marked as 1 with " red " color table, shows the group marked as 2 with " indigo plant " color table, wherein, with " indigo plant "
Color table shows that the user distance in the group marked as 2 is shorter, which forms tufted distribution, is shown with " red " color table marked as 1
User distance in group is also shorter, which forms tufted distribution, and the normal users for representing random sampling are shown with " green " color table
Distribution, farther out, distribution more disperses for the distance between normal users.Thereby it is believed that a group is if dense
Cluster, be considered as a fraud group possibility it is bigger.Than in embodiment as shown in figure 11, which shows
Group is in the distribution that more disperses, then it represents that for should " green " colo(u)r group group be normal group, the user of " green " point expression therein
For normal users.Opposite, the group (group i.e. marked as 1) shown with " red " color table and the group shown with " indigo plant " color table
(group i.e. marked as 2) is distributed in into tufted, then it represents that for should " red " and " indigo plant " colo(u)r group group be exception group, wherein, use
The user that " red " point and " indigo plant " point represent is abnormal user.In one embodiment, it can be handed over using the user of the visualization system
The specifying information of user and feature value in each group are checked to mutual formula by mouse suspension.
In other examples, in the interface of output, for example, shape, icon, and/or tag characterization can also be used
The network user, such as shape are the geometric figures such as triangle, rectangle, for example icon is smiling face or face of crying, human skeleton head portrait, Qiang Daotou
As etc. icons, such as label word or with symbol for clearly distinguishing etc..
The user data visualization system of the application is by the way that the determining group user of institute in fraud detection process is grouped
The modes such as process, data characteristics distribution, tabulation are presented, and realize during fraud is detected Suo Fen groups with more
Kind relationship interface is shown, and is conducive to domain expert and algorithm expert and the detection algorithm of fraud detecting system is commented
Estimate and revise.
It is set it should be noted that all modules in the user data visualization system can be configured in single computer
It is standby upper.Or each module in the user data visualization system is arranged, respectively the client and network side of user side
On server, and client is connect with server network.For example, the acquisition module of user data visualization system is mounted on service
In device, display module is mounted in client, and the client is based on sending request to log in the server-side, the server
Operation based on client executing request runs the user data visualization system, and pass through client to the client
End shows respective interface.The client includes but not limited to:Configuration is in the browser of user terminal or private client software
Interface and for performing hardware of display interface program etc..
It should also be noted that, through the above description of the embodiments, those skilled in the art can be clearly
Solving the part or all of of the application can realize by software and with reference to required general hardware platform.Based on such reason
Solution, the part that the technical solution of the application substantially in other words contributes to the prior art can in the form of software product body
Reveal and, which may include being stored thereon with machine readable Jie of one or more of machine-executable instruction
Matter, these instructions can make by computer, computer network or other electronic equipments when one or more machines perform
It obtains the one or more machine and performs operation according to an embodiment of the present application.Machine readable media may include, but be not limited to, soft
Disk, CD, CD-ROM (compact-disc-read-only memory), magneto-optic disk, ROM (read-only memory), RAM (random access memory),
EPROM (Erasable Programmable Read Only Memory EPROM), EEPROM (electrically erasable programmable read-only memory), magnetic or optical card, sudden strain of a muscle
Deposit or suitable for store machine-executable instruction other kinds of medium/machine readable media.
The application can be used in numerous general or special purpose computing system environments or configuration.Such as:Personal computer, service
Device computer, handheld device or portable device, laptop device, multicomputer system, the system based on microprocessor, top set
Box, programmable consumer-elcetronics devices, network PC, minicomputer, mainframe computer, including any of the above system or equipment
Distributed computing environment etc..
The application can be described in the general context of computer executable instructions, such as program
Module.Usually, program module includes routines performing specific tasks or implementing specific abstract data types, program, object, group
Part, data structure etc..The application can also be put into practice in a distributed computing environment, in these distributed computing environment, by
Task is performed and connected remote processing devices by communication network.In a distributed computing environment, program module can be with
In the local and remote computer storage media including storage device.
It should be noted that it will be understood by those skilled in the art that above-mentioned members can be programmable logic device,
Including:Programmable logic array (Programmable Array Logic, PAL), Universal Array Logic (Generic Array
Logic, GAL), field programmable gate array (Field-Programmable Gate Array, FPGA), complex programmable patrol
One or more in volume device (Complex Programmable Logic Device, CPLD), the application, which does not do this, to be had
Body limits.
The principles and effects of the application are only illustrated in above-described embodiment, not for limitation the application.It is any ripe
Know the personage of this technology all can without prejudice to spirit herein and under the scope of, modifications and changes are carried out to above-described embodiment.Cause
This, those of ordinary skill in the art is complete without departing from spirit disclosed herein and institute under technological thought such as
Into all equivalent modifications or change, should be covered by claims hereof.
Claims (25)
1. a kind of user data method for visualizing, applied in a fraud detecting system, which is characterized in that including following step
Suddenly:
The data set of a group is obtained, the data characteristics of the data set includes user information, IP address, event type, thing
Part initiates source, event response side and Time To Event;Wherein, the data characteristics of the data set is confirmed as different determine
Plan priority;
A decision tree figure is shown to characterize the attribute test process of all users in the group, wherein:
Show the data characteristics of the first priority of the decision tree figure root node and its decision codomain;
Show the final attribute of at least one user of each leaf node characterization of the decision tree figure;
Show that current attribute, the data of current priority of multiple users of each nonleaf node characterization of the decision tree figure are special
Sign and its decision codomain;And
The decision path of root node or each nonleaf node in the corresponding decision tree figure of display, those decision paths are not with
The lines of same color, shape or thickness are characterized.
2. user data method for visualizing according to claim 1, which is characterized in that one decision tree figure of the display
To characterize in the group during the attribute test of all users, the root node in the decision tree figure also shows group
Number of users and the number of users that current attribute is also shown in each nonleaf node of the decision tree figure.
3. user data method for visualizing according to claim 1 or 2, which is characterized in that the decision tree diagram of the display
Shape is the decision tree figure handled through beta pruning.
4. user data method for visualizing according to claim 1, which is characterized in that further comprising the steps of:
Determine a user in the group as target user;
A time shaft is shown in the side of the decision tree figure, the behaviour of the target user on the time axis is presented
Make daily record.
5. user data method for visualizing according to claim 1, which is characterized in that the data for obtaining a group
The step of collection, includes:
Obtain the operation log that cluster is made of multiple network users;
At least one data characteristics is determined from the operation log of the multiple network user, and is analyzed in the operation log extremely
The similarity of few one group of data characteristics is with the determining group;And
Obtain the data set of the group.
6. user data method for visualizing according to claim 1 or 5, which is characterized in that it is at least one to further include display
The step of group interface, the group size in the group interface are characterized with the geometric figure size shown.
7. user data method for visualizing according to claim 1 or 5, which is characterized in that further include one group of display
Data set interface the step of, the data characteristics of the data set of the group include user information, IP address, event type,
Event initiates at least the two data characteristics in source, event response side and Time To Event, on the boundary of the group data collection
In face, sequencing display after the group data collection is grouped.
8. user data method for visualizing according to claim 1 or 5, which is characterized in that further include the display group
Data set feature distribution interface the step of:
A group is selected, and at least one data characteristics is determined from the data set of the group,
Count feature distribution of the determining at least one data characteristics in the group and cluster;And
Show the profiles versus's figure of the histogram and the corresponding histogram of the feature distribution in entire cluster histogram.
9. user data method for visualizing according to claim 1 or 5, which is characterized in that further include the multiple groups of display
Data set feature distribution interface the step of:
Multiple groups are determined in the cluster be made of multiple network users, respectively with different shape, icon, label and/or color
Characterize the difference of the multiple group;
At least one data characteristics is determined from the data set of the multiple group;
Based on the Relative Entropy conduct between each two network user at least one data characteristics analysis respectively group
Measure the similarity degree between each two network user;And
Display interface is exported, in the interface, with shape, icon, and/or the tag characterization network user, with different colours table
The difference of the multiple group is levied, the similarity degree in each group between two network users is characterized with the distance of display.
10. user data method for visualizing according to claim 1, which is characterized in that the event type includes network
The concern of user, thumb up, comment on, presenting, logging in, publishing, more new state, registration, at least one of modification information.
11. a kind of computer equipment, which is characterized in that including:
Processor;
The presentation engine performed on the processor, the presentation engine is for execution as described in claim any one of 1-10
User data method for visualizing.
12. a kind of user data visualization system, which is characterized in that including:
Acquisition module, for obtaining the data set of a group, the data characteristics of the data set is with including user information, IP
Location, event type, event initiate source, event response side and Time To Event;Wherein, the data characteristics quilt of the data set
It is determined as different decision priority;And
Display module, for showing a decision tree figure to characterize the attribute test process of all users in the group,
In, show the data characteristics of the first priority of the decision tree figure root node and its decision codomain;Show the decision tree
The final attribute of at least one user of each leaf node characterization of figure;Show each nonleaf node characterization of the decision tree figure
The current attribute of multiple users, the data characteristics of current priority and its decision codomain;And the corresponding decision tree of display
The decision path of root node or each nonleaf node in figure, the line of those decision path different colours, shape or thickness
Item is characterized.
13. user data visualization system according to claim 12, which is characterized in that the display module is additionally operable to
Root node in the decision tree figure shows group user quantity and is shown in each nonleaf node of the decision tree figure
Show the number of users of current attribute.
14. user data visualization system according to claim 12 or 13, which is characterized in that the display module is shown
Decision tree figure be the decision tree figure handled through beta pruning.
15. user data visualization system according to claim 12 or 13, which is characterized in that the display module is also used
In showing a time shaft in the side of the decision tree figure, the operation day of a target user on the time axis is presented
Will, the target user are determined by an input operation.
16. user data visualization system according to claim 12, which is characterized in that the group is obtained by described
The operation log for multiple network users that modulus block obtains, and analyze at least one set in the operation log through the processing module
What the similarity of data characteristics determined.
17. the user data visualization system according to claim 12 or 16, which is characterized in that the display module is also used
In at least one group interface of display, the group size in the group interface is characterized with the geometric figure size shown.
18. the user data visualization system according to claim 12 or 16, which is characterized in that the display module is also used
In the interface for the data set for showing a group, the data characteristics of the data set of the group includes user information, IP address, thing
Part type, event initiate at least the two data characteristics in source, event response side and Time To Event, in the group number
According in the interface of collection, sequencing display after the group data collection is grouped.
19. the user data visualization system according to claim 12 or 16, which is characterized in that the display module is also used
In the interface of the feature distribution for the data set for showing the group, the histogram of the feature distribution and the corresponding histogram exist
Profiles versus's figure in entire cluster histogram.
20. the user data visualization system according to claim 12 or 16, which is characterized in that the display module is also used
In display shape, icon, and/or the tag characterization network user, the difference of the multiple group is characterized with different colours, with aobvious
The distance shown characterizes the interface of the similarity degree between two network users in each group.
21. user data visualization system according to claim 12, which is characterized in that the event type includes network
The concern of user, thumb up, comment on, presenting, logging in, publishing, more new state, registration, at least one of modification information.
22. a kind of client passes through one server-side of network connection, which is characterized in that the client is based on sending request to step on
Record the step of server-side performs claim 1-10 any one of them user data method for visualizing.
23. a kind of server passes through one client of network connection, which is characterized in that the server is held based on the client
The operation of row request sends claim 1-10 any one of them user data method for visualizing to the client
Process simultaneously shows implementing result by the client.
24. a kind of browser passes through one server-side of network connection, which is characterized in that the browser is based on sending request to step on
Record the step of server-side performs claim 1-10 any one of them user data method for visualizing.
25. a kind of computer readable storage medium is stored with data visualization computer program, which is characterized in that the data
Visual calculation machine program is performed the step of realizing any one of the claim 1-10 user data method for visualizing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810022133.7A CN108268624B (en) | 2018-01-10 | 2018-01-10 | User data visualization method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810022133.7A CN108268624B (en) | 2018-01-10 | 2018-01-10 | User data visualization method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108268624A true CN108268624A (en) | 2018-07-10 |
CN108268624B CN108268624B (en) | 2020-04-24 |
Family
ID=62773340
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810022133.7A Active CN108268624B (en) | 2018-01-10 | 2018-01-10 | User data visualization method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108268624B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063131A (en) * | 2018-08-02 | 2018-12-21 | 陶雷 | A kind of system and method carrying out content output based on structural data processing |
CN109213904A (en) * | 2018-08-02 | 2019-01-15 | 陶雷 | A kind of system and method that presentation data are handled based on structured schemes |
CN109767269A (en) * | 2019-01-15 | 2019-05-17 | 网易(杭州)网络有限公司 | A kind for the treatment of method and apparatus of game data |
CN111125658A (en) * | 2019-12-31 | 2020-05-08 | 深圳市分期乐网络科技有限公司 | Method, device, server and storage medium for identifying fraudulent users |
CN112347343A (en) * | 2020-09-25 | 2021-02-09 | 北京淇瑀信息科技有限公司 | Customized information pushing method and device and electronic equipment |
CN113806594A (en) * | 2020-12-30 | 2021-12-17 | 京东科技控股股份有限公司 | Business data processing method, device, equipment and storage medium based on decision tree |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6278464B1 (en) * | 1997-03-07 | 2001-08-21 | Silicon Graphics, Inc. | Method, system, and computer program product for visualizing a decision-tree classifier |
CN105408894A (en) * | 2014-06-25 | 2016-03-16 | 华为技术有限公司 | Method and device for determining user identity category |
CN105956122A (en) * | 2016-05-03 | 2016-09-21 | 无锡雅座在线科技发展有限公司 | Object attribute determining method and device |
CN106295702A (en) * | 2016-08-15 | 2017-01-04 | 西北工业大学 | A kind of social platform user classification method analyzed based on individual affective behavior |
CN107438050A (en) * | 2016-05-26 | 2017-12-05 | 北京京东尚科信息技术有限公司 | Identify the method and system of the potential malicious user of website |
-
2018
- 2018-01-10 CN CN201810022133.7A patent/CN108268624B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6278464B1 (en) * | 1997-03-07 | 2001-08-21 | Silicon Graphics, Inc. | Method, system, and computer program product for visualizing a decision-tree classifier |
CN105408894A (en) * | 2014-06-25 | 2016-03-16 | 华为技术有限公司 | Method and device for determining user identity category |
CN105956122A (en) * | 2016-05-03 | 2016-09-21 | 无锡雅座在线科技发展有限公司 | Object attribute determining method and device |
CN107438050A (en) * | 2016-05-26 | 2017-12-05 | 北京京东尚科信息技术有限公司 | Identify the method and system of the potential malicious user of website |
CN106295702A (en) * | 2016-08-15 | 2017-01-04 | 西北工业大学 | A kind of social platform user classification method analyzed based on individual affective behavior |
Non-Patent Citations (2)
Title |
---|
丁爽斯: "基于大数据的互联网金融欺诈行为识别研究", 《中国优秀硕士学位论文全文数据库经济与管理科学辑》 * |
王博: "基于行为分析的恶意代码分类与可视化", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063131A (en) * | 2018-08-02 | 2018-12-21 | 陶雷 | A kind of system and method carrying out content output based on structural data processing |
CN109213904A (en) * | 2018-08-02 | 2019-01-15 | 陶雷 | A kind of system and method that presentation data are handled based on structured schemes |
CN109063131B (en) * | 2018-08-02 | 2021-09-28 | 陶雷 | System and method for outputting content based on structured data processing |
CN109213904B (en) * | 2018-08-02 | 2021-09-28 | 陶雷 | System and method for processing presentation data based on structured scheme |
CN109767269A (en) * | 2019-01-15 | 2019-05-17 | 网易(杭州)网络有限公司 | A kind for the treatment of method and apparatus of game data |
CN109767269B (en) * | 2019-01-15 | 2022-02-22 | 网易(杭州)网络有限公司 | Game data processing method and device |
CN111125658A (en) * | 2019-12-31 | 2020-05-08 | 深圳市分期乐网络科技有限公司 | Method, device, server and storage medium for identifying fraudulent users |
CN111125658B (en) * | 2019-12-31 | 2024-03-22 | 深圳市分期乐网络科技有限公司 | Method, apparatus, server and storage medium for identifying fraudulent user |
CN112347343A (en) * | 2020-09-25 | 2021-02-09 | 北京淇瑀信息科技有限公司 | Customized information pushing method and device and electronic equipment |
CN113806594A (en) * | 2020-12-30 | 2021-12-17 | 京东科技控股股份有限公司 | Business data processing method, device, equipment and storage medium based on decision tree |
Also Published As
Publication number | Publication date |
---|---|
CN108268624B (en) | 2020-04-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108268624A (en) | User data method for visualizing and system | |
CN110223168B (en) | Label propagation anti-fraud detection method and system based on enterprise relationship map | |
CN110704572B (en) | Suspected illegal fundraising risk early warning method, device, equipment and storage medium | |
US7028036B2 (en) | System and method for visualization of continuous attribute values | |
CN108170830A (en) | Group event data visualization method and system | |
Nost et al. | Q-method and the performance of subjectivity: reflections from a survey of US stream restoration practitioners | |
CN107944745B (en) | Risk information evaluation method and system | |
CN112258303B (en) | Surrounding string mark early warning analysis method and device, electronic equipment and storage medium | |
CN107729915A (en) | For the method and system for the key character for determining machine learning sample | |
CN108280644A (en) | Group member relation data method for visualizing and system | |
CN108021651A (en) | Network public opinion risk assessment method and device | |
CN112053061A (en) | Method and device for identifying surrounding label behaviors, electronic equipment and storage medium | |
CN107844911A (en) | Performance report using network door to products & services | |
US20210406930A1 (en) | Benefit surrender prediction | |
Pang et al. | Project Risk Ranking Based on Principal Component Analysis-An Empirical Study in Malaysia-Singapore Context | |
Amyrotos | Adaptive Visualizations for Enhanced Data Understanding and Interpretation | |
CN114048974A (en) | Artificial intelligent talent evaluation method, equipment and medium based on multi-scene simulation | |
Krishnan et al. | Performance measurement link between the balanced scorecard dimensions: an empirical study of the manufacturing sector in Malaysia | |
JP2019083076A (en) | Evaluation device, evaluation method and evaluation program | |
Martinez et al. | Visualization of multi-level data quality dimensions with QuaIIe | |
Aher et al. | Prediction of course selection by student using combination of data mining algorithms in E-learning | |
Riedel | The problems of assessing transnational mobility: Identifying latent groups of immigrants in Germany using factor mixture analysis | |
CN113705072A (en) | Data processing method, data processing device, computer equipment and storage medium | |
Kınaa | Exploring Recent Ideological Divides in Turkey: Political and Cultural Axes | |
Buitenhuis | Designing a holistic method for enhancing data quality with the use of machine learning: A master thesis for ICT in Business & the Public Sector at Leiden University |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20181024 Address after: 100084 10 floor 1009-1, 3 building, 1 Zhongguancun East Road, Haidian District, Beijing. Applicant after: Hua Ching Qing Chiao information technology (Beijing) Co., Ltd. Address before: 100084 Tsinghua Yuan, Beijing, Haidian District Applicant before: Tsinghua University |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |