WO2020177450A1 - 信息归并方法、交易查询方法、装置、计算机及存储介质 - Google Patents

信息归并方法、交易查询方法、装置、计算机及存储介质 Download PDF

Info

Publication number
WO2020177450A1
WO2020177450A1 PCT/CN2019/127178 CN2019127178W WO2020177450A1 WO 2020177450 A1 WO2020177450 A1 WO 2020177450A1 CN 2019127178 W CN2019127178 W CN 2019127178W WO 2020177450 A1 WO2020177450 A1 WO 2020177450A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
unicom
feature
association
subgraph
Prior art date
Application number
PCT/CN2019/127178
Other languages
English (en)
French (fr)
Inventor
周石磊
Original Assignee
京东数字科技控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东数字科技控股有限公司 filed Critical 京东数字科技控股有限公司
Publication of WO2020177450A1 publication Critical patent/WO2020177450A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Definitions

  • the embodiments of the present application relate to data processing technology, in particular to an information merging method, transaction query method, device, computer, and storage medium.
  • the following methods are often used to make judgments.
  • match based on business data according to fixed judgment rules and judge whether different accounts belong to the same user according to the matching results.
  • the ID number and registered mobile phone number can be the same
  • the account number of is determined as the account of the same user;
  • second, based on the user’s basic data the feature vector corresponding to the account is determined, and the feature vector of the account is clustered through the unsupervised clustering method, and the clustered accounts are determined to be similar account number.
  • the inventor found that there are at least the following technical problems in the prior art:
  • For the first judgment method there is a problem that the data is missing and the judgment cannot be made.
  • ID information is not required.
  • Fields there are a large number of account ID fields missing.
  • most of the ID cards, mobile phone numbers, and bank cards used by black-produced users to pass real-name authentication are purchased from the black market, and the accuracy of the information cannot be guaranteed.
  • the unsupervised clustering algorithm can be used to group user information into a specific group, but if there is a larger group (containing a large number of accounts), it cannot be used for non-numeric attributes. To quantify the degree of similarity between two accounts, the accuracy of effective judgment is poor.
  • This application provides an information merging method, transaction query method, device, computer and storage medium to improve the accuracy of information merging.
  • an embodiment of the present application provides an information merging method, including:
  • an embodiment of the present application also provides a transaction query method, including:
  • an embodiment of the present application also provides an information merging device, including:
  • An information extraction module configured to obtain data to be processed based on at least two data sources, and extract feature information and feature associated information in the data to be processed;
  • An information association graph generating module configured to generate an information association graph according to the extracted feature information and the feature association information
  • the information merging module is configured to divide the information association graph into Unicom subgraphs, generate at least one Unicom subgraph, and perform information merging on the to-be-processed data according to the at least one Unicom subgraph.
  • an embodiment of the present application also provides a transaction query device, including:
  • the first target Unicom submap determining module is configured to obtain known risk user information, perform matching in at least one Unicom submap based on the risk user information, and determine a target Unicom submap matching the known risk user information , Wherein the at least one Unicom subgraph is determined according to the information merging method provided in any embodiment of the present application;
  • An associated user information determining module configured to extract associated user information in the target Unicom submap
  • the risk transaction determination module is configured to determine the current transaction associated with the user information, and determine the current transaction associated with the user information as a risk transaction.
  • an embodiment of the present application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • a computer device including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor executes the program, the implementation is as follows: Information merging method provided by any embodiment.
  • an embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the information merging method as provided in any embodiment of the present application is implemented.
  • an embodiment of the present application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor executes the program, the implementation is as follows: Any embodiment provides a transaction query method.
  • an embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the transaction query method as provided in any embodiment of the present application is implemented.
  • the technical solution provided by the embodiments of the present application forms an information association graph by forming feature information in the data to be processed through the association relationship between feature information, and divides the information association graph based on the connectivity of the feature nodes in the information association graph to obtain multiple Two independent Unicom submaps merge feature information based on the Unicom submap, simplify information merging through graphics, which is convenient and intuitive, solves the problem that the massive data in the database cannot clearly determine the data association relationship, and improves the efficiency of image merging.
  • FIG. 1 is a method flowchart of an information merging method provided in Embodiment 1 of this application;
  • FIG. 2 is a schematic diagram of an information association diagram provided by Embodiment 1 of the present application.
  • FIG. 3 is a method flowchart of a transaction query method provided in the second embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of an information merging device provided in Embodiment 3 of the present application.
  • FIG. 5 is a schematic structural diagram of a transaction query device provided in the fourth embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a computer device provided in Embodiment 5 of this application.
  • FIG. 7 is a schematic structural diagram of a computer device provided in Embodiment 7 of this application.
  • Fig. 1 is a flowchart of an information merging method provided in the first embodiment of this application. This embodiment is applicable to the case of information merging a large amount of data. The method can be executed by the information merging device provided in the embodiments of this application. Including the following steps:
  • S110 Acquire data to be processed based on at least two data sources, and extract feature information and feature associated information in the data to be processed.
  • the data source is used to provide different data to be processed.
  • the data to be processed can be real-time data transmitted by the data source, or can be data to be processed in a preset time period stored by the data source.
  • the feature identifier is determined according to the data to be merged, and the corresponding feature information is extracted from the data to be processed according to the feature identifier, and the association relationship between the feature information is extracted.
  • the feature identifier may be the name of the feature information or a character or character string used to characterize the feature information.
  • the feature identifier may be predetermined or obtained by filtering from the data to be processed according to the data to be merged.
  • the determined feature identifier may be an information identifier related to the account, such as account name, account registered user, account registered mobile phone number, etc.
  • extracting feature information and feature association information in the data to be processed includes: matching in the data to be processed according to a preset feature identifier, and determining feature information corresponding to the preset feature identifier; For the data to be processed, determine the association relationship between any two feature information; determine the association relationship as the feature association information.
  • feature identifiers are preset, and the preset feature identifiers are matched one by one in the data to be processed to obtain feature information corresponding to each preset feature identifier and the association relationship between any two feature information.
  • the data source includes the user behavior data source of the e-commerce platform.
  • the user behavior data source of the e-commerce platform may include, but is not limited to, registration information table, member information table, card binding information table, The payment information table, the real-name authentication information table, the order information table and the payment information table.
  • the data to be merged can be an account number
  • the characteristic information extracted from the data to be processed can include but is not limited to account number, member information, bank card information, Credentials information, mobile phone number information, device information and WIFI information.
  • the user behavior data source of the e-commerce platform is matched based on the preset feature identifier to obtain the feature information corresponding to the preset feature identifier and the association relationship between the feature information.
  • the data source that matches each preset feature identifier and association relationship can be determined in advance, so as to improve the pertinence of feature information extraction and avoid invalid data matching process.
  • Table 1 is the correspondence between feature information and the data source
  • Table 2 is the correspondence between the feature information and the data source.
  • Feature name data source node_type_pin account number Registration Information Form node_type_pay member information Member Information Form node_type_card Bank card information Tied Card Information Sheet, Payment Information Sheet node_type_idcard identity informaiton Real-name authentication information form node_type_phone Mobile phone number information Registration information form, order information form node_type_eid Device Information Registration information form, payment information form
  • Relationship type Characteristic information association data source register_with_pin_phone Account and registered mobile phone number Registration Information Form register_on_pin_eid Account and registered device Registration Information Form bind_to_pin_pay Account and Member ID Member Information Form login_on_pin_eid Account and login device Login Information Form auth_by_pin_idcard Account and certificate information Real-name authentication information form reserve_with_card_phone Bank card and cardholder's mobile phone number User binding card information table owns_id_idcard_card Bank card and cardholder ID User binding card information table bind_to_pin_card Account number and bank card binding User binding card information table consignee_with_pin_phone Account number and receiving mobile phone number Order Information Form trade_with_pin_card Account and payment bank card Payment Information Form
  • S120 Generate an information association graph according to the extracted characteristic information and the characteristic association information.
  • the information association graph displays the characteristic information and the characteristic association information in the form of graphics.
  • the characteristic information having the association relationship may be connected graphically to obtain the association graphics between all the characteristic information.
  • generating an information association graph according to the extracted characteristic information and the characteristic association information includes: setting characteristic nodes according to the characteristic information; obtaining characteristic nodes with an association relationship; and establishing associations according to the characteristic association information
  • An association edge is set between two characteristic nodes of the relationship to generate the information association graph.
  • the information association graph is composed of feature nodes and associated edges connecting the feature nodes. Each feature node has one feature information, and two feature nodes corresponding to two feature information with an association relationship are set with an associated edge. Used to connect the above two feature nodes.
  • FIG. 2 is a schematic diagram of an information association diagram provided by an embodiment of the present application. In Figure 2, it includes account A, account B, and device C. Account A and account B are all registered through device C.
  • account A and account B are associated with device C.
  • account A, account B and device C, C sets up the characteristic nodes respectively, and sets the associated edges between the account A and the characteristic nodes of the device C, sets the associated edges between the account B and the characteristic nodes of the device C, and generates the information association graph in FIG. 2.
  • the characteristic information is displayed in the graphical form of connecting nodes. Compared with the text form in the data table, the characteristic information with the association relationship can be determined intuitively, which improves the characteristic information. The convenience and intuitiveness of access.
  • the information association graph is divided into subgraphs based on the connectivity of the characteristic nodes in the information association graph, and the characteristic information is divided by the association relationship of the characteristic information to obtain multiple subgraphs of the interconnection, among which, Any two feature nodes in a connected subgraph can be connected through one or more associated edges, and there is no associated edge between any feature nodes in any two connected subgraphs.
  • dividing the information association graph into Unicom subgraphs to generate at least one Unicom subgraph, and merging the data to be processed according to the at least one Unicom subgraph includes: traversing the information association graph The feature nodes connected based on the associated edges are divided into the same Unicom subgraph, where there is no associated edge between any feature nodes in any two Unicom subgraphs; the feature nodes in the same Unicom subgraph are corresponding The feature information of is merged into the same group information.
  • one or more second feature nodes that have an association relationship with the first feature node can be determined according to the associated edge of the first feature node, where the second feature node The number of is the same as the number of associated edges of the first feature node; further, according to the associated edge of each second feature node, one or more third feature nodes that have an associated relationship with the second feature node can be determined, where, The third feature node and the first feature node are not repeated, and so on, the Unicom subgraph to which the first feature node belongs can be obtained.
  • the feature information is merged according to the Unicom submap.
  • the feature information of the same type in the Unicom submap may be merged into the same group.
  • account A, account B, and device C belongs to the same China Unicom submap, and further it is possible to determine account A and account B as the account of the same user or the account of the same organization.
  • the technical solution of this embodiment forms an information association graph by forming feature information in the data to be processed through the association relationship between the feature information, and divides the information association graph based on the connectivity of the feature nodes in the information association graph to obtain multiple mutual
  • the independent Unicom submap merges feature information based on the Unicom submap, which simplifies the information merging through graphics, which is convenient and intuitive, solves the problem that the massive data in the database cannot clearly determine the data association relationship, and improves the efficiency of image merging.
  • the method further includes: if there is at least one historical connected sub-graph, combining the generated at least one connected sub-graph with all the connected sub-graphs.
  • the at least one historical connectivity subgraph is merged to generate at least one updated connectivity subgraph.
  • the historical China Unicom subgraph can be updated through the newly created China Unicom subgraph. Specifically, the feature information of any feature node in the newly created China Unicom subgraph can be matched with the historical China Unicom subgraph. The newly created China Unicom subgraph is updated to match the historical China Unicom subgraph.
  • the update method can be to traverse the second feature node with the associated edge of the first feature node in the newly created China Unicom subgraph, and determine whether the historical China Unicom subgraph has the second feature node.
  • Feature node if not, set the second feature node, and set the associated edge of the first feature node and the second feature node, if yes, determine whether the first feature node and the second feature node in the historical Unicom subgraph are associated If there is an edge, there is no need to update it. If there is no edge, set the associated edge of the first feature node and the second feature node.
  • performing information merging on the to-be-processed data according to the at least one China Unicom sub-graph includes: performing information merging on the to-be-processed data according to the at least one updated China Unicom sub-graph.
  • the historical Unicom submap is continuously updated through the newly created Unicom submap, which improves the comprehensiveness of the feature information of the Unicom submap and the accuracy of information merging.
  • Taking the user behavior data source of the e-commerce platform as an example, to merge the user data of the e-commerce platform can be: obtaining the data to be processed based on the user behavior data source of the business platform, and extracting the characteristic information and features in the data to be processed Associated information; among them, the data sources include registration information table, member information table, card binding information table, payment information table, real-name authentication information table, order information table, and payment information table.
  • Feature information can include account number, member information, bank card information , Credentials information, mobile phone number information, device information and WIFI information.
  • the user information association diagram of the e-commerce platform is generated.
  • the user information association diagram of the e-commerce platform includes account node, member information node, bank card information node, certificate information node, and mobile phone number.
  • the account in the same China Unicom submap may be determined as the account of the same user or the account of the same organization.
  • the information is merged through the Unicom sub-map, and any characteristic information can be used to quickly determine the association with other information of the characteristic information, which is convenient for subsequent investigations of crimes such as scalping, illegal production, and fraudulent gangs through the associated account, and improves the efficiency of information query.
  • Fig. 3 is a method flowchart of a transaction query method provided by an embodiment of the application. This embodiment is applicable to the case of inquiring about risky transactions.
  • the method can be executed by the transaction query device provided by the embodiment of the application, which specifically includes The following steps:
  • the known risk user information may include multiple feature information, and the known risk user information may be user information that performs illegal operations, such as user information that performs illegal operations such as fraud, order brushing, and money laundering.
  • user information that performs illegal operations such as fraud, order brushing, and money laundering.
  • user real-name authentication information For example, user real-name authentication information, account information, ID information, etc.
  • performing matching in at least one China Unicom submap according to the risk user information, and determining a target China Unicom submap that matches the known risk user information includes: extracting the known risk according to a preset feature identifier The risk feature information in the user information; the risk feature information is matched with the feature information of the feature node in at least one China Unicom subgraph, and when the matching is successful, the connected subgraph to which the successfully matched feature node belongs is determined as the target Unicom Subgraph.
  • the established Unicom subgraph may be stored in a graph database, where the graph database may be an HBase database, and ElasticSearch is used as an indexing tool for the graph database.
  • the ElasticSearch table mainly saves users to query the information of nodes and edges according to the attributes of the nodes or edges.
  • take the order payment data as an example: save the payment details of the order in the ElasticSearch table, when the order payment method is used as the condition When querying, query the characteristic nodes and associated edges that meet these conditions in the ElasticSearch table, and then query the associated information of these characteristic nodes and associated edges in the graph, and obtain the target connectivity subgraph based on the query results.
  • Table 3 is a schematic diagram of the payment data table provided in the embodiment of the present application.
  • each feature node of the China Unicom subgraph includes at least one behavior attribute information; correspondingly, when the matching fails according to the known risk user information, the behavior feature information of the known risk user is obtained, and the The behavior characteristic information is matched with the behavior attribute information in the China Unicom subgraph, and the target Unicom subgraph is determined according to the matching result.
  • the behavior attribute information of the account may be the ip (Internet Protocol address) attribution, registration time, and registration source of the registered account
  • the behavior attribute information of the order information includes the consignee, the harvest address, and the type of goods.
  • the behavior characteristic information may include, but is not limited to, the ip attribution of the registered account, transaction time, payment method, and the attribution of the receiving ip.
  • the risk transaction may be verified, and when the verification succeeds, the risk transaction is intercepted.
  • the verification of risky transactions can be manual review or verification based on preset conditions.
  • the preset conditions can be transaction time, transaction type, etc. When risky transactions meet preset conditions, they can be intercepted to improve Transaction security.
  • FIG. 4 is a schematic structural diagram of an information merging device provided by an embodiment of the present application. As shown in FIG. 4, the device includes: an information extraction module 410, an information association graph generating module 420, and an information merging module 430.
  • the information extraction module 410 is configured to obtain data to be processed based on at least two data sources, and extract feature information and feature associated information in the data to be processed;
  • the information association graph generating module 420 is configured to generate an information association graph according to the extracted characteristic information and the characteristic association information;
  • the information merging module 430 is configured to divide the information association graph into Unicom subgraphs, generate at least one Unicom subgraph, and perform information merging on the to-be-processed data according to the at least one Unicom subgraph.
  • the information extraction module 410 is configured as:
  • the association relationship is determined as the characteristic association information.
  • the information association graph generating module 420 is configured to:
  • an association edge is set between two characteristic nodes that have an association relationship to generate the information association graph.
  • the information merging module 430 is configured as:
  • the feature information corresponding to the feature node in the same Unicom subgraph is merged into the same group information.
  • the device further includes:
  • Update the Unicom subgraph determining module configured to divide the information association graph into the Unicom subgraph to generate at least one Unicom subgraph, and if there is at least one historical Unicom subgraph, then combine the generated at least one Unicom subgraph with all The at least one historical China Unicom sub-map is merged to generate at least one updated China Unicom sub-map;
  • the information merging module 430 is configured to perform information merging on the to-be-processed data according to the at least one updated Unicom subgraph.
  • the data source includes a user behavior data source of an e-commerce platform.
  • the information association graph is a user information association graph of the e-commerce platform, and the Unicom subgraph is a collection of user information with an association relationship.
  • the above products can execute the information merging method provided by any embodiment of the present application, and have the corresponding functional modules and beneficial effects for executing the information merging method.
  • FIG. 5 is a schematic structural diagram of a transaction query device provided in an embodiment of the present application.
  • the transaction query device includes a first target Unicom subgraph determining module 510, an associated user information determining module 520, and a risky transaction determining module 530.
  • the first target Unicom subgraph determining module 510 is configured to obtain known risk user information, perform matching in at least one Unicom submap according to the risk user information, and determine a target Unicom submap matching the known risk user information Figure, wherein the at least one Unicom sub-image is determined according to the aforementioned information merging method;
  • the associated user information determination module 520 is configured to extract associated user information in the target Unicom submap
  • the risk transaction determination module 530 is configured to determine the current transaction associated with the user information, and determine the current transaction associated with the user information as a risk transaction.
  • the first target unicom subgraph determining module 510 is configured as:
  • the risk feature information is matched with the feature information of the feature node in at least one connectivity subgraph, and when the matching is successful, the connected subgraph to which the successfully matched feature node belongs is determined as the target connectivity subgraph.
  • each characteristic node of the Unicom subgraph includes at least one behavior attribute information
  • the device further includes:
  • the second target Unicom subgraph determining module is configured to, when the matching fails according to the known risk user information, obtain the behavior characteristic information of the known risk user, and compare the behavior characteristic information with those in the Unicom subgraph The behavior attribute information is matched, and the target Unicom subgraph is determined according to the matching result.
  • the above product can execute the transaction query method provided by any embodiment of the present application, and has the corresponding functional modules and beneficial effects for executing the transaction query method.
  • FIG. 6 is a schematic structural diagram of a computer device provided in Embodiment 5 of this application.
  • FIG. 6 shows a block diagram of a computer device 612 suitable for implementing the embodiments of the present application.
  • the computer device 612 shown in FIG. 6 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present application.
  • the device 612 is typically a computing device that undertakes the function of merging information.
  • the computer device 612 is represented in the form of a general-purpose computing device.
  • the components of the computer device 612 may include, but are not limited to: one or more processors 616, a storage device 628, and a bus 618 connecting different system components (including the storage device 628 and the processor 616).
  • the bus 618 represents one or more of several types of bus structures, including a memory bus or a memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local bus using any bus structure among multiple bus structures.
  • these architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards) Association, VESA) local bus and Peripheral Component Interconnect (PCI) bus.
  • Computer device 612 typically includes a variety of computer system readable media. These media may be any available media that can be accessed by the computer device 612, including volatile and non-volatile media, removable and non-removable media.
  • the storage device 628 may include a computer system readable medium in the form of a volatile memory, such as a random access memory (RAM) 630 and/or a cache memory 632.
  • the computer device 612 may further include other removable/non-removable, volatile/nonvolatile computer system storage media.
  • the storage system 634 may be used to read and write non-removable, non-volatile magnetic media (not shown in FIG. 6, and generally referred to as a "hard drive").
  • a disk drive for reading and writing to a removable non-volatile disk (such as a "floppy disk"), and a removable non-volatile optical disk (such as a compact disc (Compact Disc- Read Only Memory (CD-ROM), Digital Video Disc-Read Only Memory (DVD-ROM) or other optical media) read and write optical disc drives.
  • each drive can be connected to the bus 618 through one or more data media interfaces.
  • the storage device 628 may include at least one program product, and the program product has a set of (for example, at least one) program modules, and these program modules are configured to perform the functions of the embodiments of the present application.
  • a program 636 having a set (at least one) of program modules 626 may be stored in, for example, the storage device 628.
  • Such program modules 626 include, but are not limited to, an operating system, one or more application programs, other program modules, and program data. Each of the examples or some combination may include the realization of a network environment.
  • the program module 626 generally executes the functions and/or methods in the embodiments described in this application.
  • the computer device 612 can also communicate with one or more external devices 614 (such as keyboards, pointing devices, cameras, displays 624, etc.), and can also communicate with one or more devices that enable users to interact with the computer device 612, and/ Or communicate with any device (such as a network card, modem, etc.) that enables the computer device 612 to communicate with one or more other computing devices. Such communication can be performed through an input/output (I/O) interface 622.
  • the computer device 612 may also communicate with one or more networks (such as a local area network (LAN), a wide area network, WAN) and/or a public network, such as the Internet, through the network adapter 620.
  • networks such as a local area network (LAN), a wide area network, WAN) and/or a public network, such as the Internet
  • the network adapter 620 communicates with other modules of the computer device 612 through the bus 618. It should be understood that although not shown in the figure, other hardware and/or software modules can be used in conjunction with the computer device 612, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, and disk arrays (Redundant Arrays). of Independent Disks, RAID) systems, tape drives, and data backup storage systems.
  • the processor 616 executes various functional applications and data processing by running programs stored in the storage device 628, such as implementing the information merging method provided in the foregoing embodiments of the present application.
  • the sixth embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the information merging method as provided in the embodiment of the present application is implemented.
  • a computer-readable storage medium provided by an embodiment of the present application, and the computer program stored thereon is not limited to the method operations described above, and can also execute the information merging method provided by any embodiment of the present application.
  • the computer storage media in the embodiments of the present application may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above.
  • computer-readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or flash memory Erasable programmable read-only memory
  • CD-ROM compact disk read-only memory
  • the computer-readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
  • the computer program code used to perform the operations of the present application can be written in one or more programming languages or a combination thereof.
  • the programming languages include object-oriented programming languages-such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user’s computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to pass Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider for example, using an Internet service provider to pass Internet connection.
  • FIG. 7 is a schematic structural diagram of a computer device provided in Embodiment 7 of this application.
  • FIG. 7 shows a block diagram of a computer device 712 suitable for implementing the embodiments of the present application.
  • the computer device 712 shown in FIG. 7 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present application.
  • the device 712 is typically a computing device that undertakes transaction query functions.
  • the computer device 712 is represented in the form of a general-purpose computing device.
  • the components of the computer device 712 may include, but are not limited to: one or more processors 716, a storage device 728, and a bus 718 connecting different system components (including the storage device 728 and the processor 716).
  • the bus 718 represents one or more of several types of bus structures, including a memory bus or a memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local bus using any bus structure among multiple bus structures.
  • these architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards) Association, VESA) local bus and Peripheral Component Interconnect (PCI) bus.
  • Computer device 712 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by the computer device 712, including volatile and non-volatile media, removable and non-removable media.
  • the storage device 728 may include a computer system readable medium in the form of a volatile memory, such as a random access memory (RAM) 730 and/or a cache memory 732.
  • the computer device 712 may further include other removable/non-removable, volatile/non-volatile computer system storage media.
  • the storage system 734 may be used to read and write non-removable, non-volatile magnetic media (not shown in FIG. 7 and generally referred to as a "hard drive").
  • a disk drive for reading and writing to a removable non-volatile disk (such as a "floppy disk"), and a removable non-volatile optical disk (such as a compact disc (Compact Disc- Read Only Memory (CD-ROM), Digital Video Disc-Read Only Memory (DVD-ROM) or other optical media) read and write optical disc drives.
  • each drive may be connected to the bus 718 through one or more data media interfaces.
  • the storage device 728 may include at least one program product, and the program product has a set of (for example, at least one) program modules that are configured to perform the functions of the embodiments of the present application.
  • a program 736 having a set of (at least one) program module 726 may be stored in, for example, the storage device 728.
  • Such program module 726 includes but is not limited to an operating system, one or more application programs, other program modules, and program data. Each of the examples or some combination may include the realization of a network environment.
  • the program module 726 generally executes the functions and/or methods in the embodiments described in this application.
  • the computer device 712 can also communicate with one or more external devices 714 (such as a keyboard, pointing device, camera, display 724, etc.), and can also communicate with one or more devices that enable a user to interact with the computer device 712, and/ Or communicate with any device (such as a network card, modem, etc.) that enables the computer device 712 to communicate with one or more other computing devices. Such communication can be performed through an input/output (I/O) interface 722.
  • the computer device 712 may also communicate with one or more networks (for example, a Local Area Network (LAN), a Wide Area Network, WAN) and/or a public network, such as the Internet, through the network adapter 720.
  • networks for example, a Local Area Network (LAN), a Wide Area Network, WAN
  • a public network such as the Internet
  • the network adapter 720 communicates with other modules of the computer device 712 through the bus 718. It should be understood that although not shown in the figure, other hardware and/or software modules can be used in conjunction with the computer device 712, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, and disk arrays (Redundant Arrays). of Independent Disks, RAID) systems, tape drives, and data backup storage systems.
  • the processor 716 executes various functional applications and data processing by running programs stored in the storage device 728, such as implementing the transaction query method provided in the foregoing embodiments of the present application.
  • the eighth embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the transaction query method as provided in the embodiment of the present application is implemented.
  • the computer-readable storage medium provided by the embodiment of the present application is not limited to the above-mentioned method operation and the computer program stored on it can also execute the transaction query method provided by any embodiment of the present application.
  • the computer storage media in the embodiments of the present application may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above.
  • computer-readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable medium may send, propagate or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
  • the computer program code used to perform the operations of the present application can be written in one or more programming languages or a combination thereof.
  • the programming languages include object-oriented programming languages-such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user’s computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to pass Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider for example, using an Internet service provider to pass Internet connection.
  • this application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of hardware embodiments, software embodiments, or embodiments combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) containing computer-usable program codes.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.
  • the technical solution of this embodiment forms an information association graph by forming feature information in the data to be processed through the association relationship between the feature information, and divides the information association graph based on the connectivity of the feature nodes in the information association graph to obtain multiple mutual
  • the independent Unicom submap merges feature information based on the Unicom submap, which simplifies the information merging through graphics, which is convenient and intuitive, solves the problem that the massive data in the database cannot clearly determine the data association relationship, and improves the efficiency of image merging.
  • the China Unicom submap to which the above known risk user information belongs is determined, and other associated users in the China Unicom submap are determined as Risky users, determine the current transactions being conducted by the associated users as risky transactions.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Computer Security & Cryptography (AREA)
  • Technology Law (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例公开了信息归并方法、交易查询方法、装置、计算机及存储介质。其中信息归并方法包括基于至少两个数据源获取待处理数据,提取待处理数据中的特征信息和特征关联信息;根据提取的特征信息和特征关联信息生成信息关联图;对信息关联图进行联通子图划分,生成至少一个联通子图,根据至少一个联通子图对待处理数据进行信息归并。

Description

信息归并方法、交易查询方法、装置、计算机及存储介质
相关申请的交叉引用
本申请基于申请号为201910167233.3、申请日为2019年03月06日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的内容在此以引入方式并入本申请。
技术领域
本申请实施例涉及数据处理技术,尤其涉及一种信息归并方法、交易查询方法、装置、计算机及存储介质。
背景技术
随着互联网技术的不断发展以及电商平台的兴起,在电商平台上的团伙欺诈越来越多,黑产规模也越来越大。
在电商平台上,同一用户具有多个账号的现象十分常见,一种情况是用户的正常网络活动,通过多个账号满足用户的活动需求;另一种情况是不法分子通过大量账号进行非法牟利,例如刷单、黑产或者洗钱等。为了提高电商平台的安全性判断多个交易个体是否为同一用户、多个欺诈行为是否为同一用户操作以及多个账号是否属于同一欺诈团体,在风控反欺诈中越来越重要。
针对上述问题,目前常采用如下方式进行判断,其一,基于业务数据,根据固定的判断规则进行匹配,根据匹配结果判断不同账号是否属于同一用户,例如可以是将身份证号和注册手机号相同的账号确定为同一用户的账号;其二,基于用户的基本数据,确定账号对应的特征向量,通过无监督聚类方式,对账号的特征向量进行聚类,将聚类得到的账号确定为相似账号。
在实现本申请的过程中,发明人发现现有技术中至少存在以下技术问题:对于第一种判断方式,存在数据缺失导致无法判断的问题,例如在申请账号时,身份证信息不属于必填字段,存在大量账号的身份证字段缺失。进一步的,黑产用户通过实名认证时使用的身份证、手机号、银行卡等大部分由黑市购买得到,无法保证信息的准确性。对于第二种判断方式,通过无监督聚类算法,可以是将用户信息归并到一个特定的群组,但是如果存在较大的群组(包含大量账号)时,对于非数值型的属性,无法量化两个账号之间的形似程度,有效判断的准确性差。
发明内容
本申请提供一种信息归并方法、交易查询方法、装置、计算机及存储介质,以提高信息归并的准确性。
第一方面,本申请实施例提供了一种信息归并方法,包括:
基于至少两个数据源获取待处理数据,提取所述待处理数据中的特征信息和特征关联信息;
根据提取的所述特征信息和所述特征关联信息生成信息关联图;
对所述信息关联图进行联通子图划分,生成至少一个联通子图,根据所述至少一个联通子图对所述待处理数据进行信息归并。
第二方面,本申请实施例还提供了一种交易查询方法,包括:
获取已知风险用户信息,根据所述风险用户信息在至少一个联通子图中进行匹配,确定与所述已知风险用户信息相匹配的目标联通子图,其中,所述至少一个联通子图为根据本申请任意实施例提供的信息归并方法确定的;
提取所述目标联通子图中关联用户信息;
确定所述关联用户信息的当前交易,将所述关联用户信息的当前交易确定为风险交易。
第三方面,本申请实施例还提供了一种信息归并装置,包括:
信息提取模块,配置为基于至少两个数据源获取待处理数据,提取所述待处理数据中的特征信息和特征关联信息;
信息关联图生成模块,配置为根据提取的所述特征信息和所述特征关联信息生成信息关联图;
信息归并模块,配置为对所述信息关联图进行联通子图划分,生成至少一个联通子图,根据所述至少一个联通子图对所述待处理数据进行信息归并。
第四方面,本申请实施例还提供了一种交易查询装置,包括:
第一目标联通子图确定模块,配置为获取已知风险用户信息,根据所述风险用户信息在至少一个联通子图中进行匹配,确定与所述已知风险用户信息相匹配的目标联通子图,其中,所述至少一个联通子图为根据本申请任意实施例提供的信息归并方法确定的;
关联用户信息确定模块,配置为提取所述目标联通子图中关联用户信息;
风险交易确定模块,配置为确定所述关联用户信息的当前交易,将所述关联用户信息的当前交易确定为风险交易。
第五方面,本申请实施例还提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如本申请任意实施例提供的信息归并方法。
第六方面,本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请任意实施例提供的信息归并方法。
第七方面,本申请实施例还提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执 行所述程序时实现如本申请任意实施例提供的交易查询方法。
第八方面,本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请任意实施例提供的交易查询方法。
本申请实施例提供的技术方案,通过将待处理数据中的特征信息通过特征信息间的关联关系形成信息关联图,并基于信息关联图中特征节点的联通性对信息关联图进行划分,得到多个彼此独立的联通子图,基于联通子图将特征信息进行归并,通过图形方式简化了信息归并,方便直观,解决了数据库中海量数据无法清晰判断数据关联关系的问题,提高了图像归并效率。
附图说明
图1为本申请实施例一提供的一种信息归并方法的方法流程图;
图2是本申请实施例一提供的一种信息关联图的示意图;
图3是本申请实施例二提供的一种交易查询方法的方法流程图;
图4是本申请实施例三提供的一种信息归并装置的结构示意图;
图5是本申请实施例四提供的一种交易查询装置的结构示意图;
图6为本申请实施例五提供的一种计算机设备的结构示意图;
图7为本申请实施例七提供的一种计算机设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。
实施例一
图1为本申请实施例一提供的一种信息归并方法的流程图,本实施例可适用于对大量数据进行信息归并的情况,该方法可以由本申请实施例提供的信息归并装置来执行,具体包括如下步骤:
S110、基于至少两个数据源获取待处理数据,提取所述待处理数据中的特征信息和特征关联信息。
其中,数据源用于提供不同的待处理数据,待处理数据可以是数据源传输的实时数据,还可以是数据源存储的预设时间段的待处理数据。根据待归并数据确定特征标识,根据特征标识在待处理数据中提取对应的特征信息,以及特征信息之间的关联关系。其中,特征标识可以是特征信息的名称或者用于表征特征信息的字符或者字符串等,该特征标识可以是预先确定的,还可以是根据待归并数据从待处理数据中筛选得到的。示例性的,以电商平台的数据源为例,若待归并的数据为账号,则确定特征标识可以是与账号相关的信息标识,例如账号名称、账号注册用户、账号注册手机号等。
可选的,提取所述待处理数据中的特征信息和特征关联信息,包括:根据预设特征标识在所述待处理数据中进行匹配,确定所述预设特征标识对应的特征信息;遍历所述待处理数据,确定任意两个特征信息之间的关联关系;将所述关联关系确定为所述特征关联信息。在本实施例中,预先设置特征标识,将预设特征标识在待处理数据中进行逐一匹配,得到每一个预设特征标识对应的特征信息,以及任意两个特征信息之间的关联关系。以电商平台的数据源为例,所述数据源包括电商平台的用户行为数据源,电商平台的用户行为数据源可以包括但不限于注册信息表、会员信息表、绑卡信息表、支付信息表、实名认证信息表、订单信息表和支付信息表,相应的,待归并数据可以是账号,从待处理数据中提取的特征信息可以包括但不限于账号、会员信息、银行卡信息、证件信息、手机号信息、设备 信息和WIFI信息。基于上述预设特征标识在上述电商平台的用户行为数据源进行匹配,以得到预设特征标识对应的特征信息和特征信息间的关联关系。可选的,可预先确定匹配各个预设特征标识和关联关系的数据源,提高特征信息提取的针对性,避免无效的数据匹配过程。示例性的,参见表1和表2,表1为特征信息与数据源的对应关系,表2为特征信息间关联关系与数据源的对应关系。
表1
数据类型 特征名称 数据源
node_type_pin 账号 注册信息表
node_type_pay 会员信息 会员信息表
node_type_card 银行卡信息 绑卡信息表、支付信息表
node_type_idcard 证件信息 实名认证信息表
node_type_phone 手机号信息 注册信息表、订单信息表
node_type_eid 设备信息 注册信息表、支付信息表
表2
关系类型 特征信息的关联关系 数据源
register_with_pin_phone 账号与注册手机号 注册信息表
register_on_pin_eid 账号与注册设备 注册信息表
bind_to_pin_pay 账号与会员ID 会员信息表
login_on_pin_eid 账号与登录设备 登录信息表
auth_by_pin_idcard 账号与证件信息 实名认证信息表
reserve_with_card_phone 银行卡与持卡人手机号 用户绑卡信息表
owns_id_idcard_card 银行卡与持卡人证件 用户绑卡信息表
bind_to_pin_card 账号与绑定银行卡 用户绑卡信息表
consignee_with_pin_phone 账号与收货手机号 订单信息表
trade_with_pin_card 账号与支付银行卡 支付信息表
S120、根据提取的所述特征信息和所述特征关联信息生成信息关联图。
其中,信息关联图通过图形的形式展示特征信息和所述特征关联信息,示例性的,可以是将具有关联关系的特征信息进行图形连接,以得到全部特征信息之间的关联图形。
可选的,根据提取的所述特征信息和所述特征关联信息生成信息关联图,包括:根据所述特征信息设置特征节点;获得存在关联关系的特征节点;根据所述特征关联信息将存在关联关系的两个特征节点间设置关联边,生成所述信息关联图。其中,信息关联图有特征节点和连接特征节点的关联边组成,每一个特征节点上具有一个特征信息,具有关联关系的两个特征信息对应的两个特征节点之间设置有关联边,关联边用于连接上述两个特征节点。示例性的,参加图2,图2是本申请实施例提供的一种信息关联图的示意图。在图2中,包括账号A、账号B、设备C,其中账号A、账号B均通过设备C进行注册,即账号A、账号B分别与设备C存在关联关系,根据账号A、账号B和设备C分别设置特征节点,并将账号A与设备C的特征节点间设置关联边,将账号B与设备C的特征节点间设置关联边,生成图2中的信息关联图。通过基于特征信息和特征关联信息生成信息关联图,将特征信息通过连接节点的图形形式进行展示,相对于数据表中的文字形式,可直观的确定存在关联关系的特征信息,提高了特征信息的查阅的便利性和直观性。
S130、对所述信息关联图进行联通子图划分,生成至少一个联通子图,根据所述至少一个联通子图对所述待处理数据进行信息归并。
其中,在确定信息关联图之后,基于信息关联图中特征节点的连通性对信息关联图进行联通子图划分,通过特征信息的关联关系对特征信息进行划分,得到多个联通子图,其中,联通子图中的任意两个特征节点可通过一个或多个关联边联通,且任意两个联通子图中的任意特征节点之间不存在关联边。
可选的,对所述信息关联图进行联通子图划分,生成至少一个联通子图,根据所述至少一个联通子图对所述待处理数据进行信息归并,包括:遍历所述信息关联图中的特征节点,将基于关联边连接的特征节点划分至同一联通子图,其中,任意两个联通子图中的任意特征节点之间不存在关联边;将所述同一联通子图中特征节点对应的特征信息归并为同一群组信息。具体的,对于信息关联图中的第一特征节点,根据该第一特征节点的关联边,可确定与第一特征节点存在关联关系的一个或多个第二特征节点,其中,第二特征节点的数量与第一特征节点的关联边的数量相同;进一步的,根据每一个第二特征节点的关联边,可确定与第二特征节点存在关联关系的一个或多个第三特征节点,其中,第三特征节点与第一特征节点不重复,以此类推,可得到第一特征节点所属的联通子图。
本实施例中,根据联通子图对特征信息进行归并,例如可以是将联通子图中的同一类型的特征信息归并为同一群组,示例性的,参数图2,账号A、账号B、设备C属于同一联通子图,进一步的可以是将账号A和账号B确定为同一用户的账号,或者同一组织的账号。
本实施例的技术方案,通过将待处理数据中的特征信息通过特征信息间的关联关系形成信息关联图,并基于信息关联图中特征节点的联通性对信息关联图进行划分,得到多个彼此独立的联通子图,基于联通子图将特征信息进行归并,通过图形方式简化了信息归并,方便直观,解决了数据库中海量数据无法清晰判断数据关联关系的问题,提高了图像归并效率。
在一些实施例中,在对所述信息关联图进行联通子图划分,生成至少一个联通子图之后,还包括:若存在至少一个历史联通子图,则将生成的至少一个联通子图与所述至少一个历史联通子图进行合并,生成至少一个更新联通子图。在本实施例中,可以是通过新建的联通子图对历史联通子图进行更新,具体的,可以是通过新建的联通子图中的任意特征节点的特 征信息与历史联通子图进行匹配,通过新建的联通子图更新匹配成功历史联通子图,其中,更新方式可以是遍历新建的联通子图中第一特征节点具有关联边的第二特征节点,确定历史联通子图中是否具有该第二特征节点,若否,则设置第二特征节点,并设置第一特征节点和第二特征节点的关联边,若是,则确定历史联通子图直中第一特征节点和第二特征节点是否设置关联边,如果有,则无需进行更新,如果没有,则设置第一特征节点和第二特征节点的关联边。
相应的,根据所述至少一个联通子图对所述待处理数据进行信息归并,包括:根据所述至少一个更新联通子图对所述待处理数据进行信息归并。在本实施例中,通过新建的联通子图不断更新历史联通子图,提高联通子图的特征信息的全面性,以及信息归并的准确性。
以电商平台的用户行为数据源为例,对电商平台的用户数据进行归并,可以是:基于商平台的用户行为数据源获取待处理数据,提取所述待处理数据中的特征信息和特征关联信息;其中,数据源包括注册信息表、会员信息表、绑卡信息表、支付信息表、实名认证信息表、订单信息表和支付信息表,特征信息可以包括账号、会员信息、银行卡信息、证件信息、手机号信息、设备信息和WIFI信息。
根据提取的上述特征信息和上述特征关联信息生成电商平台的用户信息关联图,该电商平台的用户信息关联图中包括账号节点、会员信息节点、银行卡信息节点、证件信息节点、手机号信息节点、设备信息节点和WIFI信息节点,并通过上述特征信息的关联关系,将上述节点之间设置关联边。
对所述信息关联图进行联通子图划分,生成至少一个联通子图,所述联通子图为具有关联关系的用户信息集合,根据所述至少一个联通子图对所述待处理数据进行信息归并。其中,可以是将同一联通子图中的账号确定为同一用户的账号,或者同一组织的账号。通过联通子图对信息进行归 并,可通过任一特征信息快速确定与该特征信息关联其他信息,便于后续通过关联账号进行刷单、黑产、欺诈团伙等犯罪行为的查处,提高信息查询效率,便于对特征信息的统一管理。
实施例二
图3是本申请实施例提供的一种交易查询方法的方法流程图,本实施例可适用于对风险交易进行查询的情况,该方法可以由本申请实施例提供的交易查询装置来执行,具体包括如下步骤:
S310、获取已知风险用户信息,根据所述风险用户信息在至少一个联通子图中进行匹配,确定与所述已知风险用户信息相匹配的目标联通子图,其中,所述至少一个联通子图为根据上述实施例提供的信息归并方法确定的。
S320、提取所述目标联通子图中关联用户信息。
S330、确定所述关联用户信息的当前交易,将所述关联用户信息的当前交易确定为风险交易。
其中,已知风险用户信息可以是包括多个特征信息,已知风险用户信息可以是进行违规操作的用户信息,例如进行欺诈、刷单、洗钱等违规操作的用户信息。例如用户实名认证信息、账号信息、身份证信息等。通过已知风险用户信息中的一个或多个信息在至少一个联通子图中进行匹配,确定上述已知风险用户信息所属的联通子图,将该联通子图中其他的关联用户确定为风险用户,将关联用户正在进行的当前交易确定为风险交易。
可选的,根据所述风险用户信息在至少一个联通子图中进行匹配,确定与所述已知风险用户信息相匹配的目标联通子图,包括:根据预设特征标识提取所述已知风险用户信息中的风险特征信息;将所述风险特征信息与至少一个联通子图中的特征节点的特征信息进行匹配,当匹配成功时,将匹配成功的特征节点所属的连通子图确定为目标联通子图。
在本实施例中,可以是将建立的联通子图存储至图数据库中,其中,图数据库可以是HBase数据库,基于ElasticSearch作为图数据库的索引工具。ElasticSearch表主要保存用户根据节点或边的属性进行查询节点和边的信息,示例性的,以订单支付数据为例:将订单的支付详细信息保存到ElasticSearch表中,当根据订单支付方式为条件进行查询时,则在ElasticSearch表中查询符合这些条件的特征节点与关联边,然后再在图中查询这些特征节点和关联边的关联信息,根据查询结果得到目标联通子图。示例性的,参见表3,表3是本申请实施例提供的支付数据表的示意图。
表3
Figure PCTCN2019127178-appb-000001
Figure PCTCN2019127178-appb-000002
可选的,所述联通子图的每一个特征节点包括至少一个行为属性信息;相应的,当根据所述已知风险用户信息匹配失败时,获取所述已知风险用户的行为特征信息,将所述行为特征信息与所述联通子图中的行为属性信息进行匹配,根据匹配结果确定目标联通子图。示例性的,账号的行为属性信息可以是注册账号的ip(网际协议地址)归属地、注册时间、注册来源等,下单信息的行为属性信息包括收货人、收获地址、商品类型等。行为特征信息可以包括但不限于注册账号的ip归属地、交易时间、支付方式、收货ip归属地。通过提取已知风险用户的交易行为特征信息,在各个联通子图中进行匹配,确定目标联通子图,通过多维度数据对目标联通子图进行搜索,提高目标联通子图的确定精度和速度。
在本实施例中,在确定风险交易之后,可以是对风险交易进行验证,当验证成功时,拦截该风险交易。其中,对风险交易进行验证可以是人工审核,还可以是根据预设条件进行验证,其中,预设条件可以是交易时间、交易类型等,当风险交易符合预设条件时,可进行拦截,提高交易安全性。
实施例三
图4是本申请实施例提供的一种信息归并装置的结构示意图,如图4所示,该装置包括:信息提取模块410、信息关联图生成模块420和信息归并模块430。
信息提取模块410,配置为基于至少两个数据源获取待处理数据,提取所述待处理数据中的特征信息和特征关联信息;
信息关联图生成模块420,配置为根据提取的所述特征信息和所述特征关联信息生成信息关联图;
信息归并模块430,配置为对所述信息关联图进行联通子图划分,生成 至少一个联通子图,根据所述至少一个联通子图对所述待处理数据进行信息归并。
可选的,信息提取模块410配置为:
根据预设特征标识在所述待处理数据中进行匹配,确定所述预设特征标识对应的特征信息;
遍历所述待处理数据,确定任意两个特征信息之间的关联关系;
将所述关联关系确定为所述特征关联信息。
可选的,信息关联图生成模块420配置为:
根据所述特征信息设置特征节点;
根据所述特征关联信息将存在关联关系的两个特征节点间设置关联边,生成所述信息关联图。
可选的,信息归并模块430配置为:
遍历所述信息关联图中的特征节点,将基于关联边连接的特征节点划分至同一联通子图,其中,任意两个联通子图中的任意特征节点之间不存在关联边;
将所述同一联通子图中特征节点对应的特征信息归并为同一群组信息。
可选的,所述装置还包括:
更新联通子图确定模块,配置为在对所述信息关联图进行联通子图划分,生成至少一个联通子图之后,若存在至少一个历史联通子图,则将生成的至少一个联通子图与所述至少一个历史联通子图进行合并,生成至少一个更新联通子图;
相应的,信息归并模块430配置为:根据所述至少一个更新联通子图对所述待处理数据进行信息归并。
可选的,所述数据源包括电商平台的用户行为数据源,相应的,所述 信息关联图为电商平台的用户信息关联图,所述联通子图为具有关联关系的用户信息集合。
上述产品可执行本申请任意实施例所提供的信息归并方法,具备执行信息归并方法相应的功能模块和有益效果。
实施例四
图5是本申请实施例提供的交易查询装置的结构示意图,该交易查询装置包括第一目标联通子图确定模块510、关联用户信息确定模块520和风险交易确定模块530。
第一目标联通子图确定模块510,配置为获取已知风险用户信息,根据所述风险用户信息在至少一个联通子图中进行匹配,确定与所述已知风险用户信息相匹配的目标联通子图,其中,所述至少一个联通子图为根据前述的信息归并方法确定的;
关联用户信息确定模块520,配置为提取所述目标联通子图中关联用户信息;
风险交易确定模块530,配置为确定所述关联用户信息的当前交易,将所述关联用户信息的当前交易确定为风险交易。
可选的,第一目标联通子图确定模块510配置为:
根据预设特征标识提取所述已知风险用户信息中的风险特征信息;
将所述风险特征信息与至少一个联通子图中的特征节点的特征信息进行匹配,当匹配成功时,将匹配成功的特征节点所属的连通子图确定为目标联通子图。
可选的,所述联通子图的每一个特征节点包括至少一个行为属性信息;
相应的,所述装置还包括:
第二目标联通子图确定模块,配置为当根据所述已知风险用户信息匹配失败时,获取所述已知风险用户的行为特征信息,将所述行为特征信息 与所述联通子图中的行为属性信息进行匹配,根据匹配结果确定目标联通子图。
上述产品可执行本申请任意实施例所提供的交易查询方法,具备执行交易查询方法相应的功能模块和有益效果。
实施例五
图6为本申请实施例五提供的一种计算机设备的结构示意图。图6示出了适于用来实现本申请实施方式的计算机设备612的框图。图6显示的计算机设备612仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。设备612典型的是承担信息归并功能的计算设备。
如图6所示,计算机设备612以通用计算设备的形式表现。计算机设备612的组件可以包括但不限于:一个或者多个处理器616,存储装置628,连接不同系统组件(包括存储装置628和处理器616)的总线618。
总线618表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(Industry Standard Architecture,ISA)总线,微通道体系结构(Micro Channel Architecture,MCA)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(Peripheral Component Interconnect,PCI)总线。
计算机设备612典型地包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备612访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。
存储装置628可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)630和/或高速缓存存储器632。计算机设备612可以进一步包括其它可移动/不可移动的、易失 性/非易失性计算机系统存储介质。仅作为举例,存储系统634可以用于读写不可移动的、非易失性磁介质(图6未显示,通常称为“硬盘驱动器”)。尽管图6中未示出,可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如只读光盘(Compact Disc-Read Only Memory,CD-ROM)、数字视盘(Digital Video Disc-Read Only Memory,DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线618相连。存储装置628可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本申请各实施例的功能。
具有一组(至少一个)程序模块626的程序636,可以存储在例如存储装置628中,这样的程序模块626包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块626通常执行本申请所描述的实施例中的功能和/或方法。
计算机设备612也可以与一个或多个外部设备614(例如键盘、指向设备、摄像头、显示器624等)通信,还可与一个或者多个使得用户能与该计算机设备612交互的设备通信,和/或与使得该计算机设备612能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口622进行。并且,计算机设备612还可以通过网络适配器620与一个或者多个网络(例如局域网(Local Area Network,LAN),广域网Wide Area Network,WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器620通过总线618与计算机设备612的其它模块通信。应当明白,尽管图中未示出,可以结合计算机设备612使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、磁盘阵列(Redundant Arrays of  Independent Disks,RAID)系统、磁带驱动器以及数据备份存储系统等。
处理器616通过运行存储在存储装置628中的程序,从而执行各种功能应用以及数据处理,例如实现本申请上述实施例所提供的信息归并方法。
实施例六
本申请实施例六还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请实施例所提供的信息归并方法。
当然,本申请实施例所提供的一种计算机可读存储介质,其上存储的计算机程序不限于如上所述的方法操作,还可以执行本申请任意实施例所提供的信息归并方法。
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可 读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
实施例七
图7为本申请实施例七提供的一种计算机设备的结构示意图。图7示出了适于用来实现本申请实施方式的计算机设备712的框图。图7显示的计算机设备712仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。设备712典型的是承担交易查询功能的计算设备。
如图7所示,计算机设备712以通用计算设备的形式表现。计算机设备712的组件可以包括但不限于:一个或者多个处理器716,存储装置728,连接不同系统组件(包括存储装置728和处理器716)的总线718。
总线718表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(Industry Standard Architecture,ISA)总线,微通道体系结 构(Micro Channel Architecture,MCA)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(Peripheral Component Interconnect,PCI)总线。
计算机设备712典型地包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备712访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。
存储装置728可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)730和/或高速缓存存储器732。计算机设备712可以进一步包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统734可以用于读写不可移动的、非易失性磁介质(图7未显示,通常称为“硬盘驱动器”)。尽管图7中未示出,可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如只读光盘(Compact Disc-Read Only Memory,CD-ROM)、数字视盘(Digital Video Disc-Read Only Memory,DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线718相连。存储装置728可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本申请各实施例的功能。
具有一组(至少一个)程序模块726的程序736,可以存储在例如存储装置728中,这样的程序模块726包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块726通常执行本申请所描述的实施例中的功能和/或方法。
计算机设备712也可以与一个或多个外部设备714(例如键盘、指向设备、摄像头、显示器724等)通信,还可与一个或者多个使得用户能与该 计算机设备712交互的设备通信,和/或与使得该计算机设备712能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口722进行。并且,计算机设备712还可以通过网络适配器720与一个或者多个网络(例如局域网(Local Area Network,LAN),广域网Wide Area Network,WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器720通过总线718与计算机设备712的其它模块通信。应当明白,尽管图中未示出,可以结合计算机设备712使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、磁盘阵列(Redundant Arrays of Independent Disks,RAID)系统、磁带驱动器以及数据备份存储系统等。
处理器716通过运行存储在存储装置728中的程序,从而执行各种功能应用以及数据处理,例如实现本申请上述实施例所提供的交易查询方法。
实施例八
本申请实施例八还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请实施例所提供的交易查询方法。
当然,本申请实施例所提供的一种计算机可读存储介质,其上存储的计算机程序不限于如上所述的方法操作,还可以执行本申请任意实施例所提供的交易查询方法。
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、 只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
注意,上述仅为本申请的较佳实施例及所运用技术原理。本领域技术人员会理解,本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本申请的保护范 围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请构思的情况下,还可以包括更多其他等效实施例,而本申请的范围由所附的权利要求范围决定。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。
工业实用性
本实施例的技术方案,通过将待处理数据中的特征信息通过特征信息间的关联关系形成信息关联图,并基于信息关联图中特征节点的联通性对信息关联图进行划分,得到多个彼此独立的联通子图,基于联通子图将特征信息进行归并,通过图形方式简化了信息归并,方便直观,解决了数据库中海量数据无法清晰判断数据关联关系的问题,提高了图像归并效率。
此外,通过已知风险用户信息中的一个或多个信息在至少一个联通子图中进行匹配,确定上述已知风险用户信息所属的联通子图,将该联通子图中其他的关联用户确定为风险用户,将关联用户正在进行的当前交易确定为风险交易。

Claims (15)

  1. 一种信息归并方法,包括:
    基于至少两个数据源获取待处理数据,提取所述待处理数据中的特征信息和特征关联信息;
    根据提取的所述特征信息和所述特征关联信息生成信息关联图;
    对所述信息关联图进行联通子图划分,生成至少一个联通子图,根据所述至少一个联通子图对所述待处理数据进行信息归并。
  2. 根据权利要求1所述的方法,其中,所述提取所述待处理数据中的特征信息和特征关联信息,包括:
    根据预设特征标识在所述待处理数据中进行匹配,确定所述预设特征标识对应的特征信息;
    遍历所述待处理数据,确定任意两个特征信息之间的关联关系;
    将所述关联关系确定为所述特征关联信息。
  3. 根据权利要求1所述的方法,其中,根据提取的所述特征信息和所述特征关联信息生成信息关联图,包括:
    根据所述特征信息设置特征节点;
    获得存在关联关系的特征节点;
    根据所述特征关联信息将存在关联关系的两个特征节点间设置关联边,生成所述信息关联图。
  4. 根据权利要求3所述的方法,其中,对所述信息关联图进行联通子图划分,生成至少一个联通子图,根据所述至少一个联通子图对所述待处理数据进行信息归并,包括:
    遍历所述信息关联图中的特征节点,将基于关联边连接的特征节点划分至同一联通子图,其中,任意两个联通子图中的任意特征节点之间不存在关联边;
    将所述同一联通子图中特征节点对应的特征信息归并为同一群组信息。
  5. 根据权利要求1所述的方法,其中,在对所述信息关联图进行联通子图划分,生成至少一个联通子图之后,所述方法还包括:
    若存在至少一个历史联通子图,则将生成的至少一个联通子图与所述至少一个历史联通子图进行合并,生成至少一个更新联通子图;
    相应的,根据所述至少一个联通子图对所述待处理数据进行信息归并,包括:
    根据所述至少一个更新联通子图对所述待处理数据进行信息归并。
  6. 根据权利要求1至5任一所述的方法,其中,所述数据源包括电商平台的用户行为数据源,相应的,所述信息关联图为电商平台的用户信息关联图,所述联通子图为具有关联关系的用户信息集合。
  7. 一种交易查询方法,包括:
    获取已知风险用户信息,根据所述风险用户信息在至少一个联通子图中进行匹配,确定与所述已知风险用户信息相匹配的目标联通子图,其中,所述至少一个联通子图为根据权利要求1至6任一所述信息归并方法确定的;
    提取所述目标联通子图中关联用户信息;
    确定所述关联用户信息的当前交易,将所述关联用户信息的当前交易确定为风险交易。
  8. 根据权利要求7所述的方法,其中,根据所述风险用户信息在至少一个联通子图中进行匹配,确定与所述已知风险用户信息相匹配的目标联通子图,包括:
    根据预设特征标识提取所述已知风险用户信息中的风险特征信息;
    将所述风险特征信息与至少一个联通子图中的特征节点的特征信息进 行匹配,当匹配成功时,将匹配成功的特征节点所属的连通子图确定为目标联通子图。
  9. 根据权利要求7所述的方法,其中,所述联通子图的每一个特征节点包括至少一个行为属性信息;相应的,当根据所述已知风险用户信息匹配失败时,获取所述已知风险用户的行为特征信息,将所述行为特征信息与所述联通子图中的行为属性信息进行匹配,根据匹配结果确定目标联通子图。
  10. 一种信息归并装置,包括:
    信息提取模块,配置为基于至少两个数据源获取待处理数据,提取所述待处理数据中的特征信息和特征关联信息;
    信息关联图生成模块,配置为根据提取的所述特征信息和所述特征关联信息生成信息关联图;
    信息归并模块,配置为对所述信息关联图进行联通子图划分,生成至少一个联通子图,根据所述至少一个联通子图对所述待处理数据进行信息归并。
  11. 一种交易查询装置,包括:
    第一目标联通子图确定模块,配置为获取已知风险用户信息,根据所述风险用户信息在至少一个联通子图中进行匹配,确定与所述已知风险用户信息相匹配的目标联通子图,其中,所述至少一个联通子图为根据权利要求1-6任一所述信息归并方法确定的;
    关联用户信息确定模块,配置为提取所述目标联通子图中关联用户信息;
    风险交易确定模块,配置为确定所述关联用户信息的当前交易,将所述关联用户信息的当前交易确定为风险交易。
  12. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在 处理器上运行的计算机程序,所述处理器执行所述程序时实现如权利要求1至6中任一所述的信息归并方法。
  13. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1至6中任一所述的信息归并方法。
  14. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如权利要求7至9中任一所述的交易查询方法。
  15. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求7至9中任一所述的交易查询方法。
PCT/CN2019/127178 2019-03-06 2019-12-20 信息归并方法、交易查询方法、装置、计算机及存储介质 WO2020177450A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910167233.3A CN111666346A (zh) 2019-03-06 2019-03-06 信息归并方法、交易查询方法、装置、计算机及存储介质
CN201910167233.3 2019-03-06

Publications (1)

Publication Number Publication Date
WO2020177450A1 true WO2020177450A1 (zh) 2020-09-10

Family

ID=72338088

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/127178 WO2020177450A1 (zh) 2019-03-06 2019-12-20 信息归并方法、交易查询方法、装置、计算机及存储介质

Country Status (2)

Country Link
CN (1) CN111666346A (zh)
WO (1) WO2020177450A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488768A (zh) * 2020-12-10 2021-03-12 深圳市欢太科技有限公司 特征提取方法、特征提取装置、存储介质与电子设备
CN112785320B (zh) * 2021-02-01 2023-09-19 北京互金新融科技有限公司 信用风险的确定方法及装置、存储介质和电子设备
CN112765418B (zh) * 2021-04-08 2022-04-01 中译语通科技股份有限公司 基于图结构的别名合并及存储方法、系统、终端、介质
CN113297426A (zh) * 2021-04-27 2021-08-24 上海淇玥信息技术有限公司 图数据库的特征生成方法、装置及电子设备
CN113689270B (zh) * 2021-10-25 2022-04-01 阿里云计算有限公司 黑产设备的确定方法、电子设备、存储介质及程序产品
CN113987087A (zh) * 2021-10-27 2022-01-28 北京达佳互联信息技术有限公司 帐户处理方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1949214A (zh) * 2006-09-26 2007-04-18 北京北大方正电子有限公司 一种信息归并方法及系统
CN103294818A (zh) * 2013-06-12 2013-09-11 北京航空航天大学 多信息融合的微博热点话题检测方法
CN108681936A (zh) * 2018-04-26 2018-10-19 浙江邦盛科技有限公司 一种基于模块度和平衡标签传播的欺诈团伙识别方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150081542A1 (en) * 2013-09-16 2015-03-19 International Business Machines Corporation Analytics driven assessment of transactional risk daily limits
US10810179B2 (en) * 2015-09-25 2020-10-20 Microsoft Technology Licensing, Llc Distributed graph database
US10542015B2 (en) * 2016-08-15 2020-01-21 International Business Machines Corporation Cognitive offense analysis using contextual data and knowledge graphs
CN107798541B (zh) * 2016-08-31 2021-12-07 南京星云数字技术有限公司 一种用于在线业务的监控方法及系统
CN107464113A (zh) * 2017-07-27 2017-12-12 无锡雅座在线科技股份有限公司 交易行为的风险确定方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1949214A (zh) * 2006-09-26 2007-04-18 北京北大方正电子有限公司 一种信息归并方法及系统
CN103294818A (zh) * 2013-06-12 2013-09-11 北京航空航天大学 多信息融合的微博热点话题检测方法
CN108681936A (zh) * 2018-04-26 2018-10-19 浙江邦盛科技有限公司 一种基于模块度和平衡标签传播的欺诈团伙识别方法

Also Published As

Publication number Publication date
CN111666346A (zh) 2020-09-15

Similar Documents

Publication Publication Date Title
WO2020177450A1 (zh) 信息归并方法、交易查询方法、装置、计算机及存储介质
CN108322473B (zh) 用户行为分析方法与装置
TWI673666B (zh) 資料風險控制的方法及裝置
TWI804575B (zh) 確定高風險用戶的方法及裝置、電腦可讀儲存媒體、和計算設備
WO2022126971A1 (zh) 基于密度的文本聚类方法、装置、设备及存储介质
US20160364794A1 (en) Scoring transactional fraud using features of transaction payment relationship graphs
US20170178139A1 (en) Analysis of Transaction Information Using Graphs
CN109741173B (zh) 可疑洗钱团伙的识别方法、装置、设备及计算机存储介质
US20140358742A1 (en) Systems And Methods For Mapping In-Store Transactions To Customer Profiles
JP2017523513A (ja) データ記憶方法、データ照会方法、およびそれらの装置
US20200090003A1 (en) Semantic-aware feature engineering
WO2021254027A1 (zh) 一种可疑社团的识别方法、装置、存储介质和计算机设备
US20190295089A1 (en) Transaction fraud detection based on entity linking
US11222270B2 (en) Using learned application flow to predict outcomes and identify trouble spots in network business transactions
US20210294718A1 (en) Tracking data flow through data services using a processing request identifier in callstack data
US20220067136A1 (en) Verification method and apparatus, and computer readable storage medium
WO2019095569A1 (zh) 基于微博财经事件的金融分析方法、应用服务器及计算机可读存储介质
US10049306B2 (en) System and method for learning from the images of raw data
CN112950191A (zh) 基于退费业务的业务数据处理方法、装置及计算机设备
CN105988998B (zh) 关系网络构建方法及装置
US11030673B2 (en) Using learned application flow to assist users in network business transaction based apps
US20220164868A1 (en) Real-time online transactional processing systems and methods
CN111984798A (zh) 图谱数据预处理方法及装置
CN109919767B (zh) 交易风险管理方法、装置及设备
CN111429257A (zh) 一种交易监控方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19918081

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19918081

Country of ref document: EP

Kind code of ref document: A1