WO2017114276A1 - 基于图的分析用户的方法和系统 - Google Patents

基于图的分析用户的方法和系统 Download PDF

Info

Publication number
WO2017114276A1
WO2017114276A1 PCT/CN2016/111441 CN2016111441W WO2017114276A1 WO 2017114276 A1 WO2017114276 A1 WO 2017114276A1 CN 2016111441 W CN2016111441 W CN 2016111441W WO 2017114276 A1 WO2017114276 A1 WO 2017114276A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
information
merchant
merchants
association
Prior art date
Application number
PCT/CN2016/111441
Other languages
English (en)
French (fr)
Inventor
何东杰
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2017114276A1 publication Critical patent/WO2017114276A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts

Definitions

  • Embodiments of the present invention relate to data analysis and, in particular, to graph-based methods and systems for analyzing users.
  • a method for analyzing a user based on a graph which maintains a graph in which an object is a vertex, and the related information between the object and the object is an edge, wherein the object includes a user and a merchant, Instructing the relationship between the user and the merchant, the method includes: A.
  • a data feature parsing process including: parsing data records occurring between the user, the merchant, and the user and the merchant to obtain key information, wherein the key information
  • the user identifier, the merchant identifier, and the consumption information generated between the user and the merchant are generated; the vertex information and the side information of the graph are generated by using the obtained key information, wherein the user identifier and the merchant identifier are used as vertex information, and the consumption information is used as side information.
  • the association analysis process includes analyzing the first user to be associated with other users based on at least one or more merchants associated with the first user.
  • a graph-based system for analyzing a user maintaining a map in which an object is a vertex, and the association information between the object and the object is an edge, wherein the object includes a user and a merchant, and the side indication
  • the relationship between the user and the merchant includes: A. a data feature parsing module configured to: parse and obtain data records generated between the user, the merchant, and the user and the merchant Key information, wherein the key information includes a user identifier, a merchant identifier, and consumption information generated between the user and the merchant; and the obtained key information is used to generate vertex information and side information of the graph, wherein the user identifier and the merchant identifier are used as vertex information.
  • the consumption information is used as side information
  • the association analysis module is configured to: analyze the first user to be associated with other users based on at least one or more merchants associated with the first user.
  • the technical solution of the invention can effectively improve the timeliness of data and shorten the efficiency of massive data association analysis and classification analysis in a big data environment by shortening the time of data update and data analysis.
  • the speed of analysis and processing is accelerated by constructing user and merchant relationship diagrams, strong and weak correlation analysis, and edge segmentation classification.
  • FIG. 1 is a schematic diagram of analyzing a user based on a graph in which an object is a vertex and the association information between the object and the object is an edge, according to an embodiment of the present invention.
  • FIG. 2 is a flow chart of a method for analyzing a user based on a graph, in accordance with an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a graph-based system for analyzing users in accordance with an embodiment of the present invention.
  • V is a non-empty finite set representing vertices (nodes)
  • E is a set of edges, generally represented by (Vx, Vy), where Vx, Vy belong to V. If there is an edge connection between the two nodes U and V, the two nodes U and V are said to be associated.
  • a weighted graph can be used to represent relationships other than the connected relationship between two adjacent vertices.
  • the present invention proposes to maintain a graph in which an object is a vertex, and the associated information between the object and the object is an edge, so that the association analysis between the object (user or merchant) based on the association model and algorithm of the graph improves the data. Analyze performance and efficiency.
  • the user may be a user such as a bank card or any user using a web service (e.g., online shopping), and the merchant may be any entity (e.g., a physical merchant or a network merchant) that provides a product or service.
  • FIG. 1 is a schematic diagram of analyzing a user based on a graph in which an object is a vertex and the association information between the object and the object is an edge, according to an embodiment of the present invention.
  • FIG. 1 shows users 1-7, merchants 1-4, which are linked by a user's consumption (purchase product or service) behavior, and form a map. For example, after the user 1 consumes the merchant 1, the connection between the user 1 and the merchant 1 is established.
  • the vertices of the graph in Fig. 1 represent objects, and the edges between vertices and vertices indicate association information between the two vertices.
  • the user ID and the merchant ID are used as vertex information.
  • the related information as the side information may be information of consumption occurring between the user and the merchant.
  • the present invention proposes to generate a graph with the user and the merchant as vertices according to the consumption behavior characteristics of the user, and estimate the association between the user and the merchant, the user, and the user according to the map.
  • the vertices in FIG. 1 can be filtered based on the merchant identification and consumption information according to the needs of the particular analysis.
  • a merchant 3 eg, a convenience store
  • the search has a predetermined number or more with the user 1 ( Strong correlation degree) of users of jointly associated merchants.
  • the predetermined number can be set to 3, and in this example, the degree of association between the user 4 and the user 1 is strong.
  • the filtering condition can be set such that the amount consumed by the merchant 2 within a certain period of time is greater than a predetermined value (strong correlation degree).
  • a predetermined value strong correlation degree
  • one or more pieces of information may also be based on the merchant identification and consumption information.
  • a combination to analyze the relationship between a user and a merchant, and the user and other users.
  • FIG. 2 is a flow chart of a method for analyzing a user based on a graph, in accordance with an embodiment of the present invention.
  • the image in which the object is a vertex and the association information between the object and the object is an edge is maintained, wherein the object includes a user and a merchant, and the edge indicates an association relationship between the user and the merchant, and the method includes data.
  • Feature resolution process 200 and association analysis process 300 are analyzed.
  • the data feature parsing process 200 includes:
  • Step 210 Parsing data records generated between the user, the merchant, and the user and the merchant to obtain key information, where the key information includes a user identifier, a merchant identifier, and consumption information generated between the user and the merchant;
  • Step 220 Generate vertex information and side information of the graph by using the obtained key information, where the user identifier and the merchant identifier are used as vertex information, and the consumption information is used as side information;
  • the association analysis process 300 includes analyzing the first user to be associated with other users based on at least one or more merchants associated with the first user.
  • the association analysis process 300 includes:
  • Step 310 Filter the merchant according to the predetermined condition with respect to the first user.
  • Step 320 Filter other users according to predetermined conditions with respect to the first user.
  • the first user can quickly analyze in the figure to find a merchant or other user who has strong affinity with the first user.
  • the one or more merchants associated with the first user are filtered according to the merchant identity to obtain the filtered one or more merchants.
  • a convenience store, a specific department store, a utility payment unit, and a specific hotel are excluded from the one or more merchants by a merchant logo.
  • These excluded merchants can be considered as objects with weak relevance or low analytical value to the first user in a particular analysis. However, depending on the needs of the analysis, in other examples, these merchants can be taken into consideration.
  • the one or more merchants associated with the first user are filtered according to the consumption information to obtain the filtered one or more merchants. For example, the merchant in the consumption information that is less than the predetermined amount of the amount of consumption is excluded, and/or the merchant whose consumption event last occurred before the specific time is excluded.
  • consumption frequency, consumer products in consumer information Or the type of service can also be considered.
  • the merchants in the merchants associated with the first user can be filtered in conjunction with the merchant identification and consumption information.
  • the first user can quickly analyze in the figure to find a merchant that has a strong affinity with the first user or a specific association.
  • other users associated with the filtered one or more merchants are determined in the map during the association analysis process. By first identifying the merchant and then associating the first user to other users, the amount of calculation and the efficiency of analysis can be greatly reduced.
  • a user having strong affinity with the first user is further selected from other users according to the merchant identifier, wherein the first user and another user are determined according to the following preset conditions. Strongly associative: the number of merchants that the first user and another user are associated with exceeds a predetermined value. For example, a user who has more than five merchants associated with the first user is considered to be a group that meets a specific analysis target.
  • the strength of the association between the first user and other users is further determined according to the consumption information. For example, for the same merchant, when it is judged that the consumption frequency of the first user and another user within a certain time period (for example, between two dates, or a certain time of day) is in the same range (for example When the consumption is 5 to 10 times a month, the two are considered to have strong correlation. For another example, for the same merchant, when it is determined that the consumption amount of the first user and another user in a certain time period is in the same range (for example, 5 to 10 times a month), the two are regarded as strong. Relevance.
  • the two are regarded as having strong relevance. It can be understood that one or more consumption factors can be combined to determine the relevance between users. For example, the location of a consumer event can also be taken into account.
  • the side information of the user to the merchant is first weakly identified.
  • the weak association relationship is defined as follows: the merchant identification indicates a merchant of a convenience store, a specific department store, or a public utility payment unit. Correlate analysis for specific user A to obtain all non-weakly associated merchants. Then, all non-weakly associated multiple users B are obtained through these non-weakly associated merchants, and the shared merchants and consumption information between B1 to Bn and A are recorded.
  • a and B1 may be considered to belong to the same group. Thereby, user classification and merchant classification can be realized, the analysis efficiency of the user-oriented association relationship is improved, and the quality of the data service is improved.
  • the entire map may be divided according to predetermined conditions, and edges that do not satisfy the predetermined condition are deleted to obtain one or more groups.
  • the various blocks shown in FIG. 2 may be considered as method steps, and/or considered to be operations resulting from the execution of computer program code, and/or as a plurality of coupled logic circuit elements that are constructed to perform the relevant functions.
  • the operations are depicted in the figures in a particular order, this should not be construed as requiring that the operations are performed in the particular order shown or in the order of the order, or that all illustrated operations are performed to achieve the desired results. In some cases, multitasking parallel processing may be advantageous.
  • the system includes a data feature parsing module, an association analysis module, and an optional index module.
  • the feature parsing module is configured to maintain a graph in which the object is a vertex, and the association information between the object and the object is an edge, wherein the object includes a user and a merchant, and the edge indicates an association relationship between the user and the merchant.
  • the data feature parsing module is configured to: parse data records generated between the user, the merchant, and the user and the merchant to obtain key information, where the key information includes the user identifier, the merchant identifier, and the user.
  • the consumption information generated between the merchant and the merchant; the vertice information and the side information of the graph are generated by using the obtained key information, wherein the user identifier and the merchant identifier are used as vertex information, and the consumption information is used as side information.
  • the association analysis module is configured to analyze the first user to be associated with other users based on at least one or more merchants associated with the first user.
  • the consumption information generated between the user and the merchant includes one or more of the following: time, time, location, frequency, consumption amount, and type of consumer goods.
  • the association analysis module is configured to: filter the one or more merchants associated with the first user based on the merchant identity to obtain the filtered one or more merchants.
  • the association analysis module is configured to: filter the one or more merchants associated with the first user based on the consumption information to obtain the filtered one or more merchants.
  • the association analysis module is configured to determine other users associated with the filtered one or more merchants in the map.
  • the association analysis module is configured to: further based on the merchant identification, from Among other users, a user having strong affinity with the first user is selected, wherein the first user and another user are determined to have strong affinity according to the following preset conditions: the first user and another user are jointly associated The number of merchants exceeds the predetermined value.
  • the association analysis module is configured to further determine the strength of the association between the first user and other users according to the consumption information.
  • the indexing module is configured to maintain an index that uses one of the key information of the object as a key and the location information of the object in the figure as auxiliary information.
  • the indexing module is configured to use the index to locate a position of the first object in the figure by using the index information of the first object, and find the first position according to the position of the first object in the figure.
  • Other objects associated with the object may be configured to maintain an index of the auxiliary information by using one of the key information of the object (for example, a user ID or a merchant ID) as a key, and the location information of the object in the figure.
  • the location information indicates a positional relationship with other objects in a storage structure (for example, an adjacency matrix, an adjacency list, and the like) of the object corresponding to the vertex corresponding to the object.
  • the graph analysis module can quickly locate the position of the object in the graph through the index.
  • the update operation and the analysis operation can be performed efficiently.
  • the update operation when the key information of the object changes, the vertex information and the side information of the object in the graph are updated in real time.
  • the exemplary embodiments can be implemented in hardware, software, or a combination thereof. For example, some aspects of the invention may be implemented in hardware, while other aspects may be implemented in software. Although aspects of the exemplary embodiments of the present invention may be shown and described as a block diagram, a flowchart, it is well understood that the devices or methods described herein may be implemented in a system as a non-limiting example as functional module. Furthermore, the above-described apparatus should not be construed as requiring such separation in all embodiments, but it should be understood that the described program components and systems can generally be integrated into a single software product or packaged into multiple software products. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Discrete Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

公开一种基于图的分析用户的方法和系统。方法包括:A.数据特征解析过程,包括:对用户、商户、以及用户与和商户之间发生的数据记录进行解析,获取关键信息,其中,关键信息包括用户标识、商户标识、在用户与商户之间产生的消费信息;利用获取的关键信息产生该图的顶点信息和边信息,其中将用户标识和商户标识作为顶点信息、将消费信息作为边信息;B.关联分析过程,包括:至少基于与第一用户关联的一个或多个商户,分析该第一用户与其它用户关联。

Description

基于图的分析用户的方法和系统
技术领域
本发明的实施例涉及数据分析,并且具体地涉及基于图的分析用户的方法和系统。
背景技术
随着大数据技术的快速发展,面向用户个人的数据分析成为可能。传统的用户分析通过贝叶斯、决策树等方法进行用户的分类和聚类,发现用户之间的关联关系。然而,在大规模数据的情况下,面向用户个体的关联分类算法难以进行有效的处理,其往往花费很长的计算时间。特别是,面向迭代类型的模型算法在处理大规模数据时的效率极其低下。另外,一旦用户信息的被更新,则需要重新计算用户的关联分类,这将极大影响结果数据的效用。
发明内容
根据本发明的一个实施例,公开一种基于图的分析用户的方法,维护以对象为顶点,对象与对象之间的关联信息为边的图,其中,所述对象包括用户和商户,所述边指示用户和商户的关联关系,所述方法包括:A.数据特征解析过程,包括:对用户、商户、以及用户与和商户之间发生的数据记录进行解析,获取关键信息,其中,关键信息包括用户标识、商户标识、在用户与商户之间产生的消费信息;利用获取的关键信息产生该图的顶点信息和边信息,其中将用户标识和商户标识作为顶点信息、将消费信息作为边信息;B.关联分析过程,包括:至少基于与第一用户关联的一个或多个商户,分析该第一用户与其它用户关联。
根据本发明的一个实施例,公开基于图的分析用户的系统,维护以对象为顶点,对象与对象之间的关联信息为边的图,其中,所述对象包括用户和商户,所述边指示用户和商户的关联关系,所述系统包括:A.数据特征解析模块,被配置成:对用户、商户、以及用户与和商户之间发生的数据记录进行解析,获取 关键信息,其中,关键信息包括用户标识、商户标识、在用户与商户之间产生的消费信息;利用获取的关键信息产生该图的顶点信息和边信息,其中将用户标识和商户标识作为顶点信息、将消费信息作为边信息;B.关联分析模块,被配置成:至少基于与第一用户关联的一个或多个商户,分析该第一用户与其它用户关联。
本发明的技术方案通过缩短数据更新以及数据分析的时间,有效提升数据的时效性,提高大数据环境下海量数据关联分析和分类分析的效率。通过构建用户和商户的关系图、强弱关联分析、边分割分类等方法加快了分析处理的速度。同时,基于可实时更新的图存储架构,可提供准实时的数据分析能力。
当结合附图阅读以下描述时也将理解本发明的实施例的其它特征和优势,其中附图借助于实例示出了本发明的实施例的原理。
附图说明
图1是根据本发明实施例的基于以对象为顶点,对象与对象之间的关联信息为边的图来分析用户的示意图。
图2是根据本发明实施例的基于图的分析用户的方法流程图。
图3是根据本发明实施例的基于图的分析用户的系统示意图。
具体实施方式
在下文中,将结合实施例描述本发明的原理。应当理解的是,给出的实施例只是为了本领域技术人员更好地理解并且实践本发明,而不是限制本发明的范围。例如,本说明书中包含许多具体的实施细节不应被解释为对发明的范围或可能被要求保护的范围的限制,而是应该被视为特定于实施例的描述。例如,在各实施例的上下文描述的特征可被组合在单一实施例中来实施。在单一实施例的上下文中描述的特征可在多个实施例来实施。
本发明提出基于图存储模型对要处理的数据的进行实时存储和更新。图是一种数据结构,定义为:graph=(V,E)。V是一个非空有限集合,代表顶点(节点),E代表边的集合,一般用(Vx,Vy)表示,其中,Vx,Vy属于V。若两个结点U、V之间有一条边连接,则称这两个结点U、V是关联的。可以用带权图表示两个相邻顶点之间的除连接关系以外的其它关系。
基于这样的概念,本发明提出维护以对象为顶点,对象与对象之间的关联信息为边的图,以便基于图的关联模型和算法进行对象(用户或者商户)之间的关联分析来提高数据分析的性能和效率。在本发明中,用户可以是例如银行卡的用户或者任何使用网络服务(例如,网上购物)的用户,商户可以是提供产品或者服务的任何实体(例如,实体商户或者网络商户)。
图1是根据本发明实施例的基于以对象为顶点,对象与对象之间的关联信息为边的图来分析用户的示意图。图1示出用户1-7、商户1-4,该11个对象通过用户的消费(购买产品或者服务)行为被联系起来,并且形成图。例如,用户1在商户1消费后,则建立用户1与商户1的连接。图1中的图的顶点代表对象,顶点与顶点之间的边指示这两个顶点之间的关联信息。例如,将用户标识和商户标识作为顶点信息。作为边信息的关联信息可以是用户与商户之间发生的消费的信息。例如,用户在商户的消费事件发生的时间、时段、地点、频率,消费金额,消费商品种类,或者商户标识。本发明提出根据用户的消费行为特征产生以用户和商户为顶点的图,并且根据该图来估计用户和商户、用户和用户之间的关联性。
在图1示出的示例中,可以根据特定分析的需求,根据商户标识和消费信息来过滤图1中的顶点。
在一个示例中,当分析用户1时,可以首先过滤具有特定商户标识的商户3(例如,便利店),然后在剩下的与用户1关联的商户中,查找与用户1具有预定数量以上(较强的关联度)的共同关联商户的用户。例如,预定数量可以设置为3,那么在该示例中,用户4与用户1的关联度较强。
在一个示例中,直接指定分析与商户2相关的用户之间的关系。然后,可以设置过滤条件为在一定时间段之内在商户2消费的金额大于预定值(较强的关联度)。根据该过滤条件,考虑用户1、4、5、7与商户2之间的消费信息(边信息),可以知道用户1、4、5、7中哪些用户关于商户2关联度较强。
本领域技术人员可以理解的是,还可以基于商户标识和消费信息的一项或多项信息(例如,时间、时段、地点、频率,消费金额,消费商品种类的一个或多个以及它们的各种组合)来分析一个用户与商户、以及该用户与其它用户之间的关联关系。
通过基于图的关联分析,可以快速地分析用户群体、特定用户的喜好趋势、 潜在喜好。为促进对本发明的理解,下文还将描述其它示例。但这些示例不应被视为是限制性的。
图2是根据本发明实施例的基于图的分析用户的方法流程图。在该方法中,维护以对象为顶点,对象与对象之间的关联信息为边的图,其中,所述对象包括用户和商户,所述边指示用户和商户的关联关系,所述方法包括数据特征解析过程200和关联分析过程300。
数据特征解析过程200,包括:
步骤210:对用户、商户、以及用户与和商户之间发生的数据记录进行解析,获取关键信息,其中,关键信息包括用户标识、商户标识、在用户与商户之间产生的消费信息;
步骤220:利用获取的关键信息产生该图的顶点信息和边信息,其中将用户标识和商户标识作为顶点信息、将消费信息作为边信息;
关联分析过程300,包括:至少基于与第一用户关联的一个或多个商户,分析该第一用户与其它用户关联。
在一个实施例中,关联分析过程300包括:
步骤310:相对于第一用户,根据预定条件过滤商户。
步骤320:相对于第一用户,根据预定条件过滤其它用户。
由此,通过为商户标识和消费信息设置过滤条件,可以在图中迅速地第一用户进行分析,找出与第一用户具有较强关联性的商户或者其它用户。
在一个示例中,在关联分析过程中,根据商户标识过滤与第一用户关联的所述一个或多个商户,得到经过滤的一个或多个商户。例如,通过商户标识,将便利店、特定百货商场、公共事业缴费单位、特定酒店从所述一个或多个商户排除。这些被排除的商户在特定的分析中可以被视为与第一用户具有弱关联度或者分析价值较低的对象。然而,根据分析需求的不同,在其它示例中,可以将这些商户纳入考虑范围。
在另一个示例中,在关联分析过程中,根据消费信息过滤与第一用户关联的所述一个或多个商户,得到经过滤的一个或多个商户。例如,将消费信息中的单比消费金额的数额小于预定值的商户排除,和/或将消费信息中最后消费事件发生时间在特定时间以前的商户排除。可选的,消费信息中的消费频率、消费产品 或者服务的类型也可以被纳入考虑范围。
可以理解的是,可以结合商户标识和消费信息筛选与第一用户相关联的商户中的商户。如上所述,可以通过为商户标识和消费信息设置过滤条件,可以在图中迅速地第一用户进行分析,找出与第一用户具有较强关联性或者是符合特定关联性的商户。
在一个实施例中,在关联分析过程中,在图中确定与该经过滤的一个或多个商户关联的其它用户。通过首先确定商户,再将第一用户关联至其它用户,可以大大减少计算量、提高分析效率。
在一个示例中,在关联分析过程中,进一步根据商户标识,从其它用户中选择与所述第一用户具有强关联性的用户,其中根据以下预置条件确定所述第一用户和另一用户具有强关联性:所述第一用户和另一用户共同关联的商户数量超过预定值。例如,将与所述第一用户共同关联的商户数量超过5家的用户视为满足特定分析目标的群体。
在另一个示例中,在关联分析过程中,进一步根据消费信息,来判断所述第一用户与其它用户的关联性的强弱。例如,对于同一商户,当判断所述第一用户与另一用户在特定时间段(例如,在两个日期之间,或者一天的某个时段之间)内的消费频率处于相同范围内(例如,一个月消费5至10次)时,将两者视为具有强的关联性。又例如,对于同一商户,当判断所述第一用户与另一用户在特定时间段内的消费金额处于相同范围内(例如,一个月消费5至10次)时,将两者视为具有强的关联性。又例如,对于同一商户,当判断所述第一用户与另一用户的消费的产品或者服务的类型相同时,将两者视为具有强的关联性。可以理解的是,可以结合一个或多个消费因素来判断用户之间的关联性。例如,还可以将消费事件的地点纳入考虑范围。
以下描述一个根据本发明一个或多个实施例的实例。在该实例中,首先对用户到商户的边信息进行弱关联识别。定义弱关联关系如下:商户标识指示为便利店、特定百货商场、或者公共事业缴费单位的商户。针对具体用户A进行关联分析,获取其的所有非弱关联商户。然后,通过这些非弱关联商户获得对应的所有非弱关联多个用户B,并记录B1至Bn与A之间的共有商户以及消费信息。当用户A与用户B1所共同关联的商户数量达到A所有的非弱关联商户的一半以 上和/或消费信息具有强关联性(例如,如上所述的)时,可以认为A和B1属于同一群体。由此,可实现用户分类和商户分类,提升面向用户的关联关系的分析效率,提高数据服务的质量。
在一个优选的实施例中,可以根据预定的条件对整个图进行划分,将不满足预定条件的边删除,得到一个或多个群体。
图2所示的各个框可被视为方法步骤、和/或被视为由于运行计算机程序代码而导致的操作、和/或被视为构建为实施相关功能的多个耦合的逻辑电路元件。尽管操作按特定的顺序在图中被描绘,但这不应被理解为要求按照所示的特定顺序或按依次顺序来执行这些操作,或要求所有例示的操作被执行,以达到理想的结果。在某些情况下,多任务并行处理可能是有利的。
图3是根据本发明实施例的基于图的分析用户的系统示意图。如图所示,系统包括数据特征解析模块、关联分析模块、可选的索引模块。特征解析模块用于维护以对象为顶点,对象与对象之间的关联信息为边的图,其中,所述对象包括用户和商户,所述边指示用户和商户的关联关系。
根据一个实施例,数据特征解析模块,被配置成:对用户、商户、以及用户与和商户之间发生的数据记录进行解析,获取关键信息,其中,关键信息包括用户标识、商户标识、在用户与商户之间产生的消费信息;利用获取的关键信息产生该图的顶点信息和边信息,其中将用户标识和商户标识作为顶点信息、将消费信息作为边信息。关联分析模块,被配置成:至少基于与第一用户关联的一个或多个商户,分析该第一用户与其它用户关联。
用户与商户之间产生的消费信息包括以下一个或多个:消费事件发生的时间、时段、地点、频率,消费金额,消费商品种类。
在其它实施例中,所述关联分析模块被配置成:根据商户标识过滤与第一用户关联的所述一个或多个商户,得到经过滤的一个或多个商户。
在其它实施例中,所述关联分析模块被配置成:根据消费信息过滤与第一用户关联的所述一个或多个商户,得到经过滤的一个或多个商户。
在其它实施例中,所述关联分析模块被配置成:在图中确定与该经过滤的一个或多个商户关联的其它用户。
在其它实施例中,所述关联分析模块被配置成:进一步根据商户标识,从 其它用户中选择与所述第一用户具有强关联性的用户,其中根据以下预置条件确定所述第一用户和另一用户具有强关联性:所述第一用户和另一用户共同关联的商户数量超过预定值。
在其它实施例中,所述关联分析模块被配置成:进一步根据消费信息,来判断所述第一用户与其它用户的关联性的强弱。
在其它实施例中,索引模块,被配置成维护以对象的关键信息中的一项为键、以对象在该图中的位置信息为辅助信息的索引。作为示例,所述索引模块被配置成,通过第一对象的关键信息利用该索引定位该第一对象在该图中的位置,根据该第一对象在该图中的位置找出与该第一对象关联的其它对象。索引模块可以被配置成维护以对象的关键信息的一项(例如,用户ID或者商户ID)为键、以对象在该图中的位置信息为辅助信息的索引。这里,位置信息指示该对象所对应的顶点在图的存储结构(例如,邻接矩阵、邻接表等)中的与其它对象的位置关系。图分析模块通过索引能够快速定位对象在图中的位置。
基于数据特征解析模块、关联分析模块、索引模块、可以高效地进行更新操作和分析操作。在更新操作过程中,当对象的关键信息发生变化时,实时地更新该对象在该图中的顶点信息和边信息。
示例性实施例可在硬件、软件或其组合中来实施。例如,本发明的某些方面可在硬件中实施,而其它方面则可在软件中实施。尽管本发明的示例性实施例的方面可被示出和描述为框图、流程图,但很好理解的是,这里描述的这些装置、或方法可在作为非限制性实例的系统中被实现为功能模块。此外,上述装置不应被理解为要求在所有的实施例中进行这种分离,而应该被理解为所描述的程序组件和系统通常可以被集成在单一的软件产品中或打包成多个软件产品。
相关领域的技术人员当结合附图阅读前述说明书时,对本发明的前述示例性实施例的各种修改和变形对于相关领域的技术人员会变得明显。因此,本发明的实施例不限于所公开的特定实施例,并且变形例和其它实施例意在涵盖在所附权利要求的范围内。

Claims (20)

  1. 一种基于图的分析用户的方法,其特征在于,维护以对象为顶点,对象与对象之间的关联信息为边的图,其中,所述对象包括用户和商户,所述边指示用户和商户的关联关系,所述方法包括:
    A.数据特征解析过程,包括:
    对用户、商户、以及用户与和商户之间发生的数据记录进行解析,获取关键信息,其中,关键信息包括用户标识、商户标识、在用户与商户之间产生的消费信息;
    利用获取的关键信息产生该图的顶点信息和边信息,其中将用户标识和商户标识作为顶点信息、将消费信息作为边信息;
    B.关联分析过程,包括:
    至少基于与第一用户关联的一个或多个商户,分析该第一用户与其它用户关联。
  2. 如权利要求1所述的方法,其特征在于,
    用户与商户之间产生的消费信息包括以下一个或多个:
    消费事件发生的时间、时段、地点、频率,消费金额,消费商品种类。
  3. 如权利要求1所述的方法,其特征在于,
    在关联分析过程中,
    根据商户标识过滤与第一用户关联的所述一个或多个商户,得到经过滤的一个或多个商户。
  4. 如权利要求1所述的方法,其特征在于,
    在关联分析过程中,
    根据消费信息过滤与第一用户关联的所述一个或多个商户,得到经过滤的一个或多个商户。
  5. 如权利要求3或者4所述的方法,其特征在于,
    在关联分析过程中,
    在图中确定与该经过滤的一个或多个商户关联的其它用户。
  6. 如权利要求3或者4所述的方法,其特征在于,
    在关联分析过程中,
    进一步根据商户标识,从其它用户中选择与所述第一用户具有强关联性的用户,其中根据以下预置条件确定所述第一用户和另一用户具有强关联性:
    所述第一用户和另一用户共同关联的商户数量超过预定值。
  7. 如权利要求3或者4所述的方法,其特征在于,
    在关联分析过程中,
    进一步根据消费信息,来判断所述第一用户与其它用户的关联性的强弱。
  8. 如权利要求1所述的方法,其特征在于,该方法包括:
    维护以对象的关键信息中的一项为键、以对象在该图中的位置信息为辅助信息的索引。
  9. 如权利要求8所述的方法,其特征在于,
    通过第一对象的关键信息利用该索引定位该第一对象在该图中的位置,根据该第一对象在该图中的位置找出与该第一对象关联的其它对象。
  10. 如权利要求9所述的方法,其特征在于,
    通过分布式架构存储所述图和所述索引。
  11. 一种基于图的分析用户的系统,其特征在于,维护以对象为顶点,对象与对象之间的关联信息为边的图,其中,所述对象包括用户和商户,所述边指示用户和商户的关联关系,所述系统包括:
    A.数据特征解析模块,被配置成:
    对用户、商户、以及用户与和商户之间发生的数据记录进行解析,获取关键信息,其中,关键信息包括用户标识、商户标识、在用户与商户之间产生的消费信息;
    利用获取的关键信息产生该图的顶点信息和边信息,其中将用户标识和商户标识作为顶点信息、将消费信息作为边信息;
    B.关联分析模块,被配置成:
    至少基于与第一用户关联的一个或多个商户,分析该第一用户与其它用户关联。
  12. 如权利要求11所述的系统,其特征在于,
    用户与商户之间产生的消费信息包括以下一个或多个:
    消费事件发生的时间、时段、地点、频率,消费金额,消费商品种类。
  13. 如权利要求11所述的系统,其特征在于,
    所述关联分析模块被配置成:
    根据商户标识过滤与第一用户关联的所述一个或多个商户,得到经过滤的一个或多个商户。
  14. 如权利要求11所述的系统,其特征在于,
    所述关联分析模块被配置成:
    根据消费信息过滤与第一用户关联的所述一个或多个商户,得到经过滤的一个或多个商户。
  15. 如权利要求13或者14所述的系统,其特征在于,
    所述关联分析模块被配置成:
    在图中确定与该经过滤的一个或多个商户关联的其它用户。
  16. 如权利要求13或者14所述的系统,其特征在于,
    所述关联分析模块被配置成:
    进一步根据商户标识,从其它用户中选择与所述第一用户具有强关联性的用户,其中根据以下预置条件确定所述第一用户和另一用户具有强关联性:
    所述第一用户和另一用户共同关联的商户数量超过预定值。
  17. 如权利要求13或者14所述的系统,其特征在于,
    所述关联分析模块被配置成:
    进一步根据消费信息,来判断所述第一用户与其它用户的关联性的强弱。
  18. 如权利要求11所述的系统,其特征在于,该系统还包括:
    索引模块,被配置成维护以对象的关键信息中的一项为键、以对象在该图中的位置信息为辅助信息的索引。
  19. 如权利要求18所述的系统,其特征在于,
    所述索引模块被配置成,通过第一对象的关键信息利用该索引定位该第一对象在该图中的位置,根据该第一对象在该图中的位置找出与该第一对象关联的其它对象。
  20. 如权利要求19所述的系统,其特征在于,
    所述系统通过分布式架构存储所述图和所述索引。
PCT/CN2016/111441 2015-12-31 2016-12-22 基于图的分析用户的方法和系统 WO2017114276A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201511028240.3 2015-12-31
CN201511028240.3A CN105678323A (zh) 2015-12-31 2015-12-31 基于图的分析用户的方法和系统

Publications (1)

Publication Number Publication Date
WO2017114276A1 true WO2017114276A1 (zh) 2017-07-06

Family

ID=56189899

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/111441 WO2017114276A1 (zh) 2015-12-31 2016-12-22 基于图的分析用户的方法和系统

Country Status (3)

Country Link
CN (1) CN105678323A (zh)
TW (1) TWI621989B (zh)
WO (1) WO2017114276A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782847A (zh) * 2019-07-31 2020-10-16 北京京东尚科信息技术有限公司 图像处理方法、装置和计算机可读存储介质
CN111951035A (zh) * 2019-05-17 2020-11-17 上海树融数据科技有限公司 消费分析方法和系统、装置以及消费分析平台

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678323A (zh) * 2015-12-31 2016-06-15 中国银联股份有限公司 基于图的分析用户的方法和系统
CN107316205A (zh) * 2017-05-27 2017-11-03 银联智惠信息服务(上海)有限公司 识别持卡人属性的方法、装置、计算机可读介质及系统
CN109947865B (zh) * 2018-09-05 2023-06-30 中国银联股份有限公司 商户分类方法及商户分类系统
CN111127089B (zh) * 2019-12-18 2023-09-19 北京数衍科技有限公司 账单数据处理方法、装置和电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080300940A1 (en) * 2007-05-31 2008-12-04 Gosakan Aravamudan Capturing Consumer Requirements
CN102254028A (zh) * 2011-07-22 2011-11-23 青岛理工大学 一种集成属性和结构相似性的个性化商品推荐方法和系统
CN102929892A (zh) * 2011-08-12 2013-02-13 莫润刚 基于社交网络的信息精准推广系统及方法
CN105678323A (zh) * 2015-12-31 2016-06-15 中国银联股份有限公司 基于图的分析用户的方法和系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7788254B2 (en) * 2007-05-04 2010-08-31 Microsoft Corporation Web page analysis using multiple graphs
US8655805B2 (en) * 2010-08-30 2014-02-18 International Business Machines Corporation Method for classification of objects in a graph data stream
US9372589B2 (en) * 2012-04-18 2016-06-21 Facebook, Inc. Structured information about nodes on a social networking system
CN103838804A (zh) * 2013-05-09 2014-06-04 电子科技大学 一种基于社团划分的社交网络用户兴趣关联规则挖掘方法
CN104915879B (zh) * 2014-03-10 2019-08-13 华为技术有限公司 基于金融数据的社会关系挖掘的方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080300940A1 (en) * 2007-05-31 2008-12-04 Gosakan Aravamudan Capturing Consumer Requirements
CN102254028A (zh) * 2011-07-22 2011-11-23 青岛理工大学 一种集成属性和结构相似性的个性化商品推荐方法和系统
CN102929892A (zh) * 2011-08-12 2013-02-13 莫润刚 基于社交网络的信息精准推广系统及方法
CN105678323A (zh) * 2015-12-31 2016-06-15 中国银联股份有限公司 基于图的分析用户的方法和系统

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111951035A (zh) * 2019-05-17 2020-11-17 上海树融数据科技有限公司 消费分析方法和系统、装置以及消费分析平台
CN111951035B (zh) * 2019-05-17 2024-06-11 嘉兴树融数据科技有限公司 消费分析方法和系统、装置以及消费分析平台
CN111782847A (zh) * 2019-07-31 2020-10-16 北京京东尚科信息技术有限公司 图像处理方法、装置和计算机可读存储介质

Also Published As

Publication number Publication date
TWI621989B (zh) 2018-04-21
CN105678323A (zh) 2016-06-15
TW201725499A (zh) 2017-07-16

Similar Documents

Publication Publication Date Title
TWI621989B (zh) Graph-based method and system for analyzing users
CN107169768B (zh) 异常交易数据的获取方法和装置
CN111158977A (zh) 一种异常事件根因定位方法及装置
CN109840533B (zh) 一种应用拓扑图识别方法及装置
CN104077723B (zh) 一种社交网络推荐系统及方法
WO2017096892A1 (zh) 索引构建方法、查询方法及对应装置、设备、计算机存储介质
CN111459985A (zh) 标识信息处理方法及装置
CN111461164B (zh) 样本数据集的扩容方法及模型的训练方法
CN110347888B (zh) 订单数据的处理方法、装置及存储介质
CN112036476A (zh) 基于二分类业务的数据特征选择方法、装置及计算机设备
CN110471945A (zh) 活跃数据的处理方法、系统、计算机设备和存储介质
WO2017203672A1 (ja) アイテム推奨方法、アイテム推奨プログラムおよびアイテム推奨装置
JP6244274B2 (ja) 相関ルール分析装置および相関ルール分析方法
CN116483822B (zh) 业务数据预警方法、装置、计算机设备、存储介质
US20110113006A1 (en) Business process control apparatus, businesses process control method and business process control program
CN113761185A (zh) 主键提取方法、设备及存储介质
CN114723554B (zh) 异常账户识别方法及装置
WO2017114455A1 (zh) 一种基于图的数据处理方法和系统
CN108614811B (zh) 一种数据分析方法及装置
CN110929207B (zh) 数据处理方法、装置和计算机可读存储介质
CN111737371B (zh) 可动态预测的数据流量检测分类方法及装置
CN115619245A (zh) 一种基于数据降维方法的画像构建和分类方法及系统
CN111382343B (zh) 一种标签体系生成方法及装置
CN114706899A (zh) 快递数据的敏感度计算方法、装置、存储介质及设备
CN111400375A (zh) 一种基于财务业务数据商机挖掘方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16881065

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16881065

Country of ref document: EP

Kind code of ref document: A1